r3d.Rd
Fits a regression discontinuity design with distribution-valued outcomes (random distributions),
as developed in vandijcke2025;textualR3D,
using local polynomial regression on quantiles (method="simple"
) or local Fréchet regression
(method="frechet"
). Supports both sharp and fuzzy designs.
r3d(
X,
Y_list,
T = NULL,
cutoff = 0,
method = c("simple", "frechet"),
p = 2,
q_grid = seq(0.01, 0.99, 0.01),
fuzzy = FALSE,
kernel_fun = c("epanechnikov", "triangular", "uniform"),
s = 1,
bandwidths = NULL,
boot = FALSE,
boot_reps = 200,
boot_cores = 1,
alpha = 0.05,
test = c("none", "nullity", "homogeneity", "gini"),
test_ranges = NULL,
coverage = FALSE,
weights = NULL,
...
)
Numeric vector of running variable values (length \(n\)).
A list of length \(n\), where each element is a numeric vector representing the sample from the outcome distribution of one unit.
(Optional) For a fuzzy design: numeric or logical vector of length \(n\)
giving partial treatment status (0 or 1). If fuzzy=FALSE
, this can be omitted.
Numeric scalar: the threshold for treatment assignment. The design treats \(X \ge cutoff\) as above the cutoff.
Either "simple"
or "frechet"
:
"simple"
Local polynomial regression is done pointwise for each quantile, and then optionally rearranged to maintain monotonicity.
"frechet"
A global local Fréchet regression approach is done, projecting the entire curve onto quantile functions.
Integer specifying the local polynomial order (e.g., 2 for local quadratic).
Numeric vector of quantiles at which to estimate the distributional treatment effect.
Logical flag for fuzzy design. Default FALSE
for a sharp design.
Name of the kernel function. Possible choices are
"epanechnikov"
, "triangular"
, or "uniform"
.
Defaults to "triangular"
.
Integer expansion order in the pilot local polynomial step used for bandwidth selection. Defaults to 1.
Optional user-specified bandwidths. If NULL
(default), bandwidths are
automatically selected via r3d_bwselect
. For a sharp design, can be
either a single numeric value or a vector of length length(q_grid)
. For a fuzzy design,
must be a list with two components: num
(for the outcome equation) and den
(for the
first-stage equation). Where num
can be a scalar or vector of length length(q_grid)
,
and den
must be a scalar.
Logical. If TRUE
, automatically runs r3d_bootstrap
after estimation to produce uniform confidence bands and tests.
Integer, number of bootstrap draws if boot=TRUE
.
Number of CPU cores for parallelizing the bootstrap. Default 1.
Significance level for the uniform confidence bands. Default 0.05.
Character vector of tests to perform. Options include: "none"
, "nullity"
,
"homogeneity"
, or "gini"
, or a vector combining multiple test types. When "none"
,
no tests are performed. When "nullity"
, tests the null hypothesis that the treatment effect
is zero across all quantiles. When "homogeneity"
, tests the null hypothesis that the
treatment effect is constant across all quantiles. When "gini"
, tests the null hypothesis
that there is no difference in the Gini coefficients of the distributions above and below the cutoff.
List of numeric vectors defining the quantile ranges for testing. Each element should be a
vector of length 2 or more defining the ranges. For example, list(c(0.25, 0.75))
to test on
quantiles between 0.25 and 0.75, or list(c(0.25, 0.5, 0.75))
to test on ranges \[0.25, 0.5\] and \[0.5, 0.75 \].
If NULL
(default), tests are performed on the entire q_grid
.
Logical indicating whether to apply the coverage correction rule of thumb of calonico2018effect;textualR3D. Default is FALSE.
Optional numeric vector of weights for each observation that will be used to compute weighted quantiles. Defaults to unweighted.
Additional arguments passed to the bandwidth selection function r3d_bwselect
.
An S3 object of class "r3d"
, containing (among others):
tau
Estimated distributional treatment effect \(\tau(q)\) for each \(q\) in q_grid
.
q_grid
The quantile grid used.
bandwidths
The bandwidths used (either user-provided or MSE-/IMSE-optimal).
w_plus
, w_minus
The internal local polynomial weights on the plus/minus side.
e1_mat
, e2_mat
Residual matrices used for the multiplier bootstrap.
boot_out
If boot=TRUE
, a list of bootstrap results from r3d_bootstrap
.
This is the main user-facing function for the R3D approach. Internally, it:
Selects bandwidth(s) via r3d_bwselect
(if bandwidths=NULL
) or
uses user-provided bandwidths,
Computes local polynomial fits on each side of the cutoff for each quantile in q_grid
,
For method="frechet"
, performs a projection of the fitted quantile curves onto the
space of valid quantile functions (via isotonic regression),
(Optionally) runs the multiplier bootstrap for uniform inference.
When test
includes one or more test types and test_ranges
is specified, the function will
perform the specified tests on each range in test_ranges
. For example, if test=c("nullity", "homogeneity")
and test_ranges=list(c(0.25, 0.75), c(0.1, 0.9))
, four tests will be performed: nullity and homogeneity
tests on the range \[0.25, 0.75\], and nullity and homogeneity tests on the range \[0.1, 0.9\].
The Gini test (test="gini"
) examines whether there is a statistically significant difference in
inequality (as measured by the Gini coefficient) between the distributions above and below the cutoff.
It uses the estimated conditional quantile functions to calculate the Gini coefficients.
if (FALSE) { # \dontrun{
# Simulate data
set.seed(123)
n <- 100
X <- runif(n, -1, 1)
Y_list <- lapply(seq_len(n), function(i) {
# Suppose distribution is Normal with mean depending on X
rnorm(sample(30:50,1), mean=2 + 2*(X[i]>=0))
})
T <- as.numeric(X >= 0) * rbinom(n, 1, 0.8) # Fuzzy treatment
# Example 1: Test nullity and homogeneity on full range
fit1 <- r3d(X, Y_list, T=T, cutoff=0,
method="frechet", p=2, fuzzy=TRUE,
boot=TRUE, boot_reps=200, alpha=0.05,
test=c("nullity", "homogeneity"))
# Example 2: Test Gini coefficient difference
fit2 <- r3d(X, Y_list, cutoff=0, method="simple",
boot=TRUE, test="gini")
# Example 3: Test on multiple ranges
fit3 <- r3d(X, Y_list, cutoff=0,
boot=TRUE, test=c("nullity", "homogeneity", "gini"),
test_ranges=list(c(0.25, 0.5, 0.75)))
# Inspect results
print(fit1)
summary(fit1)
plot(fit1)
} # }