DR-Learner
Definition
The DR-Learner is a two-stage doubly robust estimator for CATE that regresses a Pseudo-outcome on the covariates.
Stage 1: Nuisance estimation
- Propensity score:
- Outcome regression: for
Stage 2: Pseudo-outcome regression
where the pseudo-outcome is:
Intuitive Understanding
Core idea:
- Compute the doubly robust pseudo-outcome (the efficient influence function for the ATE)
- Smooth/regress this pseudo-outcome on
- Separately exploit the structure of the CATE (smoothness, sparsity)
Stage 1: Estimate π̂(x), μ̂₁(x), μ̂₀(x) using any ML method
↓
Stage 2: Compute pseudo-outcome φ̂(Z) for each observation
↓
Stage 3: Regress φ̂ on X to get τ̂(x)
Why “Doubly Robust”?
- If either or is correct, the bias vanishes
- Even if both are wrong, only the product of errors remains:
Key Properties
Double Robustness
The bias term depends only on the product of the propensity-score and outcome-regression errors:
Rate Adaptation
- Adapts to the smoothness of the CATE
- Decoupled from the smoothness of the individual nuisance functions
- Can achieve a faster rate than the plug-in estimator
Oracle Efficiency
Achieves the oracle rate under the following condition:
where:
- : propensity score smoothness
- : outcome regression smoothness
- : CATE smoothness
- : harmonic mean smoothness
- : covariate dimension
Algorithm
# DR-Learner Algorithm
def dr_learner(X, A, Y, n_folds=5):
# Stage 1: Cross-fitted nuisance estimation
pi_hat = cross_fit_estimate(X, A, model='classifier')
mu1_hat = cross_fit_estimate(X[A==1], Y[A==1], model='regressor')
mu0_hat = cross_fit_estimate(X[A==0], Y[A==0], model='regressor')
# Stage 2: Compute pseudo-outcomes
phi_hat = (mu1_hat - mu0_hat) + \
(A - pi_hat) / (pi_hat * (1 - pi_hat)) * \
(Y - A * mu1_hat - (1 - A) * mu0_hat)
# Stage 3: Regress pseudo-outcome on X
tau_hat = regress(X, phi_hat, model='smoother')
return tau_hat
Comparison with Other Learners
| Method | Key Idea | Pros | Cons |
|---|---|---|---|
| T-Learner | Separate models per treatment | Simple | No sharing across groups |
| S-Learner | Single model with A as feature | Shares info | May miss heterogeneity |
| X-Learner | Two-stage imputation | Good for imbalance | Complex |
| R-Learner | Residualize then regress | Orthogonality | Requires product rate |
| DR-Learner | DR pseudo-outcome regression | Double robustness, rate adaptation | Stability condition needed |
Theoretical Guarantee
Main Error Bound (Theorem 2):
where:
- : the oracle estimator (using the true pseudo-outcome)
- : bias from nuisance estimation
- : oracle variance
Stability Condition required: The second-stage regression estimator must be stable with respect to input perturbations.
Related Concepts
- Pseudo-outcome - the core component of the DR-Learner
- Doubly Robust Estimator - theoretical foundation
- CATE - the estimation target
- Oracle Efficiency - the theoretical goal
- Cross-fitting - prevents overfitting
- R-Learner - related methodology
Comparison: DR-Learner vs R-Learner
| Aspect | DR-Learner | R-Learner |
|---|---|---|
| Pseudo-outcome | ||
| Rate condition | Product rate | Product rate |
| Oracle condition | Weaker for lp-R-Learner | |
| Implementation | Simpler | More complex (lp version) |
Applications
- Medicine: Heterogeneous treatment effects in clinical trials
- Policy: Subgroup-specific policy effects
- Marketing: Personalized treatment response
- Social Science: Causal effect heterogeneity
Implementation
Python (econml):
from econml.dr import DRLearner
dr = DRLearner(model_propensity=LogisticRegression(),
model_regression=RandomForestRegressor(),
model_final=RandomForestRegressor())
dr.fit(Y, T, X=X, W=W)
cate = dr.effect(X_test)
R (grf):
library(grf)
# grf's causal_forest has similar doubly robust properties
cf <- causal_forest(X, Y, W)
tau_hat <- predict(cf)$predictions
References
- kennedyOptimalDoublyRobust2023 - DR-Learner theory and oracle efficiency
- chernozhukovDoubleDebiasedMachine2018 - DML framework
- nieQuasiOracleEstimationHeterogeneous2020 - R-Learner