Doubly Robust Estimator
Definition
The Doubly Robust (DR) Estimator combines an outcome-regression model and a propensity-score model, remaining consistent as long as just one of the two is correctly specified.
DR Estimator for the ATE:
where the pseudo-outcome (efficient influence function) is:
or, in an equivalent form:
Intuitive Understanding
Combining three estimation strategies:
- Outcome Regression (OR):
- Inverse Propensity Weighting (IPW):
- Doubly Robust: OR + IPW correction term
OR only: biased if μ̂ is wrong
IPW only: biased if π̂ is wrong
DR: consistent if either μ̂ OR π̂ is correct!
Why “Doubly Robust”?
- (outcome model correct): the expectation of the augmentation term is 0
- (propensity model correct): the weighting is exact, so the bias is offset
- Even if both are wrong, the bias is proportional to the product of the errors:
Key Properties
Double Robustness Property
Theorem: If either of the following two conditions holds, is consistent:
- The outcome model is correctly specified:
- The propensity model is correctly specified:
Semiparametric Efficiency
The DR estimator is semiparametrically efficient:
- Constructed based on the efficient influence function
- Attains the semiparametric efficiency bound
- Has the lowest asymptotic variance
Rate Doubly Robust
-consistent under a product-rate condition:
e.g., a rate of each is sufficient
Mathematical Derivation
Efficient Influence Function
The efficient influence function for the ATE :
Properties:
- semiparametric variance bound
- Neyman orthogonal:
Bias Analysis
→ a product of errors form
Comparison: OR vs IPW vs DR
| Aspect | Outcome Regression | IPW | Doubly Robust |
|---|---|---|---|
| Model needed | Both | ||
| Consistency | If correct | If correct | If either correct |
| Efficiency | Not efficient | Not efficient | Semiparametrically efficient |
| Variance | Low if good | High with extreme | Best of both |
| With ML | Regularization bias | Variance issues | Robust to both |
Extensions
CATE Estimation
DR-Learner: regress the DR pseudo-outcome on
ATT Estimation
Longitudinal Settings
Time-varying treatments, combined with g-computation
Related Concepts
- Pseudo-outcome - the core component of the DR estimator
- DR-Learner - DR extension for CATE
- Influence Function - the theoretical foundation of DR
- Neyman-Orthogonal Score - the orthogonality property
- Propensity Score - treatment assignment probability
- Double-Debiased ML - related framework
Historical Context
- Robins, Rotnitzky, Zhao (1994): first proposal of a doubly robust estimator
- Bang & Robins (2005): coined the term “Doubly Robust Estimation”
- Scharfstein, Rotnitzky, Robins (1999): connection to semiparametric theory
- Chernozhukov et al. (2018): combination with ML (DML)
Implementation
Python (econml):
from econml.dr import LinearDRLearner
dr = LinearDRLearner()
dr.fit(Y, T, X=X, W=W)
ate = dr.ate(X)
R (AIPW package):
library(AIPW)
AIPW_SL <- AIPW$new(Y = Y, A = A, W = W,
Q.SL.library = c("SL.glm", "SL.ranger"),
g.SL.library = c("SL.glm", "SL.ranger"))
AIPW_SL$fit()
AIPW_SL$summary()
References
- Robins, Rotnitzky, Zhao (1994) - Original DR estimator
- kennedyOptimalDoublyRobust2023 - Optimal DR for CATE
- chernozhukovDoubleDebiasedMachine2018 - DML framework
- Bang & Robins (2005) - “Doubly Robust Estimation in Missing Data and Causal Inference Models”