Doubly Robust Estimator

Definition

Doubly Robust (DR) Estimator는 outcome regression과 propensity score 모델을 결합하여, 둘 중 하나만 올바르게 specified되어도 consistent한 추정량.

ATE에 대한 DR Estimator: $\hat{\tau}_{DR} = \frac{1}{n}\sum_{i=1}^n \hat{\varphi}(Z_i)$

여기서 pseudo-outcome (efficient influence function): $\hat{\varphi}(Z) = \underbrace{\hat{\mu}_1(X) - \hat{\mu}_0(X)}_{\text{outcome regression}} + \underbrace{\frac{A(Y - \hat{\mu}_1(X))}{\hat{\pi}(X)} - \frac{(1-A)(Y - \hat{\mu}_0(X))}{1-\hat{\pi}(X)}}_{\text{IPW augmentation}}$

또는 equivalent form: $\hat{\varphi}(Z) = \hat{\mu}_1(X) - \hat{\mu}_0(X) + \frac{A - \hat{\pi}(X)}{\hat{\pi}(X)(1-\hat{\pi}(X))}(Y - A\hat{\mu}_1(X) - (1-A)\hat{\mu}_0(X))$

Intuitive Understanding

세 가지 추정 전략의 결합:

Outcome Regression (OR): $\hat{\tau}_{OR} = \frac{1}{n}\sum_i [\hat{\mu}_1(X_i) - \hat{\mu}_0(X_i)]$
Inverse Propensity Weighting (IPW): $\hat{\tau}_{IPW} = \frac{1}{n}\sum_i \frac{A_i Y_i}{\hat{\pi}(X_i)} - \frac{(1-A_i)Y_i}{1-\hat{\pi}(X_i)}$
Doubly Robust: OR + IPW correction term

OR만 사용:      μ̂ 틀리면 biased
IPW만 사용:     π̂ 틀리면 biased
DR 사용:        μ̂ OR π̂ 중 하나만 맞아도 consistent!

왜 “Doubly Robust”인가?

$\hat{\mu} = \mu_0$ (outcome model correct): augmentation term의 기댓값이 0
$\hat{\pi} = \pi_0$ (propensity model correct): weighting이 정확하여 bias 상쇄
둘 다 틀려도 bias가 오차의 곱에 비례: $O(||\hat{\mu} - \mu_0|| \cdot ||\hat{\pi} - \pi_0||)$

Key Properties

Double Robustness Property

Theorem: 다음 두 조건 중 하나가 성립하면 $\hat{\tau}_{DR}$ 는 consistent:

Outcome model이 correctly specified: $\hat{\mu}_a(x) \xrightarrow{p} E[Y|X=x, A=a]$
Propensity model이 correctly specified: $\hat{\pi}(x) \xrightarrow{p} P(A=1|X=x)$

Semiparametric Efficiency

DR estimator는 semiparametrically efficient:

Efficient influence function을 기반으로 구성
Semiparametric efficiency bound 달성
가장 낮은 asymptotic variance

$\text{Var}(\hat{\tau}_{DR}) = \frac{1}{n}E[\varphi(Z; \tau_0, \eta_0)^2] + o(n^{-1})$

Rate Doubly Robust

Product rate condition 하에서 $\sqrt{n}$ -consistent: $||\hat{\mu} - \mu_0|| \cdot ||\hat{\pi} - \pi_0|| = o_P(n^{-1/2})$

예: 각각 $n^{-1/4}$ rate면 충분

Mathematical Derivation

Efficient Influence Function

ATE $\tau = E[Y(1) - Y(0)]$ 의 efficient influence function: $\varphi(Z; \tau, \eta) = \mu_1(X) - \mu_0(X) - \tau + \frac{A(Y - \mu_1(X))}{\pi(X)} - \frac{(1-A)(Y - \mu_0(X))}{1 - \pi(X)}$

Properties:

$E[\varphi(Z; \tau_0, \eta_0)] = 0$
$E[\varphi(Z; \tau_0, \eta_0)^2] =$ semiparametric variance bound
Neyman orthogonal: $\partial_\eta E[\varphi]|_{\eta_0} = 0$

Bias Analysis

$E[\hat{\tau}_{DR}] - \tau_0 = E\left[\frac{(\hat{\pi} - \pi_0)(\hat{\mu}_1 - \mu_1)}{\hat{\pi}} - \frac{(\hat{\pi} - \pi_0)(\hat{\mu}_0 - \mu_0)}{1-\hat{\pi}}\right]$

→ Product of errors 형태

Comparison: OR vs IPW vs DR

Aspect	Outcome Regression	IPW	Doubly Robust
Model needed	$\mu_a(x)$	$\pi(x)$	Both
Consistency	If $\hat{\mu}$ correct	If $\hat{\pi}$ correct	If either correct
Efficiency	Not efficient	Not efficient	Semiparametrically efficient
Variance	Low if $\hat{\mu}$ good	High with extreme $\hat{\pi}$	Best of both
With ML	Regularization bias	Variance issues	Robust to both

Pseudo-outcome - DR estimator의 핵심 구성요소
DR-Learner - CATE를 위한 DR 확장
Influence Function - DR의 이론적 기반
Neyman-Orthogonal Score - Orthogonality 속성
Propensity Score - Treatment assignment probability
Double-Debiased ML - 관련 framework

Historical Context

Robins, Rotnitzky, Zhao (1994): 최초 doubly robust estimator 제안
Bang & Robins (2005): “Doubly Robust Estimation” 명명
Scharfstein, Rotnitzky, Robins (1999): Semiparametric theory 연결
Chernozhukov et al. (2018): ML과의 결합 (DML)

Implementation

Python (econml):

from econml.dr import LinearDRLearner
dr = LinearDRLearner()
dr.fit(Y, T, X=X, W=W)
ate = dr.ate(X)

R (AIPW package):

library(AIPW)
AIPW_SL <- AIPW$new(Y = Y, A = A, W = W,
                    Q.SL.library = c("SL.glm", "SL.ranger"),
                    g.SL.library = c("SL.glm", "SL.ranger"))
AIPW_SL$fit()
AIPW_SL$summary()

References

Robins, Rotnitzky, Zhao (1994) - Original DR estimator
kennedyOptimalDoublyRobust2023 - Optimal DR for CATE
chernozhukovDoubleDebiasedMachine2018 - DML framework
Bang & Robins (2005) - “Doubly Robust Estimation in Missing Data and Causal Inference Models”