Tae Hyun Kim (Lowell)

DR-Learner

Definition

DR-Learner는 CATE 추정을 위한 2단계 doubly robust estimator로, Pseudo-outcome을 covariate에 대해 regression하는 방식.

Stage 1: Nuisance 추정

  • Propensity score: π^(x)=P(A=1X=x)\hat{\pi}(x) = P(A = 1 | X = x)
  • Outcome regression: μ^a(x)=E[YX=x,A=a]\hat{\mu}_a(x) = E[Y | X = x, A = a] for a{0,1}a \in \{0, 1\}

Stage 2: Pseudo-outcome regression τ^DR(x)=E^n[φ^(Z)X=x]\hat{\tau}_{DR}(x) = \hat{E}_n[\hat{\varphi}(Z) | X = x]

여기서 pseudo-outcome: φ^(Z)=μ^1(X)μ^0(X)plug-in+Aπ^(X)π^(X)(1π^(X))(YAμ^1(X)(1A)μ^0(X))augmentation/correction\hat{\varphi}(Z) = \underbrace{\hat{\mu}_1(X) - \hat{\mu}_0(X)}_{\text{plug-in}} + \underbrace{\frac{A - \hat{\pi}(X)}{\hat{\pi}(X)(1-\hat{\pi}(X))}(Y - A\hat{\mu}_1(X) - (1-A)\hat{\mu}_0(X))}_{\text{augmentation/correction}}

Intuitive Understanding

핵심 아이디어:

  1. Doubly robust pseudo-outcome을 계산 (ATE의 efficient influence function)
  2. 이 pseudo-outcome을 XX에 대해 smoothing/regression
  3. CATE 구조 (smoothness, sparsity)를 별도로 활용
Stage 1:  Estimate π̂(x), μ̂₁(x), μ̂₀(x) using any ML method

Stage 2:  Compute pseudo-outcome φ̂(Z) for each observation

Stage 3:  Regress φ̂ on X to get τ̂(x)

왜 “Doubly Robust”인가?

  • π^\hat{\pi} 또는 μ^\hat{\mu} 중 하나가 정확하면 bias가 사라짐
  • 둘 다 틀려도 product of errors만 남음: O(π^π0μ^μ0)O(||\hat{\pi} - \pi_0|| \cdot ||\hat{\mu} - \mu_0||)

Key Properties

Double Robustness

Bias term이 propensity score와 outcome regression 오차의 에만 의존: Bias=O(π^π0μ^μ0)\text{Bias} = O(||\hat{\pi} - \pi_0|| \cdot ||\hat{\mu} - \mu_0||)

Rate Adaptation

  • CATE의 smoothness γ\gamma에 적응
  • 개별 nuisance function의 smoothness α,β\alpha, \beta와 분리
  • Plug-in estimator보다 빠른 rate 달성 가능

Oracle Efficiency

다음 조건 하에서 oracle rate 달성: αβd/21+dγ1+d2s\sqrt{\alpha\beta} \geq \frac{d/2}{\sqrt{1 + \frac{d}{\gamma}}\sqrt{1 + \frac{d}{2s}}}

여기서:

  • α\alpha: propensity score smoothness
  • β\beta: outcome regression smoothness
  • γ\gamma: CATE smoothness
  • ss: harmonic mean smoothness
  • dd: covariate dimension

Algorithm

# DR-Learner Algorithm
def dr_learner(X, A, Y, n_folds=5):
    # Stage 1: Cross-fitted nuisance estimation
    pi_hat = cross_fit_estimate(X, A, model='classifier')
    mu1_hat = cross_fit_estimate(X[A==1], Y[A==1], model='regressor')
    mu0_hat = cross_fit_estimate(X[A==0], Y[A==0], model='regressor')

    # Stage 2: Compute pseudo-outcomes
    phi_hat = (mu1_hat - mu0_hat) + \
              (A - pi_hat) / (pi_hat * (1 - pi_hat)) * \
              (Y - A * mu1_hat - (1 - A) * mu0_hat)

    # Stage 3: Regress pseudo-outcome on X
    tau_hat = regress(X, phi_hat, model='smoother')

    return tau_hat

Comparison with Other Learners

MethodKey IdeaProsCons
T-LearnerSeparate models per treatmentSimpleNo sharing across groups
S-LearnerSingle model with A as featureShares infoMay miss heterogeneity
X-LearnerTwo-stage imputationGood for imbalanceComplex
R-LearnerResidualize then regressOrthogonalityRequires product rate
DR-LearnerDR pseudo-outcome regressionDouble robustness, rate adaptationStability condition needed

Theoretical Guarantee

Main Error Bound (Theorem 2): τ^DR(x)τ~(x)=E^n[b^(X)X=x]+oP(Rn(x))\hat{\tau}_{DR}(x) - \tilde{\tau}(x) = \hat{E}_n[\hat{b}(X) | X = x] + o_P(\sqrt{R_n^*(x)})

여기서:

  • τ~(x)\tilde{\tau}(x): oracle estimator (true pseudo-outcome 사용)
  • b^(x)\hat{b}(x): bias from nuisance estimation
  • Rn(x)R_n^*(x): oracle variance

Stability Condition 필요: Second-stage regression estimator가 input perturbation에 안정적이어야 함.

  • Pseudo-outcome - DR-Learner의 핵심 구성요소
  • Doubly Robust Estimator - 이론적 기반
  • CATE - 추정 대상
  • Oracle Efficiency - 이론적 목표
  • Cross-fitting - Overfitting 방지
  • R-Learner - 관련 방법론

Comparison: DR-Learner vs R-Learner

AspectDR-LearnerR-Learner
Pseudo-outcomeμ^1μ^0+correction\hat{\mu}_1 - \hat{\mu}_0 + \text{correction}(Yμ^)(Aπ^)/var(Y - \hat{\mu})(A - \hat{\pi})/\text{var}
Rate conditionProduct rateProduct rate
Oracle conditionαβ\sqrt{\alpha\beta} \geq \ldotsWeaker for lp-R-Learner
ImplementationSimplerMore complex (lp version)

Applications

  • Medicine: Heterogeneous treatment effects in clinical trials
  • Policy: Subgroup-specific policy effects
  • Marketing: Personalized treatment response
  • Social Science: Causal effect heterogeneity

Implementation

Python (econml):

from econml.dr import DRLearner
dr = DRLearner(model_propensity=LogisticRegression(),
               model_regression=RandomForestRegressor(),
               model_final=RandomForestRegressor())
dr.fit(Y, T, X=X, W=W)
cate = dr.effect(X_test)

R (grf):

library(grf)
# grf의 causal_forest가 유사한 doubly robust 속성 가짐
cf <- causal_forest(X, Y, W)
tau_hat <- predict(cf)$predictions

References

  • kennedyOptimalDoublyRobust2023 - DR-Learner 이론 및 oracle efficiency
  • chernozhukovDoubleDebiasedMachine2018 - DML framework
  • nieQuasiOracleEstimationHeterogeneous2020 - R-Learner

연결 그래프