IPW (Inverse Propensity Weighting)
Definition
Estimating treatment effects by using the inverse of the propensity score as weights
Here, is the Propensity Score.
Intuitive Understanding
Why inverse weighting?
“More weight to underrepresented samples”
| Situation | Treatment probability | Weight | Meaning |
|---|---|---|---|
| Treatment rare | This person represents 10 individuals | ||
| Treatment common | Nearly 1:1 representation |
Resampling Perspective
IPW is equivalent to:
- “Replicating” each sample according to its weight
- Creating a hypothetical pseudo-RCT (pseudo-randomized experiment)
Mathematical Derivation
ATE Identification
Under Strong Ignorability:
Therefore:
Sample Estimator
Normalized Version
More stable when the PS is estimated:
Advantages: reduced bias, reduced variance
IPW for ATT
See ATT
Pros and Cons
Advantages
| Advantage | Description |
|---|---|
| Simple | Intuitive and easy to implement |
| Nonparametric | No outcome-model assumptions required |
| Theoretical justification | Guarantees conditional consistency |
| Flexibility | Applicable to a variety of estimands |
Disadvantages
| Disadvantage | Description |
|---|---|
| Dependence on PS estimation | Biased when the PS is misspecified |
| Sensitivity to extreme PS | Unstable when or |
| High variance | Especially when overlap is weak |
| Difficult in high dimensions | PS estimation is difficult |
Extreme PS Problem
Problem
When or :
- Weights explode:
- Estimator becomes unstable
Solutions
- Trimming: remove samples with extreme PS
- Overlap Weighting: use stable weights
- Weight clipping: set an upper bound on weights
Implementation
Python (EconML)
from econml.dr import LinearDRLearner
# IPW without an outcome model
model = LinearDRLearner(model_propensity=LogisticRegression())
model.fit(Y, T, X)
ate = model.effect(X).mean()
R
library(WeightIt)
# Propensity score weights
weights <- weightit(treat ~ x1 + x2, data = df, method = "ps")
# Weighted outcome regression
lm(y ~ treat, data = df, weights = weights$weights)
Related Concepts
- Re-weighting Methods Overview - consolidated overview of reweighting methods
- Propensity Score - the core tool
- Doubly Robust Estimator - IPW + outcome regression
- CBPS - directly optimizing balance
- Trimming - handling extreme PS
- Overlap Weighting - stable weighting
Application: Correcting RTB Win Selection Bias
In RTB, training only on won impressions introduces win selection bias. Correct it with IPW:
The win propensity is estimated via Survival Analysis (Kaplan-Meier) or gradient boosting. Weight stabilization (clipping, normalization) is essential. For details, see Multi-Task Learning (IPW-ESCM²).
References
- yaoSurveyCausalInference2021 - Section 3.1.3
- Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score
- Horvitz, D. G., & Thompson, D. J. (1952). A generalization of sampling without replacement
- Zhang et al. (2016). Bid-aware Gradient Descent (KDD)