Positivity (Overlap)
Definition
The probability of receiving treatment lies strictly between 0 and 1 for every covariate value
For binary treatment:
Intuitive Understanding
Core Idea
“Both the treated and control groups are observable across every combination of characteristics”
- At any value of , both treated and control outcomes can be estimated
- Estimate causal effects without extrapolation
Common Support
The region where the covariate distributions of the treated and control groups overlap:
Control distribution Treatment distribution
___ ___
/ \ / \
/ \ / \
/ \ / \
___/ overlap region \___
<=================>
Common Support
Positivity Violations
1. Deterministic Treatment
Treatment is deterministic at certain values of :
Examples:
- Those over 65 are always offered only program A
- A new product is not launched in certain regions
- Patients with contraindications cannot be prescribed the drug
2. Practical Positivity Violation
Theoretically possible but not observed in the data:
- Small sample size
- Rare covariate combinations
Propensity Score Perspective
Extreme propensity scores → a sign of positivity violation
Impact of Violations
1. IPW Instability
In Inverse Propensity Weighting:
When or , weights explode.
2. Inestimability
In regions where :
- is inestimable (no treated units)
In regions where :
- is inestimable (no control units)
3. High Variance
The weaker the overlap, the higher the variance of the estimator.
Diagnostic Methods
1. Propensity Score Histogram
# Compare PS distributions of treated/control groups
import matplotlib.pyplot as plt
plt.hist(ps[W==1], alpha=0.5, label='Treated')
plt.hist(ps[W==0], alpha=0.5, label='Control')
plt.legend()
Good overlap: the two distributions overlap substantially Poor overlap: separated distributions
2. Proportion of Extreme PS
extreme_ps = (ps < 0.01) | (ps > 0.99)
print(f"Extreme PS: {extreme_ps.mean()*100:.1f}%")
3. Checking Common Support
Check the intersection of the PS ranges of the treated and control groups.
Solutions
1. Trimming
Remove samples with extreme propensity scores:
Typically or .
For details: Trimming
Advantage: stable estimation Disadvantage: changes the estimand (overall ATE → conditional ATE)
2. Overlap Weighting
Assign less weight to regions with extreme PS:
For details: Overlap Weighting
3. Bounds Estimation
Provide bounds in regions of positivity violation:
Partial identification approach.
4. Extrapolation (caution required)
Model-based extrapolation:
- Strongly depends on model assumptions
- Sensitivity analysis is essential
Related Concepts
- Causal Assumptions Overview - consolidated overview of the three core assumptions
- Strong Ignorability - Ignorability + Positivity
- Propensity Score -
- IPW - a method sensitive to positivity
- Trimming - responding to positivity violations
- Overlap Weighting - a robust weighting method
References
- Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score
- yaoSurveyCausalInference2021 - Section 2.3
- Crump, R. K., et al. (2009). Dealing with limited overlap in estimation of average treatment effects