Optimal Targeting Policy
Definition
An Optimal Targeting Policy is a rule that maps covariates to a treatment decision so as to maximize policy value:
Accounting for cost and margin, the optimal rule is a threshold policy: — i.e., treat only customers whose uplift exceeds the break-even point.
Intuitive Understanding
Once uplift estimation (CATE) tells us “how much more each person will buy,” the policy decides “so whom do we treat.” It is the step that turns a continuous CATE into a binary decision — the industrial instance of Policy Learning.
Methods
- Threshold on CATE: — simple and powerful (e.g., econml).
- Policy Tree / DR Policy Tree (Athey & Wager 2021, Kitagawa & Tetenov 2018): directly learn an interpretable rule. However, quantizing a continuous CATE into rules can lose information.
- Risk-adjusted policy: when CATE is uncertain due to positivity violations, tune conservativeness via .
- Value validation: estimate policy value before deployment with OPE (IPW/AIPW/DR).
Project Application
Dunnhumby: breakeven $42.43 (cost $12.73 / margin 30%). Optimal 31.3% targeting → $2,426 profit (125% ROI); targeting everyone yields a −$4,657 loss. The CATE-threshold beats the PolicyTree by $742. With PS AUC 0.989 (positivity violation), identification is restricted to the 17% overlap region → a conservative policy with λ=0.7–1.0 is recommended. (project canonical)
Related Concepts
- Targeting Overview ← hub
- Uplift Modeling / CATE — inputs to the policy
- Policy Learning — general theory (P2)
- Off-Policy Evaluation — policy value validation
References
- MOC-Targeting
- Study Roadmap — Track 3 (Athey-Wager 2021, Kitagawa-Tetenov 2018 originals)