Dynamic Treatment Regimes (DTR / OTR)
Definition
A DTR is a sequence of decision rules mapping the accumulated history (covariates, prior treatments, intermediate outcomes) to a treatment. The optimal treatment regime (OTR) maximizes the expected long-term outcome . Estimation:
- Q-learning — backward induction on the Q-function (regression based)
- A-learning — contrast/advantage modeling (robust to baseline misspecification)
- G-estimation — structural nested mean model (Robins)
- OWL — outcome-weighted learning (classification perspective)
Assumption: sequential ignorability.
Intuitive Understanding
Personalized, adaptive treatment over time — deciding “what to do next given everything so far” to optimize the long-term outcome. It is the core of CV pillar #4 (clinical sequential decision-making) and the time-series extension of personalization.
Related Concepts
- Policy Learning · MDP · Off-Policy Evaluation · Clinical Decision-Making Overview
Key Papers
- Murphy, “Optimal dynamic treatment regimes”, JRSS-B 65(2), 2003
- Robins — structural nested models / g-estimation
- Hernán & Robins, Causal Inference: What If, 2020 — Part III (g-methods)