do-operator
Definition
The do-operator is Pearl’s formalization of intervention.
This denotes the distribution of after the intervention that sets to the value .
It is distinct from observational conditioning :
- : the distribution of when is observed
- : the distribution of when is intervened upon
Intuitive Understanding
“The demand for a product priced at $50” and “the demand when the price is set to $50” are different.
The former mixes together the reason the price is $50 (e.g., high quality) with demand, whereas the latter is the effect of changing only the price, independent of quality.
The do-operator captures this difference mathematically.
Key Properties
Graphical Interpretation
in a DAG:
- Removes all arrows entering
- Fixes the value of to
This is “surgical intervention.”
Pearl’s Causal Hierarchy
| Level | Question type | Example |
|---|---|---|
| 1. Association | $P(Y | X)$ |
| 2. Intervention | $P(Y | do(X))$ |
| 3. Counterfactual | $P(Y_x | X’, Y’)$ |
The do-operator corresponds to level 2 (intervention).
Back-door Adjustment Formula
If there is an adjustment set that satisfies the Back-door Criterion:
This allows interventional effects to be estimated from observational data.
Front-door Adjustment
Used when back-door adjustment is not possible:
Example
DoWhy Implementation
import dowhy
from dowhy import CausalModel
model = CausalModel(
data=data,
treatment='price',
outcome='demand',
graph="""digraph {
cost -> price;
quality -> price; quality -> demand;
price -> demand;
}"""
)
# Identify the intervention effect
identified = model.identify_effect()
print(identified.get_backdoor_variables()) # ['quality']
# Estimate
estimate = model.estimate_effect(
identified,
method_name="backdoor.linear_regression"
)
print(f"Slope of E[demand | do(price)]: {estimate.value:.3f}")
Causal Effect vs. Conditional Expectation
# Conditional expectation (observation)
E_Y_given_X = data[data['price'] == 50]['demand'].mean()
# Causal effect (intervention) - back-door adjustment
adjusted = []
for quality_level in data['quality'].unique():
subset = data[(data['price'] == 50) & (data['quality'] == quality_level)]
weight = (data['quality'] == quality_level).mean()
if len(subset) > 0:
adjusted.append(subset['demand'].mean() * weight)
E_Y_do_X = sum(adjusted)
print(f"E[Y|X=50]: {E_Y_given_X:.2f}")
print(f"E[Y|do(X=50)]: {E_Y_do_X:.2f}")
Related Concepts
- SCM - the theoretical foundation of the do-operator
- Back-door Criterion - the adjustment condition for the do-calculus
- DAG - representation of causal structure
- Counterfactual Reasoning - level 3 (counterfactual) queries
- Potential Outcomes - alternative causal framework
- Intervention Types - Perfect, Soft, Unknown interventions
- Interventional Discovery Overview - causal discovery using interventions
References
- Pearl, J. (2009). Causality: Models, Reasoning, and Inference.
- Pearl, J., Glymour, M., & Jewell, N. P. (2016). Causal Inference in Statistics: A Primer.
- Comprehensive Personalized Pricing Guide, Part IV, §11.2