Tae Hyun Kim (Lowell)

do-operator

3 min read #causal-inference#scm

Definition

The do-operator is Pearl’s formalization of intervention.

P(Ydo(X=x))P(Y | do(X = x))

This denotes the distribution of YY after the intervention that sets XX to the value xx.

It is distinct from observational conditioning P(YX=x)P(Y|X=x):

  • P(YX=x)P(Y|X=x): the distribution of YY when X=xX=x is observed
  • P(Ydo(X=x))P(Y|do(X=x)): the distribution of YY when X=xX=x is intervened upon

Intuitive Understanding

“The demand for a product priced at $50” and “the demand when the price is set to $50” are different.

The former mixes together the reason the price is $50 (e.g., high quality) with demand, whereas the latter is the effect of changing only the price, independent of quality.

The do-operator captures this difference mathematically.

Key Properties

Graphical Interpretation

do(X=x)do(X=x) in a DAG:

  1. Removes all arrows entering XX
  2. Fixes the value of XX to xx

This is “surgical intervention.”

Pearl’s Causal Hierarchy

LevelQuestion typeExample
1. Association$P(YX)$
2. Intervention$P(Ydo(X))$
3. Counterfactual$P(Y_xX’, Y’)$

The do-operator corresponds to level 2 (intervention).

Back-door Adjustment Formula

If there is an adjustment set ZZ that satisfies the Back-door Criterion:

P(Ydo(X=x))=zP(YX=x,Z=z)P(Z=z)P(Y | do(X = x)) = \sum_z P(Y | X = x, Z = z) P(Z = z)

This allows interventional effects to be estimated from observational data.

Front-door Adjustment

Used when back-door adjustment is not possible: P(Ydo(X))=mP(M=mX)xP(YM=m,X=x)P(X=x)P(Y|do(X)) = \sum_m P(M=m|X) \sum_{x'} P(Y|M=m, X=x') P(X=x')

Example

DoWhy Implementation

import dowhy
from dowhy import CausalModel

model = CausalModel(
    data=data,
    treatment='price',
    outcome='demand',
    graph="""digraph {
        cost -> price;
        quality -> price; quality -> demand;
        price -> demand;
    }"""
)

# Identify the intervention effect
identified = model.identify_effect()
print(identified.get_backdoor_variables())  # ['quality']

# Estimate
estimate = model.estimate_effect(
    identified,
    method_name="backdoor.linear_regression"
)
print(f"Slope of E[demand | do(price)]: {estimate.value:.3f}")

Causal Effect vs. Conditional Expectation

# Conditional expectation (observation)
E_Y_given_X = data[data['price'] == 50]['demand'].mean()

# Causal effect (intervention) - back-door adjustment
adjusted = []
for quality_level in data['quality'].unique():
    subset = data[(data['price'] == 50) & (data['quality'] == quality_level)]
    weight = (data['quality'] == quality_level).mean()
    if len(subset) > 0:
        adjusted.append(subset['demand'].mean() * weight)
E_Y_do_X = sum(adjusted)

print(f"E[Y|X=50]: {E_Y_given_X:.2f}")
print(f"E[Y|do(X=50)]: {E_Y_do_X:.2f}")
  • SCM - the theoretical foundation of the do-operator
  • Back-door Criterion - the adjustment condition for the do-calculus
  • DAG - representation of causal structure
  • Counterfactual Reasoning - level 3 (counterfactual) queries
  • Potential Outcomes - alternative causal framework
  • Intervention Types - Perfect, Soft, Unknown interventions
  • Interventional Discovery Overview - causal discovery using interventions

References

  • Pearl, J. (2009). Causality: Models, Reasoning, and Inference.
  • Pearl, J., Glymour, M., & Jewell, N. P. (2016). Causal Inference in Statistics: A Primer.
  • Comprehensive Personalized Pricing Guide, Part IV, §11.2

Local graph