Tae Hyun Kim (Lowell)

Back-door Criterion

4 min read #causal-inference#scm#dag

Definition

The Back-door Criterion (Pearl, 1993) is a graphical criterion for identifying a causal effect from observational data. It determines whether a set of variables ZZ is sufficient to identify the causal effect of XYX \rightarrow Y.

Formal Definition:

A set of variables ZZ satisfies the back-door criterion relative to (X,Y)(X, Y) if:

  1. No variable in ZZ is a descendant of XX
  2. ZZ blocks every back-door path connecting XX and YY

Back-door Path

Back-door path: a path from XX to YY that begins with an arrow pointing into XX

X ← ... → Y   (back-door path)
X → ... → Y   (front-door path, causal path)

Example:

    Z
   ↙ ↘
  X   Y
  • Back-door path: XZYX \leftarrow Z \rightarrow Y
  • This path transmits non-causal association

Intuitive Understanding

Core idea:

Causal effect = Total association - Spurious association (via back-door)

Total association between X and Y:
  1. Causal path: X → ... → Y
  2. Back-door paths: X ← ... → Y (spurious)

Back-door criterion: block 2 to leave only 1

Analogy:

  • Front door: the path through which X affects Y (causal)
  • Back door: the path through which X and Y are connected via a common cause (non-causal)
  • Close all the back doors → the causal effect becomes identifiable

Back-door Adjustment Formula

If ZZ satisfies the back-door criterion:

P(Ydo(X=x))=zP(YX=x,Z=z)P(Z=z)P(Y|do(X=x)) = \sum_z P(Y|X=x, Z=z) \cdot P(Z=z)

Or, for the continuous case:

E[Ydo(X=x)]=E[YX=x,Z=z]p(z)dzE[Y|do(X=x)] = \int E[Y|X=x, Z=z] \cdot p(z) \, dz

Meaning:

  • Conditioning on ZZ and averaging yields the causal effect
  • The interventional distribution can be computed from observational data

Algorithm: Finding Adjustment Sets

Step 1: Enumerate All Back-door Paths

Starting from XX, follow arrows pointing into XX and trace all paths that reach YY

Step 2: Determine How to Block Each Path

  • Fork (X ← Z → Y): conditioning on Z blocks it
  • Chain (… → Z → …): conditioning on Z blocks it
  • Collider (…→ Z ←…): not conditioning on Z blocks it (already blocked)

Step 3: Select the Adjustment Set

  • A set of variables that blocks all back-door paths
  • Does not include any descendant of XX

Examples

Example 1: Simple Confounding

    Z
   ↙ ↘
  X → Y

Back-door path: XZYX \leftarrow Z \rightarrow Y Adjustment set: {Z}\{Z\} Formula: E[Ydo(X)]=zE[YX,Z=z]P(Z=z)E[Y|do(X)] = \sum_z E[Y|X,Z=z] \cdot P(Z=z)

Example 2: Multiple Confounders

  Z1    Z2
   ↘  ↙  ↘
    X  →  Y

Back-door paths:

  1. XZ1YX \leftarrow Z1 \rightarrow Y (none, if there is no direct Z1→Y)

Adjustment set: {Z1}\{Z1\}, {Z2}\{Z2\}, or {Z1,Z2}\{Z1, Z2\} depending on the situation

Example 3: Collider on Back-door Path

  Z1 → C ← Z2
   ↓       ↓
   X   →   Y

Back-door path: XZ1CZ2YX \leftarrow Z1 \rightarrow C \leftarrow Z2 \rightarrow Y

  • C is a collider → the path is already blocked!
  • Adjustment set: \emptyset (no control is needed)

Caution: controlling for C opens the path → bias arises

Example 4: Mediator

  Z

  X → M → Y

Causal path: XMYX \rightarrow M \rightarrow Y Back-door path: none (Z affects only X)

Adjusting for {Z}\{Z\} is optional (since there is no back-door)

Caution: do NOT adjust for M (it would block the front-door path)

Sufficient vs Minimal Adjustment Sets

Sufficient Adjustment Set

  • Any set that satisfies the back-door criterion

Minimal Adjustment Set

  • The smallest among the sufficient sets
  • Contains no unnecessary variables

Trade-off:

  • More variables: more robust (guards against omitted confounding)
  • Fewer variables: more efficient (lower variance)

Limitations

  1. Dependence on DAG correctness: if the DAG is wrong, the conclusion is wrong
  2. Unmeasured confounders: if unmeasured variables exist, they cannot be blocked
  3. Sufficient but not necessary: the back-door criterion is a sufficient, not a necessary, condition

Front-door Criterion

X → M → Y

    U (unobserved confounder)
  • An alternative when back-door paths cannot be blocked
  • Identification that exploits a mediator

Instrumental Variables

  • An alternative when back-door paths cannot be blocked
  • Exploits the Instrument → X → Y structure
  • DAG - Visualizing causal structure
  • Confounder - Creates back-door paths
  • Collider - Blocks back-door paths
  • d-separation - Conditional independence in a DAG
  • Propensity Score - Implements back-door adjustment
  • Unconfoundedness - No unmeasured confounders

References

  • Pearl, J. (1993). Comment: Graphical models, causality and intervention
  • Pearl, J. (2009). Causality: Models, Reasoning, and Inference
  • rohrerThinkingClearlyCorrelations - Introduces the back-door criterion

Local graph