Tae Hyun Kim (Lowell)

DAG (Directed Acyclic Graph)

3 min read #causal-inference#scm#dag

Definition

A DAG (Directed Acyclic Graph) is a graph that visually represents the causal relationships among variables. It is a core tool in causal inference for grasping confounding structure and deciding an identification strategy.

Components:

  • Node: represents a variable
  • Directed Edge (arrow): represents a direct causal effect (ABA \rightarrow B means “A affects B”)
  • Acyclic: no variable can cause itself (no cycles)

Key features:

  • Non-parametric: an arrow can represent any functional form (linear, nonlinear, etc.)
  • Qualitative: represents only the existence of an effect, not its magnitude
  • Requires domain knowledge: a DAG cannot be constructed from data alone

Three Elementary Structures

The three basic structures that determine how association is transmitted in a DAG:

1. Chain

A → B → C
  • Meaning: A indirectly affects C through B
  • Association: transmits causal association between A and C
  • Conditioning: conditioning on B blocks the A-C association

2. Fork

A ← B → C
  • Meaning: B is the common cause of both A and C (a confounder)
  • Association: transmits a non-causal (spurious) association between A and C
  • Conditioning: conditioning on B blocks the A-C spurious association

3. Inverted Fork / Collider

A → B ← C
  • Meaning: B is the effect of both A and C (a Collider)
  • Association: no association between A and C (blocked by default)
  • Conditioning: conditioning on B creates a spurious association between A and C (collider bias)

Path and Association

Types of Paths

  1. Causal Path: a path that follows the direction of the arrows (A → B → C)
  2. Non-causal Path: a path that contains a segment going against the direction of the arrows

Rules for Transmitting Association

  • A path transmits association unless it is blocked
  • Blocking conditions:
    • There is a collider on the path and that collider is not conditioned on
    • There is a non-collider on the path and that variable is conditioned on

Back-door Criterion

Back-door Criterion (Pearl, 1993): a condition for identifying a causal effect

Definition: To identify the causal effect of XYX \rightarrow Y:

  1. Block every path beginning with an arrow pointing into XX (back-door path)
  2. Do not condition on any descendant of XX

Example:

    Z
   ↙ ↘
  X   Y
  • ZZ is a confounder: XZYX \leftarrow Z \rightarrow Y (back-door path)
  • Conditioning on ZZ makes the causal effect identifiable

Guide to Drawing a DAG

Variables to Include

  1. Treatment (independent variable)
  2. Outcome (dependent variable)
  3. Confounders (common causes)
  4. Mediators (treatment → mediator → outcome)
  5. Colliders (treatment → collider ← outcome)

Cautions

  • Include all relevant variables
  • Arrow direction follows causal direction (consider temporal order)
  • Mark unmeasured variables too (dashed line or U)

Example: Education and Income

Intelligence

Education → Income

Intelligence

More precisely:

      Intelligence
       ↙        ↘
  Education  →  Income
  • Back-door path: Education ← Intelligence → Income
  • Solution: condition on Intelligence to block the back-door path
  • Causal effect: Education → Income becomes identifiable

DAG vs SEM

AspectDAGSEM
ParametricNo (qualitative)Yes (functional form specified)
FocusIdentificationEstimation
Arrows meaningAny causal effectSpecific functional relationship
UseConceptual reasoningStatistical modeling

Limitations

  1. Untestable assumptions: whether the DAG is correct cannot be verified from data
  2. Complexity: a real-world DAG quickly becomes complicated
  3. Temporal dynamics: a static DAG struggles to represent feedback loops
  4. Unmeasured variables: measuring all relevant variables is difficult
  • Confounder - Common cause, creates back-door paths
  • Collider - Common effect, creates bias when conditioned on
  • Mediator - A variable on the causal pathway
  • Back-door Criterion - Conditions for causal effect identification
  • d-separation - Rules for determining independence in a DAG
  • SCM - Structural Causal Model
  • Propensity Score - A method for adjusting for confounding

References

  • Pearl, J. (2009). Causality: Models, Reasoning, and Inference
  • rohrerThinkingClearlyCorrelations - Psychological applications of DAGs

Local graph