SCM (Structural Causal Model)
Definition
An SCM (Structural Causal Model) is a framework for mathematically expressing the causal relationships among variables. It is the core of Pearl’s causal inference framework.
Formal Definition:
| Component | Meaning |
|---|---|
| Endogenous variables (observable variables) | |
| Exogenous variables (unobserved, ) | |
| Structural equations: | |
| Distribution over exogenous: |
Structural Equation:
Each variable is determined by its parent variables and exogenous noise.
Example
Causal Graph G: SCM M:
X V = {X, Y, Z}
↓ U = {U_X, U_Y, U_Z}
Y
↓ F = { X := U_X,
Z Y := f(X) + U_Y,
Z := g(Y) + U_Z }
P = { U_X ~ N(0,1),
U_Y ~ N(0,1),
U_Z ~ N(0,1) }
Joint Distribution (Markov factorization):
SCM vs DAG
| Aspect | DAG | SCM |
|---|---|---|
| Representation | Qualitative (presence of arrows) | Quantitative (functional form) |
| Information content | Causal direction only | Causal direction + functional relationships |
| Use | Identification | Identification + Estimation |
| Intervention | Conceptual representation | Mathematically manipulable |
Causal Edge Assumption
The value of each variable is fully determined by its parent variables:
Meaning:
- Causal sufficiency: all common causes are observed
- No unmeasured confounders (in ideal case)
Intervention (do-operator)
Definition
: forcibly set the variable to the value
Mathematical manipulation:
- Replace the structural equation of with
- Keep the other equations
Example
Original SCM:
X := U_X
Y := 2X + U_Y
Z := Y + U_Z
After :
X := 3 ← Changed!
Y := 2X + U_Y ← Uses X = 3
Z := Y + U_Z
Interventional vs Observational
- Observational: observational conditional (includes confounding)
- Interventional: causal intervention (removes confounding)
Causal Effect Identification
From SCM to DAG
Extract a DAG from an SCM :
- Each variable in is a node
- edge
do-calculus
Pearl’s three rules convert an interventional distribution into an observational one:
Rule 1 (Insertion/deletion of observations): if
Rule 2 (Action/observation exchange): if
Rule 3 (Insertion/deletion of actions): if
Counterfactuals
An SCM enables counterfactual reasoning:
Counterfactual query: “If X had been x’, what would Y have been?”
Three-step procedure:
- Abduction: Observe evidence, infer
- Action: Modify SCM with
- Prediction: Compute in modified model
Linear SCM
Linear Gaussian SCM:
Matrix form:
Characteristics:
- A closed-form solution exists
- LiNGAM: identifiable with non-Gaussian noise
Causal Discovery from SCM
Goal
Recover the underlying SCM (or DAG) from data
Approaches
- Constraint-based (PC, FCI): Conditional independence tests
- Score-based (GES, FGES): Score optimization
- Asymmetry-based (LiNGAM): Distributional asymmetries
Identifiability
- Markov Equivalence: same conditional independencies → same Markov Equivalence Class
- Non-Gaussian: a unique DAG can be identified with LiNGAM
Related Concepts
- DAG - Graphical representation of an SCM
- Confounder - Unmeasured common cause
- d-separation - Graphical conditional independence
- Markov Equivalence Class - Observationally equivalent graphs
- Back-door Criterion - Causal effect identification
- Potential Outcomes - Alternative causal framework
- Markov Property - Graph-distribution relationship
- Graph Foundations Overview - Complete overview of graphical representations
References
- Pearl, J. (2009). Causality: Models, Reasoning, and Inference
- yaoSurveyCausalInference2021 - SCM in causal discovery context