Score-Based Methods Overview
Overview
Score-based methods assign a score function to each graph and search for the graph that best fits the data. Unlike constraint-based methods, they optimize model fit without CI tests.
Mermaid source (click to expand)
> flowchart LR > Data[Data] --> Score[Score Function] > Score --> Search[Search Algorithm] > Search --> Best[Best Graph/MEC] >
Core Idea
Optimization problem:
- : Score function (BIC, BGe, etc.)
- : Data
- Search over MECs for efficiency
Comparison of Main Algorithms
| Algorithm | Search Strategy | Complexity | Feature |
|---|---|---|---|
| GES | Greedy | Forward-backward phases | |
| FGES | Parallel Greedy | Fast | Parallelized GES |
| Hill-climbing | Local search | Simple but local optima | |
| NOTEARS | Continuous opt | Acyclicity constraint |
Score Functions
BIC (Bayesian Information Criterion)
- : Number of parameters
- : Sample size
- Penalizes complexity
BGe (Bayesian Gaussian equivalent)
Marginal likelihood with Gaussian assumption:
- Uses conjugate priors
- Closed-form computation
BDeu (Bayesian Dirichlet equivalent uniform)
Score for discrete data
Score Equivalence
Definition: The score is constant within a MEC
Implication:
- Does not distinguish DAGs within a MEC
- Can directly search over CPDAGs
Advantages and Disadvantages
Advantages
-
No Faithfulness Required
- Weaker assumption than constraint-based methods
- Can work even under unfaithful distributions
-
Principled Model Selection
- Bayesian framework
- Automatic complexity control (BIC penalty)
-
Global Optimization
- Global fit rather than local CI decisions
Disadvantages
-
NP-hard Problem
- DAG space is super-exponential
- Greedy search → local optima
-
Score Function Choice
- Results depend on the score
- Misspecification issues
-
Computational Cost
- Slow in high dimensions
- → Mitigated by FGES
Constraint vs Score Comparison
| Aspect | Constraint-Based | Score-Based |
|---|---|---|
| Core operation | CI tests | Score optimization |
| Assumption | Faithfulness | Score decomposability |
| Local vs Global | Local decisions | Global fit |
| Noise sensitivity | High (CI test errors) | Moderate |
| Computation | Many CI tests | Score evaluations |
Directions for Improvement
Hybrid Methods
Hybrid Methods Overview:
- Build skeleton with constraints, then orient edges with scores
- Examples: GFCI, MMHC
Continuous Optimization
NOTEARS:
- Transforms the DAG space into a continuous space
- Leverages an acyclicity penalty
Parallelization
FGES:
- Parallelizes edge addition/removal
- Handles large-scale data
Related Concepts
Algorithm Details
- GES - Greedy Equivalence Search
- FGES - Fast GES
- Scoring Criteria - BIC, BGe, BDeu details
Theoretical Foundations
- Markov Equivalence Class - Search space
- CPDAG - Output representation
Comparison
- Constraint-Based Methods Overview - Alternative approach
- Hybrid Methods Overview - Combining constraint and score
- Asymmetry-Based Methods Overview - A different approach
Key Papers
- Chickering, D.M. (2002). Optimal Structure Identification with Greedy Search
- Heckerman et al. (1995). Learning Bayesian Networks
- zangaSurveyCausalDiscovery2023 - Section 3.2