Score-Based Methods Overview
개요
Score-based methods는 그래프에 score function을 부여하고, 데이터에 가장 적합한 그래프를 탐색. Constraint-based와 달리 CI tests 없이 model fit 최적화.
Mermaid source (click to expand)
> flowchart LR > Data[Data] --> Score[Score Function] > Score --> Search[Search Algorithm] > Search --> Best[Best Graph/MEC] >
핵심 아이디어
Optimization problem:
- : Score function (BIC, BGe, etc.)
- : Data
- Search over MECs for efficiency
주요 알고리즘 비교
| Algorithm | Search Strategy | Complexity | Feature |
|---|---|---|---|
| GES | Greedy | Forward-backward phases | |
| FGES | Parallel Greedy | Fast | Parallelized GES |
| Hill-climbing | Local search | Simple but local optima | |
| NOTEARS | Continuous opt | Acyclicity constraint |
Score Functions
BIC (Bayesian Information Criterion)
- : Number of parameters
- : Sample size
- Penalizes complexity
BGe (Bayesian Gaussian equivalent)
Marginal likelihood with Gaussian assumption:
- Conjugate priors 사용
- Closed-form computation
BDeu (Bayesian Dirichlet equivalent uniform)
Discrete data용 score
Score Equivalence
Definition: Score가 MEC 내에서 일정
Implication:
- MEC 내 DAGs를 구분하지 않음
- CPDAG를 직접 탐색 가능
장단점
장점
-
No Faithfulness Required
- Constraint-based보다 약한 가정
- Unfaithful distribution에서도 동작 가능
-
Principled Model Selection
- Bayesian framework
- Automatic complexity control (BIC penalty)
-
Global Optimization
- Local CI decisions 아닌 global fit
단점
-
NP-hard Problem
- DAG space super-exponential
- Greedy search → local optima
-
Score Function Choice
- 결과가 score에 의존
- Misspecification 문제
-
Computational Cost
- High-dimensional에서 느림
- → FGES로 완화
Constraint vs Score Comparison
| Aspect | Constraint-Based | Score-Based |
|---|---|---|
| Core operation | CI tests | Score optimization |
| Assumption | Faithfulness | Score decomposability |
| Local vs Global | Local decisions | Global fit |
| Noise sensitivity | High (CI test errors) | Moderate |
| Computation | Many CI tests | Score evaluations |
개선 방향
Hybrid Methods
Hybrid Methods Overview:
- Constraint로 skeleton 후 Score로 orientation
- 예: GFCI, MMHC
Continuous Optimization
NOTEARS:
- DAG space를 continuous space로 변환
- Acyclicity penalty 활용
Parallelization
FGES:
- Edge 추가/제거 병렬화
- 대규모 데이터 처리
관련 개념
Algorithm Details
- GES - Greedy Equivalence Search
- FGES - Fast GES
- Scoring Criteria - BIC, BGe, BDeu 상세
Theoretical Foundations
- Markov Equivalence Class - Search space
- CPDAG - Output representation
Comparison
- Constraint-Based Methods Overview - 대안적 접근
- Hybrid Methods Overview - Constraint + Score 결합
- Asymmetry-Based Methods Overview - 다른 접근
참고 논문
- Chickering, D.M. (2002). Optimal Structure Identification with Greedy Search
- Heckerman et al. (1995). Learning Bayesian Networks
- zangaSurveyCausalDiscovery2023 - Section 3.2