#causal-inference
51 notes
- AIPW (Augmented Inverse Probability Weighting) - $\hat{\mu}_t(X)$: Outcome model ($E[Y|T=t, X]$)
- Anytime-Valid Inference Overview Game-theoretic statistics that resolves the "peeking" problem of fixed-sample hypothesis testing. The mathematical foundation for real-time monitoring of identification-validity drift.
- Applied Causal Inference for Pricing — CATE & SCM Across Public Datasets An applied case study using only public datasets (LendingClub, iPinYou) that combines CATE estimation for price-sensitivity heterogeneity with SCM-based moderator analysis to design individual-level, risk-based pricing and RTB bidding policies — all findings illustrative and projected, not proprietary.
- ATT (Average Treatment Effect on the Treated) Average treatment effect for the group that actually received treatment
- Back-door Criterion The Back-door Criterion (Pearl, 1993) is a graphical criterion for identifying a causal effect from observational data. It determines whether a set of variables $Z$ is sufficient to identify the causal effect of $X \rightarrow Y$.
- BART (Bayesian Additive Regression Trees) A Bayesian ensemble method that models the outcome as a sum of many trees
- CATE (Conditional Average Treatment Effect) The Conditional Average Treatment Effect (CATE) is the average treatment effect given covariates $X=x$:
- Causal Forest Causal Forest is a causal-inference application of the Generalized Random Forest (GRF) proposed by Athey, Tibshirani, and Wager (2019), splitting so as to maximize the heterogeneity of treatment effects.
- Causal Inference Under Partial Identification — Sensitivity and Evidence Hierarchies When real-world data fail strong ignorability, point identification gives way to bounds, proxies, and sensitivity analysis — an honest hierarchy of evidence that connects credible causal claims to semiparametric efficiency.
- CEVAE (Causal Effect Variational Autoencoder) A method that uses a VAE to infer latent confounders and estimate causal effects.
- CFR (Counterfactual Regression) A deep learning method that learns balanced representations via IPM (Integral Probability Metric) regularization
- Collider A collider is a variable affected by both the treatment (X) and the outcome (Y) (a common effect). In the structure X → C ← Y, C is a collider.
- Confounder A confounder is a variable that affects both the treatment (X) and the outcome (Y) (a common cause), creating a spurious (non-causal) association between X and Y.
- Constraint-Based Methods Overview Constraint-based methods recover the causal graph by testing conditional independence (CI) relations in the data. Under the faithfulness assumption, they exploit the correspondence between CI relations and d-separation.
- d-separation d-separation (directional separation) is a graphical criterion in a DAG for determining whether two sets of variables are conditionally independent given a third set.
- DAG (Directed Acyclic Graph) A DAG (Directed Acyclic Graph) is a graph that visually represents the causal relationships among variables. It is a core tool in causal inference for grasping confounding structure and deciding an identification strategy.
- do-operator The do-operator is Pearl's formalization of intervention.
- Double/Debiased Machine Learning (DML) A methodology for performing valid statistical inference on a low-dimensional parameter of interest $\theta0$ in the presence of a high-dimensional nuisance parameter $\eta0$.
- Doubly Robust Estimator The Doubly Robust (DR) Estimator combines an outcome-regression model and a propensity-score model, remaining consistent as long as just one of the two is correctly specified.
- DR-Learner The DR-Learner is a two-stage doubly robust estimator for CATE that regresses a pseudo-outcome on the covariates.
- Dunnhumby — Track 2: Causal Targeting via Heterogeneous Treatment Effects Meta-learner / Causal Forest CATE under severe positivity violation (PS AUC 0.989); an OPE-validated policy targets ~31% of customers and surfaces counter-intuitive negative-CATE segments. Hypothesis-generating on public data.
- Efficient Influence Function Among the regular asymptotically linear (RAL) estimators of a (semi)parametric model, the IF with the smallest variance is the efficient influence function (EIF), and its variance equals the semiparametric efficiency bound (the supremum of the Cramér-Rao bounds over all parametric submodels)…
- Endogeneity Endogeneity is the problem that arises when an explanatory variable is correlated with the error term.
- ESCM² (Entire Space Counterfactual Multi-Task Model) A model that integrates a counterfactual risk regularizer based on the Inverse Propensity Score (IPS) and the Doubly Robust estimator into ESMM, in order to address ESMM's two theoretical limitations — Inherent Estimation Bias (IEB) and Potential Independence Priority (PIP).
- From Estimation to Action — How HTE Drives Personalized Policy Across Domains One methodological spine — estimate heterogeneous treatment effects and turn them into individual-level policies — powers both clinical sequential treatment decisions and industrial targeting, pricing, and recommendation.
- Fundamental Problem of Causal Inference The problem that, for the same individual, the outcomes under treatment (W=1) and control (W=0) cannot be observed simultaneously
- HTE (Heterogeneous Treatment Effects) The phenomenon in which the treatment effect varies with an individual's characteristics
- Influence Function If an estimator $\hat\psi$ of a functional parameter $\psi:\mathcal{P}\to\mathbb{R}$ is asymptotically linear, then an influence function (IF) $\phi$ exists such that
- Instrumental Variables Instrumental variables (IV) are exogenous variables used to address the problem of endogeneity.
- IPW (Inverse Propensity Weighting) Estimating treatment effects by using the inverse of the propensity score as weights
- ITE (Individual Treatment Effect) The treatment effect for individual $i$
- Marketing Attribution at Scale — From Simulation to Causal Inference A case study comparing 10+ multi-touch attribution methods against a known-ground-truth simulator, then scaling them on the public Criteo dataset, closing the loop with budget off-policy evaluation for channel allocation.
- Mediator A mediator is an intermediate variable lying on the causal pathway through which a treatment (X) affects an outcome (Y). In the structure X → M → Y, M is the mediator.
- Meta-learners Meta-learners are a general term for algorithms that estimate the CATE by leveraging existing supervised learning methods (base learners).
- Negative Control Outcome (NCO) An NCO is an outcome variable guaranteed a priori to be unaffected by the treatment's causal influence, yet still cast in the shadow of the same confounder $U$. By contrast, an NCE (negative control exposure) is an exposure with no causal effect on the outcome. If the "apparent effect" on an NCO is nonzero → a signal of unmeasured confounding (detection) → correct for it via proximal methods.
- One-step Estimator Corrects first-order bias by adding the empirical mean of the estimated EIF to the plug-in $\psi(\hat P)$:
- Partial Identification When point identification is impossible due to a lack of assumptions, we only know that the parameter lies in the identified set $\ThetaI$ (often an interval $[\thetaL,\thetaU]$) compatible with the data plus assumptions. Manski's assumption-free / worst-case bounds are the starting point. sharp bounds =…
- Positivity (Overlap) The probability of receiving treatment lies strictly between 0 and 1 for every covariate value
- Propensity Score Matching (PSM) Matching treated and control individuals with similar propensity scores
- Proximal Causal Inference When unmeasured confounding $U$ is present, the causal effect is identified using two types of proxies:
- R-Learner R-Learner (Residualized Learner) is a meta-learner that estimates the CATE using residualized outcomes and residualized treatments based on the Robinson Transformation.
- Representation Learning Overview Methods for learning representations that are independent of treatment while remaining useful for outcome prediction.
- S-Learner The S-Learner (Single Learner) is a Meta-learner that estimates the response function with a single model including the treatment indicator as a feature, then computes the CATE.
- SCM (Structural Causal Model) An SCM (Structural Causal Model) is a framework for mathematically expressing the causal relationships among variables. It is the core of Pearl's causal inference framework.
- Score-Based Methods Overview Score-based methods assign a score function to each graph and search for the graph that best fits the data. Unlike constraint-based methods, they optimize model fit without CI tests.
- Strong Ignorability An assumption combining Ignorability and Positivity
- SUTVA (Stable Unit Treatment Value Assumption) The potential outcome of one unit is not affected by the treatment assignment of other units, and only a single version exists for each treatment level.
- T-Learner The T-Learner (Two Learner) is a Meta-learner that estimates the CATE by training separate models for the treatment group and the control group.
- TMLE (Targeted Maximum Likelihood Estimation) A procedure that corrects (targets) a plug-in estimator toward the target parameter:
- Treatment Effects Overview A systematic overview of the treatment effects that serve as the estimands in the Potential Outcome Framework.
- X-Learner The X-Learner is a three-stage algorithm that leverages imputed treatment effects, a meta-learner that effectively exploits group imbalance and the structural properties of the CATE.