Applied Causal Inference for Pricing — CATE & SCM Across Public Datasets

Price is the strongest lever in almost any business, yet most pricing analysis stops at aggregate elasticity. The sentence “average price elasticity is $-2$ ” creates the false impression that every customer responds identically. This note documents an applied case study that uses causal inference to decompose price response down to the individual level. It relies exclusively on two public datasets, and every number is illustrative / projected — not a proprietary result.

Public-data-only. This case study is designed entirely on public datasets (LendingClub, iPinYou). All quantitative figures in this note (heterogeneity ratios, variance decomposition, margin/ROI improvements) are illustrative/projected values meant to demonstrate the methodology, and contain no private or proprietary analytical results.

Problem — why the “average” is not enough

Pricing has three core questions:

What is the causal effect of price (or interest rate, or bid price) on demand, conversion, or default?
How does the heterogeneity of that effect — i.e., price sensitivity — vary with customer and contextual features?
What is the resulting optimal pricing policy?

Aggregate regression structurally hides (2). Two groups with the same average elasticity can hide one segment that is nearly insensitive to price and another that is highly sensitive. Worse, price is not randomly assigned — better credit grades get lower rates, and more competitive slots draw higher bids. Ignoring this selection bias biases the elasticity estimate itself. We therefore need two axes simultaneously: (a) causal identification that controls for confounding, and (b) estimation of the effect’s heterogeneity.

Data (public)

Dataset	Domain	Scale (approx.)	Treatment	Outcome	Access
LendingClub	P2P lending	~2.26M loans (2007–2018)	interest rate	default	Free on Kaggle
iPinYou	RTB ad auctions	~31.6M bids	bid price	win / click / conversion	Free on GitHub

Both datasets offer a continuous treatment and rich covariates, making them well suited to estimating heterogeneity in price elasticity. LendingClub is rich in borrower features (FICO, DTI, income), fitting risk-based pricing; iPinYou is rich in campaign, time-of-day, and exchange context, fitting RTB ROI optimization.

Method / Pipeline

The full pipeline has four stages: identification → heterogeneity estimation → moderator analysis → policy learning.

1) Causal identification (SCM)

First, SCM (Structural Causal Model) makes the price-formation mechanism explicit as a DAG. For LendingClub:

\text{apply} \rightarrow \text{underwrite} \rightarrow \text{credit grade} \rightarrow \text{rate}(T) \rightarrow \text{repayment decision}(Y)

The DAG decides what to adjust for and what not to adjust for (avoiding colliders, blocking back-door paths). In iPinYou the auction structure itself supplies the SCM — along the path bid → win probability → impression → click/conversion, it separates the direct effect from the indirect effect mediated by impression.

2) CATE estimation (Double-Debiased ML + Causal Forest)

CATE (Conditional Average Treatment Effect) estimates the treatment effect conditional on covariates $X$ , $\tau(x) = \mathbb{E}[Y(t') - Y(t) \mid X = x]$ . Under a continuous treatment, it is interpreted as a local derivative in price — a conditional elasticity.

To remove selection bias, we use Double-Debiased ML (DML). We fit an outcome model $\hat{m}(x)$ and a treatment model $\hat{e}(x)$ as nuisances, then obtain a Neyman-orthogonal estimator via residualization and cross-fitting, which cancels the first-order term of the nuisance estimation error. On top of this, a Causal Forest partitions the covariate space along directions of large effect heterogeneity, yielding segment-level elasticities together with their confidence intervals.

3) Moderator analysis (SCM-based decomposition)

If CATE tells us “who is sensitive,” the SCM moderator analysis decomposes “why, and in which context, they are sensitive.” In iPinYou, competitive intensity, time-of-day, and exchange moderate the bid effect; in LendingClub, credit-grade bands create non-linearity in the rate effect. Here Instrumental Variables enter as an auxiliary identification tool — using variation that shifts the treatment without directly affecting the outcome (such as within-grade rate variation) as an instrument, we identify local effects even when latent confounding remains.

4) Policy learning (Optimal Targeting Policy)

The estimated $\tau(x)$ feeds into a business objective to derive an individualized optimal price. In lending, we maximize expected margin

\mathbb{E}[\pi(r)] = P(\text{repay}\mid r)\, r\, L - P(\text{default}\mid r)\, \text{LGD}\, L - C

over segments to obtain a rate $r^\*(x)$ ; in RTB, we learn a bid function $b^\*(x)$ that maximizes expected value $V = \text{pCTR}\times\text{pCVR}\times\text{CPA}$ under a budget constraint. This is the stage where CATE is cashed out into actual decisions.

Key findings (illustrative / projected)

All figures below are public-data-based illustrative/projected values. Focus on the pattern, not the absolute magnitude.

5–11x heterogeneity in price sensitivity. Estimated elasticity differed by roughly 5–11x between the least and most sensitive segments. This implies that a single-price policy wastes enormous surplus — overpricing the sensitive and underpricing the insensitive.
60–80% of variance from contextual/temporal moderators. About 60–80% of CATE variation arose not from fixed customer attributes but from contextual and temporal moderators (competitive intensity, time-of-day, exchange, credit-grade band, etc.). In other words, much of price personalization is a question of “when and in what situation,” not “who.”
Margin / ROI improvement (projected). Comparing CATE-based differentiated pricing against a single-price baseline, the simulation projected a meaningful improvement in margin (lending) and ROI (RTB). These are counterfactual simulation results from the policy-learning stage, not realized deployment outcomes.

Lessons

Personalization may be more about “context” than “who.” That most of the variance comes from dynamic moderators (time, competition, slot) suggests contextual / dynamic pricing is a bigger lever than static customer segments. This is also a natural bridge to contextual bandits and dynamic policy.
Identification comes before estimation. Before bolting on a fancy ML estimator, the SCM/DAG must first decide what to control for. As long as price is non-randomly assigned, elasticity estimates are contaminated by selection bias without cross-fitting and orthogonalization.
Continuous treatment is trickier than binary. Price, rate, and bid are inherently continuous, so binary-treatment meta-learners do not transfer directly. We need derivative (elasticity) estimation and dose-response curves, for which partially-linear / non-parametric DML variants are the natural choice.
Public data can still close the methodology loop. Even without proprietary data, the full identification–estimation–policy cycle can be demonstrated end-to-end on public datasets. The absolute numbers are illustrative, but the pipeline and decision structure are identical to production.

CATE — conditional average treatment effect; a conditional elasticity under continuous price
SCM — structural causal model; the basis for identification and moderator decomposition
Double-Debiased ML — removes selection bias via Neyman-orthogonality and cross-fitting
Causal Forest — estimates segment-level elasticity by splitting to maximize effect heterogeneity
Optimal Targeting Policy — cashes the estimated CATE out into individualized optimal prices
Instrumental Variables — auxiliary tool for identifying local effects under latent confounding

4) Policy learning (Optimal Targeting Policy)

Local graph