Tae Hyun Kim (Lowell)

Instrumental Variables

Definition

Instrumental variables (IV) are exogenous variables used to address the problem of endogeneity.

Conditions for a valid instrument ZZ:

  1. Relevance: Cov(Z,X)0Cov(Z, X) \neq 0 — the instrument affects the endogenous variable
  2. Exclusion Restriction: Cov(Z,u)=0Cov(Z, u) = 0 — the instrument has no direct effect on the outcome

XZ(only through X)YX \leftarrow Z \rightarrow \text{(only through } X \text{)} \rightarrow Y

Intuitive Understanding

In the price endogeneity problem, we need to find “price variation that is unrelated to demand.”

An instrumental variable acts like a “natural experiment.” A cost shock (e.g., a rise in raw material prices) affects price but does not directly affect a consumer’s willingness to pay.

Key Properties

2SLS (Two-Stage Least Squares)

Stage 1: Regress the endogenous variable on the instrument X=π0+π1Z+vX = \pi_0 + \pi_1 Z + v

Stage 2: Regress the outcome on the predicted endogenous variable Y=β0+β1X^+uY = \beta_0 + \beta_1 \hat{X} + u

Weak Instrument Problem

When the first-stage F-statistic is low, the weak instrument problem arises:

  • Bias: can be more severe than OLS
  • Distorted confidence intervals

Stock-Yogo rule: an F-statistic > 10 is considered safe

Valid Instruments in Pricing

Instrument TypeExampleValidity
Cost shiftersRaw material prices, exchange rates, shipping costsCosts affect price but have no direct effect on consumer WTP
Hausman IVPrice of the same product in other marketsCost shocks are common, demand shocks are local
Competitive structureNumber of competitors, BLP IVCompetition affects price

Example

Python Implementation

from linearmodels.iv import IV2SLS
import numpy as np

# Data
# Y: log quantity, X: log price (endogenous), Z: cost shock (instrument)

iv_model = IV2SLS(
    dependent=np.log(data['quantity']),
    exog=data<span class="wikilink-dead" title="private note">'const', 'quality'</span>,  # exogenous control variables
    endog=np.log(data['price']),       # endogenous variable
    instruments=data<span class="wikilink-dead" title="private note">'cost_shock', 'competitor_price'</span>  # instruments
)

iv_results = iv_model.fit(cov_type='robust')

print(f"IV elasticity: {iv_results.params['log_price']:.3f}")
print(f"Standard error: {iv_results.std_errors['log_price']:.3f}")
print(f"First-stage F-statistic: {iv_results.first_stage.diagnostics['f.stat'].stat:.2f}")

BLP Instruments

Instruments proposed by Berry-Levinsohn-Pakes (1995):

# Sum of characteristics of other products within the same market
def create_blp_iv(data, characteristics, market_col='market', product_col='product'):
    """Generate BLP-style instruments"""
    ivs = []
    for char in characteristics:
        # Sum of the characteristic across other products within the same market
        market_sums = data.groupby(market_col)[char].transform('sum')
        iv = market_sums - data[char]
        ivs.append(iv)
    return pd.DataFrame(ivs).T

blp_ivs = create_blp_iv(data, ['horsepower', 'weight', 'mpg'])

Instrument Diagnostics

# Weak instrument test
from scipy import stats

# First-stage regression
first_stage = sm.OLS(data['log_price'], data[['const'] + instruments]).fit()
f_stat = first_stage.fvalue

print(f"First-stage F-statistic: {f_stat:.2f}")
if f_stat < 10:
    print("Warning: possible weak instrument")
else:
    print("Instrument strength adequate")

# Overidentification test (J-test) - when there are two or more instruments
# H0: all instruments are valid
sargan_stat = iv_results.sargan.stat
sargan_pval = iv_results.sargan.pval
print(f"Sargan test: stat={sargan_stat:.2f}, p={sargan_pval:.3f}")
  • Endogeneity - the problem IV addresses
  • A-B Testing - the alternative free of endogeneity
  • Double-Debiased ML - IV combined with ML (DRIV)
  • Price Elasticity - the target estimated via IV

References

  • Angrist, J. D., & Pischke, J. S. (2008). Mostly Harmless Econometrics.
  • Stock, J. H., & Yogo, M. (2005). “Testing for Weak Instruments in Linear IV Regression.”
  • Berry, S., Levinsohn, J., & Pakes, A. (1995). “Automobile Prices in Market Equilibrium.”
  • Comprehensive Personalized Pricing Guide, Part II, §6

Local graph