A/B Testing
Definition
A/B testing is the online application of the randomized controlled trial (RCT), estimating causal effects by randomly exposing two or more variants to users.
- Control (A): the existing version
- Treatment (B): the new version
Because of random assignment, the simple difference in means becomes an unbiased estimator of the causal effect.
Intuitive Understanding
A/B testing is the “gold standard for establishing causal relationships.”
In observational data, the correlation between price and demand can be distorted by confounders, but in a randomized experiment the correlation between treatment and all confounders is eliminated.
Key Properties
Why experiment?
| Approach | Assumptions | Risk |
|---|---|---|
| Observational study | Unconfoundedness, exclusion restriction | Bias when assumptions are violated |
| A/B testing | Only requires SUTVA | Ethical/cost constraints |
Particularities of pricing experiments
| Challenge | Description | Mitigation strategy |
|---|---|---|
| Ethical concerns | Different prices for the same product are unfair | Region/time-based experiments |
| Interference | Information sharing between customers | Cluster randomization |
| Long-term effects | Brand and loyalty effects | Long-term tracking |
| Sample contamination | Multiple devices/accounts | Deterministic assignment |
Deterministic random assignment
def randomize(user_id, experiment_name, treatment_prob=0.5):
"""해시 기반 결정론적 할당"""
hash_value = hash(f"{user_id}_{experiment_name}") % 100
return 'treatment' if hash_value < treatment_prob * 100 else 'control'
The same user is always assigned to the same group, providing a consistent experience.
Example
Price A/B test
class PricingExperiment:
def __init__(self, control_price, treatment_price):
self.control_price = control_price
self.treatment_price = treatment_price
self.results = {'control': [], 'treatment': []}
def get_price(self, user_id):
group = self.randomize(user_id)
return self.treatment_price if group == 'treatment' else self.control_price
def analyze(self):
from scipy import stats
control = np.array(self.results['control'])
treatment = np.array(self.results['treatment'])
t_stat, p_value = stats.ttest_ind(treatment, control)
effect = treatment.mean() - control.mean()
return {'effect': effect, 'p_value': p_value}
Analyzing results
- Conversion rate difference:
- Statistical significance: p-value < 0.05
- Practical significance: Is the effect size meaningful for the business?
Related Concepts
- Statistical Power - Sample size determination
- CUPED - Variance reduction technique
- Design Effect - Impact of cluster randomization
- ATE - Estimand
References
- Kohavi, R., Tang, D., & Xu, Y. (2020). Trustworthy Online Controlled Experiments.
- Comprehensive Personalized Pricing Guide, Part V, §13