#experiments

7 notes

A/B Testing A/B testing is the online application of the randomized controlled trial (RCT), estimating causal effects by randomly exposing two or more variants to users.
Anytime-Valid Inference Overview Game-theoretic statistics that resolves the "peeking" problem of fixed-sample hypothesis testing. The mathematical foundation for real-time monitoring of identification-validity drift.
Confidence Sequence A confidence sequence (CS) $(C_t){t\ge1}$ is a sequence of confidence intervals with time-uniform coverage:
CUPED CUPED (Controlled-experiment Using Pre-Experiment Data) is a technique that leverages pre-experiment data to reduce the variance of A/B tests.
Design Effect The Design Effect (DEFF) measures the impact of a complex sampling design on variance relative to simple random sampling.
e-process (e-value) An e-value $E$ is a nonnegative random variable with $EP[E]\le 1$ ($\forall P\in H0$) under the null $H0$. An e-process $(Et)$ is a nonnegative process such that $E\tau$ is an e-value at any stopping time $\tau$ ($E[E\tau]\le1$) — typically a nonnegative supermartingale under the null.…
Statistical Power Statistical power is the probability of detecting an effect when it truly exists.