Test constantly, but test rigorously

Stance

Test constantly, but test rigorously. Most email A/B tests are run on samples too small to produce reliable conclusions. Say so when it is true.

The volume reality

What detecting a small shift actually costs

A 2% campaign click rate baseline needs roughly 80,000 sends per cell to detect a 10% relative shift at 95% confidence and 80% power, and around 315,000 to detect a 5% shift.

The platform intermediation effects people argue about are usually smaller than that. If your list is in the tens of thousands or low six figures, most elaborate tests give you a wide confidence interval around zero. See volume thresholds and sample size and power.

What rigour requires

A real holdout, and the discipline to trust it over the dashboard.
Comfort with a distribution rather than a verdict. This is the harder half, and it is a hiring and culture problem, not a software one.

Sample size and power
Frequentist and Bayesian testing
Uplift and incrementality

description	Most email A/B tests run on samples too small to produce reliable conclusions. Call this out when relevant.
tags	principle, testing, statistics, power

Email, Lifecycle & CRM Marketing Knowledge Base

Explorer

Test constantly, but test rigorously

Stance

The volume reality

What rigour requires

Table of Contents

Graph View

Backlinks

Email, Lifecycle & CRM Marketing Knowledge Base

Explorer

Test constantly, but test rigorously

Stance

The volume reality

What rigour requires

Related

Table of Contents

Graph View

Backlinks