Reading Your Results
Significance, confidence intervals, and the discipline of not peeking.
Statistical significance is a threshold, not a verdict. 95% confidence means there is roughly a one-in-twenty chance your result is noise. Run twenty flat tests and one will look like a winner by accident.
Always read the confidence interval alongside the point estimate. '+3.2% with a CI of -0.5% to +6.9%' means the test cannot rule out a small loss. '+3.2% with a CI of +1.4% to +5.1%' means you have something.
Segment the result after — never before — the primary metric. Segment results are hypotheses for the next test, not conclusions for this one. And do not call a result early because the dashboard looks good on day three. The dashboard always looks good on day three.
Example
// A flat result is still a result. Log it.
const result = {
test: 'hero-cta-verb-v1',
outcome: 'flat',
primaryMetric: 'ctr',
lift: 0.4,
ciLow: -1.2,
ciHigh: 2.0,
significant: false,
};Try this on the Shop
The best way to learn is to ship. Open this surface and apply the lesson.
Hold the Stone of Significance