← Math & statistics

Statistics A/B testing CRO

A/B Test Significance Calculator

Already ran your test? Enter visitors and conversions for control A and variant B to estimate conversion rates, lift, a two-proportion z-test p-value, confidence interval, and Bayesian probability that B beats A.

This page analyzes completed test results. To plan traffic before launch, use the sample size calculator or power analysis calculator.

Other languages English
Embed this calculator

Test inputs

Control A
Variant B

When unsure, keep the two-sided test. Use one-sided only if the direction was specified before the experiment.

Result

Control CVR-
Variant CVR-
Absolute lift-
Relative lift-
p-value-
z statistic-
Difference CI-
P(B beats A)-
Bayesian lift interval-

How to read the output

Related statistics tools

For planning before a test, use sample size or power analysis. For a one-sample exact proportion check, use the binomial test calculator. For a general confidence-interval workflow, use the CI & hypothesis test wizard. For clinical or risk-difference framing, use the risk difference confidence interval calculator.

FAQ

Which formula does this calculator use?

It uses a two-proportion z-test with a pooled standard error for the p-value. The difference interval can use Newcombe-Wilson or Wald, and the Bayesian readout samples Beta(1 + conversions, 1 + non-conversions) posteriors.

Does statistical significance mean I should ship the variant?

No. A significant p-value says the observed difference is unlikely under the equal-rate model. A launch decision also needs effect size, cost, guardrail metrics, segment behavior, and business context.

Should I choose a one-sided or two-sided test?

Use the two-sided test unless you committed before the experiment that only a lift in the variant direction matters. Two-sided is the safer default because it can detect movement in either direction.

How should I read small-sample results?

When visitors or expected conversions are very low, the normal z approximation can be unstable. Treat the result as a rough signal and compare it with an exact binomial workflow or run a larger test.

Why can the frequentist and Bayesian readings look different?

The p-value asks how surprising the data would be if both rates were equal. The Bayesian probability estimates how often the variant rate beats the control rate under the chosen posterior model, so the numbers answer different questions.

Can I check the result many times while the test is running?

Repeated peeking increases false positives for ordinary fixed-horizon tests. Plan the sample size and analysis rule first, or use a sequential method that is designed for interim checks.

Why might my result differ from another A/B calculator?

Calculators can differ by one-sided versus two-sided tests, pooled versus unpooled standard errors, Wald versus Newcombe intervals, prior choices, rounding, and whether they apply sequential or multiple-testing adjustments.