Choose a scenario and enter summary statistics. The wizard returns test statistics, critical values, confidence intervals, p-values, and a clear step log. Share the URL to reproduce the same setup.
Results
Provide inputs and run the analysis to see the summary, interval, and decision.
Static example before you run analysis: with the one-sample mean defaults (n=10, x̄=5.2, s=2.4, null mean 4, 95% confidence), the standard error is about 0.759, the test statistic is t≈1.581 with df=9, and the 95% confidence interval is about [3.483, 6.917]. The two-tailed p-value is about 0.148, so this example does not reject the null at α = 0.05.
Key metrics
Conclusion
How it is computed
P-value visual
Teacher notes
- Student’s t quantiles are derived via the regularised incomplete beta, matching textbook lookup tables even for small samples.
- Welch degrees of freedom, Wilson score, and Newcombe’s difference keep coverage accurate for unequal variances or proportions near the boundary.
- The shareable URL stores the scenario, summary statistics, tail choice, and confidence level so group members can replicate the report instantly.
How to use the interval and test workflow
Pick the statistical question first: one mean, two means, one proportion, two proportions, or paired data. Then enter either summary statistics or raw counts consistently so the confidence interval, p-value, and effect direction describe the same assumption set.
How it works
The wizard chooses the matching z, t, or proportion procedure from your sample size, standard deviation, and alternative-hypothesis settings. Calculations keep internal precision and round only for display, so use the shown interval endpoints, test statistic, and p-value as a coherent report rather than mixing them with another tool's rounded intermediates.
When to use
Use this page for classroom checks, experiment triage, A/B-test sanity checks, and early analysis notes where you need transparent assumptions before a fuller statistical review. It is not a substitute for study design, sampling-bias review, or regulated reporting.
Common mistakes to avoid
- Using a one-tailed alternative when the research question is actually two-tailed.
- Choosing an independent two-sample test for paired before/after observations.
- Mixing sample standard deviation, population standard deviation, and standard error.
- Reading a p-value as the probability that the null hypothesis is true.
Interpretation and worked example
Start by stating the null value and alternative direction in words. After calculating, read the confidence interval for plausible effect sizes and the p-value for compatibility with the null model. If the interval crosses the null value, report that uncertainty explicitly instead of turning the result into a simple pass/fail claim.
See also
FAQ
What does the p-value shading show?
The shaded area matches the p-value under the null distribution. Two-tailed tests shade both sides, while one-tailed tests shade only one side.
How are the Wilson and Newcombe intervals computed?
Wilson intervals use the adjusted proportion with a z critical value. Newcombe combines two Wilson intervals to form bounds for the difference.
What should I define first for confidence intervals or tests?
Choose the test family and enter the sample statistic, sample size, and confidence or significance level. Confirm whether you need a one-sided or two-sided interpretation before reading the result.
Why can confidence intervals or tests results differ from nearby tools?
Differences usually come from test family, sidedness, alpha, and sample statistic definitions. Match those assumptions before comparing this result with another CalcBE page, spreadsheet, or external tool.
How should I judge the reliability of the result?
Use the displayed result as reliable for the stated test family, sidedness, alpha, and sample statistic definitions. For official reporting, regulated work, or purchasing decisions, verify the inputs against the source document or provider rule you must follow.