← Math & statistics

Confidence Interval & Hypothesis Test Wizard

Run t and z workflows for means and proportions, review each formula step, and share the same setup with your team.

Other languages 日本語 | English | 简体中文 | 繁體中文 | 繁體中文(香港) | Español | Español (México) | Português (Brasil) | Português (Portugal) | Bahasa Indonesia | Tiếng Việt | 한국어 | Français | Deutsch | Italiano | Русский | हिन्दी | العربية | বাংলা | اردو | Türkçe | ไทย | Polski | Filipino | Bahasa Melayu | فارسی | Nederlands | Українська | עברית | Čeština

Choose a scenario and enter summary statistics. The wizard returns test statistics, critical values, confidence intervals, p-values, and a clear step log. Share the URL to reproduce the same setup.

Inputs

Scenario
Confidence & tails
Sample summary

Results

Provide inputs and run the analysis to see the summary, interval, and decision.

Static example before you run analysis: with the one-sample mean defaults (n=10, x̄=5.2, s=2.4, null mean 4, 95% confidence), the standard error is about 0.759, the test statistic is t≈1.581 with df=9, and the 95% confidence interval is about [3.483, 6.917]. The two-tailed p-value is about 0.148, so this example does not reject the null at α = 0.05.

P-value visual

The shaded area represents the p-value relative to the null distribution (Student’s t or standard normal).

Teacher notes

How to use the interval and test workflow

Pick the statistical question first: one mean, two means, one proportion, two proportions, or paired data. Then enter either summary statistics or raw counts consistently so the confidence interval, p-value, and effect direction describe the same assumption set.

How it works

The wizard chooses the matching z, t, or proportion procedure from your sample size, standard deviation, and alternative-hypothesis settings. Calculations keep internal precision and round only for display, so use the shown interval endpoints, test statistic, and p-value as a coherent report rather than mixing them with another tool's rounded intermediates.

When to use

Use this page for classroom checks, experiment triage, A/B-test sanity checks, and early analysis notes where you need transparent assumptions before a fuller statistical review. It is not a substitute for study design, sampling-bias review, or regulated reporting.

Common mistakes to avoid

Interpretation and worked example

Start by stating the null value and alternative direction in words. After calculating, read the confidence interval for plausible effect sizes and the p-value for compatibility with the null model. If the interval crosses the null value, report that uncertainty explicitly instead of turning the result into a simple pass/fail claim.

See also

FAQ

What does the p-value shading show?

The shaded area matches the p-value under the null distribution. Two-tailed tests shade both sides, while one-tailed tests shade only one side.

How are the Wilson and Newcombe intervals computed?

Wilson intervals use the adjusted proportion with a z critical value. Newcombe combines two Wilson intervals to form bounds for the difference.

What should I define first for confidence intervals or tests?

Choose the test family and enter the sample statistic, sample size, and confidence or significance level. Confirm whether you need a one-sided or two-sided interpretation before reading the result.

Why can confidence intervals or tests results differ from nearby tools?

Differences usually come from test family, sidedness, alpha, and sample statistic definitions. Match those assumptions before comparing this result with another CalcBE page, spreadsheet, or external tool.

How should I judge the reliability of the result?

Use the displayed result as reliable for the stated test family, sidedness, alpha, and sample statistic definitions. For official reporting, regulated work, or purchasing decisions, verify the inputs against the source document or provider rule you must follow.