Results
PMF formula
P(X=k)=C(K,k)·C(N−K,n−k)/C(N,n)
Tip: “at least one” is P(X≥1)=1−P(X=0).
Distribution (PMF table & bar chart)
| k | P(X=k) | CDF |
|---|
Simulation (Monte Carlo)
Use trials and seed to reproduce runs. For large ranges, the tool bins the histogram to stay fast.
Worked examples & interpretation
What the inputs mean
- N: population size (total items).
- K: number of “successes” in the population (items you care about).
- n: number of draws (sample size).
- k: successes in your sample (the random variable X).
When to use this (vs. binomial)
- Hypergeometric: sampling without replacement (the success probability changes after each draw).
- Binomial: independent trials with a fixed success probability (often “with replacement” or a very large population).
- Rule of thumb: if your sampling fraction is small (n/N is small), hypergeometric and binomial with
p = K/Ncan be close.
Worked examples (try the presets)
Cards: draw 5 from 52, how likely to get exactly 2 aces?
Set N=52, K=4, n=5, query “Exactly” with k=2.
Inspection: 10 defectives in 100, sample 8 — probability of at least 1 defective?
Set N=100, K=10, n=8, query “At least” with k=1.
“At least one” quickly (complement trick)
Compute P(X ≥ 1) = 1 − P(X = 0). You can also use the helper button “P(X≥1)”.
Common pitfalls
- Invalid k: the support is
k_min=max(0,n−(N−K))andk_max=min(n,K). Outside it,P(X=k)=0. - Define “success” clearly: “success” is just the label for the item type you are counting (e.g., “defective”, “red”, “ace”).
- Large ranges: the PMF table can be omitted for performance; use the probability result and/or simulation.
References
How to use this calculator effectively
Use this calculator when samples are drawn without replacement from a finite population. Define population size, success count, draw size, and the target success count before interpreting any tail probability.
How it works
The page evaluates the hypergeometric probability mass and cumulative tails from combinations. It keeps the finite-population assumption explicit, so compare scenarios by changing N, K, n, or k one at a time and watching how the exact probability and expected successes move.
When to use
Use it for quality-control sampling, card or lottery-style draws, classroom probability checks, and risk thresholds where items are not returned after each draw. If draws are independent or effectively with replacement, compare against a binomial model instead.
Common mistakes to avoid
- Using hypergeometric when each draw is independent or the population is effectively unlimited.
- Entering sample size n larger than population size N, or success count K larger than N.
- Comparing tail probabilities without checking whether you need exactly k, at most k, or at least k successes.
- Treating a rounded display value as an exact probability in downstream work.
Interpretation and worked example
Start with one sampling plan and record P(X = k), P(X ≤ k), and P(X ≥ k). Then change only the draw size or success threshold to see how risk changes. If the direction feels surprising, verify that success and failure categories match the real population before using the result.
See also
FAQ
What is the hypergeometric model used for?
It models draws from a finite population where each item is not returned, so probabilities change after each draw.
How do I set inputs?
Set population size N, number of success items K, draw size n, and target success count k (or a range) before checking tails and cumulative values.
Why does this differ from binomial?
Binomial assumes independent draws with replacement-like replacement assumptions, while hypergeometric handles finite population depletion and is accurate for sampling without replacement.
How should I interpret the tail result?
Tail results summarize 'at least / at most' scenarios and are useful for risk checks, thresholds, and decision boundaries.
Can I use results for decisions?
Use them as a quantitative signal for scenario comparison, then validate against context assumptions and operational limits before final decisions.
How it’s calculated
C(K,k)·C(N−K,n−k)/C(N,n)computed in log-space for stability.k_min=max(0,n−(N−K)),k_max=min(n,K).