Dirichlet Distribution Generator & Visualizer

Q: Why do components negatively correlate?

Because all components must sum to 1, increasing one component tends to decrease at least one other component.

Q: Why do samples stick to corners?

If one or more α values are below 1, or total concentration is small, the density moves toward simplex boundaries.

Q: Why can the exported row stop summing to 1 exactly?

Rounded output can lose exact equality even though the underlying sample still sums to 1 before rounding.

Q: How is this different from Beta?

Dirichlet handles vectors of probabilities, while Beta is the two-component special case viewed through one component.

Q: What should I do first on this page?

Start with a low-dimensional baseline, then change one α value or concentration setting at a time.

What is a Dirichlet distribution?

A Dirichlet distribution is a distribution over probability vectors (x1,…,xK) where each component is non‑negative and the total sums to 1. This space is called a simplex.

α (alpha) can be interpreted like pseudo‑counts. The relative sizes of α determine the mean vector.
α0 = Σα_i is the concentration (strength): larger α0 ⇒ tighter around the mean; smaller α0 ⇒ more variability.
If some α_i < 1, samples tend to be sparse and stick to corners/edges; if all α_i > 1, mass is often inside the simplex.
K=2 is a special case: x1 ~ Beta(α1,α2) (this tool shows the Beta overlay and links to the Beta tool).

Common use cases: Bayesian priors for categorical probabilities, topic proportions, mixture weights, and probability‑like test data. You don’t need to enter personal information to use it.

Presets

Pick a practical preset (it regenerates instantly; you can tweak after applying).

Tip: For large K, use profile JSON for sharing instead of long URLs.

Generator

Choose a parameterization, generate samples, then inspect means, marginals, and diagnostics.

Parameterization

Dimension (K) Labels (comma-separated)

α (same for all components)

All components use α_i = α. Good starting point to see “corner vs center”.

Concentration (α0)

Enter a mean vector m (sums to 1). This tool derives α = m×α0.

Component	Mean (m_i)

All α values must be >0. Smaller values (<1) encourage sparse, corner-heavy vectors.

Component	α_i

Sample size (N) Bins (histograms)

RNG

Show components (marginals)

Up to 5 components are used for marginal histograms. (For large K, the checkbox list is hidden — use the index input.)

Show correlation heatmap (theory; small K only)

Preview rounding (decimals) Export rounding (optional) JSON mode

Copy format (preview)

Per-component stats

Component	Theory mean	Sample mean	Theory var	Sample var

Samples preview (first 20)

Profile JSON (save/restore settings)

Share URLs contain settings only. For large K, use profile JSON to save/restore without long URLs.

Import profile JSON

Tip: Don’t include confidential labels (customer names, etc.) in shared profiles.

How to use this tool

Use this page to generate probability vectors that must stay non-negative and sum to 1.

Use in 3 steps

Start with a small dimension such as K=3 and a preset that is easy to interpret.
Generate the sample, then review theoretical means, marginals, and row previews together.
Change one α value or the total concentration at a time so you can isolate mean shifts from concentration shifts.

How to read the result

Each row is one probability vector. The means show the expected share of each component, while the concentration controls how tightly samples stay near that mean. Because all components must sum to 1, increases in one component reduce room for others.

Boundary checks

If any α_i<1, expect more mass near simplex corners or edges.
Rounded exports can make a displayed row look slightly different from an exact sum of 1.
When K=2, compare with the Beta tool because it is the matching special case.

Frequently asked questions

Why do components negatively correlate?