Confusion Matrix Calculator | Precision, Recall, F1 & Specificity

How to use

Enter the confusion matrix counts: TP, FP, TN, and FN.
Optionally rename the positive and negative classes so the matrix reads like your own dataset.
Read precision and recall beside accuracy, especially when one class is rare.

Binary classification metrics from one matrix

Use this page after a model or rule has already produced binary predictions. It is a readout page for classification metrics, not a threshold-tuning workflow.

Inputs

True positive (TP) False positive (FP) True negative (TN) False negative (FN) Positive label Negative label Decimal places Auto update

Run a calculation to see classification metrics from the confusion matrix.

Results

Accuracy—

Precision—

Recall—

Specificity—

F1 score—

Prevalence—

Total cases—

Predicted positive—

Predicted negative—

Run a calculation to see classification metrics from the confusion matrix.

Supports binary classification from TP, FP, TN, and FN.

Actual \ Predicted	Positive	Negative
Positive	—	—
Negative	—	—

Actual positive: — · Actual negative: —

Metric	Value	Fraction

Accuracy is not the whole story

Accuracy can stay high when the negative class dominates. In that situation, a classifier may look fine overall while still missing many positive cases. That is why this page keeps recall, specificity, precision, and prevalence beside accuracy.

Precision vs recall

Precision asks, “when the model says positive, how often is it right?” Recall asks, “of all actual positives, how many did it catch?” Improving one can hurt the other, so the right balance depends on the cost of false positives versus false negatives.

Binary-classification scope

This page stays with binary classification so each metric remains easy to audit from TP, FP, TN, and FN. Multiclass matrices, ROC curves, and threshold sweeps belong in a later expansion rather than being mixed into the first version.

Frequently asked questions

Why is accuracy alone not enough?

Accuracy can look high when one class is rare. In imbalanced datasets, precision, recall, specificity, and prevalence often tell you more about model behavior than accuracy alone.

What is the difference between precision and recall?

Precision asks how often predicted positives are correct. Recall asks how many actual positives the model catches. One focuses on false positives, the other on false negatives.

Does this page support multiclass confusion matrices?

No. This page is intentionally limited to binary classification so the core metrics stay easy to audit and compare.

Does the share URL include my counts or labels?

No. The share URL stores only lightweight settings such as decimal places. Counts and custom labels stay in your browser.

Comments (optional)

To reduce load, comments are fetched only when needed.