How to use
- Paste one score,label pair per line. Scores should rank positives above negatives when the model is working well.
- Optionally set the positive label explicitly if your labels are not obvious from values like
1orcase. - Read AUC for ranking quality, then inspect the threshold table and ROC curve to choose a practical operating cutoff.
Wave 4 classification expansion
Binary score ranking to ROC and AUC
This first release is binary-only and score-only. Use it before precision-recall curves or calibration plots when the main question is score ordering and threshold trade-off.
Inputs
Paste score,label rows and run the calculation to inspect ROC trade-offs.
ROC curve
Run the calculator to inspect the ROC curve. Focus the chart and use ← / → to move across thresholds.
Threshold table
The table lists the first threshold rows from the score sweep. Lower thresholds usually increase sensitivity and reduce specificity.
| Threshold ≥ | Sensitivity | Specificity | FPR | TP | FP | TN | FN |
|---|
How to read ROC AUC
ROC AUC asks how well the score ranks positives above negatives across all thresholds. A value near 0.5 means the ordering is close to random. A higher value means positives tend to receive higher scores than negatives.
Thresholds still matter
A high AUC does not choose the operating cutoff for you. The right threshold depends on what hurts more in your workflow: false positives, false negatives, or delay caused by review volume.
Confusion matrix vs ROC AUC
Use confusion matrix when the threshold is already fixed and you need one operational snapshot. Use ROC AUC when the threshold is still open and you need to compare the whole sweep before deciding.
Frequently asked questions
How is ROC AUC different from a confusion matrix?
A confusion matrix describes one chosen threshold. ROC AUC describes how sensitivity and specificity trade off across many thresholds, so it tells you how well scores rank positives above negatives before you commit to one operating cutoff.
What changes when I move the threshold?
Lower thresholds classify more cases as positive, which usually increases sensitivity and decreases specificity. Higher thresholds do the reverse. The threshold table lets you inspect that trade-off row by row.
Does a high AUC automatically tell me which threshold to use?
No. AUC summarizes ranking quality across thresholds, but the operational threshold still depends on the cost of false positives versus false negatives, prevalence, and workflow constraints.
Does this page support multiclass ROC?
No. This first release is intentionally limited to binary classification so the score ordering, ROC curve, and threshold table stay easy to audit.
Related
Comments (optional)
To reduce load, comments are fetched only when needed.