There are a few types of baselines:
- random baseline
- simple heuristic
- zero rule baseline1
- human baseline
- existing solutions
Notes on Random Baseline
Random baselines can assume two kinds of prior:
- beta prior (same probability for all classes)
- label prior (match the label probability)
The F1 metric is a super simple measuring stick to estimate a model’s performance against the random baseline:
TL;DR
By knowing the label prior, you can already tell a lot about whether a classification model is performing better or worse than a random model that predicts with the label prior.
F1 as a simple measuring stick
Link to original
Scenario Interpretation Same performance as random model using label prior Worse performance than random model using label prior Better performance than random model using label prior
Footnotes
-
special case of simple heuristic—pick the most common class ↩