Bayes Classifier

The Bayes classifier just simply assign each observation to the most likely class, given its predictor values. In other words, we should simply assign a test observation with predictor vector x to the class j for which

$\displaystyle \Pr ({Y=j \mid X=x})$ is largest.

In a two-class problem where there are only two possible response values, the Bayes classifier uses a threshold of 0.5, corresponding to predicting one class if $\Pr ( {Y=j \mid X=x}) > 0.5$ and the other class otherwise. The points where the probability is exactly 50% constitute a decision boundary called Bayes decision boundary.

It’s possible to prove that the Bayes classifier produces the lowest possible test error rate, called the Bayes error rate, which is also analogous to the irreducible error. In general, the overall Bayes error rate is given by

$\displaystyle 1-\mathbb{E} (\max_j \Pr (Y=j \mid X))$

But for real data, we do not know the conditional distribution of Y given X, and so computing the Bayes classifier is impossible. Therefore, the Bayes classifier serves as an unattainable gold standard against which to compare other methods.

Reference:

An Introduction to Statistical Learning by Trevor Hastie, Robert Tibshirani