Bayesian Classifier
Last updated
Last updated
See Terminology Review before reading further
Assume we are to classify an object based on the evidence provided by feature vector x, as class w1 or class w2.
Let w1: class 1, w2: class 2.
If P(w1|X)> P(w2|X), then X belongs to w1. Else w2
Applying Bayesian rule, it becomes minimum error Bayesian Classifier
If p(X|w1)P(w1)> p(X|w2)P(w2), then X belongs to w1. Else w2
Since X can be either discrete,continuous, use small p for p(X|w1). For discrete lable w, use P(w), P(w|x)
For binary classification,
P(e)=P(e|w1)P(w1)+P(e|w2)P(w2)
If P(w1)=P(w2)=0.5, then P(e)=0.5(e1+e2)
Optimal decision rule will minimize P(e|x) at every value of x so the integral is minimized
P(e)=Integral_INF { P(e|x)p(x)dx}
For any given problem, the minimum probability error is achieved by LRT decision. The best classifier
Penalty of misclassifying can have different weight for each class.
For example, misclassifying a cancer sufferer as a healthy patient is a much more serious problem than the other way around
Let C_ij is the cost of choosing class w_i when w_j is the true class.
e.g. C21 is wrong classification as w2 when the true class is w1.
Expected value of the cost:
R= E[C] = { c11 p(x|w1)P(w1)+ c12p(x|w2)P(w2)} + c21 p(x|w1)P(w1)+ c22p(x|w2)P(w2)}
After some rearrangement, it becomes a form of Likelihood Ratio
Since p(x) does not affect decision rule, rearrange using the term .