Linear Discriminant Analysis
Last updated
Was this helpful?
Last updated
Was this helpful?
LDA is a classification method that maximizes the separation between classes.
The LDA concept in StatQuest is Fisher's linear discriminant
The ratio of the variance between the classes to the variance within the classes:
Here, the LDA is explained with the following assumptions
Normally distributted
Equal class covariances
Supervised method of dimension reduction.
Extract basis (w) for data projection that
Maximizes separability between classes
while minimizing scatter within the same class
Find basis (w) for minimizing cost function
For fixed x, choose the class $k$ which gives the maximum (Posterior) probability of
The classification can be expressed in terms of posterior probability function using Bayes rule as
If Posterior function and Likelihood function are assumed to be multivariate Gaussian Distribution, then
Here, the covariance matrix for all classes are assumed to be equal.
If the covariance is not equal, then use Quadratic Discriminant Analysis
What is covariance? Read here
How to find the class k that gives maximum posterior probability ?
Take Log on
Maximizing is equivalent to maximizing
i: ith data, j: jth dimension.
Then, Estimate Linear Discriminant Function from
Assumption
Normally distributted
Equal class covariances
Dimension(feature number) p=2
class num K=2
Total dataset N=6
X = 2×6
1 2 2 3 3 4 3 3 4 1 2 2
Y = 1×6
1 1 1 2 2 2
Estimate Linear Discriminant functions
mu1 = 2×1
mu2 = 2×1
cov = 2×2
icov = 2×2
\
Dimension(feature number) p=2
class num K=3
Total dataset N=6
To set the boundary