# Linear Discriminant Analysis

## Concept

{% embed url="<https://youtu.be/azXCzI57Yfc>" %}
StatQuest Youtube
{% endembed %}

## Introduction

LDA is a classification method that maximizes the separation between classes.

The LDA concept in StatQuest is **Fisher's linear discriminant**

* The ratio of the variance between the classes to the variance within the classes:
* ![S={\frac {\sigma \_{\text{between}}^{2}}{\sigma \_{\text{within}}^{2}}}={\frac {({\vec {w}}\cdot {\vec {\mu }}\_{1}-{\vec {w}}\cdot {\vec {\mu }}\_{0})^{2}}{{\vec {w}}^{T}\Sigma \_{1}{\vec {w}}+{\vec {w}}^{T}\Sigma \_{0}{\vec {w}}}}={\frac {({\vec {w}}\cdot ({\vec {\mu }}\_{1}-{\vec {\mu }}\_{0}))^{2}}{{\vec {w}}^{T}(\Sigma \_{0}+\Sigma \_{1}){\vec {w}}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/9af8aa035642689bb2004047416b069a15406447)

Here, the LDA is explained with the following assumptions

1. Normally distributted
2. Equal class covariances

### Dimension Reduction Problem

Supervised method of dimension reduction.

Extract basis (w) for data projection that

* Maximizes separability between classes
* while minimizing scatter within the same class

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2Fpf1NqwW04ADRnxb9SzjT%2Fimage.png?alt=media\&token=4cef3e42-9f52-4e14-a17e-0a081bd6b922)

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2FlLM8vPYLj2MK3X8aXrqQ%2Fimage.png?alt=media\&token=c8ae935e-716c-4af6-bfce-88b80b2f58f6)

#### Mathematics

Find basis (w) for minimizing cost function

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2Fi4l3ju4w5OxrcVQITZFt%2Fimage.png?alt=media\&token=891b66d2-3d94-46a5-b05c-9c73ede62f57)

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2FuALMEtpau7UolXlphKdD%2Fimage.png?alt=media\&token=e735a5ce-360c-4a7b-b6d8-a336620e18b5)

### Classification Problem

#### Posterior Probability Function

For fixed x, choose the class $k$ which gives the maximum (Posterior) probability of

$$
P(Y=k | X=x)
$$

The classification can be expressed in terms of posterior probability function using Bayes rule as

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2Fgit-blob-7c912696e0e892c92df6753974147d20e2cd30ed%2Fimage.png?alt=media)

### As a Multivariate Gaussian Distribution

If Posterior function$$p\_k(x)$$ and Likelihood function $$f\_k(x)$$ are assumed to be multivariate Gaussian Distribution, then

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2Fgit-blob-70417e6a4f02ba8905747ee70860aacc4be04dbf%2Fimage.png?alt=media)

Here, the covariance matrix for all classes are assumed to be equal.

> If the covariance is not equal, then use Quadratic Discriminant Analysis

What is covariance? Read here

#### Derivation

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2Fgit-blob-6995609aa782fa96079ece9efa7c8078fe9ff8ee%2Fimage.png?alt=media)

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2Fgit-blob-f6d0a6e2cac04d77f1e6d90e0fb85f06e93596e8%2Fimage.png?alt=media)

### Linear Discriminant Functions

How to find the class *k* that gives maximum posterior probability $$p\_k(x)$$ ?

Take Log on $$p\_k(x)$$

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2Fgit-blob-62730e1381fc21138ff3c66a13568d363320e342%2Fimage.png?alt=media)

Maximizing $$log(p\_k(x))$$ is equivalent to maximizing $$\delta\_k(x)$$

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2Fgit-blob-9799f35dab4398c052c94fd625cff6fa29175f70%2Fimage.png?alt=media)

### Estimating Linear discriminant function

#### From Training dataset, estimate

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2Fgit-blob-2b4314667f0f1d188808aea8022baa8ffba21960%2Fimage.png?alt=media)

> i: ith data, j: jth dimension.

Then, Estimate Linear Discriminant Function from

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2Fgit-blob-c0c1cd14c2263e131bd5e252865a7a49637e1927%2Fimage.png?alt=media)

### Classification with LDA

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2Fgit-blob-e7c279da18a5e52cea85f34d589ba756df5cd370%2Fimage.png?alt=media)

## Example

### In MATLAB

## LDA example

Assumption

* Normally distributted
* Equal class covariances

### Example 1: 2-class classification

* Dimension(feature number) p=2
* class num K=2
* Total dataset N=6

```python
N1=3; N2=3; N=N1+N2; K=2;
% Dataset
x1=[1;3]; x2=[2;3]; x3=[2;4]; x4=[3;1]; x5=[3;2]; x6=[4;2];
% Label  class 1
y1=1; y2=1; y3=1;
% Label class 2
y4=2; y5=2; y6=2;

X=[x1 x2 x3 x4 x5 x6]
Y=[y1 y2 y3 y4 y5 y6]
```

X = 2×6

1 2 2 3 3 4 3 3 4 1 2 2

Y = 1×6

1 1 1 2 2 2

**Estimate Linear Discriminant functions**

```c
% Prior
pi1=N1/N
pi2=N2/N

% mu
mu1=sum(X(:,1:3),2)/ N1
mu2=sum(X(:,4:6),2)/ N2

%covariance
sum_temp1=0;
for i=1:N1
  	sum_temp1=sum_temp1+(X(:,i)-mu1)*(X(:,i)-mu1)';
end
sum_temp2=sum_temp1;
for i=N1+1:N
	sum_temp2=sum_temp2+(X(:,i)-mu2)*(X(:,i)-mu2)';
end

%cov1=cov2=cov
cov=1/(N-K)*sum_temp2
icov=inv(cov)

% Delta
LD1=@(x) x'*(icov)*(mu1)-0.5*(mu1)'*(icov)*(mu1)+log(pi1)
LD2=@(x) x'*(icov)*(mu2)-0.5*(mu2)'*(icov)*(mu2)+log(pi2)
```

***

mu1 = 2×1

```
1.6667    3.3333
```

mu2 = 2×1

```
3.3333    1.6667
```

cov = 2×2

```
0.3333    0.1667    0.1667    0.3333
```

icov = 2×2

```
4.0000   -2.0000   -2.0000    4.0000
```

\\

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2Fgit-blob-a85b3fc364b88a406bd64e456a0856c25a9d5a2c%2Fimage.png?alt=media)

###

### Example 2: 3-class classification

* Dimension(feature number) p=2
* class num K=3
* Total dataset N=6

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2Fgit-blob-3c1554b184f9e071487962d402dcd715df97fe35%2Fimage.png?alt=media)

To set the boundary

![](https://3698175758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MAwtzMy_pbrChIExFtN%2Fuploads%2Fgit-blob-193f7d8a5e66d1e9be5312e0731b44b29aeed2e4%2Fimage.png?alt=media)
