🖍️
gitbook_docs
  • Introduction
  • Machine Learning
    • Recommended Courses
      • For Undergrad Research
      • Math for Machine Learning
    • ML Notes
      • Covariance Correlation
      • Feature Selection
      • Linear Regression
      • Entropy, Cross-Entropy, KL Divergence
      • Bayesian Classifier
        • Terminology Review
        • Bayesian Classifier for Normally Distributed classes
      • Linear Discriminant Analysis
      • Logistic Regression
        • Logistic Regression Math
      • Logistic Regression-MaximumLikelihood
      • SVM
        • SVM concept
        • SVM math
      • Cross Validation
      • Parameter, Density Estimation
        • MAP, MLE
        • Gaussian Mixture Model
      • E-M
      • Density Estimation(non-parametric)
      • Unsupervised Learning
      • Clustering
      • kNN
      • WaveletTransform
      • Decision Tree
    • Probability and Statistics for Machine Learning
      • Introduction
      • Basics of Data Analysis
      • Probability for Discrete Random Variable
      • Poisson Distribution
      • Chi-Square Distribution
      • P-value and Statistical Hypothesis
      • Power and Sample Size
      • Hypothesis Test Old
      • Hypothesis Test
      • Multi Armed Bandit
      • Bayesian Inference
      • Bayesian Updating with Continuous Priors
      • Discrete Distribution
      • Comparison of Bayesian and frequentist inference
      • Confidence Intervals for Normal Data
      • Frequenist Methods
      • Null Hypothesis Significance Testing
      • Confidence Intervals: Three Views
      • Confidence Intervals for the Mean of Non-normal Data
      • Probabilistic Prediction
  • Industrial AI
    • PHM Dataset
    • BearingFault_Journal
      • Support Vector Machine based
      • Autoregressive(AR) model based
      • Envelope Extraction based
      • Wavelet Decomposition based
      • Prediction of RUL with Deep Convolution Nueral Network
      • Prediction of RUL with Information Entropy
      • Feature Model and Feature Selection
    • TempCore Journal
      • Machine learning of mechanical properties of steels
      • Online prediction of mechanical properties of hot rolled steel plate using machine learning
      • Prediction and Analysis of Tensile Properties of Austenitic Stainless Steel Using Artificial Neural
      • Tempcore, new process for the production of high quality reinforcing
      • TEMPCORE, the most convenient process to produce low cost high strength rebars from 8 to 75 mm
      • Experimental investigation and simulation of structure and tensile properties of Tempcore treated re
    • Notes
  • LiDAR
    • Processing of Point Cloud
    • Intro. 3D Object Detection
    • PointNet
    • PointNet++
    • Frustrum-PointNet
    • VoxelNet
    • Point RCNN
    • PointPillars
    • LaserNet
  • Simulator
    • Simulator List
    • CARLA
    • Airsim
      • Setup
      • Tutorial
        • T#1
        • T#2
        • T#3: Opencv CPP
        • T#4: Opencv Py
        • Untitled
        • T#5: End2End Driving
  • Resources
    • Useful Resources
    • Github
    • Jekyll
  • Reinforcement Learning
    • RL Overview
      • RL Bootcamp
      • MIT Deep RL
    • Textbook
    • Basics
    • Continuous Space RL
  • Unsupervised Learning
    • Introduction
  • Unclassified
    • Ethics
    • Conference Guideline
  • FPGA
    • Untitled
  • Numerical Method
    • NM API reference
Powered by GitBook
On this page
  • Concept
  • Introduction
  • Dimension Reduction Problem
  • Classification Problem
  • As a Multivariate Gaussian Distribution
  • Linear Discriminant Functions
  • Estimating Linear discriminant function
  • Classification with LDA
  • Example
  • In MATLAB
  • LDA example
  • Example 1: 2-class classification
  • Example 2: 3-class classification

Was this helpful?

  1. Machine Learning
  2. ML Notes

Linear Discriminant Analysis

PreviousBayesian Classifier for Normally Distributed classesNextLogistic Regression

Last updated 3 years ago

Was this helpful?

Concept

Introduction

LDA is a classification method that maximizes the separation between classes.

The LDA concept in StatQuest is Fisher's linear discriminant

  • The ratio of the variance between the classes to the variance within the classes:

Here, the LDA is explained with the following assumptions

  1. Normally distributted

  2. Equal class covariances

Dimension Reduction Problem

Supervised method of dimension reduction.

Extract basis (w) for data projection that

  • Maximizes separability between classes

  • while minimizing scatter within the same class

Mathematics

Find basis (w) for minimizing cost function

Classification Problem

Posterior Probability Function

For fixed x, choose the class $k$ which gives the maximum (Posterior) probability of

The classification can be expressed in terms of posterior probability function using Bayes rule as

As a Multivariate Gaussian Distribution

Here, the covariance matrix for all classes are assumed to be equal.

If the covariance is not equal, then use Quadratic Discriminant Analysis

What is covariance? Read here

Derivation

Linear Discriminant Functions

Estimating Linear discriminant function

From Training dataset, estimate

i: ith data, j: jth dimension.

Then, Estimate Linear Discriminant Function from

Classification with LDA

Example

In MATLAB

LDA example

Assumption

  • Normally distributted

  • Equal class covariances

Example 1: 2-class classification

  • Dimension(feature number) p=2

  • class num K=2

  • Total dataset N=6

N1=3; N2=3; N=N1+N2; K=2;
% Dataset
x1=[1;3]; x2=[2;3]; x3=[2;4]; x4=[3;1]; x5=[3;2]; x6=[4;2];
% Label  class 1
y1=1; y2=1; y3=1;
% Label class 2
y4=2; y5=2; y6=2;

X=[x1 x2 x3 x4 x5 x6]
Y=[y1 y2 y3 y4 y5 y6]

X = 2×6

1 2 2 3 3 4 3 3 4 1 2 2

Y = 1×6

1 1 1 2 2 2

Estimate Linear Discriminant functions

% Prior
pi1=N1/N
pi2=N2/N

% mu
mu1=sum(X(:,1:3),2)/ N1
mu2=sum(X(:,4:6),2)/ N2

%covariance
sum_temp1=0;
for i=1:N1
  	sum_temp1=sum_temp1+(X(:,i)-mu1)*(X(:,i)-mu1)';
end
sum_temp2=sum_temp1;
for i=N1+1:N
	sum_temp2=sum_temp2+(X(:,i)-mu2)*(X(:,i)-mu2)';
end

%cov1=cov2=cov
cov=1/(N-K)*sum_temp2
icov=inv(cov)

% Delta
LD1=@(x) x'*(icov)*(mu1)-0.5*(mu1)'*(icov)*(mu1)+log(pi1)
LD2=@(x) x'*(icov)*(mu2)-0.5*(mu2)'*(icov)*(mu2)+log(pi2)

mu1 = 2×1

1.6667    3.3333

mu2 = 2×1

3.3333    1.6667

cov = 2×2

0.3333    0.1667    0.1667    0.3333

icov = 2×2

4.0000   -2.0000   -2.0000    4.0000

\

Example 2: 3-class classification

  • Dimension(feature number) p=2

  • class num K=3

  • Total dataset N=6

To set the boundary

P(Y=k∣X=x)P(Y=k | X=x)P(Y=k∣X=x)

If Posterior functionpk(x)p_k(x)pk​(x) and Likelihood function fk(x)f_k(x)fk​(x) are assumed to be multivariate Gaussian Distribution, then

How to find the class k that gives maximum posterior probability pk(x)p_k(x)pk​(x) ?

Take Log on pk(x)p_k(x)pk​(x)

Maximizing log(pk(x))log(p_k(x))log(pk​(x)) is equivalent to maximizing δk(x)\delta_k(x)δk​(x)

StatQuest Youtube
S={\frac {\sigma _{\text{between}}^{2}}{\sigma _{\text{within}}^{2}}}={\frac {({\vec {w}}\cdot {\vec {\mu }}_{1}-{\vec {w}}\cdot {\vec {\mu }}_{0})^{2}}{{\vec {w}}^{T}\Sigma _{1}{\vec {w}}+{\vec {w}}^{T}\Sigma _{0}{\vec {w}}}}={\frac {({\vec {w}}\cdot ({\vec {\mu }}_{1}-{\vec {\mu }}_{0}))^{2}}{{\vec {w}}^{T}(\Sigma _{0}+\Sigma _{1}){\vec {w}}}}