Parameter, Density Estimation

Lecture Note

Introduction

For Bayesian Classifier, it becomes the problem of Probability Density Function(pdf) Estimation for continuous distribution of Likelihood.

How to estimate $p(\bf{x}|w_i)$ with samples of x?

X 중에 w_i 에 속하는 샘플 집합 Xi를 가지고 p(x|w_i)를 추정하라

It is modeling the probability density function (pdf) of the unknown probability distribution from which the dataset has been drawn.

The methods are classifed as

Parametric model

MAP, MLE
Gaussian Mixture Model
multivariate normal distribution (MND)

Nonparametric model

Kernel Density Estimation, Histogram, Parzen Window
k-Nearest neighbor

Kernel Density Estimation Method

It has the parameter of b, the bandwidth

Example: Gaussian kernel (1D) for k

Find the optimum b: We look for such a value of b that minimizes the difference between the real shape of f(x) and the shape of our model f_b(x).

Example

Let {xi}N i=1 be a one-dimensional dataset (a multi-dimensional case is similar) whose examples were drawn from a distribution with an unknown pdf f with xi 2 R for all i = 1, . . . ,N.

How to measure the goodness of estimation?

A reasonable choice of measure of this difference is called the mean integrated squared error (MISE):

we square the difference between the real pdf f and our model of it f^hat_b.

Now, to find the optimal value b* for b, we minimize the cost defined as,

Limitations

Memory -based method, that need to store all the samples of training.
For additional sample, need to re-calculate the whole process.
Curse-of Dimension exists: Use only for low-dimension problems

Gaussian Mixture Model

Read GMM

Reference

The Hundred-Page Machine Learning Book http://themlbook.com/wiki/doku.php

머신러닝/패턴인식, 오일석

A Comprehensive Overview of Gaussian Splatting | Towards Data ScienceTowards Data Science

PreviousCross Validation NextMAP, MLE

Last updated 3 years ago

Was this helpful?