Parameter, Density Estimation
Last updated
Was this helpful?
Last updated
Was this helpful?
Click here for Lecture PPT: TAMU
For Bayesian Classifier, it becomes the problem of Probability Density Function(pdf) Estimation for continuous distribution of Likelihood.
How to estimate with samples of x?
X 중에 w_i 에 속하는 샘플 집합 Xi를 가지고 p(x|w_i)를 추정하라
It is modeling the probability density function (pdf) of the unknown probability distribution from which the dataset has been drawn.
The methods are classifed as
Gaussian Mixture Model
multivariate normal distribution (MND)
Kernel Density Estimation, Histogram, Parzen Window
k-Nearest neighbor
It has the parameter of b, the bandwidth
Example: Gaussian kernel (1D) for k
Find the optimum b: We look for such a value of b that minimizes the difference between the real shape of f(x) and the shape of our model f_b(x).
Let {xi}N i=1 be a one-dimensional dataset (a multi-dimensional case is similar) whose examples were drawn from a distribution with an unknown pdf f with xi 2 R for all i = 1, . . . ,N.
How to measure the goodness of estimation?
A reasonable choice of measure of this difference is called the mean integrated squared error (MISE):
we square the difference between the real pdf f and our model of it f^hat_b.
Now, to find the optimal value b* for b, we minimize the cost defined as,
Memory -based method, that need to store all the samples of training.
For additional sample, need to re-calculate the whole process.
Curse-of Dimension exists: Use only for low-dimension problems
The Hundred-Page Machine Learning Book
머신러닝/패턴인식, 오일석