Clustering
Last updated
Last updated
The value of k, the number of clusters, is a hyperparameter that has to be tuned by the data analyst. There are some techniques for selecting k. None of them is proven optimal. Most of those techniques require the analyst to make an “educated guess” by looking at some metrics or by examining cluster assignments visually.
Methods
Prediction Strength (See 9.2.3 [1] )
Gap statistic method
Elbow method
Average silihouette method
[1] The Hundred-Page Machine Learning Book http://themlbook.com/wiki/doku.php
[2] Machine Learning, Huang, VTech