How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis
Citations
6,199 citations
4,283 citations
4,123 citations
Cites background or methods from "How Many Clusters? Which Clustering..."
...The single-link clustering method seems to be not related to a statistical model and does not perform well in instances where clusters are not well separated (e.g., Fraley and Raftery 1998)....
[...]
...Finally, in a range of applications of model-based clustering, model choice based on BIC has given good results (Campbell, Fraley, Murtagh, and Raftery 1997; Campbell, Fraley, Stanford, Murtagh, and Raftery 1999; DasGupta and Raftery 1998; Fraley and Raftery 1998; Stanford and Raftery 2000)....
[...]
...Sections 2–5 include a review of material covered in earlier work (Fraley and Raftery 1998) and elsewhere....
[...]
...A non-Gaussian component can often be approximated by several Gaussian ones (e.g., Dasgupta and Raftery 1998; Fraley and Raftery 1998)....
[...]
...Sections 2-5 include a review of material covered in earlier work (Fraley and Raftery 1998) and elsewhere....
[...]
3,860 citations
3,770 citations
Cites methods from "How Many Clusters? Which Clustering..."
...As advocated in previous studies [5,22], we use Bayesian Information Criterion (BIC) to assess the best supported model, and therefore the number and nature of clusters....
[...]
References
38,681 citations
36,760 citations
"How Many Clusters? Which Clustering..." refers methods in this paper
...When EM is used to find the maximum mixture likelihood, a more reliable approximation to twice the log Bayes factor called the BIC (Schwarz [32]) is applicable: 2 log p(x |M)+ constant ≈ 2lM(x, θ̂)− mM log(n) ≡ BIC, where p(x |M) is the (integrated) likelihood of the data for the modelM, lM(x, θ̂ ) is the maximized mixture loglikelihood for the model andmM is the number of independent parameters to be estimated in the model....
[...]
[...]
34,729 citations
"How Many Clusters? Which Clustering..." refers background in this paper
...This latter quantity falls in the range [0, 1], and values near zero imply ill-conditioning [34]....
[...]
24,320 citations
"How Many Clusters? Which Clustering..." refers methods in this paper
...The most common relocation method— k-means (MacQueen [14], Hartigan and Wong [15])—reduces the within-group sums of squares....
[...]