On the number of components in a Gaussian mixture model 论文

2014Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery引用 245
Bayesian Methods and Mixture ModelsAdvanced Clustering Algorithms ResearchData Management and Algorithms

摘要

Mixture distributions, in particular normal mixtures, are applied to data with two main purposes in mind. One is to provide an appealing semiparametric framework in which to model unknown distributional shapes, as an alternative to, say, the kernel density method. The other is to use the mixture model to provide a probabilistic clustering of the data into g clusters corresponding to the g components in the mixture model. In both situations, there is the question of how many components to include in the normal mixture model. We review various methods that have been proposed to answer this question. WIREs Data Mining Knowl Discov 2014, 4:341–355. doi: 10.1002/widm.1135 This article is categorized under: Technologies > Machine Learning