Variational Bayesian Model Selection for Mixture Distributions 论文

2001引用 356
Bayesian Methods and Mixture ModelsGaussian Processes and Bayesian InferenceStatistical Methods and Bayesian Inference

摘要

Mixture models, in which a probability distribu-tion is represented as a linear superposition of component distributions, are widely used in sta-tistical modeling and pattern recognition. One of the key tasks in the application of mixture models is the determination of a suitable number of components. Conventional approaches based on cross-validation are computationally expen-sive, are wasteful of data, and give noisy esti-mates for the optimal number of components. A fully Bayesian treatment, based on Markov chain Monte Carlo methods for instance, will re-turn a posterior distribution over the number of components. However, in practical applications it is generally convenient, or even computation-ally essential, to select a single, most appropri-ate model. Recently it has been shown, in the context of linear latent variable models, that the use of hierarchical priors governed by continuous hyperparameters whose values are set by type-II maximum likelihood, can be used to optimize model complexity. In this paper we extend this framework to mixture distributions by consider-ing the classical task of density estimation us-ing mixtures of Gaussians. We show that, by setting the mixing coefficients to maximize the marginal log-likelihood, unwanted components can be suppressed, and the appropriate number of components for the mixture can be determined in a single training run without recourse to cross-validation. Our approach uses a variational treat-ment based on a factorized approximation to the posterior distribution. 1