Variable Selection in Finite Mixture of Regression Models 论文

2007Journal of the American Statistical Association引用 236
Bayesian Methods and Mixture ModelsStatistical Distribution Estimation and ApplicationsStatistical Methods and Bayesian Inference

摘要

In the applications of finite mixture of regression models, a large number of covariates are often used and their contributions toward the response variable vary from one component to another of the mixture model. This creates a complex variable selection problem. Existing methods, such as AIC and BIC, are computationally expensive as the number of covariates and the components in the mixture model increase. In this paper, we introduce a penalized likelihood approach for variable selection in finite mixture of regression models. The new method introduces a penalty which depends on the sizes of regression coefficients and the mixture structure. The new method is shown to have the desired sparsity property. A data adaptive method for selecting tuning parameters, and an EM-algorithm for efficient numerical computations are developed. Simulations show that the method has very good performance with much lower demand on computing power. The new method is also illustrated by analyzing a real data set in marketing applications.