Forward Regression for Ultra-High Dimensional Variable Screening 论文

2009Journal of the American Statistical Association引用 371
Statistical Methods and InferenceBayesian Methods and Mixture ModelsGene expression and cancer classification

摘要

Motivated by the seminal theory of Sure Independence Screening (Fan and Lv 2008, SIS), we investigate here another popular and classical variable screening method, namely, forward regression (FR). Our theoretical analysis reveals that FR can identify all relevant predictors consistently, even if the predictor dimension is substantially larger than the sample size. In particular, if the dimension of the true model is finite, FR can discover all relevant predictors within a finite number of steps. To practically select the “best” candidate from the models generated by FR, the recently proposed BIC criterion of Chen and Chen (2008) can be used. The resulting model can then serve as an excellent starting point, from where many existing variable selection methods (e.g., SCAD and Adaptive LASSO) can be applied directly. FR’s outstanding finite sample performances are confirmed by extensive numerical studies.