Multivariate Two-Sample Tests Based on Nearest Neighbors 论文

1986Journal of the American Statistical Association引用 252
Statistical Distribution Estimation and ApplicationsAdvanced Statistical Methods and ModelsBayesian Methods and Mixture Models

摘要

Abstract A new class of simple tests is proposed for the general multivariate two-sample problem based on the (possibly weighted) proportion of all k nearest neighbor comparisons in which observations and their neighbors belong to the same sample. Large values of the test statistics give evidence against the hypothesis H of equality of the two underlying distributions. Asymptotic null distributions are explicitly determined and shown to involve certain nearest neighbor interaction probabilities. Simple infinite-dimensional approximations are supplied. The unweighted version yields a distribution-free test that is consistent against all alternatives; optimally weighted statistics are also obtained and asymptotic efficiencies are calculated. Each of the tests considered is easily adapted to a permutation procedure that conditions on the pooled sample. Power performance for finite sample sizes is assessed in simulations. Key Words: Distribution-free Kth nearest neighborInfinite-dimensional approximation