Correcting sample selection bias by unlabeled data 论文

2007ANU Open Research (Australian National University)引用 418
Machine Learning and Data ClassificationMachine Learning and AlgorithmsGaussian Processes and Bayesian Inference

摘要

We consider the scenario where training and test data are drawn from\ndifferent distributions, commonly referred to as sample\nselection bias. Most algorithms for this setting try to first\nrecover sampling distributions and then make appropriate corrections based on the\ndistribution estimate. We present a nonparametric method which\ndirectly produces resampling weights without distribution estimation.\nOur method works by matching distributions between training and\ntesting sets in feature space. Experimental results demonstrate that our method works well in\npractice.