On sampling without replacement with unequal probabilities of selection 论文

1967Biometrika引用 228
Bayesian Modeling and Causal InferenceNeural Networks and ApplicationsStatistics Education and Methodologies

摘要

A sample of n different units is to be drawn from a population or stratum in such a way that unit i has probability npi, assumed less than 1, of appearing in the sample. A mathematical solution of this problem is given by a formula from which the required probability of selection of any possible sample can be calculated: this formula is an extension of one, due to Durbin, for n = 2. The required npican be achieved in practice in three ways: (a) by evaluating the required probabilities for all possible samples, and selecting one; (b) selecting units without replacement, with probabilities of selection that must be recalculated after each drawing; and (c) by selecting up to n units with replacement, the first drawing being made with probabilities pi, and all subsequent ones with probabilities proportional to pi/(1−npi), and rejecting completely any sample that does not contain n different units. Method (c) seems likely to be the most convenient in practice. The probability of the simultaneous appearance in the sample of any pair of units is relatively easily calculated, so that unbiased variance estimates can be obtained without undue labour.