Estimating Population Mean with Missing Data in Unequal Probability Sampling
Nonresponse problem is a serious obstacle to the validity of estimates in a survey. The estimates become biased due to the missing values in data. The problem is how to deal with missing values, once they have been deemed impossible to recover. One way of exploring a possible lack of representativity in missing data is to estimate the response probabilities which are usually done by logistic regression model. However, the drawback of the logit model is that this requires values of the explanatory variables of the model to be known for all nonrespondents. Bethlehem (2012) showed that the response probabilities can be estimated by some weighting adjustment technique without having the individual data of the nonrespondents. Here we consider the doubtful nature of nonresponse regarding possible existence of relationship with any of the covariates. Moreover, instead of simple random sampling, we consider general unequal probability sampling scheme for selecting respondents. This paper presents the modification of Bethlehem (2012) proposal for unequal probability sampling to obtain the unbiased estimators for population total/average of a variable of interest and variance estimator and compares them with the usual estimators through numerical simulations. (original abstract)
- BETHLEHEM, J. G., (2012). Using response probabilities for assessing representativity. Statistics Netherlands, Discussion Paper.
- BETHLEHEM, J. G., KELLER, W. A., (1987). Linear weighting of sample survey data. Journal of Official Statistics. 3, 141-153.
- BREWER, K. R. W., (1963). A model of systematic sampling with unequal probabilities. Australian Journal of Statistics. 5, 5-13.
- CHANG, T., KOTT, P. S., (2008). Using calibration weighting to adjust for nonresponse under a plausible model. Biometrika. 95, 557-571.
- CHAUDHURI, A., (2010). Essentials of Survey Sampling. PHI Learning Private Limited. New Delhi.
- CHAUDHURI, A., ADHIKARY, A., DIHIDAR, S., (2000). Mean square error estimation in multi-stage sampling. Metrika. 52, 115-131.
- CHAUDHURI, A., PAL, S., (2002). Estimating proportions from unequal probability samples using randomized responses by Warner's and other devices. Journal of the Indian Society of Agricultural Statistics. 55(2), 174-183.
- FOLSOM, R. E., (1991). Exponential and logistic weight adjustment for sampling and nonresponse error reduction. Proceedings of Social Statistics Section, Washington, DC: American Statistical Association. 197- 202.
- FULLER, W. A., LOUGHIN, M. M., BAKER, H. D., (1994). Regression weighting for the 1987-88 National Food Consumption Survey. Survey Methodology. 20, 7585.
- HANSEN, M. H., HURWITZ, W. N., (1946). The problem of non-response in sample surveys. Journal of the American Statistical Association. 41, 517-529.
- HEITJAN, D. F., BASU, S., (1996). Distinguishing 'Missing at Random' and 'Missing Completely at Random'. The American Statistician. 50, 207-213.
- HORVITZ, D. G., THOMPSON, D. J., (1952). A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association. 47, 663-685.
- KOTT, P. S., (2006). Using calibration weighting to adjust for nonresponse and coverage errors. Survey Methodology. 32, 133-142.
- KOTT, P. S. CHANG, T., (2010). Using calibration weighting to adjust for nonignorable unit nonresponse. Journal of the American Statistical Association. 105(491), 1265-1275.
- LITTLE, R. J. A., (1986). Survey nonresponse adjustments for estimates of means. International Statistical Review. 54. 139-157.
- MIDZUNO, H., (1952). On the sampling system with probabilities proportionate to sum of sizes. Annals of the Institute of Statistical Mathematics. 3, 99-107.
- PÖLITZ, A. N., SIMMONS, W. R., (1949). An Attempt to Get 'Not-at-Homes' into the Sample Without Call- Backs. Journal of the American Statistical Association. 44, 9-31.
- POLITZ, A., SIMMONS, W., (1950). Note on an Attempt to Get 'Not-at-Homes' into the Sample Without Call-Backs. Journal of the American Statistical Association. 45, 136-137.
- RAJ, D., (1966). Some remarks on a simple procedure of sampling without replacement. Journal of the American Statistical Association. 61, 391-396.
- RUBIN, D. B., (1987). Multiple Imputation for Nonresponse in Surveys. J. Wiley & Sons, New York.
- RUBIN, D. B., (1976). Inference and missing data. Biometrika. 63, 581-592.
- SARNDAL, C. E., (2011). Dealing with survey nonresponse in data collection, in estimation. Journal of Official Statistics. 27, 1-21.
- SARNDAL, C. E., SWENSON, B., WRETMAN, J., (1992). Model Assisted Survey Sampling. Springer-Verlag. New York.
- SETH, G. R., (1966). On estimators of variance of estimate of population total in varying probabilities. Journal of the Indian Society of Agricultural Statistics. 18, 52-56.
- SINGH, S., (2010). Layman's understanding of non-response: How Michael and Amy adjust a missing phone call. LIAISON, Statistical Society of Canada. 24(3), p. 67.
- VALLIANT, R., DORFMAN, A. H., ROYALL, R. M., (2000). Finite Population Sampling and Inference: A Prediction Approach. Wiley Series in Survey Methodology. New York.
- YATES, F., GRUNDY, P. M., (1953). Selection without replacement from within strata with probability proportional to size. Journal of the American Statistical Association. 75,206-211.