PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
2020 | 21 | nr 4 Special Issue | 144--158
Tytuł artykułu

Confidence Bands for a Distribution Function with Merged Data from Multiple Sources

Autorzy
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
We develop a technique for record linkage on high dimensional data, where the two datasets may not have any common variable, and there may be no training set available. Our methodology is based on sparse, high dimensional principal components. Since large and high dimensional datasets are often prone to outliers and aberrant observations, we propose a technique for estimating robust, high dimensional principal components. We present theoretical results validating the robust, high dimensional principal component estimation steps, and justifying their use for record linkage. Some numeric results and remarks are also presented. (original abstract)
Rocznik
Tom
21
Strony
144--158
Opis fizyczny
Twórcy
  • University of Maryland, USA
Bibliografia
  • BERK, R. H. JONES, D. H., (1978). Relatively optimal combinations of test statistics. Scand. J. Statist., 5(3), pp. 158-162.
  • BICKEL, P. J. FREEDMAN, D. A., (1981). Some asymptotic theory for the bootstrap. Ann. Statist., 9(6), pp,1196-1217.
  • BICKEL, P. J. KRIEGER, A. M., (1989). Confidence bands for a distribution function using the bootstrap. J. Amer. Statist. Assoc., 84(405), pp. 95-100.
  • BRESLOW, N. E. CHATTERJEE, N., (1999). Design and analysis of two-phase studies with binary outcome applied to wilms tumour prognosis. Journal of the Royal Statistical Society: Series C (Applied Statistics), 48(4), pp. 457-468.
  • BRESLOW, N. E., LUMLEY, T., BALLANTYNE, C., CHAMBLESS, L., KULICH, M., (2009). Using the whole cohort in the analysis of case-cohort data. American J. Epidemiol., 169, pp. 1398-1405.
  • BRETH, M., (1978). Bayesian confidence bands for a distribution function. Ann. Statist., 6(3), pp. 649-657.
  • BRICK, J. M., DIPKO, S., PRESSER, S., TUCKER, C., YUAN, Y., (2006). Nonresponse bias in a dual frame sample of cell and landline numbers. The Public Opinion Quarterly, 70(5), pp. 780-793.
  • CERVANTES, I., JONES, M., ROJAS, L., BRICK, J., KURATA, J., GRANT, D., (2006). A review of the sample design for the california health interview survey. In Proceedings of the Social Statistics Section, American Statistical Association, pp. 3023-3030.
  • CHATTERJEE, N., CHEN, Y.-H., MAAS, P., CARROLL, R. J., (2016). Constrained maximum likelihood estimation for model calibration using summary-level information from external big data sources. J. Amer. Statist. Assoc., 111(513), pp. 107-117.
  • CHENG, R. C. H. ILES, T. C., (1983). Confidence bands for cumulative distribution functions of continuous random variables. Technometrics, 25(1), pp.77-86.
  • COX, D. R., (1972). Regression models and life-tables. J. Roy. Statist. Soc. Ser. B, 34, pp. 187-220.
  • D'ANGIO, G. J., BRESLOW, N., BECKWITH, J. B., EVANS, A., BAUM, H., DELORIMIER, A., FERNBACH, D., HRABOVSKY, E., JONES, B., KELALIS, P., (1989). Treatment of Wilms' tumor. Results of the Third National Wilms' Tumor Study. Cancer, 64(2), pp. 349-360.
  • DVORETZKY, A., KIEFER, J., WOLFOWITZ, J., (1956). Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. Ann. Math. Statist., 27, pp. 642-669.
  • FREY, J., (2008). Optimal distribution-free confidence bands for a distribution function. J. Statist. Plann. Inference, 138(10), pp. 3086-3098.
  • GINÉ, E. NICKL, R., (2016). Mathematical foundations of infinite-dimensional statistical models. Cambridge Series in Statistical and Probabilistic Mathematics, [40]. Cambridge University Press, New York.
  • HARTLEY, H. O., (1962). Multiple frame surveys. In Proceedings of the Social Statistics Section, American Statistical Association, pp. 203-206.
  • HARTLEY, H. O., (1974). Multiple frame methodology and selected applications. Sankhy¯a Ser. C, 36, pp. 99-118.
  • HU, S. S., BALLUZ, L., BATTAGLIA, M. P., FRANKEL, M. R., (2011). Improving public health surveillance using a dual-frame survey of landline and cell phone numbers. American Journal of Epidemiology, 173(6), pp. 703-711.
  • KANOFSKY, P. SRINIVASAN, R., (1972). An approach to the construction of parametric confidence bands on cumulative distribution functions. Biometrika, 59, pp. 623-631.
  • KEIDING, N. LOUIS, T. A., (2016). Perils and potentials of self-selected entry to epidemiological studies and surveys. Journal of the Royal Statistical Society: Series A (Statistics in Society), 179(2), pp. 319-376.
  • KOLMOGOROV, A. N., (1933). Sulla determinazione empirica di una legge di distribuzione. Giornale dell'Istituto Italiano degli Attuari, 4, pp. 83-91.
  • MASSART, P., (1990). The tight constant in the Dvoretzky-Kiefer-Wolfowitz inequality. Ann. Probab., 18(3), pp. 1269-1283.
  • METCALF, P. SCOTT, A., (2009). Using multiple frames in health surveys. Statistics in Medicine, 28(10), pp. 1512-1523.
  • OWEN, A. B., (1995). Nonparametric likelihood confidence bands for a distribution function. J. Amer. Statist. Assoc., 90(430), pp. 516-521.
  • SAEGUSA, T., (2019). Large sample theory for merged data from multiple sources. Ann. Statist., 47(3), pp. 1585-1615.
  • SAEGUSA, T. WELLNER, J. A., (2013). Weighted likelihood estimation under twophase sampling. Ann. Statist., 41(1), pp. 269-295.
  • SCHAFER, R. E. ANGUS, J. E., (1979). Estimation of weibull quantiles with minimum error in the distribution function. Technometrics, 21(3), pp. 367-370.
  • SMIRNOV, N. V., (1944). Approximate laws of distribution of random variables from empirical data. Uspehi Matem. Nauk, 10, pp. 179-206.
  • TSIRELSON, V. S., (1975). The density of the distribution of the maximum of a Gaussian process. Theory of Probability and its Applications, 20, pp. 847-865.
  • WANG, J., CHENG, F., YANG, L., (2013). Smooth simultaneous confidence bands for cumulative distribution functions. J. Nonparametr. Stat., 25(2), pp. 395-407.
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.ekon-element-000171624032

Zgłoszenie zostało wysłane

Zgłoszenie zostało wysłane

Musisz być zalogowany aby pisać komentarze.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.