64th ISI World Statistics Congress - Ottawa, Canada

64th ISI World Statistics Congress - Ottawa, Canada

Semiparametric Adaptive Estimation Under Informative Sampling

Author

KM
Kosuke Morikawa

Co-author

  • J
    Jae Kwang Kim
  • Y
    Yoshikazu Terada

Conference

64th ISI World Statistics Congress - Ottawa, Canada

Format: IPS Abstract

Keywords: informative_sampling, semiparametric_inference, survey_sampling

Session: IPS 460 - Inference under Informative Sample Designs

Monday 17 July 10 a.m. - noon (Canada/Eastern)

Abstract

In probability sampling, each unit is sampled according to pre-designated sampling weights in a large population. Using the sampling weights, we can remove the selection bias in the sample. For example, the Horvitz-Thompson estimator is well-known to be consistent and asymptotically normally distributed even if the sampling mechanism is informative; however, it is not necessarily efficient. This study derives the semiparametric efficiency bound of three target parameters for (i) Z-estimator, such as mean; (ii) regression model; (ii) conditional density function. Our key idea is to regard the survey weights as random variables and derive the semiparametric efficiency bound for each target parameter.

However, a conditional expectation on the sampling weights is necessary to construct estimators that attain the semiparametric efficiency bound. We propose two types of adaptive estimators to estimate the expectation. One estimator assumes a parametric model on the sampling weights that is essentially the same as beta regression. The other estimator uses nonparametric working models with debiased machine learning. The proposed estimator with the parametric model is consistent and asymptotically normal even if the working model is incorrect; if the working model is correct, it is efficient in a class of regular and asymptotically linear estimators. The proposed estimator with the nonparametric working model is always efficient. A limited simulation study is conducted to investigate the finite sample performance of the proposed method. The proposed method is applied to the 1999 Canadian Workplace and Employee Survey data.