Download PDF

Performance Metrics for Sample Selection Bias Correction

Author

An-Chiao Liu

Co-author

An-Chiao Liu
Ton De Waal
Sander Scholtus
Katrijn Van Deun

Conference

64th ISI World Statistics Congress - Ottawa, Canada

Format: CPS Abstract

Keywords: non-probability sample, performance metrics, selection bias

Session: CPS 01 - Statistical methodology II

Monday 17 July 8:30 a.m. - 9:40 a.m. (Canada/Eastern)

Abstract

When estimating a population parameter by a non-probability sample, a sample without a known sampling mechanism, the estimate may suffer from sample selection bias. To correct selection bias, one of the often-used methods is assigning a set of unit pseudo-weights to the non-probability sample, and estimating the target parameter by the weighted sum. However, a tailor-made framework to evaluate the assigned weights is missing in the literature, and the evaluation framework for prediction problems may not be suitable for population parameter estimation. We try to fill in the gap by discussing several promising performance metrics, which are inspired by classical calibration and measures of selection bias. A simulation study and real data examples show that some performance metrics have a strong positive relationship with the mean squared error of the estimated population mean. These performance metrics may be helpful for model selection when correcting selection bias by logistic regression or machine learning algorithms.