Alternative Sample Weighting Procedures of Household surveys: a comparison of calibration approaches used in National Socio-Economic Survey
64th ISI World Statistics Congress - Ottawa, Canada
Format: CPS Abstract
Keywords: calibration, sampling, weighting
Session: CPS 38 - Survey statistics I
Tuesday 18 July 8:30 a.m. - 9:40 a.m. (Canada/Eastern)
Statistics Indonesia, as National Government Office, has conducted National Socio-Economic Survey to collect multidimensional indicators depicting Indonesia's socio-economic welfare and other substantial indicators, also known as mother of survey. Designed for estimating aggregates populations up to Province and Regional Levels, this large–scale survey apply probability sampling with total sample of approximately 320.000 households. Regarding of sampling design used and the significant information obtained from this survey, Statistics Indonesia has performed basic procedures for weighting the sampled data, such as initial/base weight calculation, adjustment for non-response, adjustment for non-coverage households, calibration, and weight trimming. Nevertheless, in line with advancement in science and technology, improvements in the sampling field are necessary, which NGOs must adapt. Nowadays, calibration approaches have evolved into several equation models that address problems in the distribution of sampled data. Basically, calibration was introduced by Sarndall and Deville as statistics procedures are able to generate vectors (g-weight) satisfying the smallest distance function. In this paper, we will evaluate calibration approaches performed in the weighting process, such as linear/GREG, truncated, multivariate raking ratio, and logit. This study attempts to compare result weighting among calibration techniques, which one is the best approach. Resampling is performed on National Socio-Economic Survey datasets repeatedly, generating numerous sample sets of data used as datasets in this simulation study, which adopts Monte Carlo’s simulation. Thus, sample sets would be evaluated by its estimates and standard error of several key variables selected, such as employment status, school participation, internet access, health, and access to social welfare programs. Results found using linear approaches could estimate indicator accurately than other approaches, while it is less precise. Furthermore, estimates of the truncated approach is the most precise, and slightly different from others. Based on the distribution of weights, it may conclude that truncated and logit are more well distributed than others because we restrict the bound of g-weight and reduced the extreme value of weights. Therefore, a truncated approach is recommended in the weighting process for this survey data, followed by linear as another alternative approach.