Download PDF

Minimax optimal classification with missing data

Author

Timothy Cannings

Co-author

Torben Sell
Thomas Berrett

Conference

64th ISI World Statistics Congress - Ottawa, Canada

Format: IPS Abstract

Keywords: classification, missingness

Session: IPS 66 - Recent Advances in High-Dimensional Machine Learning and Inference

Tuesday 18 July 10 a.m. - noon (Canada/Eastern)

Abstract

We introduce a new nonparametric framework for classification problems in the presence of missing data. The key assumption of our framework is that the regression function decomposes into an ANOVA-type sum of orthogonal functions, of which some (or many) may be zero. Our main goal is to derive the minimax rate for the excess risk in our problem, which, in addition to the ANOVA decomposition assumption, depends on the smoothness of the regression function, a tail condition on the feature marginal distributions, and a margin assumption. The rate depends only on the largest dimension of the non-zero components of the decomposition and not the ambient data dimension, and can thus be somewhat faster than in the classical setting when using only complete cases. The classifier that obtains the upper bound is based on a careful application of a hard-thresholding estimator for each of the terms in the ANOVA decomposition, which allows us to estimate the zero components at a very fast rate, and a nearest neighbour based estimator is then used to estimate each of the non-zero components.