64th ISI World Statistics Congress - Ottawa, Canada

64th ISI World Statistics Congress - Ottawa, Canada

Penalized Mixture Cure Models for Modeling a Time-to-Event Outcome with Long-term Survivors in a High-Dimensional Covariate Space


Kellie J. Archer


  • H
    Han Fu


64th ISI World Statistics Congress - Ottawa, Canada

Format: CPS Abstract

Keywords: "survival, cure, genomic

Session: CPS 61 - High-dimensional statistics

Tuesday 18 July 5:30 p.m. - 6:30 p.m. (Canada/Eastern)


Treatment decisions for patients diagnosed with acute myeloid leukemia (AML) are often based on cytogenetics and selected genetic mutations. However, approximately 40% of AML patients are cytogenetically normal (CN). While the European LeukemiaNet (ELN) prognostic risk classification additionally refines this group of patients into favorable, intermediate, and adverse risk groups, some cytogenetically normal patients will enjoy long-term relapse-free survival despite their ELN classification. In such cases where an important subset of patients will not experience the event of interest, assumptions of the Cox proportional hazards (PH) model are violated. Thus, mixture cure models (MCMs) are an appropriate alternative to the Cox PH model when an important cured fraction exists. Specifically, MCMs assume the population consists of two subgroups, those cured and those susceptible to the event of interest, thus there are two regression components, which permit identification of features associated with cure and/or latency of susceptible patients. Because for CN-AML we are interested in relapse-free survival here ‘cured’ is synonymous with attaining long-term relapse-free survival and thus patients have a survival probability of 1. However, novel methods are needed to fit multivariable MCMs when the number of covariates exceeds the sample size, such as in the case of having high-throughput genomic assay data comprising the covariate space. Therefore, to identify prognostically relevant transcripts from high-throughput genomic assays and a multivariable model that can distinguish patients cured from patients susceptible with lower- or higher-risk of relapse we developed parametric and semi-parametric regularized mixture cure models (MCM) that embed false discovery rate control. We examined the performance of our regularized MCMs using extensive simulation studies and compared them to regularized Cox PH model, regularized Weibull model, and to two existing MCM approaches: Cmix and sign consistency in cure rate models (SCinCRM). We then applied our regularized MCMs to a CN-AML dataset. First, we fit univariable MCMs to identify baseline demographic, clinical features, or selected gene mutations related to the probability of being cured and/or to the latency distribution (time to relapse). We then included gene expression values as candidate covariates in our novel regularized MCM to identify a parsimonious list of transcripts associated with cure or latency. An independent CN-AML dataset was used to validate the transcripts identified by our model. Our regularized MCM identified transcripts associated with cure and latency. Kaplan-Meier curves of cured versus susceptible patients as well as of those susceptible with lower vs higher risk of relapse or death were well separated. In conclusion, our regularized MCMs identified important subsets of genes associated with cure and latency in CN-AML patients. Our results suggest that this group includes distinct transcriptionally defined subgroups with different biological properties, which may be useful for refining current risk stratification systems and indicate who might be cured with chemotherapy alone versus referred for more aggressive therapies.