64th ISI World Statistics Congress - Ottawa, Canada

64th ISI World Statistics Congress - Ottawa, Canada

Statistical Methods for Complex Data Obtained from Administrative Health Databases


Dr Aya Mitani


  • WL
    Prof. Wendy Lou
  • JH
    Joan Hu
  • EP
    Eleanor Pullenayegum
  • KL
    Dr Kuan Liu
  • ZL
    Zihang Lu
  • AM
    Aya Mitani
  • Category: International Statistical Institute


    JH: There has been increasing interest in utilizing administrative health data to achieve various scientific goals. Truncation issues arising from analysis of administrative health records are particularly challenging. We present our recent projects on tackling a zero-truncation issue by integrating zero-truncated recurrent events data with available population-based demographic information. We employ the mental health related emergency department visits generated from an administrative cohort to motivate and illustrate the statistical analysis under loosely structured models. This presentation is based on joint work with Angela Chen (Simon Fraser University), Rhonda Rosychuk (University of Alberta), and Yi Xiong (University of Manitoba).

    EP: Longitudinal data collected as part of usual healthcare delivery are becoming increasingly available for research through electronic health records. However, a common feature of these data is that they are collected more frequently when patients are unwell. For example, newborns who are slow to regain their birthweight will require more frequent monitoring and will consequently have more weight measurements than their typically growing counterparts. Failing to account for this would lead to underestimation of the rate of growth of the population of newborns as a whole. I will discuss approaches to handling the informative nature of the observation, including recent developments to handle other data complexities such as clustering, causal inference, and variable selection.

    KL: Although administrative data are rich in information, key confounders might not be captured. Several Bayesian sensitivity analyses for unmeasured confounding have been developed that use bias parameters to capture the effect of unobserved confounders. However, these methods do not consider time-dependent unmeasured confounders. We will develop a parametric Bayesian sensitivity analysis that models time-dependent unmeasured confounders. We will formally define sequential ignorability assumption given the unobserved confounders and discuss identifiability of our model. We will apply our approach to study the effectiveness of oral vancomycin for pediatric primary sclerosing cholangitis using a multi-centre pediatric disease registry.

    ZL: Identifying disease phenotypes based on longitudinal traits is a common goal in biomedical study. Compared to clustering a single longitudinal trait, integrating multiple longitudinal traits allows additional information to be incorporated into the clustering process, which reveals co-existing longitudinal patterns and generates deeper biological insight. This talk will discuss a joint modeling approach for clustering multiple longitudinal traits using administrative data with complex structures. Results from analyzing real and simulated data will be presented and discussed.

    AM: Patients with periodontitis visit dental clinics routinely and multiple markers on each tooth are recorded at each visit. To characterize the progression of periodontal markers on each tooth, we extend the multistate model framework to account for informative cluster size by incorporating the within-cluster resampling method and cluster-weighted score function, from which we can obtain the marginal inference about the association of time to disease progression with subject-level covariates. We assess the performance of the proposed methods through simulation studies and apply them to the longitudinal data obtained from the Canadian Armed Forces Oral Health Database.