64th ISI World Statistics Congress - Ottawa, Canada

64th ISI World Statistics Congress - Ottawa, Canada

Scale Simulation Random Variables for Inference from Longitudinal Microbiome Data

Conference

64th ISI World Statistics Congress - Ottawa, Canada

Format: IPS Abstract

Keywords: "bayesian, longitudinal, measurement error

Session: IPS 107 - New statistical methods for longitudinal microbiome data

Tuesday 18 July 2 p.m. - 3:40 p.m. (Canada/Eastern)

Abstract

In the analysis of microbiome data, it is well known that the total number of sequenced DNA molecules is unrelated to the scale (i.e., biological load) of microbes in the communities being studied. Recognizing this limitation, some authors state that the only information in these data are compositional (i.e., relative) and argue for the use of Compositional Data Analysis (CoDA). CoDA is an axiomatic system that asserts that all estimators applied to compositional data must satisfy three invariances to avoid "spurious conclusions". While CoDA concepts and its warnings have provided insights to the microbiome field, CoDA itself has a number of key limitations that hamper further progress. Most importantly, CoDAs axioms restrict the types of scientific questions that can be asked of microbiome data. Its most important axiom, the principle of scale invariance, functionally restricts researchers to only ask questions that are invariant to the unobserved scale of the microbial communities being studied. This precludes differential abundance analysis, correlation analysis, and many types of longitudinal analyses. We have recently introduced Scale Reliant Inference (SRI) as an alternative to CoDA that replaces its axiomatic foundation with more familiar statistical criteria such as consistency, calibration, and bias. Using SRI we have shown that CoDAs scale invariance axiom is too strong and that scale reliant questions can be answered so long as models carefully account for uncertainty stemming from the limitations of the observed data. Within SRI, Scale Simulation Random Variables (SSRVs) have emerged as a flexible and efficient framework for incorporating this type of uncertainty into analyses. In this talk I will review SRI and present SSRVs through their relationship with a special type of Bayesian model called a Bayesian Partially Identified Model. I will illustrate how SSRVs can provide a rigorous, flexible, and practical framework for analyzing longitudinal microbiome data and discuss avenues for further development.