64th ISI World Statistics Congress - Ottawa, Canada

64th ISI World Statistics Congress - Ottawa, Canada

Strain genetic association studies within the human microbiome

Conference

64th ISI World Statistics Congress - Ottawa, Canada

Format: IPS Abstract

Session: IPS 107 - New statistical methods for longitudinal microbiome data

Tuesday 18 July 2 p.m. - 3:40 p.m. (Canada/Eastern)

Abstract

A rich ecosystem of statistical methods for microbial community epidemiology has evolved that can associate community features with phenotypes, covariates, and exposures across human (and other) populations. At the same time, computational methods for processing metagenomic sequences have improved, yielding increasingly precise features describing community taxa, functions, and strains. Due to the complexity of microbial genomics, however, quantitative methods for genetic epidemiology have not yet been adapted to analyze strain features in these contexts, particularly at scale. I will discuss a suite of statistical models developed to test several different ways in which microbial strain biology can be linked to population phenotypes, including 1) enriched (or depleted) gene variants, 2) nonrandom phylogenetic assortment (using Phylogenetic Generalized Linear Mixed Models, PGLMMs), and 3) strain-specific pathway carriage. These methods have been implemented in the R/Bioconductor package anpan, which has been validated using a variety of synthetic community datasets and applied to identify strain variants consistently associated with colorectal cancer (CRC) in the largest CRC meta-analysis to date (~1,500 metagenomes spanning ~750 subjects from 10 studies).