Multivariable association discovery in population-scale meta-omics studies.

Overview

abstract

It is challenging to associate features such as human health outcomes, diet, environmental conditions, or other metadata to microbial community measurements, due in part to their quantitative properties. Microbiome multi-omics are typically noisy, sparse (zero-inflated), high-dimensional, extremely non-normal, and often in the form of count or compositional measurements. Here we introduce an optimized combination of novel and established methodology to assess multivariable association of microbial community features with complex metadata in population-scale observational studies. Our approach, MaAsLin 2 (Microbiome Multivariable Associations with Linear Models), uses generalized linear and mixed models to accommodate a wide variety of modern epidemiological studies, including cross-sectional and longitudinal designs, as well as a variety of data types (e.g., counts and relative abundances) with or without covariates and repeated measurements. To construct this method, we conducted a large-scale evaluation of a broad range of scenarios under which straightforward identification of meta-omics associations can be challenging. These simulation studies reveal that MaAsLin 2's linear model preserves statistical power in the presence of repeated measures and multiple covariates, while accounting for the nuances of meta-omics features and controlling false discovery. We also applied MaAsLin 2 to a microbial multi-omics dataset from the Integrative Human Microbiome (HMP2) project which, in addition to reproducing established results, revealed a unique, integrated landscape of inflammatory bowel diseases (IBD) across multiple time points and omics profiles.

authors

Mallick, Himel
Rahnavard, Ali
McIver, Lauren J
Ma, Siyuan
Zhang, Yancong
Nguyen, Long H
Tickle, Timothy L
Weingart, George
Ren, Boyu
Schwager, Emma H
Chatterjee, Suvo
Thompson, Kelsey N
Wilkinson, Jeremy E
Subramanian, Ayshwarya
Lu, Yiren
Waldron, Levi
Paulson, Joseph N
Franzosa, Eric A
Bravo, Hector Corrada
Huttenhower, Curtis

publication date

November 16, 2021

published in

PLoS computational biology Journal

Research

keywords

Computational Biology
Gastrointestinal Microbiome
Multivariate Analysis

Identity

PubMed Central ID

PMC8714082

Scopus Document Identifier

85119837381

Digital Object Identifier (DOI)

10.1101/2020.08.31.261214

PubMed ID

34784344

Additional Document Info

has global citation frequency

261

volume

17

issue

11

VIVO Weill Cornell Medical College

Multivariable association discovery in population-scale meta-omics studies. Academic Article

Overview

abstract

authors

publication date

published in

Research

keywords

Identity

PubMed Central ID

Scopus Document Identifier

Digital Object Identifier (DOI)

PubMed ID

Additional Document Info

has global citation frequency

volume

issue