Poster Presentation 43rd Lorne Genome Conference 2022

Robust differential composition and variability analysis for multisample single-cell and microbiome data (#223)

Stefano Mangiola 1 , Castiel Zhao 1 , Heejung Shim 2 , Tony Papenfuss 1
  1. Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
  2. School of Mathematics and Statistics, The University of Melbourne, Parkville, VIC, Australia

Single-cell genomics, proteomics and metagenomics allow the unbiased characterisation of the composition of tissues and microorganism communities; which can be compared between conditions to identify biological drivers. This strategy has been critical to unveil markers of disease progression such as cancer and pathogen infection. Therefore, developing a robust statistical method for differential composition analyses from single-cell data is crucial for driving discoveries. However, available tools lack the ability to jointly model the compositional and count-based data properties or lack the flexibility and robustness toward overdispersion and outliers. Here we present sccomp, a flexible and robust Bayesian probabilistic framework for testing differential composition and variability of tissues and microbial communities from single-cell RNA sequencing, CyTOF and metagenomics data. This framework can transfer knowledge from a large set of integrated datasets to increase accuracy further in case of low data regimens. We show that the proposed method accurately fits real data, outperforming the state-of-the-art algorithms. Applying the proposed method to publicly available data sets, we show that it can identify novel compositional and variability associations, and quantify the impact of outlier observations.