Withdrawn 43rd Lorne Genome Conference 2022

Improved identification of cell-free droplets and damaged cells in single-cell RNA-seq data (#233)

Walter Muskovic 1 , Joseph E Powell 1
  1. Garvan Institute of Medical Research, Pyrmont, NSW, Australia
Advances in droplet-based single cell RNA-sequencing (scRNA-seq) have dramatically increased throughput, allowing tens of thousands of cells to be routinely sequenced in a single experiment. In addition to intact cells droplets capture cell-free "ambient" RNA and cell fragments. Dissociation of solid tissue can produce large quantities of ambient RNA, creating challenges in accurately distinguishing cell-containing droplets and droplets containing high concentrations of ambient RNA. Fragile cell types may also become damaged during dissociation. Current methods to separate these groups often retain a significant number of droplets that do not contain cells – so called empty droplets. Additional to the challenge of identifying empty drops, there are currently no methods available to detect droplets containing damaged cells, which comprise partially lysed cells – the original source of the ambient RNA. To address this problem we have created a new method DropletQC, that is able to detect empty droplets, damaged, and intact cells, and accurately distinguish them from one another. This approach is based on a novel quality control metric, the nuclear fraction, which quantifies for each droplet the fraction of RNA originating from unspliced, nuclear pre-mRNA. Ambient RNA consists predominantly of mature cytoplasmic mRNA. Hence, droplets that contain only ambient RNA have a low nuclear fraction compared to droplets containing cells. In contrast, damaged cells due to the depletion of cytoplasmic RNA, have a higher nuclear fraction compared to intact cells. Using this information, we are able to accurately distinguish between empty droplets, damaged cells, and intact cells. Applying the method to several heterogeneous scRNA-seq datasets, we demonstrate DropletQC provides a powerful extension to existing computational methods for identifying empty droplets. DropletQC has been implemented as an R package, which can be easily integrated into existing single cell analysis workflows.