Poster Presentation 43rd Lorne Genome Conference 2022

Engineered barcode iPSCs coupled with unsupervised data analysis pipelines for scalable analysis of multi-lineage cell differentiation (#212)

Sophie Shen 1 2 , Yuliangzi Sun 1 2 , Woo Jun Shim 1 2 , Tessa Werner 1 2 , Stacey Andersen 1 2 , Samuel Lukowski 3 , Han Sheng Chiu 1 2 , Di Xia 1 2 , Xiaoli Chen 1 2 , Joseph Powell 4 5 , Quan Nguyen 1 2 , Nathan Palpant 1 2
  1. University of Queensland, St. Lucia, QLD, Australia
  2. Institute for Molecular Bioscience, Brisbane, Queensland, Australia
  3. Boehringer Ingelheim, Vienna, Austria
  4. Garvan-Weizmann Centre for Cellular Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
  5. University of New South Wales Cellular Genomics Futures Institute, Sydney, New South Wales, Australia

This study develops a cost-effective and versatile cell multiplexing platform with an unsupervised computational data analysis pipeline to upscale data generation and knowledge gain into mechanisms controlling cell differentiation. Using CRISPR gene editing, we engineered transcribed barcodes into the AAVS1 locus in WTC induced pluripotent stem cells (iPSCs), enabling parallel analysis of up to 20 isogenic cell lines without additional processing prior to capture for single-cell RNA-seq. Pairing barcoded iPSCs and Cell Hashing for multiplexing, we performed 62 experimental permutations multiplexed into 4 single-cell reactions to capture 62,208 single cells differentiating from pluripotency in vitro over 8 timepoints and 9 different small molecule-mediated signalling perturbations with biological duplicates. These data reveal temporal and signalling-dependent mechanisms guiding differentiation of iPSCs into diverse mesendodermal cell types. Next, we designed a novel unsupervised data analysis pipeline to classify, cluster, and evaluate molecular control of cell differentiation. First, we use consortium-level epigenetic data to cluster cells based on expression of cell type-specific regulatory genes, identifying the most well-defined cell states to anchor biological diversity in the data. Second, we use label transferring to classify cell populations against benchmark in vivo multi-lineage single-cell developmental time course data. These two independent biological reference points provide a basis for classification, identifying 53 cell types spanning subpopulations of primordial germ cells, gastrulation-stage progenitor cells, as well as lateral plate and paraxial mesoderm, neural crest, and primitive gut tube. We then use NIH Epigenome Roadmap data to infer epigenetic co-modulation of genes, providing an unsupervised computational method to reveal structural and regulatory basis of individual cell types. Collectively, this study links new cell barcoding with unbiased computational methods to deconstruct molecular control of cell types, with applications in drug discovery, cell-cell interactions, organoid biology, disease modelling, and cell differentiation to accelerate knowledge into genetic control of cell decisions.