Poster Presentation 43rd Lorne Genome Conference 2022

Somatic point mutations are enriched in long non-coding RNAs with possible regulatory function in breast cancer (#118)

Narges Rezaei 1 , Masroor Bayati 2 , James Breen 3 , Hamid Rabiee 4 , Hamid Alinejad Rokny 5
  1. Center for Complex Biological Systems, University of California Irvine, California, California, Irvine, USA
  2. Department of Computer Sciecne, University of Toronto, Toronto, CANADA
  3. Robinson Research Institute, University of Adelaide, Adelaide, SA, Australia
  4. Computational Biology and Bioinformatics, Sharif University of Technology, IR, Tehran, IR
  5. BioMedical Machine Learning Lab, Graduate School of Biomedical Engineering, UNSW SYDNEY, Sydney, NSW, Australia

Non-coding RNAs (ncRNAs) form a large portion of the mammalian genome however, their biological functions are poorly characterized in cancers. In this study, using a newly developed tool, SomaGene, we analyze de novo somatic point mutations from the International Cancer Genome Consortium (ICGC) whole-genome sequencing data of 1,855 breast cancers. We identify 929 candidates of ncRNAs that are significantly and explicitly mutated in breast cancer samples. By integrating data from the ENCODE regulatory features and FANTOM5 expression atlas, we show that the candidate ncRNAs in breast cancer samples significantly enrich for active chromatin histone marks (1.9 times), CTCF binding sites (2.45 times), DNase accessibility (1.76 times), HMM predicted enhancers (2.26 times) and eQTL polymorphisms (1.77 times). Importantly, we show that the 929 ncRNAs contain a much higher level (3.64 times) of breast cancer-associated genome-wide association (GWAS) single nucleotide polymorphisms (SNPs) than genome-wide expectation. Such enrichment has not been seen with GWAS SNPs from other diseases. Using breast tissue related Hi-C data we then show that 82% of our candidate ncRNAs (1.9 times) significantly interact with the promoter of protein-coding genes, including previously known cancer-associated genes, suggesting the critical role for candidate ncRNA genes in activation of essential regulators of development and differentiation in breast cancer. We provide an extensive web-based resource (https://www.ihealthe.unsw.edu.au/research), to communicate our results with the research community. Our list of breast cancer-specific ncRNA genes has the potential to provide a better understanding of the underlying genetic causes of breast cancer. Lastly, the tool developed in this study can be used in the analysis of somatic mutations in all cancers.