Poster Presentation 43rd Lorne Genome Conference 2022

A chromosome-level reference genome for Telopea speciosissima (New South Wales waratah) provides insight into waratah evolution (#138)

Stephanie H Chen 1 2 , Jason G Bragg 1 , Richard J Edwards 2
  1. Australian Institute of Botanical Science, Royal Botanic Garden Sydney, Sydney, NSW, Australia
  2. University of New South Wales, Sydney, NSW, Australia

Telopea is an eastern Australian genus of five species of long-lived shrubs in the family Proteaceae. Previous work has characterised population structure and patterns of introgression between Telopea species. These studies were performed using a limited set of genetic markers, but point to the great potential of waratah as a model clade for understanding the processes of divergence, environmental adaptation and speciation, when enhanced by a genome-wide perspective enabled by a reference genome. However, few Proteaceae genomes and no waratah genomes are available. We assembled the first chromosome-level reference genome for T. speciosissima (New South Wales waratah; 2n = 22) using Nanopore long-reads, 10x Chromium linked-reads and Hi-C data. The assembly spans 823 Mb (scaffold N50 of 69.0 Mb) with 97.8 % of Embryophyta universal single-copy orthologues (BUSCOs; n = 1,614) complete. Read depth analysis of 140 ‘Duplicated’ BUSCO genes reveals that almost all are real duplications, increasing confidence in protein family analysis using annotated protein-coding genes, highlighting a possible need to revise the BUSCO set for this lineage. Genome annotation predicted 34,706 genes and pseudogenes, including 27,481 protein-coding genes. We examined the evolutionary dynamics of Telopea using the reference genome in conjunction with DArTseq (n = 244) and whole genome shotgun sequencing (n = 14) of each of the seven lineages; there are three lineages of T. speciosissima – coastal, upland and southern. Here, I will discuss the population structure and demographic history of the genus. We also examined phylogenomic relationships and developed a scalable method of rapidly generating species trees from short-read data to maximise the recovery of informative data from genomic datasets. The waratah reference genome represents an important new genomic resource in Proteaceae to accelerate our understanding of the origins and evolutionary dynamics of the Australian flora.