The fundamental development programme of an embryo relies on a robust pattern of gene expression, where any variation may lead to catastrophic defects and disease. These regionalised spatial boundaries are vital in establishing the embryonic body plan; yet gene networks and components governing these boundaries remain cryptic. Genome-wide transcript analysis is required to understand these patterns. Recognition of the analytical power that can be generated by these approaches has sparked the emergence of the new field of spatial transcriptomics (ST) and rapidly growing library of specialised computational tools. However, mining of ST datasets remains a challenge due to their complexity. In a bid to increase our capability to analyse ST datasets, here we present a new computational tool to interrogate publicly available Tomo-seq data generated by Junker et al. in 2014 of the developing Danio rerio embryo. We applied a novel hierarchical clustering algorithm to identify and characterise unannotated genes in the lateral plate mesoderm (LPM) domain in 3D. We selected the LPM as, although it contributes to tissues as diverse as the heart and kidneys, its cellular composition remains unclear.
Using a novel adaptive two-peak statistical model and unsupervised clustering techniques trained on previously known LPM markers, we systematically predicted new genetic elements contributing to gene networks deployed in LPM boundaries. Candidate predictions include 34 genes validated within the ZFIN database, and 65 novel genes with no prior annotation. Our computational approach successfully reconstructed detailed gene modules of expression within the developing LPM, providing new insights into the complex development of embryonic boundaries, and novel gene targets for further characterisation and study. Our study highlights the power of ST and the promise it holds in uncovering how embryonic spatial boundaries contribute to adult development.