Meiosis is a process required during sexual reproduction which generates gamete (egg or sperm) cells that contain half of the chromosomes of the parental cell. Meiotic crossovers generate recombined chromosomes that are different from the paternal or maternal chromosomes and are tightly regulated. Crossover distributions are not uniform and vary among species, genders and even individuals. Single-cell DNA sequencing of gametes from an individual provides sufficient information to reconstruct the individual’s haplotypes at chromosome-scale and construct a personalised crossover profile. Being able to dissect the variations of meiotic crossovers in individuals reveals diverse crossover phenotypes and provides more opportunities in studying the fundamental mechanisms of meiosis regulation. However, the field lacks an efficient and specialised software tool for haplotype construction and crossover detection from single-cell DNA sequencing, limiting the applicability of these approaches.
Here we introduce a computational toolset that enables the profiling of personalised meiotic crossovers using single-cell DNA sequencing datasets of gametes in an efficient way. The two components in the toolset are orientated for different, complementary tasks. To process large genomic datasets, the toolset includes a command-line tool sgcocaller implemented using the programming language NIM and an accompanying R package comapr (https://github.com/ruqianl/comapr) for post-crossover-identification analysis such as comparative analysis among individuals. A phasing module that utilises the specific features of single-gamete DNA sequences has been included in sgcocaller and achieved individualised chromosome-scale haplotype construction even with sparse read coverage per gamete. For crossover detection, sgcocaller implements a Hidden Markov Model with binomial emission probabilities to model the observed allele read counts to find shifts in gametes’ haplotypes.
Our open-source computational tools will provide a fast and convenient workflow for constructing individual haplotypes and crossover profiles from single-cell gamete sequencing datasets directly, without requiring additional overhead preprocessing of the large alignment files.