Accurate characterization of expanded tandem repeat length and sequence through whole genome long-read sequencing on PromethION, bioRxiv, 2018-10-09

AbstractTandem repeats (TRs) can cause disease through their length, sequence motif interruptions, and nucleotide modifications. For many TRs, however, these features are very difficult - if not impossible - to assess, requiring low-throughput and labor-intensive assays. One example is a VNTR in ABCA7 for which we recently discovered that expanded alleles strongly increase risk of Alzheimer’s disease. Here, we investigated the potential of long-read whole genome sequencing to surmount these challenges, using the high-throughput PromethION platform from Oxford Nanopore Technologies. To overcome the limitations of conventional base calling and alignment, we developed an algorithm to study the TR size and sequence directly on raw PromethION current data.We report the long-read sequencing of multiple human genomes (n = 11) using only a single sequencing run and flow cell per individual. With the use of fresh DNA extractions, DNA shearing to approximately 20kb and size selection, we obtained an average output of 70 gigabases (Gb) per flow cell, corresponding to a 21x genome coverage, and a maximum yield of 98 Gb (30x genome coverage). All ABCA7 VNTR alleles, including expansions up to 10,000 bases, were spanned by long sequencing reads, validated by Southern blotting. Classical approaches of TR length estimation suffered from low accuracy, low precision, DNA strand effects andor inability to call pathogenic repeat expansions. In contrast, our novel NanoSatellite algorithm, which circumvents base calling by using dynamic time warping on raw PromethION current data, achieved more than 90% accuracy and high precision (5.6% relative standard deviation) of TR length estimation, and detected all clinically relevant repeat expansions. In addition, we identified alternative TR sequence motifs with high consistency, allowing determination of TR sequence and distinction of VNTR alleles with homozygous length.In conclusion, we validated the robustness of single-experiment whole genome long-read sequencing on PromethION, a prerequisite for application of long-read sequencing in the clinic. In addition, we outperformed Southern blotting, enabling improved characterization of the role of expanded ABCA7 VNTR alleles in Alzheimer’s disease, and opening new opportunities for TR research.

biorxiv genomics 0-100-users 2018

Analyses of Neanderthal introgression suggest that Levantine and southern Arabian populations have a shared population history, bioRxiv, 2018-10-09

AbstractObjectivesModern humans are thought to have interbred with Neanderthals in the Near East soon after modern humans dispersed out of Africa. This introgression event likely took place in either the Levant or southern Arabian depending on which dispersal route out of Africa was followed. In this study, we compare Neanderthal introgression in contemporary Levantine and southern Arabian populations to investigate Neanderthal introgression and to study Near Eastern population history.Materials and MethodsWe analyzed genotyping data on >400,000 autosomal SNPs from seven Levantine and five southern Arabian populations and compared those data to populations from around the world including Neanderthal and Denisovan genomes. We used f4 and D statistics to estimate and compare levels of Neanderthal introgression between Levantine, southern Arabian, and comparative global populations. We also identified 1,581 putative Neanderthal-introgressed SNPs within our dataset and analyzed their allele frequencies as a means to compare introgression patterns in Levantine and southern Arabian genomes.ResultsWe find that Levantine and southern Arabian populations have similar levels of Neanderthal introgression to each other but lower levels than other non-Africans. Furthermore, we find that introgressed SNPs have very similar allele frequencies in the Levant and southern Arabia, which indicates that Neanderthal introgression is similarly distributed in Levantine and southern Arabian genomes.DiscussionWe infer that the ancestors of contemporary Levantine and southern Arabian populations received Neanderthal introgression prior to separating from each other and that there has been extensive gene flow between these populations.

biorxiv genomics 0-100-users 2018

A mouse tissue atlas of small non-coding RNA, bioRxiv, 2018-09-29

SUMMARYSmall non-coding RNAs (ncRNAs) play a vital role in a broad range of biological processes both in health and disease. A comprehensive quantitative reference of small ncRNA expression would significantly advance our understanding of ncRNA roles in shaping tissue functions. Here, we systematically profiled the levels of five ncRNA classes (miRNA, snoRNA, snRNA, scaRNA and tRNA fragments) across eleven mouse tissues by deep sequencing. Using fourteen biological replicates spanning both sexes, we identified that ~ 30% of small ncRNAs are distributed across the body in a tissue-specific manner with some are also being sexually dimorphic. We found that miRNAs are subject to “arm switching” between healthy tissues and that tRNA fragments are retained within tissues in both a gene- and a tissue-specific manner. Out of eleven profiled tissues we confirmed that brain contains the largest number of unique small ncRNA transcripts, some of which were previously annotated while others are identified for the first time in this study. Furthermore, by combining these findings with single-cell ATAC-seq data, we were able to connect identified brain-specific ncRNA with their cell types of origin. These results yield the most comprehensive characterization of specific and ubiquitous small RNAs in individual murine tissues to date, and we expect that this data will be a resource for the further identification of ncRNAs involved in tissue-function in health and dysfunction in disease.HIGHLIGHTS<jatslist list-type=simple><jatslist-item>-An atlas of tissue levels of multiple small ncRNA classes generated from 14 biological replicates of both sexes across 11 tissues<jatslist-item><jatslist-item>-Distinct distribution patterns of miRNA arms and tRNA fragments across tissues suggest the existence of tissue-specific mechanisms of ncRNA cleavage and retention<jatslist-item><jatslist-item>-miRNA expression is sex specific in healthy tissues<jatslist-item><jatslist-item>-Small RNA-seq and scATAC-seq data integration produce a detailed map of cell-type specific ncRNA profiles in the mouse brain<jatslist-item>

biorxiv genomics 0-100-users 2018

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo