Accurate characterization of expanded tandem repeat length and sequence through whole genome long-read sequencing on PromethION, bioRxiv, 2018-10-09

AbstractTandem repeats (TRs) can cause disease through their length, sequence motif interruptions, and nucleotide modifications. For many TRs, however, these features are very difficult - if not impossible - to assess, requiring low-throughput and labor-intensive assays. One example is a VNTR in ABCA7 for which we recently discovered that expanded alleles strongly increase risk of Alzheimer’s disease. Here, we investigated the potential of long-read whole genome sequencing to surmount these challenges, using the high-throughput PromethION platform from Oxford Nanopore Technologies. To overcome the limitations of conventional base calling and alignment, we developed an algorithm to study the TR size and sequence directly on raw PromethION current data.We report the long-read sequencing of multiple human genomes (n = 11) using only a single sequencing run and flow cell per individual. With the use of fresh DNA extractions, DNA shearing to approximately 20kb and size selection, we obtained an average output of 70 gigabases (Gb) per flow cell, corresponding to a 21x genome coverage, and a maximum yield of 98 Gb (30x genome coverage). All ABCA7 VNTR alleles, including expansions up to 10,000 bases, were spanned by long sequencing reads, validated by Southern blotting. Classical approaches of TR length estimation suffered from low accuracy, low precision, DNA strand effects andor inability to call pathogenic repeat expansions. In contrast, our novel NanoSatellite algorithm, which circumvents base calling by using dynamic time warping on raw PromethION current data, achieved more than 90% accuracy and high precision (5.6% relative standard deviation) of TR length estimation, and detected all clinically relevant repeat expansions. In addition, we identified alternative TR sequence motifs with high consistency, allowing determination of TR sequence and distinction of VNTR alleles with homozygous length.In conclusion, we validated the robustness of single-experiment whole genome long-read sequencing on PromethION, a prerequisite for application of long-read sequencing in the clinic. In addition, we outperformed Southern blotting, enabling improved characterization of the role of expanded ABCA7 VNTR alleles in Alzheimer’s disease, and opening new opportunities for TR research.

biorxiv genomics 0-100-users 2018

Altered chromatin localization of hybrid lethality proteins in Drosophila, bioRxiv, 2018-10-09

AbstractUnderstanding hybrid incompatibilities is a fundamental pursuit in evolutionary genetics. In crosses between Drosophila melanogaster females and Drosophila simulans males, the interaction of at least three genes is necessary for hybrid male lethality Hmr mel, Lhr sim, and gfzf sim. All three hybrid incompatibility genes are chromatin associated factors. While HMR and LHR physically bind each other and function together in a single complex, the connection between either of these proteins and gfzf remains mysterious. Here, we investigate the allele specific chromatin binding patterns of gfzf. First, our cytological analyses show that there is little difference in protein localization of GFZF between the two species except at telomeric sequences. In particular, GFZF binds the telomeric retrotransposon repeat arrays, and the differential binding of GFZF at telomeres reflects the rapid changes in sequence composition at telomeres between D. melanogaster and D. simulans. Second, we investigate the patterns of GFZF and HMR co-localization and find that the two proteins do not normally co-localize in D. melanogaster. However, in inter-species hybrids, HMR shows extensive mis-localization to GFZF sites, and this altered localization requires the presence of gfzf sim. Third, we find by ChIP-Seq that over-expression of HMR and LHR within species is sufficient to cause HMR to mis-localize to GFZF binding sites, indicating that HMR has a natural low affinity for GFZF sites. Together, these studies provide the first insights into the different properties of gfzf between D. melanogaster and D. simulans as well as a molecular interaction between gfzf and Hmr in the form of altered protein localization.

biorxiv molecular-biology 0-100-users 2018

Analyses of Neanderthal introgression suggest that Levantine and southern Arabian populations have a shared population history, bioRxiv, 2018-10-09

AbstractObjectivesModern humans are thought to have interbred with Neanderthals in the Near East soon after modern humans dispersed out of Africa. This introgression event likely took place in either the Levant or southern Arabian depending on which dispersal route out of Africa was followed. In this study, we compare Neanderthal introgression in contemporary Levantine and southern Arabian populations to investigate Neanderthal introgression and to study Near Eastern population history.Materials and MethodsWe analyzed genotyping data on >400,000 autosomal SNPs from seven Levantine and five southern Arabian populations and compared those data to populations from around the world including Neanderthal and Denisovan genomes. We used f4 and D statistics to estimate and compare levels of Neanderthal introgression between Levantine, southern Arabian, and comparative global populations. We also identified 1,581 putative Neanderthal-introgressed SNPs within our dataset and analyzed their allele frequencies as a means to compare introgression patterns in Levantine and southern Arabian genomes.ResultsWe find that Levantine and southern Arabian populations have similar levels of Neanderthal introgression to each other but lower levels than other non-Africans. Furthermore, we find that introgressed SNPs have very similar allele frequencies in the Levant and southern Arabia, which indicates that Neanderthal introgression is similarly distributed in Levantine and southern Arabian genomes.DiscussionWe infer that the ancestors of contemporary Levantine and southern Arabian populations received Neanderthal introgression prior to separating from each other and that there has been extensive gene flow between these populations.

biorxiv genomics 0-100-users 2018

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo