Precise temporal regulation of alternative splicing during neural development, bioRxiv, 2018-01-15
AbstractAlternative splicing (AS) is a crucial step of gene expression that must be tightly controlled, but the precise timing of dynamic splicing switches during neural development and the underlying regulatory mechanisms are poorly understood. Here we systematically analyzed the temporal regulation of AS in a large number of transcriptome profiles of developing mouse cortices, in vivo purified neuronal subtypes, and neurons differentiated in vitro. Our analysis revealed early- and late-switch exons in genes with distinct functions, and these switches accurately define neuronal maturation stages. Integrative modeling suggests that these switches are under direct and combinatorial regulation by distinct sets of neuronal RNA-binding proteins including Nova, Rbfox, Mbnl and Ptbp. Surprisingly, various neuronal subtypes in the sensory systems lack Nova andor Rbfox expression. These neurons retain the “immature” splicing program in early-switch exons, affecting numerous synaptic genes. These results provide new insights into the organization and regulation of the neurodevelopmental transcriptome.
biorxiv molecular-biology 0-100-users 2018A systematic review of Drosophila short-term-memory genetics meta-analysis reveals robust reproducibility, bioRxiv, 2018-01-14
AbstractGeneticists use olfactory conditioning in Drosophila to identify learning genes; however, little is known about how these genes are integrated into short-term memory (STM) pathways. Here, we investigated the hypothesis that the STM evidence base is weak. We performed systematic review and meta-analysis of the field. Using metrics to quantify variation between discovery articles and follow-up studies, we found that seven genes were both highly replicated, and highly reproducible. However, ~80% of STM genes have never been replicated. While only a few studies investigated interactions, the reviewed genes could account for >1000% memory. This large summed effect size could indicate irreproducibility, many shared pathways, or that current assay protocols lack the specificity needed to identify core plasticity genes. Mechanistic theories of memory will require the convergence of evidence from system, circuit, cellular, molecular, and genetic experiments; systematic data synthesis is an essential tool for integrated neuroscience.
biorxiv animal-behavior-and-cognition 0-100-users 2018Assembly of Long Error-Prone Reads Using Repeat Graphs, bioRxiv, 2018-01-13
ABSTRACTThe problem of genome assembly is ultimately linked to the problem of the characterization of all repeat families in a genome as a repeat graph. The key reason the de Bruijn graph emerged as a popular short read assembly approach is because it offered an elegant representation of all repeats in a genome that reveals their mosaic structure. However, most algorithms for assembling long error-prone reads use an alternative overlap-layout-consensus (OLC) approach that does not provide a repeat characterization. We present the Flye algorithm for constructing the A-Bruijn (assembly) graph from long error-prone reads, that, in contrast to the k-mer-based de Bruijn graph, assembles genomes using an alignment-based A-Bruijn graph. In difference from existing assemblers, Flye does not attempt to construct accurate contigs (at least at the initial assembly stage) but instead simply generates arbitrary paths in the (unknown) assembly graph and further constructs an assembly graph from these paths. Counter-intuitively, this fast but seemingly reckless approach results in the same graph as the assembly graph constructed from accurate contigs. Flye constructs (overlapping) contigs with possible assembly errors at the initial stage, combines them into an accurate assembly graph, resolves repeats in the assembly graph using small variations between various repeat instances that were left unresolved during the initial assembly stage, constructs a new, less tangled assembly graph based on resolved repeats, and finally outputs accurate contigs as paths in this graph. We benchmark Flye against several state-of-the-art Single Molecule Sequencing assemblers and demonstrate that it generates better or comparable assemblies for all analyzed datasets.
biorxiv bioinformatics 0-100-users 2018Accurate allele frequencies from ultra-low coverage pool-seq samples in evolve-and-resequence experiments, bioRxiv, 2018-01-12
AbstractEvolve-and-resequence (E+R) experiments leverage next-generation sequencing technology to track the allele frequency dynamics of populations as they evolve. While previous work has shown that adaptive alleles can be detected by comparing frequency trajectories from many replicate populations, this power comes at the expense of high-coverage (>100x) sequencing of many pooled samples, which can be cost-prohibitive. Here, we show that accurate estimates of allele frequencies can be achieved with very shallow sequencing depths (<5x) via inference of known founder haplotypes in small genomic windows. This technique can be used to efficiently estimate frequencies for any number of bi-allelic SNPs in populations of any model organism founded with sequenced homozygous strains. Using both experimentally-pooled and simulated samples of Drosophila melanogaster, we show that haplotype inference can improve allele frequency accuracy by orders of magnitude for up to 50 generations of recombination, and is robust to moderate levels of missing data, as well as different selection regimes. Finally, we show that a simple linear model generated from these simulations can predict the accuracy of haplotype-derived allele frequencies in other model organisms and experimental designs. To make these results broadly accessible for use in E+R experiments, we introduce HAF-pipe, an open-source software tool for calculating haplotype-derived allele frequencies from raw sequencing data. Ultimately, by reducing sequencing costs without sacrificing accuracy, our method facilitates E+R designs with higher replication and resolution, and thereby, increased power to detect adaptive alleles.
biorxiv evolutionary-biology 0-100-users 2018High accuracy haplotype-derived allele frequencies from ultra-low coverage pool-seq samples, bioRxiv, 2018-01-12
AbstractEvolve-and-resequence experiments leverage next-generation sequencing technology to track allele frequency dynamics of populations as they evolve. While previous work has shown that adaptive alleles can be detected by comparing frequency trajectories from many replicate populations, this power comes at the expense of high-coverage (>100x) sequencing of many pooled samples, which can be cost-prohibitive. Here we show that accurate estimates of allele frequencies can be achieved with very shallow sequencing depths (<5x) via inference of known founder haplotypes in small genomic windows. This technique can be used to efficiently estimate frequencies for any number of alleles in any model system. Using both experimentally-pooled and simulated samples of Drosophila melanogaster, we show that haplotype inference can improve allele frequency accuracy by orders of magnitude, and that high accuracy is maintained after up to 200 generations of recombination, even in the presence of missing data or incomplete founder knowledge. By reducing sequencing costs without sacrificing accuracy, our method enables analysis of samples from more time-points and replicates, increasing the statistical power to detect adaptive alleles.
biorxiv evolutionary-biology 0-100-users 2018Widespread and targeted gene expression by systemic AAV vectors Production, purification, and administration, bioRxiv, 2018-01-12
ABSTRACTWe recently developed novel AAV capsids for efficient and noninvasive gene transfer across the central and peripheral nervous systems. In this protocol, we describe how to produce and systemically administer AAV-PHP viruses to label andor genetically manipulate cells in the mouse nervous system and organs including the heart. The procedure comprises three separate stages AAV production, intravenous delivery, and evaluation of transgene expression. The protocol spans eight days, excluding the time required to assess gene expression, and can be readily adopted by laboratories with standard molecular and cell culture capabilities. We provide guidelines for experimental design and choosing the capsid, cargo, and viral dose appropriate for the experimental aims. The procedures outlined here are adaptable to diverse biomedical applications, from anatomical and functional mapping to gene expression, silencing, and editing.
biorxiv neuroscience 0-100-users 2018