High-throughput mapping of single neuron projections by sequencing of barcoded RNA, bioRxiv, 2016-05-23
SummaryNeurons transmit information to distant brain regions via long-range axonal projections. In the mouse, area-to-area connections have only been systematically mapped using bulk labeling techniques, which obscure the diverse projections of intermingled single neurons. Here we describe MAPseq (Multiplexed Analysis of Projections by Sequencing), a technique that can map the projections of thousands or even millions of single neurons by labeling large sets of neurons with random RNA sequences (barcodes). Axons are filled with barcode mRNA, each putative projection area isdissected, and the barcode mRNA is extracted and sequenced. Applying MAPseq to the locus coeruleus (LC), we find that individual LC neurons have preferred cortical targets. By recasting neuroanatomy, which is traditionallyviewed as a problem of microscopy, as a problem of sequencing, MAPseq harnesses advances in sequencing technology to permit high-throughput interrogation of brain circuits.
biorxiv neuroscience 0-100-users 2016LD Hub a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis, bioRxiv, 2016-05-04
AbstractMotivationLD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large sample sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously.ResultsIn this manuscript, we describe LD Hub – a centralized database of summary-level GWAS results for 177 diseasestraits from different publicly available resourcesconsortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traitsdiseases using LD Hub; and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies.Availability and implementationThe web interface and instructions for using LD Hub are available at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpldsc.broadinstitute.org>httpldsc.broadinstitute.org<jatsext-link>
biorxiv bioinformatics 0-100-users 2016plasmidSPAdes Assembling Plasmids from Whole Genome Sequencing Data, bioRxiv, 2016-04-16
ABSTRACTMotivationPlasmids are stably maintained extra-chromosomal genetic elements that replicate independently from the host cell’s chromosomes. Although plasmids harbor biomedically important genes, (such as genes involved in virulence and antibiotics resistance), there is a shortage of specialized software tools for extracting and assembling plasmid data from whole genome sequencing projects.ResultsWe present the plasmidSPAdes algorithm and software tool for assembling plasmids from whole genome sequencing data and benchmark its performance on a diverse set of bacterial genomes.Availability and implementationPLASMIDSPADES is publicly available at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpspades.bioinf.spbau.ruplasmidSPAdes>httpspades.bioinf.spbau.ruplasmidSPAdes<jatsext-link>Contactd.antipov@spbu.ru
biorxiv bioinformatics 0-100-users 2016Div-Seq A single nucleus RNA-Seq method reveals dynamics of rare adult newborn neurons in the CNS, bioRxiv, 2016-03-28
AbstractTranscriptomes of individual neurons provide rich information about cell types and dynamic states. However, it is difficult to capture rare dynamic processes, such as adult neurogenesis, because isolation from dense adult tissue is challenging, and markers for each phase are limited. Here, we developed Div-Seq, which combines Nuc-Seq, a scalable single nucleus RNA-Seq method, with EdU-mediated labeling of proliferating cells. We first show that Nuc-Seq can sensitively identify closely related cell types within the adult hippocampus. We apply Div-Seq to track transcriptional dynamics of newborn neurons in an adult neurogenic region in the hippocampus. Finally, we find rare adult newborn GABAergic neurons in the spinal cord, a non-canonical neurogenic region. Taken together, Nuc-Seq and Div-Seq open the way for unbiased analysis of any complex tissue.
biorxiv neuroscience 0-100-users 2016Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics, bioRxiv, 2016-03-24
AbstractScalable, integrative methods to understand mechanisms that link genetic variants with phenotypes are needed. Here we derive a mathematical expression to compute PrediXcan (a gene mapping approach) results using summary data (S-PrediXcan) and show its accuracy and general robustness to misspecified reference sets. We apply this framework to 44 GTEx tissues and 100+ phenotypes from GWAS and meta-analysis studies, creating a growing public catalog of associations that seeks to capture the effects of gene expression variation on human phenotypes. Replication in an independent cohort is shown. Most of the associations were tissue specific, suggesting context specificity of the trait etiology. Colocalized significant associations in unexpected tissues underscore the need for an agnostic scanning of multiple contexts to improve our ability to detect causal regulatory mechanisms. Monogenic disease genes are enriched among significant associations for related traits, suggesting that smaller alterations of these genes may cause a spectrum of milder phenotypes.
biorxiv bioinformatics 0-100-users 2016Topologically associated domains are ancient features that coincide with Metazoan clusters of extreme noncoding conservation, bioRxiv, 2016-03-16
AbstractIn vertebrates and other Metazoa, developmental genes are found surrounded by dense clusters of highly conserved noncoding elements (CNEs). CNEs exhibit extreme levels of sequence conservation of unexplained origin, with many acting as long-range enhancers during development. Clusters of CNEs, termed genomic regulatory blocks (GRBs), define the span of regulatory interactions for many important developmental regulators. The function and genomic distribution of these elements close to important regulatory genes raises the question of how they relate to the 3D conformation of these loci. We show that GRBs, defined using clusters of CNEs, coincide strongly with the patterns of topological organisation in metazoan genomes, predicting the boundaries of topologically associating domains (TADs) at hundreds of loci. The set of TADs that are associated with high levels of non-coding conservation exhibit distinct properties compared to TADs called in chromosomal regions devoid of extreme non-coding conservation. The correspondence between GRBs and TADs suggests that TADs around developmental genes are ancient, slowly evolving genomic structures, many of which have had conserved spans for hundreds of millions of years. This relationship also explains the difference in TAD numbers and sizes between genomes. While the close correspondence between extreme conservation and the boundaries of this subset of TADs does not reveal the mechanism leading to the conservation of these elements, it provides a functional framework for studying the role of TADs in long-range transcriptional regulation.
biorxiv genomics 0-100-users 2016