High-throughput mapping of long-range neuronal projection using in situ sequencing, bioRxiv, 2018-04-04
SummaryUnderstanding neural circuits requires deciphering interactions among myriad cell types defined by spatial organization, connectivity, gene expression, and other properties. Resolving these cell types requires both single neuron resolution and high throughput, a challenging combination with conventional methods. Here we introduce BARseq, a multiplexed method based on RNA barcoding for mapping projections of thousands of spatially resolved neurons in a single brain, and relating those projections to other properties such as gene or Cre expression. Mapping the projections to 11 areas of 3579 neurons in mouse auditory cortex using BARseq confirmed the laminar organization of the three top classes (IT, PT-like and CT) of projection neurons. In depth analysis uncovered a novel projection type restricted almost exclusively to transcriptionally-defined subtypes of IT neurons. By bridging anatomical and transcriptomic approaches at cellular resolution with high throughput, BARseq can potentially uncover the organizing principles underlying the structure and formation of neural circuits.
biorxiv neuroscience 100-200-users 2018The organization of intracortical connections by layer and cell class in the mouse brain, bioRxiv, 2018-04-01
AbstractThe mammalian cortex is a laminar structure composed of many cell types densely interconnected in complex ways. Recent systematic efforts to map the mouse mesoscale connectome provide comprehensive projection data on interareal connections, but not at the level of specific cell classes or layers within cortical areas. We present here a significant expansion of the Allen Mouse Brain Connectivity Atlas, with ∼1,000 new axonal projection mapping experiments across nearly all isocortical areas in 49 Cre driver lines. Using 13 lines selective for cortical layer-specific projection neuron classes, we identify the differential contribution of each layerclass to the overall intracortical connectivity patterns. We find layer 5 (L5) projection neurons account for essentially all intracortical outputs. L23, L4, and L6 neurons contact a subset of the L5 cortical targets. We also describe the most common axon lamination patterns in cortical targets. Most patterns are consistent with previous anatomical rules used to determine hierarchical position between cortical areas (feedforward, feedback), with notable exceptions. While diverse target lamination patterns arise from every source layerclass, L23 and L4 neurons are primarily associated with feedforward type projection patterns and L6 with feedback. L5 has both feedforward and feedback projection patterns. Finally, network analyses revealed a modular organization of the intracortical connectome. By labeling interareal and intermodule connections as feedforward or feedback, we present an integrated view of the intracortical connectome as a hierarchical network.
biorxiv neuroscience 200-500-users 2018The Genomic Formation of South and Central Asia, bioRxiv, 2018-03-31
AbstractThe genetic formation of Central and South Asian populations has been unclear because of an absence of ancient DNA. To address this gap, we generated genome-wide data from 362 ancient individuals, including the first from eastern Iran, Turan (Uzbekistan, Turkmenistan, and Tajikistan), Bronze Age Kazakhstan, and South Asia. Our data reveal a complex set of genetic sources that ultimately combined to form the ancestry of South Asians today. We document a southward spread of genetic ancestry from the Eurasian Steppe, correlating with the archaeologically known expansion of pastoralist sites from the Steppe to Turan in the Middle Bronze Age (2300-1500 BCE). These Steppe communities mixed genetically with peoples of the Bactria Margiana Archaeological Complex (BMAC) whom they encountered in Turan (primarily descendants of earlier agriculturalists of Iran), but there is no evidence that the main BMAC population contributed genetically to later South Asians. Instead, Steppe communities integrated farther south throughout the 2nd millennium BCE, and we show that they mixed with a more southern population that we document at multiple sites as outlier individuals exhibiting a distinctive mixture of ancestry related to Iranian agriculturalists and South Asian hunter-gathers. We call this group Indus Periphery because they were found at sites in cultural contact with the Indus Valley Civilization (IVC) and along its northern fringe, and also because they were genetically similar to post-IVC groups in the Swat Valley of Pakistan. By co-analyzing ancient DNA and genomic data from diverse present-day South Asians, we show that Indus Periphery-related people are the single most important source of ancestry in South Asia—consistent with the idea that the Indus Periphery individuals are providing us with the first direct look at the ancestry of peoples of the IVC—and we develop a model for the formation of present-day South Asians in terms of the temporally and geographically proximate sources of Indus Periphery-related, Steppe, and local South Asian hunter-gatherer-related ancestry. Our results show how ancestry from the Steppe genetically linked Europe and South Asia in the Bronze Age, and identifies the populations that almost certainly were responsible for spreading Indo-European languages across much of Eurasia.One Sentence SummaryGenome wide ancient DNA from 357 individuals from Central and South Asia sheds new light on the spread of Indo-European languages and parallels between the genetic history of two sub-continents, Europe and South Asia.
biorxiv genomics 500+-users 2018Bayesian Inference for a Generative Model of Transcriptome Profiles from Single-cell RNA Sequencing, bioRxiv, 2018-03-30
AbstractTranscriptome profiles of individual cells reflect true and often unexplored biological diversity, but are also affected by noise of biological and technical nature. This raises the need to explicitly model the resulting uncertainty and take it into account in any downstream analysis, such as dimensionality reduction, clustering, and differential expression. Here, we introduce Single-cell Variational Inference (scVI), a scalable framework for probabilistic representation and analysis of gene expression in single cells. Our model uses variational inference and stochastic optimization of deep neural networks to approximate the parameters that govern the distribution of expression values of each gene in every cell, using a non-linear mapping between the observations and a low-dimensional latent space.By doing so, scVI pools information between similar cells or genes while taking nuisance factors of variation such as batch effects and limited sensitivity into account. To evaluate scVI, we conducted a comprehensive comparative analysis to existing methods for distributional modeling and dimensionality reduction, all of which rely on generalized linear models. We first show that scVI scales to over one million cells, whereas competing algorithms can process at most tens of thousands of cells. Next, we show that scVI fits unseen data more closely and can impute missing data more accurately, both indicative of a better generalization capacity. We then utilize scVI to conduct a set of fundamental analysis tasks – including batch correction, visualization, clustering and differential expression – and demonstrate its accuracy in comparison to the state-of-the-art tools in each task. scVI is publicly available, and can be readily used as a principled and inclusive solution for multiple tasks of single-cell RNA sequencing data analysis.
biorxiv bioinformatics 0-100-users 2018A comprehensive toolkit to enable MinION sequencing in any laboratory, bioRxiv, 2018-03-27
AbstractLong-read sequencing technologies are transforming our ability to assemble highly complex genomes. Realising their full potential relies crucially on extracting high quality, high molecular weight (HMW) DNA from the organisms of interest. This is especially the case for the portable MinION sequencer which potentiates all laboratories to undertake their own genome sequencing projects, due to its low entry cost and minimal spatial footprint. One challenge of the MinION is that each group has to independently establish effective protocols for using the instrument, which can be time consuming and costly. Here we present a workflow and protocols that enabled us to establish MinION sequencing in our own laboratories, based on optimising DNA extractions from a challenging plant tissue as a case study. Following the workflow illustrated we were able to reliably and repeatedly obtain > 8.5 Gb of long read sequencing data with a mean read length of 13 kb and an N50 of 26 kb. Our protocols are open-source and can be performed in any laboratory without special equipment. We also illustrate some more elaborate workflows which can increase mean and average read lengths if this is desired. We envision that our workflow for establishing MinION sequencing, including the illustration of potential pitfalls, will be useful to others who plan to establish long-read sequencing in their own laboratories.
biorxiv genomics 500+-users 2018Label-free prediction of three-dimensional fluorescence images from transmitted light microscopy, bioRxiv, 2018-03-27
Understanding living cells as integrated systems, a challenge central to modern biology, is complicated by limitations of available imaging methods. While fluorescence microscopy can resolve subcellular structure in living cells, it is expensive, slow, and damaging to cells. Here, we present a label-free method for predicting 3D fluorescence directly from transmitted light images and demonstrate that it can be used to generate multi-structure, integrated images.
biorxiv cell-biology 100-200-users 2018