Pooled CRISPR screening with single-cell transcriptome read-out, bioRxiv, 2016-10-28
AbstractCRISPR-based genetic screens have revolutionized the search for new gene functions and biological mechanisms. However, widely used pooled screens are limited to simple read-outs of cell proliferation or the production of a selectable marker protein. Arrayed screens allow for more complex molecular read-outs such as transcriptome profiling, but they provide much lower throughput. Here we demonstrate CRISPR genome editing together with single-cell RNA sequencing as a new screening paradigm that combines key advantages of pooled and arrayed screens. This approach allowed us to link guide-RNA expression to the associated transcriptome responses in thousands of single cells using a straightforward and broadly applicable screening workflow.
biorxiv genomics 0-100-users 2016The successor representation in human reinforcement learning, bioRxiv, 2016-10-28
AbstractTheories of reward learning in neuroscience have focused on two families of algorithms, thought to capture deliberative vs. habitual choice. “Model-based” algorithms compute the value of candidate actions from scratch, whereas “model-free” algorithms make choice more efficient but less flexible by storing pre-computed action values. We examine an intermediate algorithmic family, the successor representation (SR), which balances flexibility and efficiency by storing partially computed action values predictions about future events. These pre-computation strategies differ in how they update their choices following changes in a task. SR’s reliance on stored predictions about future states predicts a unique signature of insensitivity to changes in the task’s sequence of events, but flexible adjustment following changes to rewards. We provide evidence for such differential sensitivity in two behavioral studies with humans. These results suggest that the SR is a computational substrate for semi-flexible choice in humans, introducing a subtler, more cognitive notion of habit.
biorxiv neuroscience 0-100-users 2016FIDDLE An integrative deep learning framework for functional genomic data inference, bioRxiv, 2016-10-19
AbstractNumerous advances in sequencing technologies have revolutionized genomics through generating many types of genomic functional data. Statistical tools have been developed to analyze individual data types, but there lack strategies to integrate disparate datasets under a unified framework. Moreover, most analysis techniques heavily rely on feature selection and data preprocessing which increase the difficulty of addressing biological questions through the integration of multiple datasets. Here, we introduce FIDDLE (Flexible Integration of Data with Deep LEarning) an open source data-agnostic flexible integrative framework that learns a unified representation from multiple data types to infer another data type. As a case study, we use multiple Saccharomyces cerevisiae genomic datasets to predict global transcription start sites (TSS) through the simulation of TSS-seq data. We demonstrate that a type of data can be inferred from other sources of data types without manually specifying the relevant features and preprocessing. We show that models built from multiple genome-wide datasets perform profoundly better than models built from individual datasets. Thus FIDDLE learns the complex synergistic relationship within individual datasets and, importantly, across datasets.
biorxiv bioinformatics 0-100-users 2016The hidden elasticity of avian and mammalian genomes, bioRxiv, 2016-10-17
AbstractGenome size in mammals and birds shows remarkably little interspecific variation compared to other taxa. Yet, genome sequencing has revealed that many mammal and bird lineages have experienced differential rates of transposable element (TE) accumulation, which would be predicted to cause substantial variation in genome size between species. Thus, we hypothesize that there has been co-variation between the amount of DNA gained by transposition and lost by deletion during mammal and avian evolution, resulting in genome size homeostasis. To test this model, we develop a computational pipeline to quantify the amount of DNA gained by TE expansion and lost by deletion over the last 100 million years (My) in the lineages of 10 species of eutherian mammals and 24 species of birds. The results reveal extensive variation in the amount of DNA gained via lineage-specific transposition, but that DNA loss counteracted this expansion to various extent across lineages. Our analysis of the rate and size spectrum of deletion events implies that DNA removal in both mammals and birds has proceeded mostly through large segmental deletions (>10 kb). These findings support a unified ‘accordion’ model of genome size evolution in eukaryotes whereby DNA loss counteracting TE expansion is a major determinant of genome size. Furthermore, we propose that extensive DNA loss, and not necessarily a dearth of TE activity, has been the primary force maintaining the greater genomic compaction of flying birds and bats relative to their flightless relatives.
biorxiv evolutionary-biology 0-100-users 2016Local genetic effects on gene expression across 44 human tissues, bioRxiv, 2016-09-10
AbstractExpression quantitative trait locus (eQTL) mapping provides a powerful means to identify functional variants influencing gene expression and disease pathogenesis. We report the identification of cis-eQTLs from 7,051 post-mortem samples representing 44 tissues and 449 individuals as part of the Genotype-Tissue Expression (GTEx) project. We find a cis-eQTL for 88% of all annotated protein-coding genes, with one-third having multiple independent effects. We identify numerous tissue-specific cis-eQTLs, highlighting the unique functional impact of regulatory variation in diverse tissues. By integrating large-scale functional genomics data and state-of-the-art fine-mapping algorithms, we identify multiple features predictive of tissue-specific and shared regulatory effects. We improve estimates of cis-eQTL sharing and effect sizes using allele specific expression across tissues. Finally, we demonstrate the utility of this large compendium of cis-eQTLs for understanding the tissue-specific etiology of complex traits, including coronary artery disease. The GTEx project provides an exceptional resource that has improved our understanding of gene regulation across tissues and the role of regulatory variation in human genetic diseases.
biorxiv genomics 0-100-users 2016Genes mirror migrations and cultures in prehistoric Europe – a population genomic perspective, bioRxiv, 2016-09-02
AbstractGenomic information from ancient human remains is beginning to show its full potential for learning about human prehistory. We review the last few years' dramatic finds about European prehistory based on genomic data from humans that lived many millennia ago and relate it to modern-day patterns of genomic variation. The early times, the Upper Palaeolithic, appears to contain several population turn-overs followed by more stable populations after the Last Glacial Maximum and during the Mesolithic. Some 11,000 years ago the migrations driving the Neolithic transition start from around Anatolia and reach the north and the west of Europe millennia later followed by major migrations during the Bronze age. These findings show that culture and lifestyle were major determinants of genomic differentiation and similarity in pre-historic Europe rather than geography as is the case today.
biorxiv evolutionary-biology 0-100-users 2016