Whole-genome deep learning analysis reveals causal role of noncoding mutations in autism, bioRxiv, 2018-05-11
AbstractWe address the challenge of detecting the contribution of noncoding mutations to disease with a deep-learning-based framework that predicts specific regulatory effects and deleterious disease impact of genetic variants. Applying this framework to 1,790 Autism Spectrum Disorder (ASD) simplex families reveals autism disease causality of noncoding mutations by demonstrating that ASD probands harbor transcriptional (TRDs) and post-transcriptional (RRDs) regulation-disrupting mutations of significantly higher functional impact than unaffected siblings. Importantly, we detect this significant noncoding contribution at each level, transcriptional and post-transcriptional, independently and after multiple hypothesis correction. Further analysis suggests involvement of noncoding mutations in synaptic transmission and neuronal development, and reveals a convergent genetic landscape of coding and noncoding (TRD and RRD) de novo mutations in ASD. We demonstrate that sequences carrying prioritized proband de novo mutations possess transcriptional regulatory activity and drive expression differentially, and highlight a link between noncoding mutations and IQ heterogeneity in ASD probands. Our predictive genomics framework illuminates the role of noncoding mutations in ASD, prioritizes high impact transcriptional and post-transcriptional regulatory mutations for further study, and is broadly applicable to complex human diseases.
biorxiv genomics 100-200-users 2018Massive single-cell RNA-seq analysis and imputation via deep learning, bioRxiv, 2018-05-06
Recent advances in large-scale single cell RNA-seq enable fine-grained characterization of phenotypically distinct cellular states within heterogeneous tissues. We present scScope, a scalable deep-learning based approach that can accurately and rapidly identify cell-type composition from millions of noisy single-cell gene-expression profiles.
biorxiv bioinformatics 0-100-users 2018Highly Multiplexed Single-Cell RNA-seq for Defining Cell Population and Transcriptional Spaces, bioRxiv, 2018-05-05
AbstractWe describe a universal sample multiplexing method for single-cell RNA-seq in which cells are chemically labeled with identifying DNA oligonucleotides. Analysis of a 96-plex perturbation experiment revealed changes in cell population structure and transcriptional states that cannot be discerned from bulk measurements, establishing a cost effective means to survey cell populations from large experiments and clinical samples with the depth and resolution of single-cell RNA-seq.
biorxiv genomics 200-500-users 2018Portraits of genetic intra-tumour heterogeneity and subclonal selection across cancer types, bioRxiv, 2018-05-05
SummaryOngoing cancer evolution gives rise to intra-tumour heterogeneity (ITH), which is a major mechanism of therapeutic resistance and therefore an important clinical challenge. However, the extent, origin and drivers of ITH across cancer types are poorly understood. Here, we extensively characterise ITH across 2,778 cancer whole genome sequences from 36 cancer types. We demonstrate that nearly all tumours (94.7%) with sufficient sequencing depth contain evidence of recent subclonal expansions, and that most cancer types show clear signs of positive selection in both clonal and subclonal protein coding variants. We find distinctive subclonal patterns of driver gene mutations, fusions, structural variation and copy-number alterations across cancer types. Dynamic, tumour type-specific changes of mutational processes between subclonal expansions shape differences between clonal and subclonal events. Our results underline the importance of ITH and its drivers in tumour evolution, and provide an unprecedented pan-cancer resource of extensively annotated subclonal events, laying a foundation for future cancer genomic studies.
biorxiv cancer-biology 100-200-users 2018CRISPR-Cas9 interference in cassava linked to the evolution of editing-resistant geminiviruses, bioRxiv, 2018-05-04
ABSTRACTWe used CRISPR-Cas9 in the staple food crop cassava with the aim of engineering resistance to African cassava mosaic virus, a member of a widespread and important family of plant-pathogenic DNA viruses. We found that between 33 and 48% of edited virus genomes evolved a conserved single-nucleotide mutation that confers resistance to CRISPR-Cas9 cleavage. Our study highlights the potential for virus escape from this technology. Care should be taken to design CRISPR-Cas9 experiments that minimize the risk of virus escape.
biorxiv plant-biology 100-200-users 2018crisprQTL mapping as a genome-wide association framework for cellular genetic screens, bioRxiv, 2018-05-04
AbstractExpression quantitative trait locus (eQTL) and genome-wide association studies (GWAS) are powerful paradigms for mapping the determinants of gene expression and organismal phenotypes, respectively. However, eQTL mapping and GWAS are limited in scope (to naturally occurring, common genetic variants) and resolution (by linkage disequilibrium). Here, we present crisprQTL mapping, a framework in which large numbers of CRISPRCas9 perturbations are introduced to each cell on an isogenic background, followed by single-cell RNA-seq (scRNA-seq). crisprQTL mapping is analogous to conventional human eQTL studies, but with individual humans replaced by individual cells; genetic variants replaced by unique combinations of ‘unlinked’ guide RNA (gRNA)-programmed perturbations per cell; and tissue-level RNA-seq of many individuals replaced by scRNA-seq of many cells. By randomly introducing gRNAs, a single population of cells can be leveraged to test for association between each perturbation and the expression of any potential target gene, analogous to how eQTL studies leverage populations of humans to test millions of genetic variants for associations with expression in a genome-wide manner. However, crisprQTL mapping is neither limited to naturally occurring, common genetic variants nor by linkage disequilibrium. As a proof-of-concept, we applied crisprQTL mapping to evaluate 1,119 candidate enhancers with no strong a priori hypothesis as to their target gene(s). Perturbations were made by a nuclease-dead Cas9 (dCas9) tethered to KRAB, and introduced at a mean ‘allele frequency’ of 1.1% into a population of 47,650 profiled human K562 cells (median of 15 gRNAs identified per cell). We tested for differential expression of all genes within 1 megabase (Mb) of each candidate enhancer, effectively evaluating 17,584 potential enhancer-target gene relationships within a single experiment. At an empirical false discovery rate (FDR) of 10%, we identify 128 cis crisprQTLs (11%) whose targeting resulted in downregulation of 105 nearby genes. crisprQTLs were strongly enriched for proximity to their target genes (median 34.3 kilobases (Kb)) and the strength of H3K27ac, p300, and lineage-specific transcription factor (TF) ChIP-seq peaks. Our results establish the power of the eQTL mapping paradigm as applied to programmed variation in populations of cells, rather than natural variation in populations of individuals. We anticipate that crisprQTL mapping will facilitate the comprehensive elucidation of the cis-regulatory architecture of the human genome.
biorxiv genomics 200-500-users 2018