crisprQTL mapping as a genome-wide association framework for cellular genetic screens, bioRxiv, 2018-05-04
AbstractExpression quantitative trait locus (eQTL) and genome-wide association studies (GWAS) are powerful paradigms for mapping the determinants of gene expression and organismal phenotypes, respectively. However, eQTL mapping and GWAS are limited in scope (to naturally occurring, common genetic variants) and resolution (by linkage disequilibrium). Here, we present crisprQTL mapping, a framework in which large numbers of CRISPRCas9 perturbations are introduced to each cell on an isogenic background, followed by single-cell RNA-seq (scRNA-seq). crisprQTL mapping is analogous to conventional human eQTL studies, but with individual humans replaced by individual cells; genetic variants replaced by unique combinations of ‘unlinked’ guide RNA (gRNA)-programmed perturbations per cell; and tissue-level RNA-seq of many individuals replaced by scRNA-seq of many cells. By randomly introducing gRNAs, a single population of cells can be leveraged to test for association between each perturbation and the expression of any potential target gene, analogous to how eQTL studies leverage populations of humans to test millions of genetic variants for associations with expression in a genome-wide manner. However, crisprQTL mapping is neither limited to naturally occurring, common genetic variants nor by linkage disequilibrium. As a proof-of-concept, we applied crisprQTL mapping to evaluate 1,119 candidate enhancers with no strong a priori hypothesis as to their target gene(s). Perturbations were made by a nuclease-dead Cas9 (dCas9) tethered to KRAB, and introduced at a mean ‘allele frequency’ of 1.1% into a population of 47,650 profiled human K562 cells (median of 15 gRNAs identified per cell). We tested for differential expression of all genes within 1 megabase (Mb) of each candidate enhancer, effectively evaluating 17,584 potential enhancer-target gene relationships within a single experiment. At an empirical false discovery rate (FDR) of 10%, we identify 128 cis crisprQTLs (11%) whose targeting resulted in downregulation of 105 nearby genes. crisprQTLs were strongly enriched for proximity to their target genes (median 34.3 kilobases (Kb)) and the strength of H3K27ac, p300, and lineage-specific transcription factor (TF) ChIP-seq peaks. Our results establish the power of the eQTL mapping paradigm as applied to programmed variation in populations of cells, rather than natural variation in populations of individuals. We anticipate that crisprQTL mapping will facilitate the comprehensive elucidation of the cis-regulatory architecture of the human genome.
biorxiv genomics 200-500-users 2018FMRIPrep a robust preprocessing pipeline for functional MRI, bioRxiv, 2018-04-26
Preprocessing of functional MRI (fMRI) involves numerous steps to clean and standardize data before statistical analysis. Generally, researchers create ad hoc preprocessing workflows for each new dataset, building upon a large inventory of tools available for each step. The complexity of these workflows has snowballed with rapid advances in MR data acquisition and image processing techniques. We introduce fMRIPrep, an analysis-agnostic tool that addresses the challenge of robust and reproducible preprocessing for task-based and resting fMRI data. FMRIPrep automatically adapts a best-in-breed workflow to the idiosyncrasies of virtually any dataset, ensuring high-quality preprocessing with no manual intervention. By introducing visual assessment checkpoints into an iterative integration framework for software-testing, we show that fMRIPrep robustly produces high-quality results on a diverse fMRI data collection comprising participants from 54 different studies in the OpenfMRI repository. We review the distinctive features of fMRIPrep in a qualitative comparison to other preprocessing workflows. We demonstrate that fMRIPrep achieves higher spatial accuracy as it introduces less uncontrolled spatial smoothness than commonly used preprocessing tools. FMRIPrep has the potential to transform fMRI research by equipping neuroscientists with a high-quality, robust, easy-to-use and transparent preprocessing workflow which can help ensure the validity of inference and the interpretability of their results.
biorxiv bioinformatics 200-500-users 2018Single-trial neural dynamics are dominated by richly varied movements, bioRxiv, 2018-04-25
When experts are immersed in a task, do their brains prioritize task-related activity? Most efforts to understand neural activity during well-learned tasks focus on cognitive computations and specific task-related movements. We wondered whether task-performing animals explore a broader movement landscape, and how this impacts neural activity. We characterized movements using video and other sensors and measured neural activity using widefield and two-photon imaging. Cortex-wide activity was dominated by movements, especially uninstructed movements, reflecting unknown priorities of the animal. Some uninstructed movements were aligned to trial events. Accounting for them revealed that neurons with similar trial-averaged activity often reflected utterly different combinations of cognitive and movement variables. Other movements occurred idiosyncratically, accounting for trial-by-trial fluctuations that are often considered “noise”. This held true for extracellular Neuropixels recordings in cortical and subcortical areas. Our observations argue that animals execute expert decisions while performing richly varied, uninstructed movements that profoundly shape neural activity.
biorxiv neuroscience 200-500-users 2018Spontaneous behaviors drive multidimensional, brain-wide activity, bioRxiv, 2018-04-22
Cortical responses to sensory stimuli are highly variable, and sensory cortex exhibits intricate spontaneous activity even without external sensory input. Cortical variability and spontaneous activity have been variously proposed to represent random noise, recall of prior experience, or encoding of ongoing behavioral and cognitive variables. Here, by recording over 10,000 neurons in mouse visual cortex, we show that spontaneous activity reliably encodes a high-dimensional latent state, which is partially related to the mouse’s ongoing behavior and is represented not just in visual cortex but across the forebrain. Sensory inputs do not interrupt this ongoing signal, but add onto it a representation of visual stimuli in orthogonal dimensions. Thus, visual cortical population activity, despite its apparently noisy structure, reliably encodes an orthogonal fusion of sensory and multidimensional behavioral information.
biorxiv neuroscience 200-500-users 2018Single cell RNA-seq denoising using a deep count autoencoder, bioRxiv, 2018-04-14
AbstractSingle-cell RNA sequencing (scRNA-seq) has enabled researchers to study gene expression at a cellular resolution. However, noise due to amplification and dropout may obstruct analyses, so scalable denoising methods for increasingly large but sparse scRNAseq data are needed. We propose a deep count autoencoder network (DCA) to denoise scRNA-seq datasets. DCA takes the count distribution, overdispersion and sparsity of the data into account using a zero-inflated negative binomial noise model, and nonlinear gene-gene or gene-dispersion interactions are captured. Our method scales linearly with the number of cells and can therefore be applied to datasets of millions of cells. We demonstrate that DCA denoising improves a diverse set of typical scRNA-seq data analyses using simulated and real datasets. DCA outperforms existing methods for data imputation in quality and speed, enhancing biological discovery.
biorxiv bioinformatics 200-500-users 2018Parameterizing neural power spectra, bioRxiv, 2018-04-11
AbstractElectrophysiological signals across species and recording scales exhibit both periodic and aperiodic features. Periodic oscillations have been widely studied and linked to numerous physiological, cognitive, behavioral, and disease states, while the aperiodic “background” 1f component of neural power spectra has received far less attention. Most analyses of oscillations are conducted on a priori, canonically-defined frequency bands without consideration of the underlying aperiodic structure, or verification that a periodic signal even exists in addition to the aperiodic signal. This is problematic, as recent evidence shows that the aperiodic signal is dynamic, changing with age, task demands, and cognitive state. It has also been linked to the relative excitationinhibition of the underlying neuronal population. This means that standard analytic approaches easily conflate changes in the periodic and aperiodic signals with one another because the aperiodic parameters—along with oscillation center frequency, power, and bandwidth—are all dynamic in physiologically meaningful, but likely different, ways. In order to overcome the limitations of traditional narrowband analyses and to reduce the potentially deleterious effects of conflating these features, we introduce a novel algorithm for automatic parameterization of neural power spectral densities (PSDs) as a combination of the aperiodic signal and putative periodic oscillations. Notably, this algorithm requires no a priori specification of band limits and accounts for potentially-overlapping oscillations while minimizing the degree to which they are confounded with one another. This algorithm is amenable to large-scale data exploration and analysis, providing researchers with a tool to quickly and accurately parameterize neural power spectra.
biorxiv neuroscience 200-500-users 2018