Evaluating potential drug targets through human loss-of-function genetic variation, bioRxiv, 2019-01-29
AbstractHuman genetics has informed the clinical development of new drugs, and is beginning to influence the selection of new drug targets. Large-scale DNA sequencing studies have created a catalogue of naturally occurring genetic variants predicted to cause loss of function in human genes, which in principle should provide powerful in vivo models of human genetic “knockouts” to complement model organism knockout studies and inform drug development. Here, we consider the use of predicted loss-of-function (pLoF) variation catalogued in the Genome Aggregation Database (gnomAD) for the evaluation of genes as potential drug targets. Many drug targets, including the targets of highly successful inhibitors such as aspirin and statins, are under natural selection at least as extreme as known haploinsufficient genes, with pLoF variants almost completely depleted from the population. Thus, metrics of gene essentiality should not be used to eliminate genes from consideration as potential targets. The identification of individual humans harboring “knockouts” (biallelic gene inactivation), followed by individual recall and deep phenotyping, is highly valuable to study gene function. In most genes, pLoF alleles are sufficiently rare that ascertainment will be largely limited to heterozygous individuals in outbred populations. Sampling of diverse bottlenecked populations and consanguineous individuals will aid in identification of total “knockouts”. Careful filtering and curation of pLoF variants in a gene of interest is necessary in order to identify true LoF individuals for follow-up, and the positional distribution or frequency of true LoF variants may reveal important disease biology. Our analysis suggests that the value of pLoF variant data for drug discovery lies in deep curation informed by the nature of the drug and its indication, as well as the biology of the gene, followed by recall-by-genotype studies in targeted populations.
biorxiv genomics 100-200-users 2019Facilitating open-science with realistic fMRI simulation validation and application, bioRxiv, 2019-01-29
Background With advances in methods for collecting and analyzing fMRI data, there is a concurrent need to understand how to reliably evaluate and optimally use these methods. Simulations of fMRI data can aid in both the evaluation of complex designs and the analysis of data. New Method We present fmrisim, a new Python package for standardized, realistic simulation of fMRI data. This package is part of BrainIAK a recently released open-source Python toolbox for advanced neuroimaging analyses. We describe how to use fmrisim to extract noise properties from real fMRI data and then create a synthetic dataset with matched noise properties and a user-specified signal. Results We validate the noise generated by fmrisim to show that it can approximate the noise properties of real data. We further show how fmrisim can help researchers find the optimal design in terms of power. Comparison with other methods fmrisim ports the functionality of other packages to the Python platform while extending what is available in order to make it seamless to simulate realistic fMRI data. Conclusions The fmrisim package holds promise for improving the design of fMRI experiments, which may facilitate both the pre-registration
biorxiv neuroscience 0-100-users 2019Regular cycling between representations of alternatives in the hippocampus, bioRxiv, 2019-01-29
Cognitive faculties such as imagination, planning, and decision-making require the ability to represent alternative scenarios. In animals, split-second decision-making implies that the brain can represent alternatives at a commensurate speed. Yet despite this insight, it has remained unknown whether there exists neural activity that can consistently represent alternatives in <1 s. Here we report that neural activity in the hippocampus, a brain structure vital to cognition, can regularly cycle between representations of alternative locations (bifurcating paths in a maze) at 8 Hz. This cycling dynamic was paced by the internally generated 8 Hz theta rhythm, often occurred in the absence of overt deliberative behavior, and unexpectedly also governed an additional hippocampal representation defined by alternatives (heading direction). These findings implicate a fast, regular, and generalized neural mechanism underlying the representation of competing possibilities.
biorxiv neuroscience 100-200-users 2019Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes Supplementary Information, bioRxiv, 2019-01-29
Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption genes critical for an organism's function will be depleted for such variants in natural populations, while non-essential genes will tolerate their accumulation. However, predicted loss-of-function (pLoF) variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes. Here, we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence pLoF variants in this cohort after filtering for sequencing and annotation artifacts. Using an improved model of human mutation, we classify human protein-coding genes along a spectrum representing intolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve gene discovery power for both common and rare diseases.
biorxiv genomics 500+-users 2019Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes, bioRxiv, 2019-01-29
SummaryGenetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption genes critical for an organism’s function will be depleted for such variants in natural populations, while non-essential genes will tolerate their accumulation. However, predicted loss-of-function (pLoF) variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes. Here, we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence pLoF variants in this cohort after filtering for sequencing and annotation artifacts. Using an improved human mutation rate model, we classify human protein-coding genes along a spectrum representing tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve gene discovery power for both common and rare diseases.
biorxiv genomics 500+-users 2019Activity-by-Contact model of enhancer specificity from thousands of CRISPR perturbations, bioRxiv, 2019-01-27
Mammalian genomes harbor millions of noncoding elements called enhancers that quantitatively regulate gene expression, but it remains unclear which enhancers regulate which genes. Here we describe an experimental approach, based on CRISPR interference, RNA FISH, and flow cytometry (CRISPRi-FlowFISH), to perturb enhancers in the genome, and apply it to test >3,000 potential regulatory enhancer-gene connections across multiple genomic loci. A simple equation based on a mechanistic model for enhancer function performed remarkably well at predicting the complex patterns of regulatory connections we observe in our CRISPR dataset. This Activity-by-Contact (ABC) model involves multiplying measures of enhancer activity and enhancer-promoter 3D contacts, and can predict enhancer-gene connections in a given cell type based on chromatin state maps. Together, CRISPRi-FlowFISH and the ABC model provide a systematic approach to map and predict which enhancers regulate which genes, and will help to interpret the functions of the thousands of disease risk variants in the noncoding genome.
biorxiv genetics 200-500-users 2019