Dynamic design manipulation of millisecond timescale motions on the energy landscape of Cyclophilin A, bioRxiv, 2018-12-09
AbstractProteins need to interconvert between many conformations in order to function, many of which are formed transiently, and sparsely populated. Particularly when the lifetimes of these states approach the millisecond timescale, identifying the relevant structures and the mechanism by which they inter-convert remains a tremendous challenge. Here we introduce a novel combination of accelerated MD (aMD) simulations and Markov State modelling (MSM) to explore these ‘excited’ conformational states. Applying this to the highly dynamic protein CypA, a protein involved in immune response and associated with HIV infection, we identify five principally populated conformational states and the atomistic mechanism by which they interconvert. A rational design strategy predicted that the mutant D66A should stabilise the minor conformations and substantially alter the dynamics whereas the similar mutant H70A should leave the landscape broadly unchanged. These predictions are confirmed using CPMG and R1ρ solution state NMR measurements. By accurately and reliably exploring functionally relevant, but sparsely populated conformations with milli-second lifetimes in silico, our aMDMSM method has tremendous promise for the design of dynamic protein free energy landscapes for both protein engineering and drug discovery.
biorxiv biophysics 0-100-users 2018Chemogenetic ligands for translational neurotheranostics, bioRxiv, 2018-12-08
AbstractDesigner Receptors Exclusively Activated by Designer Drugs (DREADDs) are a popular chemogenetic technology for manipulation of neuronal activity in uninstrumented awake animals with potential for precision medicine-based clinical theranostics. DREADD ligands developed to date are not appropriate for such translational applications. The prototypical DREADD agonist clozapine N-oxide (CNO) lacks brain entry and converts to clozapine. The second-generation DREADD agonist, Compound 21 (C21), was developed to overcome these limitations. We found that C21 has low brain penetrance, weak affinity, and low in vivo DREADD occupancy. To address these drawbacks, we developed two new DREADD agonists, JHU37152 and JHU37160, and the first dedicated positron emission tomography (PET) DREADD radiotracer, [18F]JHU37107. JHU37152 and JHU37160 exhibit high in vivo DREADD potency. [18F]JHU37107 combined with PET allows for DREADD detection in locally-targeted neurons and at their long-range projections, enabling for the first time, noninvasive and longitudinal neuronal projection mapping and potential for neurotheranostic applications.
biorxiv neuroscience 0-100-users 2018Fast and accurate large multiple sequence alignments using root-to-leave regressive computation, bioRxiv, 2018-12-08
AbstractInferences derived from large multiple alignments of biological sequences are critical to many areas of biology, including evolution, genomics, biochemistry, and structural biology. However, the complexity of the alignment problem imposes the use of approximate solutions. The most common is the progressive algorithm, which starts by aligning the most similar sequences, incorporating the remaining ones following the order imposed by a guide-tree. We developed and validated on protein sequences a regressive algorithm that works the other way around, aligning first the most dissimilar sequences. Our algorithm produces more accurate alignments than non-regressive methods, especially on datasets larger than 10,000 sequences. By design, it can run any existing alignment method in linear time thus allowing the scale-up required for extremely large genomic analyses.One Sentence SummaryInitiating alignments with the most dissimilar sequences allows slow and accurate methods to be used on large datasets
biorxiv bioinformatics 200-500-users 2018Models of archaic admixture and recent history from two-locus statistics, bioRxiv, 2018-12-08
AbstractWe learn about population history and underlying evolutionary biology through patterns of genetic polymorphism. Many approaches to reconstruct evolutionary histories focus on a limited number of informative statistics describing distributions of allele frequencies or patterns of linkage disequilibrium. We show that many commonly used statistics are part of a broad family of two-locus moments whose expectation can be computed jointly and rapidly under a wide range of scenarios, including complex multi-population demographies with continuous migration and admixture events. A full inspection of these statistics reveals that widely used models of human history fail to predict simple patterns of linkage disequilibrium. To jointly capture the information contained in classical and novel statistics, we implemented a tractable likelihood-based inference framework for demographic history. Using this approach, we show that human evolutionary models that include archaic admixture in Africa, Asia, and Europe provide a much better description of patterns of genetic diversity across the human genome. We estimate that an unidentified, deeply diverged population admixed with modern humans within Africa both before and after the split of African and Eurasian populations, contributing 4 - 8% genetic ancestry to individuals in world-wide populations.Author SummaryThroughout human history, populations have expanded and contracted, split and merged, and ex-changed migrants. Because these events affected genetic diversity, we can learn about human history by comparing predictions from evolutionary models to genetic data. Here, we show how to rapidly compute such predictions for a wide range of diversity measures within and across populations under complex demographic scenarios. While widely used models of human history accurately predict common measures of diversity, we show that they strongly underestimate the co-occurence of low frequency mutations within human populations in Asia, Europe, and Africa. Models allowing for archaic admixture, the relatively recent mixing of human populations with deeply diverged human lineages, resolve this discrepancy. We use such models to infer demographic models that include both recent and ancient features of human history. We recover the well-characterized admixture of Neanderthals in Eurasian populations, as well as admixture from an as-yet unknown diverged human population within Africa, further suggesting that admixture with deeply diverged lineages occurred multiple times in human history. By simultaneously testing model predictions for a broad range of diversity statistics, we can assess the robustness of common evolutionary models, identify missing historical events, and build more informed models of human demography.
biorxiv genetics 100-200-users 2018The consensus molecular classification of muscle-invasive bladder cancer, bioRxiv, 2018-12-08
AbstractMuscle-Invasive Bladder Cancer (MIBC) is a molecularly diverse disease with heterogeneous clinical outcomes. Several molecular classifications have been proposed, yielding diverse sets of subtypes, which hampers the clinical implications of such knowledge. Here, we report the results of a large international effort to reach a consensus on MIBC molecular subtypes. Using 1750 MIBC transcriptomes and a network-based analysis of six independent MIBC classification systems, we identified a consensus set of six molecular classes Luminal Papillary (24%), Luminal Non-Specified (8%), Luminal Unstable (15%), Stroma-rich (15%), BasalSquamous (35%), and Neuroendocrine-like (3%). These consensus classes differ regarding underlying oncogenic mechanisms, infiltration by immune and stromal cells, and histological and clinical characteristics. This consensus system offers a robust framework that will enable testing and validating predictive biomarkers in future clinical trials.
biorxiv cancer-biology 100-200-users 2018The genomic and proteomic landscape of the rumen microbiome revealed by comprehensive genome-resolved metagenomics, bioRxiv, 2018-12-08
AbstractRuminants provide essential nutrition for billions of people worldwide. The rumen is a specialised stomach adapted to the breakdown of plant-derived complex polysaccharides, and collectively the rumen microbiota encode the thousands of enzymes responsible. Here we present a comprehensive analysis of over 6.5 terabytes of Illumina and Nanopore sequence data, including assembly of 4941 metagenome-assembled genomes, and several single-contig, whole-chromosome assemblies of novel rumen bacteria. We also present the largest dataset of predicted proteins from the rumen, and provide rich annotation against public datasets. Together these data will form an essential part of future studies of rumen microbiome structure and function.
biorxiv microbiology 100-200-users 2018