BEAST 2.5 An Advanced Software Platform for Bayesian Evolutionary Analysis, bioRxiv, 2018-11-20
AbstractElaboration of Bayesian phylogenetic inference methods has continued at pace in recent years with major new advances in nearly all aspects of the joint modelling of evolutionary data. It is increasingly appreciated that some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and computationally. Advanced computational software packages that allow robust development of compatible (sub-)models which can be composed into a full model hierarchy have played a key role in these developments.Developing such software frameworks is increasingly a major scientific activity in its own right, and comes with specific challenges, from practical software design, development and engineering challenges to statistical and conceptual modelling challenges. BEAST 2 is one such computational software platform, and was first announced over 4 years ago. Here we describe a series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release.Author summaryBayesian phylogenetic inference methods have undergone considerable development in recent years, and joint modelling of rich evolutionary data, including genomes, phenotypes and fossil occurrences is increasingly common. Advanced computational software packages that allow robust development of compatible (sub-)models which can be composed into a full model hierarchy have played a key role in these developments. Developing scientific software is increasingly crucial to advancement in many fields of biology. The challenges range from practical software development and engineering, distributed team coordination, conceptual development and statistical modelling, to validation and testing. BEAST 2 is one such computational software platform for phylogenetics, population genetics and phylodynamics, and was first announced over 4 years ago. Here we describe the full range of new tools and models available on the BEAST 2.5 platform, which expand joint evolutionary inference in many new directions, especially for joint inference over multiple data types, non-tree models and complex phylodynamics.
biorxiv evolutionary-biology 100-200-users 2018Deep Neural Networks and Kernel Regression Achieve Comparable Accuracies for Functional Connectivity Prediction of Behavior and Demographics, bioRxiv, 2018-11-20
AbstractThere is significant interest in the development and application of deep neural networks (DNNs) to neuroimaging data. A growing literature suggests that DNNs outperform their classical counterparts in a variety of neuroimaging applications, yet there are few direct comparisons of relative utility. Here, we compared the performance of three DNN architectures and a classical machine learning algorithm (kernel regression) in predicting individual phenotypes from whole-brain resting-state functional connectivity (RSFC) patterns. One of the DNNs was a generic fully-connected feedforward neural network, while the other two DNNs were recently published approaches specifically designed to exploit the structure of connectome data. By using a combined sample of almost 10,000 participants from the Human Connectome Project (HCP) and UK Biobank, we showed that the three DNNs and kernel regression achieved similar performance across a wide range of behavioral and demographic measures. Furthermore, the generic feedforward neural network exhibited similar performance to the two state-of-the-art connectome-specific DNNs. When predicting fluid intelligence in the UK Biobank, performance of all algorithms dramatically improved when sample size increased from 100 to 1000 subjects. Improvement was smaller, but still significant, when sample size increased from 1000 to 5000 subjects. Importantly, kernel regression was competitive across all sample sizes. Overall, our study suggests that kernel regression is as effective as DNNs for RSFC-based behavioral prediction, while incurring significantly lower computational costs. Therefore, kernel regression might serve as a useful baseline algorithm for future studies.
biorxiv neuroscience 100-200-users 2018Distributed correlates of visually-guided behavior across the mouse brain, bioRxiv, 2018-11-20
Behavior arises from neuronal activity, but it is not known how the active neurons are distributed across brain regions and how their activity unfolds in time. Here, we used high-density Neuropixels probes to record from ~30,000 neurons in mice performing a visual contrast discrimination task. The task activated 60% of the neurons, involving nearly all 42 recorded brain regions, well beyond the regions activated by passive visual stimulation. However, neurons selective for choice (left vs. right) were rare, and found mostly in midbrain, striatum, and frontal cortex. Those in midbrain were typically activated prior to contralateral choices and suppressed prior to ipsilateral choices, consistent with a competitive midbrain circuit for adjudicating the subject’s choice. A brain-wide state shift distinguished trials in which visual stimuli led to movement. These results reveal concurrent representations of movement and choice in neurons widely distributed across the brain.
biorxiv neuroscience 100-200-users 2018Factors associated with sharing email information and mental health survey participation in large population cohorts, bioRxiv, 2018-11-20
AbstractPeople who opt to participate in scientific studies tend to be healthier, wealthier, and more educated than the broader population. While selection bias does not always pose a problem for analysing the relationships between exposures and diseases or other outcomes, it can lead to biased effect size estimates. Biased estimates may weaken the utility of genetic findings because the goal is often to make inferences in a new sample (such as in polygenic risk score analysis). We used data from UK Biobank and Generation Scotland and conducted phenotypic and genome-wide association analyses on two phenotypes that reflected mental health data availability (1) whether participants were contactable by email for follow-up) and (2) whether participants responded to a follow-up surveys of mental health. We identified nine genetic loci associated with email contact and 25 loci associated with mental health survey completion. Both phenotypes were positively genetically correlated with higher educational attainment and better health and negatively genetically correlated with psychological distress and schizophrenia. Recontact availability and follow-up participation can act as further genetic filters for data on mental health phenotypes.
biorxiv genetics 100-200-users 2018Tracing diagnosis trajectories over millions of inpatients reveal an unexpected association between schizophrenia and rhabdomyolysis, bioRxiv, 2018-11-20
AbstractWhile it has been technically feasible to create longitudinal representations of individual health at a nationwide scale, the use of these techniques to identify novel disease associations for the risk stratification of patients has had limited success. Here, we created a large-scale US longitudinal disease network of traced readmission patterns (i.e., disease trajectories), merging data from over 10.4 million inpatients from 350 California hospitals through the Healthcare Cost and Utilization Project between 1980 and 2010. We were able to create longitudinal representations of disease progression mapping over 300 common diseases, including the well-known complication of heart failure after acute myocardial infarction. Surprisingly, out of these generated disease trajectories, we discovered an unknown association between schizophrenia, a chronic mental disorder, and rhabdomyolysis, a rare disease of muscle breakdown. It was found that 92 of 3674 patients (2.5%) with schizophrenia were readmitted for rhabdomyolysis (relative risk, 2.21 [1.80–2.71, confidence interval = 0.95] P-value 9.54E-15), which has a general population incidence of 1 in 10,000. We validated this association using independent electronic health records from over 830,000 patients treated over seven years at the University of California, San Francisco (UCSF) medical center. A case review of 29 patients at UCSF who were treated for schizophrenia and who went on to develop rhabdomyolysis demonstrated that the majority of cases (62%) are idiopathic, which suggests a biological connection between these two diseases. Together, these findings demonstrate the power of using public disease registries in combination with electronic medical records to discover novel disease associations.One Sentence SummaryBased on the longitudinal health records from millions of California inpatient discharges, we created a temporal network that enabled us to understand statewide patterns of hospital readmissions, which led to the novel finding that hospitalization for schizophrenia is significantly associated with rehospitalization for rhabdomyolysis.
biorxiv bioinformatics 0-100-users 2018Disorganization of the histone core promotes organization of heterochromatin into phase-separated droplets, bioRxiv, 2018-11-19
AbstractThe heterochromatin protein HP1 is proposed to enable chromatin compaction via liquid droplet formation. Yet, a connection between phase separation and chromatin compaction has not been experimentally demonstrated. More fundamentally, how HP1 action at the level of a single nucleosome drives chromatin compaction remains poorly understood. Here we directly demonstrate that the S. pombe HP1 protein, Swi6, compacts arrays of multiple nucleosomes into phase-separated droplets. Using hydrogen-deuterium exchange, NMR, and mass-spectrometry, we further find that Swi6 substantially increases the accessibility and dynamics of buried histone residues within a mononucleosome. Restraining these dynamics via site-specific disulfide bonds impairs the compaction of nucleosome arrays into phase-separated droplets. Our results indicate that chromatin compaction and phase separation can be highly coupled processes. Further, we find that such coupling is promoted by a counter-intuitive function of Swi6, namely disorganization of the octamer core. Phase separation is canonically mediated by weak and dynamic multivalent interactions. We propose that dynamic exposure of buried histone residues increases opportunities for multivalent interactions between nucleosomes, thereby coupling chromatin compaction to phase separation. We anticipate that this new model for chromatin organization may more generally explain the formation of highly compacted chromatin assemblies beyond heterochromatin.
biorxiv biophysics 0-100-users 2018