Identification of hidden population structure in time-scaled phylogenies, bioRxiv, 2019-07-16
AbstractPopulation structure influences genealogical patterns, however data pertaining to how populations are structured are often unavailable or not directly observable. Inference of population structure is highly important in molecular epidemiology where pathogen phylogenetics is increasingly used to infer transmission patterns and detect outbreaks. Discrepancies between observed and idealised genealogies, such as those generated by the coalescent process, can be quantified, and where significant differences occur, may reveal the action of natural selection, host population structure, or other demographic and epidemiological heterogeneities. We have developed a fast non-parametric statistical test for detection of cryptic population structure in time-scaled phylogenetic trees. The test is based on contrasting estimated phylogenies with the theoretically expected phylodynamic ordering of common ancestors in two clades within a coalescent framework. These statistical tests have also motivated the development of algorithms which can be used to quickly screen a phylogenetic tree for clades which are likely to share a distinct demographic or epidemiological history. Epidemiological applications include identification of outbreaks in vulnerable host populations or rapid expansion of genotypes with a fitness advantage. To demonstrate the utility of these methods for outbreak detection, we applied the new methods to large phylogenies reconstructed from thousands of HIV-1 partial pol sequences. This revealed the presence of clades which had grown rapidly in the recent past, and was significantly concentrated in young men, suggesting recent and rapid transmission in that group. Furthermore, to demonstrate the utility of these methods for the study of antimicrobial resistance, we applied the new methods to a large phylogeny reconstructed from whole genome Neisseria gonorrhoeae sequences. We find that population structure detected using these methods closely overlaps with the appearance and expansion of mutations conferring antimicrobial resistance.
biorxiv evolutionary-biology 100-200-users 2019Striatal activity reflects cortical activity patterns, bioRxiv, 2019-07-16
The dorsal striatum is organized into domains that drive characteristic behaviors1–7, and receive inputs from different parts of the cortex8,9 which modulate similar behaviors10–12. Striatal responses to cortical inputs, however, can be affected by changes in connection strength13–15, local striatal circuitry16,17, and thalamic inputs18,19. Therefore, it is unclear whether the pattern of activity across striatal domains mirrors that across the cortex20–23 or differs from it24–28. Here we use simultaneous large-scale recordings in the cortex and the striatum to show that striatal activity can be accurately predicted by spatiotemporal activity patterns in the cortex. The relationship between activity in the cortex and the striatum was spatially consistent with corticostriatal anatomy, and temporally consistent with a feedforward drive. Each striatal domain exhibited specific sensorimotor responses that predictably followed activity in the associated cortical regions, and the corticostriatal relationship remained unvaried during passive states or performance of a task probing visually guided behavior. However, the task’s visual stimuli and corresponding behavioral responses evoked relatively more activity in the striatum than in associated cortical regions. This increased striatal activity involved an additive offset in firing rate, which was independent of task engagement but only present in animals that had learned the task. Thus, striatal activity largely reflects patterns of cortical activity, deviating from them in a simple additive fashion for learned stimuli or actions.
biorxiv neuroscience 100-200-users 2019Supercentenarians and the oldest-old are concentrated into regions with no birth certificates and short lifespans, bioRxiv, 2019-07-16
AbstractThe observation of individuals attaining remarkable ages, and their concentration into geographic sub-regions or ‘blue zones’, has generated considerable scientific interest. Proposed drivers of remarkable longevity include high vegetable intake, strong social connections, and genetic markers. Here, we reveal new predictors of remarkable longevity and ‘supercentenarian’ status. In the United States, supercentenarian status is predicted by the absence of vital registration. The state-specific introduction of birth certificates is associated with a 69-82% fall in the number of supercentenarian records. In Italy, which has more uniform vital registration, remarkable longevity is instead predicted by low per capita incomes and a short life expectancy. Finally, the designated ‘blue zones’ of Sardinia, Okinawa, and Ikaria corresponded to regions with low incomes, low literacy, high crime rate and short life expectancy relative to their national average. As such, relative poverty and short lifespan constitute unexpected predictors of centenarian and supercentenarian status, and support a primary role of fraud and error in generating remarkable human age records.
biorxiv developmental-biology 500+-users 2019Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics, bioRxiv, 2019-07-15
AbstractRecent advances in statistical machine learning techniques have led to the creation of probabilistic programming frameworks. These frameworks enable probabilistic models to be rapidly prototyped and fit to data using scalable approximation methods such as variational inference. In this work, we explore the use of the Stan language for probabilistic programming in application to phylogenetic models. We show that many commonly used phylogenetic models including the general time reversible (GTR) substitution model, rate heterogeneity among sites, and a range of coalescent models can be implemented using a probabilistic programming language. The posterior probability distributions obtained via the black box variational inference engine in Stan were compared to those obtained with reference implementations of Markov chain Monte Carlo (MCMC) for phylogenetic inference. We find that black box variational inference in Stan is less accurate than MCMC methods for phylogenetic models, but requires far less compute time. Finally, we evaluate a custom implementation of mean-field variational inference on the Jukes-Cantor substitution model and show that a specialized implementation of variational inference can be two orders of magnitude faster and more accurate than a general purpose probabilistic implementation.
biorxiv bioinformatics 0-100-users 2019The ARRIVE guidelines 2019 updated guidelines for reporting animal research, bioRxiv, 2019-07-15
AbstractReproducible science requires transparent reporting. The ARRIVE guidelines were originally developed in 2010 to improve the reporting of animal research. They consist of a checklist of information to include in publications describing in vivo experiments to enable others to scrutinise the work adequately, evaluate its methodological rigour, and reproduce the methods and results. Despite considerable levels of endorsement by funders and journals over the years, adherence to the guidelines has been inconsistent, and the anticipated improvements in the quality of reporting in animal research publications have not been achieved.Here we introduce ARRIVE 2019. The guidelines have been updated and information reorganised to facilitate their use in practice. We used a Delphi exercise to prioritise the items and split the guidelines into two sets, the ARRIVE Essential 10, which constitute the minimum requirement, and the Recommended Set, which describes the research context. This division facilitates improved reporting of animal research by supporting a stepwise approach to implementation. This helps journal editors and reviewers to verify that the most important items are being reported in manuscripts. We have also developed the accompanying Explanation and Elaboration document that serves 1) to explain the rationale behind each item in the guidelines, 2) to clarify key concepts and 3) to provide illustrative examples. We aim through these changes to help ensure that researchers, reviewers and journal editors are better equipped to improve the rigour and transparency of the scientific process and thus reproducibility.
biorxiv scientific-communication-and-education 0-100-users 2019Releasing a preprint is associated with more attention and citations for the peer-reviewed article, bioRxiv, 2019-07-14
AbstractPreprints in biology are gaining popularity, but release of a preprint still precedes only a fraction of peer-reviewed publications. We examined whether having a preprint on bioRxiv was associated with metrics of the corresponding peer-reviewed article. We assembled a dataset of 74,239 articles, 5,405 of which had a preprint, published in 39 journals. Based on log-linear regression and random-effects meta-analysis, articles with a preprint had a 51% higher Altmetric Attention Score and 37% more citations compared to articles without one. These associations were independent of several other article- and author-level variables (e.g., scientific subfield and last author publication age) and unrelated to journal-level variables such as access model and Impact Factor. This observational study can help researchers and publishers make informed decisions about how to incorporate preprints into their work.
biorxiv scientific-communication-and-education 200-500-users 2019