Genes mirror migrations and cultures in prehistoric Europe – a population genomic perspective, bioRxiv, 2016-09-02
AbstractGenomic information from ancient human remains is beginning to show its full potential for learning about human prehistory. We review the last few years' dramatic finds about European prehistory based on genomic data from humans that lived many millennia ago and relate it to modern-day patterns of genomic variation. The early times, the Upper Palaeolithic, appears to contain several population turn-overs followed by more stable populations after the Last Glacial Maximum and during the Mesolithic. Some 11,000 years ago the migrations driving the Neolithic transition start from around Anatolia and reach the north and the west of Europe millennia later followed by major migrations during the Bronze age. These findings show that culture and lifestyle were major determinants of genomic differentiation and similarity in pre-historic Europe rather than geography as is the case today.
biorxiv evolutionary-biology 0-100-users 2016Population-genomic inference of the strength and timing of selection against gene flow, bioRxiv, 2016-09-02
AbstractThe interplay of divergent selection and gene flow is key to understanding how populations adapt to local environments and how new species form. Here, we use DNA polymorphism data and genome-wide variation in recombination rate to jointly infer the strength and timing of selection, as well as the baseline level of gene flow under various demographic scenarios. We model how divergent selection leads to a genome-wide negative correlation between recombination rate and genetic differentiation among populations. Our theory shows that the selection density, i.e. the selection coefficient per base pair, is a key parameter underlying this relationship. We then develop a procedure for parameter estimation that accounts for the confounding effect of background selection. Applying this method to two datasets from Mimulus guttatus, we infer a strong signal of adaptive divergence in the face of gene flow between populations growing on and off phytotoxic serpentine soils. However, the genome-wide intensity of this selection is not exceptional compared to what M. guttatus populations may typically experience when adapting to local conditions. We also find that selection against genome-wide introgression from the selfing sister species M. nasutus has acted to maintain a barrier between these two species over at least the last 250 ky. Our study provides a theoretical framework for linking genome-wide patterns of divergence and recombination with the underlying evolutionary mechanisms that drive this differentiation.
biorxiv evolutionary-biology 100-200-users 2016Paperfuge An ultra-low cost, hand-powered centrifuge inspired by the mechanics of a whirligig toy, bioRxiv, 2016-09-01
AbstractSample preparation, including separation of plasma from whole blood or isolation of parasites, is an unmet challenge in many point of care (POC) diagnostics and requires centrifugation as the first key step. From the context of global health applications, commercial centrifuges are expensive, bulky and electricity-powered, leading to a critical bottle-neck in the development of decentralized, electricity-free POC diagnostic devices. By uncovering the fundamental mechanics of an ancient whirligig toy (3300 B.C.E), we design an ultra-low cost (20 cents), light-weight (2 g), human-powered centrifuge that is made out of paper (“paperfuge”). To push the operating limits of this unconventional centrifuge, we present an experimentally-validated theoretical model that describes the paperfuge as a non-linear, non-conservative oscillator system. We use this model to inform our design process, achieving speeds of 125,000 rpm and equivalent centrifugal forces of 30,000 g, with theoretical limits predicting one million rpm. We harness these speeds to separate pure plasma in less than 1.5 minutes and isolate malaria parasites in 15 minutes from whole human blood. By expanding the materials used, we implement centrifugal microfluidics using PDMS, plastic and 3D-printed devices, ultimately opening up new opportunities for electricity-free POC diagnostics, especially in resource-poor settings.
biorxiv bioengineering 200-500-users 2016Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature, bioRxiv, 2016-08-26
AbstractWe have empirically assessed the distribution of published effect sizes and estimated power by extracting more than 100,000 statistical records from about 10,000 cognitive neuroscience and psychology papers published during the past 5 years. The reported median effect size was d=0.93 (inter-quartile range 0.64-1.46) for nominally statistically significant results and d=0.24 (0.11-0.42) for non-significant results. Median power to detect small, medium and large effects was 0.12, 0.44 and 0.73, reflecting no improvement through the past half-century. Power was lowest for cognitive neuroscience journals. 14% of papers reported some statistically significant results, although the respective F statistic and degrees of freedom proved that these were non-significant; p value errors positively correlated with journal impact factors. False report probability is likely to exceed 50% for the whole literature. In light of our findings the recently reported low replication success in psychology is realistic and worse performance may be expected for cognitive neuroscience.
biorxiv neuroscience 500+-users 2016Canu scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation, bioRxiv, 2016-08-25
AbstractLong-read single-molecule sequencing has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. However, given the relatively high error rates of such technologies, efficient and accurate assembly of large repeats and closely related haplotypes remains challenging. We address these issues with Canu, a successor of Celera Assembler that is specifically designed for noisy single-molecule sequences. Canu introduces support for nanopore sequencing, halves depth-of-coverage requirements, and improves assembly continuity while simultaneously reducing runtime by an order of magnitude on large genomes versus Celera Assembler 8.2. These advances result from new overlapping and assembly algorithms, including an adaptive overlapping strategy based on tf-idf weighted MinHash and a sparse assembly graph construction that avoids collapsing diverged repeats and haplotypes. We demonstrate that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either PacBio or Oxford Nanopore technologies, and achieves a contig NG50 of greater than 21 Mbp on both human and Drosophila melanogaster PacBio datasets. For assembly structures that cannot be linearly represented, Canu provides graph-based assembly outputs in graphical fragment assembly (GFA) format for analysis or integration with complementary phasing and scaffolding techniques. The combination of such highly resolved assembly graphs with long-range scaffolding information promises the complete and automated assembly of complex genomes.
biorxiv bioinformatics 100-200-users 2016DeepAD Alzheimer’s Disease Classification via Deep Convolutional Neural Networks using MRI and fMRI, bioRxiv, 2016-08-22
1AbstractTo extract patterns from neuroimaging data, various statistical methods and machine learning algorithms have been explored for the diagnosis of Alzheimer’s disease among older adults in both clinical and research applications; however, distinguishing between Alzheimer’s and healthy brain data has been challenging in older adults (age > 75) due to highly similar patterns of brain atrophy and image intensities. Recently, cutting-edge deep learning technologies have rapidly expanded into numerous fields, including medical image analysis. This paper outlines state-of-the-art deep learning-based pipelines employed to distinguish Alzheimer’s magnetic resonance imaging (MRI) and functional MRI (fMRI) from normal healthy control data for a given age group. Using these pipelines, which were executed on a GPU-based high-performance computing platform, the data were strictly and carefully preprocessed. Next, scale- and shift-invariant low- to high-level features were obtained from a high volume of training images using convolutional neural network (CNN) architecture. In this study, fMRI data were used for the first time in deep learning applications for the purposes of medical image analysis and Alzheimer’s disease prediction. These proposed and implemented pipelines, which demonstrate a significant improvement in classification output over other studies, resulted in high and reproducible accuracy rates of 99.9% and 98.84% for the fMRI and MRI pipelines, respectively. Additionally, for clinical purposes, subject-level classification was performed, resulting in an average accuracy rate of 94.32% and 97.88% for the fMRI and MRI pipelines, respectively. Finally, a decision making algorithm designed for the subject-level classification improved the rate to 97.77% for fMRI and 100% for MRI pipelines.
biorxiv bioinformatics 0-100-users 2016