Adversarial childhood events are associated with Sudden Infant Death Syndrome (SIDS) an ecological study, bioRxiv, 2018-06-07
AbstractSudden Infant Death Syndrome (SIDS) is the most common cause of postneonatal infant death. The allostatic load hypothesis posits that SIDS is the result of perinatal cumulative painful, stressful, or traumatic exposures that tax neonatal regulatory systems. To test it, we explored the relationships between SIDS and two common stressors, male neonatal circumcision (MNC) and prematurity, using latitudinal data from 15 countries and over 40 US states during the years 1999-2016. We used linear regression analyses and likelihood ratio tests to calculate the association between SIDS and the stressors. SIDS prevalence was significantly and positively correlated with MNC and prematurity rates. MNC explained 14.2% of the variability of SIDS’s male bias in the US, reminiscent of the Jewish myth of Lilith, the killer of infant males. Combined, the stressors increased the likelihood of SIDS. Ecological analyses are useful to generate hypotheses but cannot provide strong evidence of causality. Biological plausibility is provided by a growing body of experimental and clinical evidence linking adversary preterm and early-life events with SIDS. Together with historical evidence, our findings emphasize the necessity of cohort studies that consider these environmental stressors with the aim of improving the identification of at-risk infants and reducing infant mortality.
biorxiv pathology 100-200-users 2018CalmAn An open source tool for scalable Calcium Imaging data Analysis, bioRxiv, 2018-06-05
AbstractAdvances in fluorescence microscopy enable monitoring larger brain areas in-vivo with finer time resolution. The resulting data rates require reproducible analysis pipelines that are reliable, fully automated, and scalable to datasets generated over the course of months. Here we present CaImAn, an open-source library for calcium imaging data analysis. CaImAn provides automatic and scalable methods to address problems common to pre-processing, including motion correction, neural activity identification, and registration across different sessions of data collection. It does this while requiring minimal user intervention, with good performance on computers ranging from laptops to high-performance computing clusters. CaImAn is suitable for two-photon and one-photon imaging, and also enables real-time analysis on streaming data. To benchmark the performance of CaImAn we collected a corpus of ground truth annotations from multiple labelers on nine mouse two-photon datasets. We demonstrate that CaImAn achieves near-human performance in detecting locations of active neurons.
biorxiv neuroscience 0-100-users 2018Toward machine-guided design of proteins, bioRxiv, 2018-06-02
AbstractProteins—molecular machines that underpin all biological life—are of significant therapeutic and industrial value. Directed evolution is a high-throughput experimental approach for improving protein function, but has difficulty escaping local maxima in the fitness landscape. Here, we investigate how supervised learning in a closed loop with DNA synthesis and high-throughput screening can be used to improve protein design. Using the green fluorescent protein (GFP) as an illustrative example, we demonstrate the opportunities and challenges of generating training datasets conducive to selecting strongly generalizing models. With prospectively designed wet lab experiments, we then validate that these models can generalize to unseen regions of the fitness landscape, even when constrained to explore combinations of non-trivial mutations. Taken together, this suggests a hybrid optimization strategy for protein design in which a predictive model is used to explore difficult-to-access but promising regions of the fitness landscape that directed evolution can then exploit at scale.
biorxiv synthetic-biology 100-200-users 2018Alevin efficiently estimates accurate gene abundances from dscRNA-seq data, bioRxiv, 2018-06-01
AbstractWe introduce alevin, a fast end-to-end pipeline to process droplet-based single cell RNA sequencing data, which performs cell barcode detection, read mapping, unique molecular identifier deduplication, gene count estimation, and cell barcode whitelisting. Alevin’s approach to UMI deduplication accounts for both gene-unique reads and reads that multimap between genes. This addresses the inherent bias in existing tools which discard gene-ambiguous reads, and improves the accuracy of gene abundance estimates.
biorxiv bioinformatics 100-200-users 2018Antimicrobial exposure in sexual networks drives divergent evolution in modern gonococci, bioRxiv, 2018-05-31
AbstractThe sexually transmitted pathogen Neisseria gonorrhoeae is regarded as being on the way to becoming an untreatable superbug. Despite its clinical importance, little is known about its emergence and evolution, and how this corresponds with the introduction of antimicrobials. We present a genome-based phylogeographic analysis of 419 gonococcal isolates from across the globe. Results indicate that modern gonococci originated in Europe or Africa as late as the 16thcentury and subsequently disseminated globally. We provide evidence that the modern gonococcal population has been shaped by antimicrobial treatment of sexually transmitted and other infections, leading to the emergence of two major lineages with different evolutionary strategies. The well-described multi-resistant lineage is associated with high rates of homologous recombination and infection in high-risk sexual networks where antimicrobial treatment is frequent. A second, multi-susceptible lineage associated with heterosexual networks, where asymptomatic infection is more common, was also identified, with potential implications for infection control.
biorxiv genomics 0-100-users 2018Evaluation of Deep Learning Strategies for Nucleus Segmentation in Fluorescence Images, bioRxiv, 2018-05-31
Identifying nuclei is often a critical first step in analyzing microscopy images of cells, and classical image processing algorithms are most commonly used for this task. Recent developments in deep learning can yield superior accuracy, but typical evaluation metrics for nucleus segmentation do not satisfactorily capture error modes that are relevant in cellular images. We present an evaluation framework to measure accuracy, types of errors, and computational efficiency; and use it to compare deep learning strategies and classical approaches. We publicly release a set of 23,165 manually annotated nuclei and source code to reproduce experiments and run the proposed evaluation methodology. Our evaluation framework shows that deep learning improves accuracy and can reduce the number of biologically relevant errors by half.
biorxiv bioinformatics 0-100-users 2018