Live Mouse Tracker real-time behavioral analysis of groups of mice, bioRxiv, 2018-06-14
Preclinical studies of psychiatric disorders require the use of animal models to investigate the impact of environmental factors or genetic mutations on complex traits such as decision-making and social interactions. Here, we present a real-time method for behavior analysis of mice housed in groups that couples computer vision, machine learning and Triggered-RFID identification to track and monitor animals over several days in enriched environments. The system extracts a thorough list of individual and collective behavioral traits and provides a unique phenotypic profile for each animal. On mouse models, we study the impact of mutations of genes Shank2 and Shank3 involved in autism. Characterization and integration of data from behavioral profiles of mutated female mice reveals distinctive activity levels and involvement in complex social configuration.
biorxiv animal-behavior-and-cognition 100-200-users 2018Real-time cryo-EM data pre-processing with Warp, bioRxiv, 2018-06-14
The acquisition of cryo-electron microscopy (cryo-EM) data from biological specimens is currently largely uncoupled from subsequent data evaluation, correction and processing. Therefore, the acquisition strategy is difficult to optimize during data collection, often leading to suboptimal microscope usage and disappointing results. Here we provide Warp, a software for real-time evaluation, correction, and processing of cryo-EM data during their acquisition. Warp evaluates and monitors key parameters for each recorded micrograph or tomographic tilt series in real time. Warp also rapidly corrects micrographs for global and local motion, and estimates the local defocus with the use of novel algorithms. The software further includes a deep learning-based particle picking algorithm that rivals human accuracy to make the pre-processing pipeline truly automated. The output from Warp can be directly fed into established tools for particle classification and 3D image reconstruction. In a benchmarking study we show that Warp automatically processed a published cryo-EM data set for influenza virus hemagglutinin, leading to an improvement of the nominal resolution from 3.9 Å to 3.2 Å. Warp is easy to install, computationally inexpensive, and has an intuitive and streamlined user interface.
biorxiv biophysics 0-100-users 2018Adversarial childhood events are associated with Sudden Infant Death Syndrome (SIDS) an ecological study, bioRxiv, 2018-06-07
AbstractSudden Infant Death Syndrome (SIDS) is the most common cause of postneonatal infant death. The allostatic load hypothesis posits that SIDS is the result of perinatal cumulative painful, stressful, or traumatic exposures that tax neonatal regulatory systems. To test it, we explored the relationships between SIDS and two common stressors, male neonatal circumcision (MNC) and prematurity, using latitudinal data from 15 countries and over 40 US states during the years 1999-2016. We used linear regression analyses and likelihood ratio tests to calculate the association between SIDS and the stressors. SIDS prevalence was significantly and positively correlated with MNC and prematurity rates. MNC explained 14.2% of the variability of SIDS’s male bias in the US, reminiscent of the Jewish myth of Lilith, the killer of infant males. Combined, the stressors increased the likelihood of SIDS. Ecological analyses are useful to generate hypotheses but cannot provide strong evidence of causality. Biological plausibility is provided by a growing body of experimental and clinical evidence linking adversary preterm and early-life events with SIDS. Together with historical evidence, our findings emphasize the necessity of cohort studies that consider these environmental stressors with the aim of improving the identification of at-risk infants and reducing infant mortality.
biorxiv pathology 100-200-users 2018CalmAn An open source tool for scalable Calcium Imaging data Analysis, bioRxiv, 2018-06-05
AbstractAdvances in fluorescence microscopy enable monitoring larger brain areas in-vivo with finer time resolution. The resulting data rates require reproducible analysis pipelines that are reliable, fully automated, and scalable to datasets generated over the course of months. Here we present CaImAn, an open-source library for calcium imaging data analysis. CaImAn provides automatic and scalable methods to address problems common to pre-processing, including motion correction, neural activity identification, and registration across different sessions of data collection. It does this while requiring minimal user intervention, with good performance on computers ranging from laptops to high-performance computing clusters. CaImAn is suitable for two-photon and one-photon imaging, and also enables real-time analysis on streaming data. To benchmark the performance of CaImAn we collected a corpus of ground truth annotations from multiple labelers on nine mouse two-photon datasets. We demonstrate that CaImAn achieves near-human performance in detecting locations of active neurons.
biorxiv neuroscience 0-100-users 2018Toward machine-guided design of proteins, bioRxiv, 2018-06-02
AbstractProteins—molecular machines that underpin all biological life—are of significant therapeutic and industrial value. Directed evolution is a high-throughput experimental approach for improving protein function, but has difficulty escaping local maxima in the fitness landscape. Here, we investigate how supervised learning in a closed loop with DNA synthesis and high-throughput screening can be used to improve protein design. Using the green fluorescent protein (GFP) as an illustrative example, we demonstrate the opportunities and challenges of generating training datasets conducive to selecting strongly generalizing models. With prospectively designed wet lab experiments, we then validate that these models can generalize to unseen regions of the fitness landscape, even when constrained to explore combinations of non-trivial mutations. Taken together, this suggests a hybrid optimization strategy for protein design in which a predictive model is used to explore difficult-to-access but promising regions of the fitness landscape that directed evolution can then exploit at scale.
biorxiv synthetic-biology 100-200-users 2018Alevin efficiently estimates accurate gene abundances from dscRNA-seq data, bioRxiv, 2018-06-01
AbstractWe introduce alevin, a fast end-to-end pipeline to process droplet-based single cell RNA sequencing data, which performs cell barcode detection, read mapping, unique molecular identifier deduplication, gene count estimation, and cell barcode whitelisting. Alevin’s approach to UMI deduplication accounts for both gene-unique reads and reads that multimap between genes. This addresses the inherent bias in existing tools which discard gene-ambiguous reads, and improves the accuracy of gene abundance estimates.
biorxiv bioinformatics 100-200-users 2018