Tracking the popularity and outcomes of all bioRxiv preprints, bioRxiv, 2019-01-13
AbstractResearchers in the life sciences are posting work to preprint servers at an unprecedented and increasing rate, sharing papers online before (or instead of) publication in peer-reviewed journals. Though the increasing acceptance of preprints is driving policy changes for journals and funders, there is little information about their usage. Here, we collected and analyzed data on all 37,648 preprints uploaded to bioRxiv.org, the largest biology-focused preprint server, in its first five years. We find preprints are being downloaded more than ever before (1.1 million tallied in October 2018 alone) and that the rate of preprints being posted has increased to a recent high of 2,100 per month. We also find that two-thirds of preprints posted before 2017 were later published in peer-reviewed journals, and find a relationship between journal impact factor and preprint downloads. Lastly, we developed Rxivist.org, a web application providing multiple ways of interacting with preprint metadata.
biorxiv scientific-communication-and-education 500+-users 2019Deep neural networks for interpreting RNA binding protein target preferences, bioRxiv, 2019-01-12
Deep learning has become a powerful paradigm to analyze the binding sites of regulatory factors including RNA-binding proteins (RBPs), owing to its strength to learn complex features from possibly multiple sources of raw data. However, the interpretability of these models, which is crucial to improve our understanding of RBP binding preferences and functions, has not yet been investigated in significant detail. We have designed a multitask and multimodal deep neural network for characterizing in vivo RBP binding preferences. The model incorporates not only the sequence but also the region type of the binding sites as input, which helps the model to boost the prediction performance. To interpret the model, we quantified the contribution of the input features to the predictive score of each RBP. Learning across multiple RBPs at once, we are able to avoid experimental biases and to identify the RNA sequence motifs and transcript context patterns that are the most important for the predictions of each individual RBP. Our findings are consistent with known motifs and binding behaviors of RBPs and can provide new insights about the regulatory functions of RBPs.
biorxiv bioinformatics 0-100-users 2019Hunter-gatherer genomes reveal diverse demographic trajectories following the rise of farming in East Africa, bioRxiv, 2019-01-12
A major outstanding question in human prehistory is the fate of hunting and gathering populations following the rise of agriculture and pastoralism. Genomic analysis of ancient and contemporary Europeans suggests that autochthonous groups were either absorbed into or replaced by expanding farmer populations. Many of the hunter-gatherer populations persisting today live in Africa, perhaps because agropastoral transitions occurred later on the continent. Here, we present the first genomic data from the Chabu, a relatively isolated and marginalized hunting-and-gathering group from the Southwestern Ethiopian highlands. The Chabu are a distinct genetic population that carry the highest levels of Southwestern Ethiopian ancestry of any extant population studied thus far. This ancestry has been in situ for at least 4,500 years. We show that the Chabu are undergoing a severe population bottleneck which began around 40 generations ago. We also study other Eastern African populations and demonstrate divergent patterns of historical population size change over the past 60 generations between even closely related groups. We argue that these patterns demonstrate that, unlike in Europe, Africans hunter-gatherers responded to agropastoralism with diverse strategies.
biorxiv genetics 0-100-users 2019Insulin enhances presynaptic glutamate release in the nucleus accumbens via opioid receptor-mediated disinhibition, bioRxiv, 2019-01-12
Insulin influences learning and cognition and activity in brain centers that mediate reward and motivation in humans. However, very little is known about how insulin influences excitatory transmission within brain reward centers such as the nucleus accumbens (NAc). Further, insulin dysregulation that accompanies obesity is linked to cognitive decline, depression, anxiety, and aberrant motivation that also rely on excitatory transmission in the NAc, but potential mechanisms are poorly understood. Here we show that insulin receptor activation increases presynaptic glutamate release via a previously unidentified form of opioid receptor-mediated disinhibition, whereas activation of IGF receptors by insulin decreases presynaptic glutamate release in the NAc. Furthermore, obesity results in a loss of the insulin receptor-mediated increases and a reduction in NAc insulin receptor surface expression, while preserving reductions in excitatory transmission mediated by IGF receptors. These results provide the first insights into how insulin influences excitatory transmission in the adult brain and have broad implications for the regulation of motivation and reward related processes by peripheral hormones.
biorxiv neuroscience 0-100-users 2019The genome of C57BL6J Eve, the mother of the laboratory mouse genome reference strain, bioRxiv, 2019-01-12
Isogenic laboratory mouse strains are used to enhance reproducibility as individuals within a strain are essentially genetically identical. For the most widely used isogenic strain, C57BL6, there is also a wealth of genetic, phenotypic, and genomic data, including one of the highest quality reference genomes (GRCm38.p6). However, laboratory mouse strains are living reagents and hence genetic drift occurs and is an unavoidable source of accumulating genetic variability that can have an impact on reproducibility over time. Nearly 20 years after the first release of the mouse reference genome, individuals from the strain it represents (C57BL6J) are at least 26 inbreeding generations removed from the individuals used to generate the mouse reference genome. Moreover, C57BL6J is now maintained through the periodic reintroduction of mice from cryopreserved embryo stocks that are derived from a single breeder pair, aptly named C57BL6J Adam and Eve. To more accurately represent the genome of today's C57BL6J mice, we have generated a de novo assembly of the C57BL6J Eve genome (B6Eve) using high coverage, long-read sequencing, optical mapping, and short-read data. Using these data, we addressed recurring variants observed in previous mouse studies. We have also identified structural variations that impact coding sequences, closed gaps in the mouse reference assembly, some of which are in genes, and we have identified previously unannotated coding sequences through long read sequencing of cDNAs. This B6Eve assembly explains discrepant observations that have been associated with GRCm38-based analyses, and has provided data towards a reference genome that is more representative of the C57BL6J mice that are in use today.
biorxiv genomics 0-100-users 2019Tumor mutational landscape is a record of the pre-malignant state, bioRxiv, 2019-01-12
Chromatin structure has a major influence on the cell-specific density of somatic mutations along the cancer genome. Here, we present a pan-cancer study in which we searched for the putative cancer cell-of-origin of 2,550 whole genomes, representing 32 cancer types by matching their mutational landscape to the regional patterns of chromatin modifications ascertained in 104 normal tissue types. We found that, in almost all cancer types, the cell-of-origin can be predicted solely from their DNA sequences. Our analysis validated the hypothesis that high-grade serous ovarian cancer originates in the fallopian tube and identified distinct origins of breast cancer subtypes. We also demonstrated that the technique is equally capable of identifying the cell-of-origin for a series of 2,044 metastatic samples from 22 of the tumor types available as primaries. Moreover, cancer drivers, whether inherited or acquired, reside in active chromatin regions in the respective cell-of-origin. Taken together, our findings highlight that many somatic mutations accumulate while the chromatin structure of the cell-of-origin is maintained and that this historical record, captured in the DNA, can be used to identify the often elusive cancer cell-of-origin.
biorxiv genomics 100-200-users 2019