Deep neural networks for interpreting RNA binding protein target preferences, bioRxiv, 2019-01-12
Deep learning has become a powerful paradigm to analyze the binding sites of regulatory factors including RNA-binding proteins (RBPs), owing to its strength to learn complex features from possibly multiple sources of raw data. However, the interpretability of these models, which is crucial to improve our understanding of RBP binding preferences and functions, has not yet been investigated in significant detail. We have designed a multitask and multimodal deep neural network for characterizing in vivo RBP binding preferences. The model incorporates not only the sequence but also the region type of the binding sites as input, which helps the model to boost the prediction performance. To interpret the model, we quantified the contribution of the input features to the predictive score of each RBP. Learning across multiple RBPs at once, we are able to avoid experimental biases and to identify the RNA sequence motifs and transcript context patterns that are the most important for the predictions of each individual RBP. Our findings are consistent with known motifs and binding behaviors of RBPs and can provide new insights about the regulatory functions of RBPs.
biorxiv bioinformatics 0-100-users 2019Hunter-gatherer genomes reveal diverse demographic trajectories following the rise of farming in East Africa, bioRxiv, 2019-01-12
A major outstanding question in human prehistory is the fate of hunting and gathering populations following the rise of agriculture and pastoralism. Genomic analysis of ancient and contemporary Europeans suggests that autochthonous groups were either absorbed into or replaced by expanding farmer populations. Many of the hunter-gatherer populations persisting today live in Africa, perhaps because agropastoral transitions occurred later on the continent. Here, we present the first genomic data from the Chabu, a relatively isolated and marginalized hunting-and-gathering group from the Southwestern Ethiopian highlands. The Chabu are a distinct genetic population that carry the highest levels of Southwestern Ethiopian ancestry of any extant population studied thus far. This ancestry has been in situ for at least 4,500 years. We show that the Chabu are undergoing a severe population bottleneck which began around 40 generations ago. We also study other Eastern African populations and demonstrate divergent patterns of historical population size change over the past 60 generations between even closely related groups. We argue that these patterns demonstrate that, unlike in Europe, Africans hunter-gatherers responded to agropastoralism with diverse strategies.
biorxiv genetics 0-100-users 2019Insulin enhances presynaptic glutamate release in the nucleus accumbens via opioid receptor-mediated disinhibition, bioRxiv, 2019-01-12
Insulin influences learning and cognition and activity in brain centers that mediate reward and motivation in humans. However, very little is known about how insulin influences excitatory transmission within brain reward centers such as the nucleus accumbens (NAc). Further, insulin dysregulation that accompanies obesity is linked to cognitive decline, depression, anxiety, and aberrant motivation that also rely on excitatory transmission in the NAc, but potential mechanisms are poorly understood. Here we show that insulin receptor activation increases presynaptic glutamate release via a previously unidentified form of opioid receptor-mediated disinhibition, whereas activation of IGF receptors by insulin decreases presynaptic glutamate release in the NAc. Furthermore, obesity results in a loss of the insulin receptor-mediated increases and a reduction in NAc insulin receptor surface expression, while preserving reductions in excitatory transmission mediated by IGF receptors. These results provide the first insights into how insulin influences excitatory transmission in the adult brain and have broad implications for the regulation of motivation and reward related processes by peripheral hormones.
biorxiv neuroscience 0-100-users 2019The genome of C57BL6J Eve, the mother of the laboratory mouse genome reference strain, bioRxiv, 2019-01-12
Isogenic laboratory mouse strains are used to enhance reproducibility as individuals within a strain are essentially genetically identical. For the most widely used isogenic strain, C57BL6, there is also a wealth of genetic, phenotypic, and genomic data, including one of the highest quality reference genomes (GRCm38.p6). However, laboratory mouse strains are living reagents and hence genetic drift occurs and is an unavoidable source of accumulating genetic variability that can have an impact on reproducibility over time. Nearly 20 years after the first release of the mouse reference genome, individuals from the strain it represents (C57BL6J) are at least 26 inbreeding generations removed from the individuals used to generate the mouse reference genome. Moreover, C57BL6J is now maintained through the periodic reintroduction of mice from cryopreserved embryo stocks that are derived from a single breeder pair, aptly named C57BL6J Adam and Eve. To more accurately represent the genome of today's C57BL6J mice, we have generated a de novo assembly of the C57BL6J Eve genome (B6Eve) using high coverage, long-read sequencing, optical mapping, and short-read data. Using these data, we addressed recurring variants observed in previous mouse studies. We have also identified structural variations that impact coding sequences, closed gaps in the mouse reference assembly, some of which are in genes, and we have identified previously unannotated coding sequences through long read sequencing of cDNAs. This B6Eve assembly explains discrepant observations that have been associated with GRCm38-based analyses, and has provided data towards a reference genome that is more representative of the C57BL6J mice that are in use today.
biorxiv genomics 0-100-users 2019Evolutionary Dynamics Do Not Motivate a Single-Mutant Theory of Human Language Supplementary Sections, bioRxiv, 2019-01-10
One of the most controversial hypotheses in cognitive science is the Chomskyan evolutionary conjecture that language arose instantaneously in our species as the result of a single staggeringly fortuitous mutation. Here we analyze the evolutionary dynamics implied by this hypothesis, which has never been formalized. The theory supposes the emergence and fixation of a single mutant (capable of the syntactic operation Merge) during a narrow historical window as a result of frequency-independent selection under a huge fitness advantage in a population of an effective size that is standardly assumed to have been no larger than ~15 000 early humans. We examine this proposal by combining diffusion analysis and extreme value theory to derive a probabilistic formulation of its dynamics. Perhaps counter-intuitively, a macro-mutation is much more unlikely a priori than multiple mutations with smaller fitness effects, yet both hypotheses predict fixation with high conditional probability. The consequences of this asymmetry have not been accounted for previously. Our results diffuse any suggestion that evolutionary reasoning provides an independent rationale for the controversial single-mutant theory of language.
biorxiv evolutionary-biology 0-100-users 2019Individual-Specific fMRI-Subspaces Improve Functional Connectivity Prediction of Behavior Supplemental, bioRxiv, 2019-01-10
There is significant interest in using resting-state functional connectivity (RSFC) to predict human behavior. Good behavioral prediction should in theory require RSFC to be sufficiently distinct across participants; if RSFC were the same across participants, then behavioral prediction would obviously be poor. Therefore, we hypothesize that removing common resting-state functional magnetic resonance imaging (rs-fMRI) signals that are shared across participants would improve behavioral prediction. Here, we considered 803 participants from the human connectome project (HCP) with four rs-fMRI runs. We applied the common and orthogonal basis extraction (COBE) technique to decompose each HCP run into two subspaces a common (group-level) subspace shared across all participants and a subject-specific subspace. We found that the first common COBE component of the first HCP run was localized to the visual cortex and was unique to the run. On the other hand, the second common COBE component of the first HCP run and the first common COBE component of the remaining HCP runs were highly similar and localized to regions within the default network, including the posterior cingulate cortex and precuneus. Overall, this suggests the presence of run-specific (state-specific) effects that were shared across participants. By removing the first and second common COBE components from the first HCP run, and the first common COBE component from the remaining HCP runs, the resulting RSFC improves behavioral prediction by an average of 11.7% across 58 behavioral measures spanning cognition, emotion and personality.
biorxiv neuroscience 0-100-users 2019