Insights about variation in meiosis from 31,228 human sperm genomes, bioRxiv, 2019-05-02
AbstractMeiosis, while critical for reproduction, is also highly variable and error prone crossover rates vary among humans and individual gametes, and chromosome nondisjunction leads to aneuploidy, a leading cause of miscarriage. To study variation in meiotic outcomes within and across individuals, we developed a way to sequence many individual sperm genomes at once. We used this method to sequence the genomes of 31,228 gametes from 20 sperm donors, identifying 813,122 crossovers, 787 aneuploid chromosomes, and unexpected genomic anomalies. Different sperm donors varied four-fold in the frequency of aneuploid sperm, and aneuploid chromosomes gained in meiosis I had 36% fewer crossovers than corresponding non-aneuploid chromosomes. Diverse recombination phenotypes were surprisingly coordinated donors with high average crossover rates also made a larger fraction of their crossovers in centromere-proximal regions and placed their crossovers closer together. These same relationships were also evident in the variation among individual gametes from the same donor sperm with more crossovers tended to have made crossovers closer together and in centromere-proximal regions. Variation in the physical compaction of chromosomes could help explain this coordination of meiotic variation across chromosomes, gametes, and individuals.
biorxiv genomics 100-200-users 2019Tmem119-EGFP and Tmem119-CreERT2 transgenic mice for labeling and manipulating microglia, bioRxiv, 2019-05-02
AbstractMicroglia are specialized brain-resident macrophages with important functions in health and disease. To improve our understanding of these cells, the research community needs genetic tools to identify and control them in a manner that distinguishes them from closely related cell-types. We have targeted the recently discovered microglia-specific Tmem119 gene to generate knock-in mice expressing EGFP (JAX#031823) or CreERT2 (JAX#031820) for the identification and manipulation of microglia, respectively. Genetic characterization of the locus and qPCR-based analysis demonstrate correct positioning of the transgenes and intact expression of endogenous Tmem119 in the knock-in mouse models. Immunofluorescence analysis further shows that parenchymal microglia, but not other brain macrophages, are completely and faithfully labeled in the EGFP-line at different time points of development. Flow cytometry indicates highly selective expression of EGFP in CD11b+CD45lo microglia. Similarly, immunofluorescence and flow cytometry analyses using a Cre-dependent reporter mouse line demonstrate activity of CreERT2 primarily in microglia upon tamoxifen administration with the caveat of activity in leptomeningeal cells. Finally, flow cytometric analyses reveal absence of EGFP expression and minimal activity of CreERT2 in blood monocytes of the Tmem119-EGFP and Tmem119-CreERT2 lines, respectively. These new transgenic lines extend the microglia toolbox by providing the currently most specific genetic labeling and control over these cells in the myeloid compartment of mice.Visual abstract<jatsfig id=ufig1 position=float orientation=portrait fig-type=figure><jatsgraphic xmlnsxlink=httpwww.w3.org1999xlink xlinkhref=624825v2_ufig1 position=float orientation=portrait >Significance statementTools that specifically label and manipulate only microglia are currently unavailable, but are critically needed to further our understanding of this cell type. Complementing and significantly extending recently introduced microglia-specific immunostaining methods that have quickly become a new standard in the field, we generated two mouse lines that label and control gene expression in microglia with high specificity and made them publicly available. Using these readily accessible mice, the research community will be able to study microglia biology with improved specificity.
biorxiv neuroscience 0-100-users 2019Are place cells just memory cells? Memory compression leads to spatial tuning and history dependence, bioRxiv, 2019-05-01
AbstractThe observation of place cells has suggested that the hippocampus plays a special role in encoding spatial information. However, place cell responses are modulated by several non-spatial variables, and reported to be rather unstable. Here we propose a memory model of the hippocampus that provides a novel interpretation of place cells consistent with these observations. We hypothesize that the hippocampus is a memory device that takes advantage of the correlations between sensory experiences to generate compressed representations of the episodes that are stored in memory. A simple neural network model that can efficiently compress information naturally produces place cells that are similar to those observed in experiments. It predicts that the activity of these cells is variable and that the fluctuations of the place fields encode information about the recent history of sensory experiences. Place cells may simply be a consequence of a memory compression process implemented in the hippocampus.
biorxiv neuroscience 0-100-users 2019Matryoshka RNA virus 1 a novel RNA virus associated with Plasmodium parasites in human malaria, bioRxiv, 2019-05-01
AbstractParasites of the genus Plasmodium cause human malaria. Yet nothing is known about the viruses that infect these divergent eukaryotes. We investigated the Plasmodium virome by performing a meta-transcriptomic analysis of blood samples from malaria patients infected with P. vivax, P. falciparum or P. knowlesi. This revealed a novel bi-segmented narna-like RNA virus restricted to P. vivax and named Matryoshka RNA virus 1 (MaRNAV-1) to reflect its “Russian doll” nature a virus, infecting a parasite, infecting an animal. MaRNAV-1 was abundant in geographically diverse P. vivax from humans and mosquitoes. Notably, a related virus (MaRNAV-2) was identified in Australian birds infected with a Leucocytozoon - eukaryotic parasites that group with Plasmodium in the Apicomplexa subclass hematozoa. This is the first report of a Plasmodium virus. As well as broadening our understanding of the eukaryotic virosphere, the restriction to P. vivax may help understand P. vivax-specific biology in humans and mosquitoes.
biorxiv evolutionary-biology 100-200-users 2019Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences, bioRxiv, 2019-04-30
AbstractIn the field of artificial intelligence, a combination of scale in data and model capacity enabled by unsupervised learning has led to major advances in representation learning and statistical generation. In biology, the anticipated growth of sequencing promises unprecedented data on natural sequence diversity. Learning the natural distribution of evolutionary protein sequence variation is a logical step toward predictive and generative modeling for biology. To this end we use unsupervised learning to train a deep contextual language model on 86 billion amino acids across 250 million sequences spanning evolutionary diversity. The resulting model maps raw sequences to representations of biological properties without labels or prior domain knowledge. The learned representation space organizes sequences at multiple levels of biological granularity from the biochemical to proteomic levels. Unsupervised learning recovers information about protein structure secondary structure and residue-residue contacts can be identified by linear projections from the learned representations. Training language models on full sequence diversity rather than individual protein families increases recoverable information about secondary structure. The unsupervised models can be adapted with supervision from quantitative mutagenesis data to predict variant activity. Predictions from sequences alone are comparable to results from a state-of-the-art model of mutational effects that uses evolutionary and structurally derived features.
biorxiv synthetic-biology 200-500-users 2019Long-Term Exposure to Elevated Lipoprotein(a) Levels, Parental Lifespan and Risk of Mortality, bioRxiv, 2019-04-30
ABSTRACTBackgroundElevated Lipoprotein(a) (Lp[a]) levels are associated with a broad range of atherosclerotic cardiovascular diseases (CVD). The impact of high Lp(a) levels on human longevity is however controversial. Our objectives were to determine whether genetically-determined Lp(a) levels are associated with parental lifespan and to assess the association between measured and genetically-determined Lp(a) levels and long-term all-cause and cardiovascular mortality.MethodsWe determined the association between a genetic risk score of 26 single nucleotide polymorphisms weighted for their impact on Lp(a) levels (wGRS) and parental lifespan (at least one long-lived parent; father still alive and older than 90 or father’s age of death ≥90 or mother still alive and older than 93 or mother’s age of death ≥93) in 139,362 participants from the UK Biobank. A total of 17,686 participants were considered as having high parental lifespan. We also investigated the association between Lp(a) levels and all-cause and cardiovascular mortality in 18,720 participants from the EPIC-Norfolk study.ResultsIn the UK Biobank, increases in the wGRS (weighted for a 50 mgdL increase in Lp(a) levels) were inversely associated with a high parental lifespan (odds ratio=0.92, 95% confidence interval [CI]=0.89-0.94, p=2.7×10−8). During the 20-year follow-up of the EPIC-Norfolk study, 5686 participants died (2412 from CVD-related causes). Compared to participants with Lp(a) levels <50 mgdL, those with Lp(a) levels ≥50 mgdL had an increased hazard ratio (HR) for all-cause (HR=1.17, 95% CI=1.08-1.27) and cardiovascular (HR=1.54, 95% CI=1.37-1.72) mortality. Compared to individuals with Lp(a) levels below the 50th percentile of the Lp(a) distribution (in whom event rates were 29.8% and 11.3%, respectively for all-cause and cardiovascular mortality), those with Lp(a) levels equal or above the 95th percentile of the population distribution (≥70 mgdL) had HRs of 1.22 (95% CI=1.09-1.37, event rate 37.5%) and 1.71 (95% CI=1.46-2.00, event rate 20.0%), for all-cause mortality and cardiovascular mortality, respectively.ConclusionsResults of this study suggest a potentially causal effect of Lp(a) on human longevity, support the use of parental lifespan as a tool to study the genetic determinants of human longevity, and provide a rationale for a trial of Lp(a)-lowering therapy in individuals with high Lp(a) levels.
biorxiv genetics 0-100-users 2019