A risk-reward tradeoff of high ribosome production in proliferating cells, bioRxiv, 2018-10-31
AbstractTo achieve maximal growth, cells must manage a massive economy of ribosomal proteins (r-proteins) and RNAs (rRNAs), which are required to produce thousands of new ribosomes every minute. Although ribosomes are essential in all cells, disruptions to ribosome biogenesis lead to heterogeneous phenotypes. Here, we modeled these perturbations in Saccharomyces cerevisiae and show that challenges to ribosome biogenesis result immediately in acute loss of proteostasis (protein folding homeostasis). Imbalances in the synthesis of r-proteins and rRNAs lead to the rapid aggregation of newly synthesized orphan r-proteins and compromise essential cellular processes. In response, proteostasis genes are activated by an Hsf1-dependent stress response pathway that is required for recovery from r-protein assembly stress. Importantly, we show that exogenously bolstering the proteostasis network increases cellular fitness in the face of challenges to ribosome assembly, demonstrating the direct contribution of orphan r-proteins to cellular phenotypes. Our results highlight ribosome assembly as a linchpin of cellular homeostasis, representing a key proteostasis vulnerability for rapidly proliferating cells that may be compromised by diverse genetic, environmental, and xenobiotic conditions that generate orphan r-proteins.
biorxiv cell-biology 0-100-users 2018Data Denoising with transfer learning in single-cell transcriptomics, bioRxiv, 2018-10-31
Single-cell RNA sequencing (scRNA-seq) data is noisy and sparse. Here, we show that transfer learning across datasets remarkably improves data quality. By coupling a deep autoencoder with a Bayesian model, SAVER-X extracts transferable gene-gene relationships across data from different labs, varying conditions, and divergent species to denoise target new datasets.
biorxiv bioinformatics 100-200-users 2018Immediate visualization of recombination events and chromosome segregation defects in fission yeast meiosis, bioRxiv, 2018-10-31
AbstractSchizosaccharomyces pombe, also known as fission yeast, is an established model for studying chromosome biological processes. Over the years research employing fission yeast has made important contributions to our knowledge about chromosome segregation during meiosis, as well as meiotic recombination and its regulation. Quantification of meiotic recombination frequency is not a straightforward undertaking, either requiring viable progeny for a genetic plating assay, or relying on laborious Southern blot analysis of recombination intermediates. Neither of these methods lends itself to high-throughput screens to identify novel meiotic factors. Here, we establish visual assays novel to Sz. pombe for characterizing chromosome segregation and meiotic recombination phenotypes. Genes expressing red, yellow, andor cyan fluorophores from spore-autonomous promoters have been integrated into the fission yeast genomes, either close to the centromere of chromosome I to monitor chromosome segregation, or on the arm of chromosome III to form a genetic interval at which recombination frequency can be determined. The visual recombination assay allows straightforward and immediate assessment of the genetic outcome of a single meiosis by epi-fluorescence microscopy without requiring tetrad dissection. We also demonstrate that the recombination frequency analysis can be automatized by utilizing imaging flow cytometry to enable high-throughput screens. These assays have several advantages over traditional methods for analysing meiotic phenotypes.
biorxiv genetics 0-100-users 2018Personalized and graph genomes reveal missing signal in epigenomic data, bioRxiv, 2018-10-31
AbstractBackgroundEpigenomic studies that use next generation sequencing experiments typically rely on the alignment of reads to a reference sequence. However, because of genetic diversity and the diploid nature of the human genome, we hypothesized that using a generic reference could lead to incorrectly mapped reads and bias downstream results.ResultsWe show that accounting for genetic variation using a modified reference genome (MPG) or a denovo assembled genome (DPG) can alter histone H3K4me1 and H3K27ac ChIP-seq peak calls by either creating new personal peaks or by the loss of reference peaks. MPGs are found to alter approximately 1% of peak calls while DPGs alter up to 5% of peaks. We also show statistically significant differences in the amount of reads observed in regions associated with the new, altered and unchanged peaks. We report that short insertions and deletions (indels), followed by single nucleotide variants (SNVs), have the highest probability of modifying peak calls. A counter-balancing factor is peak width, with wider calls being less likely to be altered. Next, because high-quality DPGs remain hard to obtain, we show that using a graph personalized genome (GPG), represents a reasonable compromise between MPGs and DPGs and alters about 2.5% of peak calls. Finally, we demonstrate that altered peaks have a genomic distribution typical of other peaks. For instance, for H3K4me1, 518 personal-only peaks were replicated using at least two of three approaches, 394 of which were inside or within 10Kb of a gene.ConclusionsAnalysing epigenomic datasets with personalized and graph genomes allows the recovery of new peaks enriched for indels and SNVs. These altered peaks are more likely to differ between individuals and, as such, could be relevant in the study of various human phenotypes.
biorxiv bioinformatics 100-200-users 2018The emergence of multiple retinal cell types through efficient coding of natural movies, bioRxiv, 2018-10-31
AbstractOne of the most striking aspects of early visual processing in the retina is the immediate parcellation of visual information into multiple parallel pathways, formed by different retinal ganglion cell types each tiling the entire visual field. Existing theories of efficient coding have been unable to account for the functional advantages of such cell-type diversity in encoding natural scenes. Here we go beyond previous theories to analyze how a simple linear retinal encoding model with different convolutional cell types efficiently encodes naturalistic spatiotemporal movies given a fixed firing rate budget. We find that optimizing the receptive fields and cell densities of two cell types makes them match the properties of the two main cell types in the primate retina, midget and parasol cells, in terms of spatial and temporal sensitivity, cell spacing, and their relative ratio. Moreover, our theory gives a precise account of how the ratio of midget to parasol cells decreases with retinal eccentricity. Also, we train a nonlinear encoding model with a rectifying nonlinearity to efficiently encode naturalistic movies, and again find emergent receptive fields resembling those of midget and parasol cells that are now further subdivided into ON and OFF types. Thus our work provides a theoretical justification, based on the efficient coding of natural movies, for the existence of the four most dominant cell types in the primate retina that together comprise 70% of all ganglion cells.
biorxiv neuroscience 0-100-users 2018Transfer learning in single-cell transcriptomics improves data denoising and pattern discovery, bioRxiv, 2018-10-31
Although single-cell RNA sequencing (scRNA-seq) technologies have shed light on the role of cellular diversity in human pathophysiology1–3, the resulting data remains noisy and sparse, making reliable quantification of gene expression challenging. Here, we show that a deep autoencoder coupled to a Bayesian model remarkably improves UMI-based scRNA-seq data quality by transfer learning across datasets. This new technology, SAVER-X, outperforms existing state-of-the-art tools. The deep learning model in SAVER-X extracts transferable gene expression features across data from different labs, generated by varying technologies, and obtained from divergent species. Through this framework, we explore the limits of transfer learning in a diverse testbed and demonstrate that future human sequencing projects will unequivocally benefit from the accumulation of publicly available data. We further show, through examples in immunology and neurodevelopment, that SAVER-X can harness existing public data to enhance downstream analysis of new data, such as those collected in clinical settings.
biorxiv bioinformatics 100-200-users 2018