Charting a tissue from single-cell transcriptomes, bioRxiv, 2018-10-30
AbstractMassively multiplexed sequencing of RNA in individual cells is transforming basic and clinical life sciences. However, in standard experiments, tissues must first be dissociated. Thus, after sequencing, information about the spatial relationships between cells is lost although this knowledge is crucial for understanding cellular and tissue-level function. Recent attempts to overcome this fundamental challenge rely on employing additional in situ gene expression imaging data which can guide spatial mapping of sequenced cells. Here we present a conceptually different approach that allows to reconstruct spatial positions of cells in a variety of tissues without using reference imaging data. We first show for several complex biological systems that distances of single cells in expression space monotonically increase with their physical distances across tissues. We therefore seek to map cells to tissue space such that this principle is optimally preserved, while matching existing imaging data when available. We show that this optimization problem can be cast as a generalized optimal transport problem and solved efficiently. We apply our approach successfully to reconstruct the mammalian liver and intestinal epithelium as well as fly and zebrafish embryos. Our results demonstrate a simple spatial expression organization principle and that this principle (or future refined principles) can be used to infer, for individual cells, meaningful spatial position probabilities from the sequencing data alone.
biorxiv systems-biology 100-200-users 2018A deep proteome and transcriptome abundance atlas of 29 healthy human tissues, bioRxiv, 2018-06-27
AbstractGenome-, transcriptome- and proteome-wide measurements provide valuable insights into how biological systems are regulated. However, even fundamental aspects relating to which human proteins exist, where they are expressed and in which quantities are not fully understood. Therefore, we have generated a systematic, quantitative and deep proteome and transcriptome abundance atlas from 29 paired healthy human tissues from the Human Protein Atlas Project and representing human genes by 17,615 transcripts and 13,664 proteins. The analysis revealed that few proteins show truly tissue-specific expression, that vast differences between mRNA and protein quantities within and across tissues exist and that the expression levels of proteins are often more stable across tissues than those of transcripts. In addition, only ~2% of all exome and ~7% of all mRNA variants could be confidently detected at the protein level showing that proteogenomics remains challenging, requires rigorous validation using synthetic peptides and needs more sophisticated computational methods. Many uses of this resource can be envisaged ranging from the study of geneprotein expression regulation to protein biomarker specificity evaluation to name a few.
biorxiv systems-biology 200-500-users 2018Transcriptional burst initiation and polymerase pause release are key control points of transcriptional regulation, bioRxiv, 2018-03-03
AbstractTranscriptional regulation occurs via changes to the rates of various biochemical processes. Sequencing-based approaches that average together many cells have suggested that polymerase binding and polymerase release from promoter-proximal pausing are two key regulated steps in the transcriptional process. However, single cell studies have revealed that transcription occurs in short, discontinuous bursts, suggesting that transcriptional burst initiation and termination might also be regulated steps. Here, we develop and apply a quantitative framework to connect changes in both Pol II ChIP-seq and single cell transcriptional measurements to changes in the rates of specific steps of transcription. Using a number of global and targeted transcriptional regulatory perturbations, we show that burst initiation rate is indeed a key regulated step, demonstrating that transcriptional activity can be frequency modulated. Polymerase pause release is a second key regulated step, but the rate of polymerase binding is not changed by any of the biological perturbations we examined. Our results establish an important role for transcriptional burst regulation in the control of gene expression.
biorxiv systems-biology 100-200-users 2018The interaction landscape between transcription factors and the nucleosome, bioRxiv, 2017-12-29
Nucleosomes cover most of the genome and are thought to be displaced by transcription factors (TFs) in regions that direct gene expression. However, the modes of interaction between TFs and nucleosomal DNA remain largely unknown. Here, we use nucleosome consecutive affinity-purification systematic evolution of ligands by exponential enrichment (NCAP-SELEX) to systematically explore interactions between the nucleosome and 220 TFs representing diverse structural families. Consistently with earlier observations, we find that the vast majority of TFs have less access to nucleosomal DNA than to free DNA. The motifs recovered from TFs bound to nucleosomal and free DNA are generally similar; however, steric hindrance and scaffolding by the nucleosome result in specific positioning and orientation of the motifs. Many TFs preferentially bind close to the end of nucleosomal DNA, or to periodic positions at its solvent-exposed side. TFs often also bind nucleosomal DNA in a particular orientation, because the nucleosome breaks the local rotational symmetry of DNA. Some TFs also specifically interact with DNA located at the dyad position where only one DNA gyre is wound, whereas other TFs prefer sites spanning two DNA gyres and bind specifically to each of them. Our work reveals striking differences in TF binding to free and nucleosomal DNA, and uncovers a rich interaction landscape between the TFs and the nucleosome.
biorxiv systems-biology 100-200-users 2017The null additivity of multi-drug combinations, bioRxiv, 2017-12-26
AbstractFrom natural ecology 1–4 to clinical therapy 5–8, cells are often exposed to mixtures of multiple drugs. Two competing null models are used to predict the combined effect of drugs response additivity (Bliss) and dosage additivity (Loewe) 9–11. Here, noting that these models diverge with increased number of drugs, we contrast their predictions with measurements of Escherichia coli growth under combinations of up to 10 different antibiotics. As the number of drugs increases, Bliss maintains accuracy while Loewe systematically loses its predictive power. The total dosage required for growth inhibition, which Loewe predicts should be fixed, steadily increases with the number of drugs, following a square root scaling. This scaling is explained by an approximation to Bliss where, inspired by RA Fisher’s classical geometric model 12, dosages of independent drugs adds up as orthogonal vectors rather than linearly. This dose-orthogonality approximation provides results similar to Bliss, yet uses the dosage language as Loewe and is hence easier to implement and intuit. The rejection of dosage additivity in favor of effect additivity and dosage orthogonality provides a framework for understanding how multiple drugs and stressors add up in nature and the clinic.
biorxiv systems-biology 100-200-users 2017A deep mutational scan of an acidic activation domain, bioRxiv, 2017-12-09
AbstractTranscriptional activation domains are intrinsically disordered peptides with little primary sequence conservation. These properties have made it difficult to identify the sequence features that define activation domains. For example, although acidic activation domains were discovered 30 years ago, we still do not know what role, if any, acidic residues play in these peptides. To address this question we designed a rational mutagenesis scheme to independently test four sequence features theorized to control the strength of activation domains acidity (negative charge), hydrophobicity, intrinsic disorder, and short linear motifs. To test enough mutants to deconvolve these four features we developed a method to quantify the activities of thousands of activation domain variants in parallel. Our results with Gcn4, a classic acidic activation domain, suggest that acidic residues in particular regions keep two hydrophobic motifs exposed to solvent. We also found that the specific activity of the Gcn4 activation domain increases during amino acid starvation. Our results suggest that Gcn4 may have evolved to have low activity but high inducibility. Our results also demonstrate that high-throughput rational mutation scans will be powerful tools for unraveling the properties that control how intrinsically disordered proteins function.
biorxiv systems-biology 0-100-users 2017