Single-cell RNA-sequencing reveals profibrotic roles of distinct epithelial and mesenchymal lineages in pulmonary fibrosis, bioRxiv, 2019-09-07
AbstractPulmonary fibrosis is a form of chronic lung disease characterized by pathologic epithelial remodeling and accumulation of extracellular matrix. In order to comprehensively define the cell types, mechanisms and mediators driving fibrotic remodeling in lungs with pulmonary fibrosis, we performed single-cell RNA-sequencing of single-cell suspensions from 10 non-fibrotic control and 20 PF lungs. Analysis of 114,396 cells identified 31 distinct cell types. We report a remarkable shift in epithelial cell phenotypes occurs in the peripheral lung in PF, and identify several previously unrecognized epithelial cell phenotypes including a KRT5−KRT17+, pathologic ECM-producing epithelial cell population that was highly enriched in PF lungs. Multiple fibroblast subtypes were observed to contribute to ECM expansion in a spatially-discrete manner. Together these data provide high-resolution insights into the complexity and plasticity of the distal lung epithelium in human disease, and indicate a diversity of epithelial and mesenchymal cells contribute to pathologic lung fibrosis.One Sentence SummarySingle-cell RNA-sequencing provides new insights into pathologic epithelial and mesenchymal remodeling in the human lung.
biorxiv genomics 100-200-users 2019Single Cell RNA-seq reveals ectopic and aberrant lung resident cell populations in Idiopathic Pulmonary Fibrosis, bioRxiv, 2019-09-06
AbstractWe provide a single cell atlas of Idiopathic Pulmonary Fibrosis (IPF), a fatal interstitial lung disease, focusing on resident lung cell populations. By profiling 312,928 cells from 32 IPF, 29 healthy control and 18 chronic obstructive pulmonary disease (COPD) lungs, we demonstrate that IPF is characterized by changes in discrete subpopulations of cells in the three major parenchymal compartments the epithelium, endothelium and stroma. Among epithelial cells, we identify a novel population of IPF enriched aberrant basaloid cells that co-express basal epithelial markers, mesenchymal markers, senescence markers, developmental transcription factors and are located at the edge of myofibroblast foci in the IPF lung. Among vascular endothelial cells in the in IPF lung parenchyma we identify an expanded cell population transcriptomically identical to vascular endothelial cells normally restricted to the bronchial circulation. We confirm the presence of both populations by immunohistochemistry and independent datasets. Among stromal cells we identify fibroblasts and myofibroblasts in both control and IPF lungs and leverage manifold-based algorithms diffusion maps and diffusion pseudotime to infer the origins of the activated IPF myofibroblast. Our work provides a comprehensive catalogue of the aberrant cellular transcriptional programs in IPF, demonstrates a new framework for analyzing complex disease with scRNAseq, and provides the largest lung disease single-cell atlas to date.
biorxiv genomics 0-100-users 2019Insight into the genomic history of the Near East from whole-genome sequences and genotypes of Yemenis, bioRxiv, 2019-08-29
AbstractWe report high-coverage whole-genome sequencing data from 46 Yemeni individuals as well as genome-wide genotyping data from 169 Yemenis from diverse locations. We use this dataset to define the genetic diversity in Yemen and how it relates to people elsewhere in the Near East. Yemen is a vast region with substantial cultural and geographic diversity, but we found little genetic structure correlating with geography among the Yemenis – probably reflecting continuous movement of people between the regions. African ancestry from admixture in the past 800 years is widespread in Yemen and is the main contributor to the country’s limited genetic structure, with some individuals in Hudayda and Hadramout having up to 20% of their genetic ancestry from Africa. In contrast, individuals from Maarib appear to have been genetically isolated from the African gene flow and thus have genomes likely to reflect Yemen’s ancestry before the admixture. This ancestry was comparable to the ancestry present during the Bronze Age in the distant Northern regions of the Near East. After the Bronze Age, the South and North of the Near East therefore followed different genetic trajectories in the North the Levantines admixed with a Eurasian population carrying steppe ancestry whose impact never reached as far south as the Yemen, where people instead admixed with Africans leading to the genetic structure observed in the Near East today.
biorxiv genomics 0-100-users 2019A molecular cell atlas of the human lung from single cell RNA sequencing, bioRxiv, 2019-08-27
AbstractAlthough single cell RNA sequencing studies have begun providing compendia of cell expression profiles, it has proven more difficult to systematically identify and localize all molecular cell types in individual organs to create a full molecular cell atlas. Here we describe droplet- and plate-based single cell RNA sequencing applied to ∼70,000 human lung and blood cells, combined with a multi-pronged cell annotation approach, which have allowed us to define the gene expression profiles and anatomical locations of 58 cell populations in the human lung, including 41 of 45 previously known cell types or subtypes and 14 new ones. This comprehensive molecular atlas elucidates the biochemical functions of lung cell types and the cell-selective transcription factors and optimal markers for making and monitoring them; defines the cell targets of circulating hormones and predicts local signaling interactions including sources and targets of chemokines in immune cell trafficking and expression changes on lung homing; and identifies the cell types directly affected by lung disease genes. Comparison to mouse identified 17 molecular types that appear to have been gained or lost during lung evolution and others whose expression profiles have been substantially altered, revealing extensive plasticity of cell types and cell-type-specific gene expression during organ evolution including expression switches between cell types. This lung atlas provides the molecular foundation for investigating how lung cell identities, functions, and interactions are achieved in development and tissue engineering and altered in disease and evolution.
biorxiv genomics 100-200-users 2019Deep learning at base-resolution reveals motif syntax of the cis-regulatory code, bioRxiv, 2019-08-22
AbstractGenes are regulated through enhancer sequences, in which transcription factor binding motifs and their specific arrangements (syntax) form a cis-regulatory code. To understand the relationship between motif syntax and transcription factor binding, we train a deep learning model that uses DNA sequence to predict base-resolution binding profiles of four pluripotency transcription factors Oct4, Sox2, Nanog, and Klf4. We interpret the model to accurately map hundreds of thousands of motifs in the genome, learn novel motif representations and identify rules by which motifs and syntax influence transcription factor binding. We find that instances of strict motif spacing are largely due to retrotransposons, but that soft motif syntax influences motif interactions at protein and nucleosome range. Most strikingly, Nanog binding is driven by motifs with a strong preference for ∼10.5 bp spacings corresponding to helical periodicity. Interpreting deep learning models applied to high-resolution binding data is a powerful and versatile approach to uncover the motifs and syntax of cis-regulatory sequences.
biorxiv genomics 100-200-users 2019A curated database reveals trends in single-cell transcriptomics, bioRxiv, 2019-08-21
The more than 500 single-cell transcriptomics studies that have been published to date constitute a valuable and vast resource for biological discovery. While various “atlas” projects have collated some of the associated datasets, most questions related to specific tissue types, species, or other attributes of studies require identifying papers through manual and challenging literature search. To facilitate discovery with published single-cell transcriptomics data, we have assembled a near exhaustive, manually curated database of single-cell transcriptomics studies with key information descriptions of the type of data and technologies used, along with descriptors of the biological systems studied. Additionally, the database contains summarized information about analysis in the papers, allowing for analysis of trends in the field. As an example, we show that the number of cell types identified in scRNA-seq studies is proportional to the number of cells analysed. The database is available at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpswww.nxn.sesingle-cell-studiesgui>www.nxn.sesingle-cell-studiesgui<jatsext-link>.
biorxiv genomics 200-500-users 2019