Cardelino Integrating whole exomes and single-cell transcriptomes to reveal phenotypic impact of somatic variants, bioRxiv, 2018-09-12

AbstractDecoding the clonal substructures of somatic tissues sheds light on cell growth, development and differentiation in health, ageing and disease. DNA-sequencing, either using bulk or using single-cell assays, has enabled the reconstruction of clonal trees from frequency and co-occurrence patterns of somatic variants. However, approaches to systematically characterize phenotypic and functional variations between individual clones are not established. Here we present cardelino (<jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsgithub.comPMBiocardelino>httpsgithub.comPMBiocardelino<jatsext-link>), a computational method for inferring the clone of origin of individual cells that have been assayed using single-cell RNA-seq (scRNA-seq). After validating our model using simulations, we apply cardelino to matched scRNA-seq and exome sequencing data from 32 human dermal fibroblast lines, identifying hundreds of differentially expressed genes between cells from different somatic clones. These genes are frequently enriched for cell cycle and proliferation pathways, indicating a key role for cell division genes in non-neutral somatic evolution.Key findings<jatslist list-type=bullet><jatslist-item>A novel approach for integrating DNA-seq and single-cell RNA-seq data to reconstruct clonal substructure for single-cell transcriptomes.<jatslist-item><jatslist-item>Evidence for non-neutral evolution of clonal populations in human fibroblasts.<jatslist-item><jatslist-item>Proliferation and cell cycle pathways are commonly distorted in mutated clonal populations.<jatslist-item>

biorxiv genomics 100-200-users 2018

Resource Scalable whole genome sequencing of 40,000 single cells identifies stochastic aneuploidies, genome replication states and clonal repertoires, bioRxiv, 2018-09-07

SummaryEssential features of cancer tissue cellular heterogeneity such as negatively selected genome topologies, sub-clonal mutation patterns and genome replication states can only effectively be studied by sequencing single-cell genomes at scale and high fidelity. Using an amplification-free single-cell genome sequencing approach implemented on commodity hardware (DLP+) coupled with a cloud-based computational platform, we define a resource of 40,000 single-cell genomes characterized by their genome states, across a wide range of tissue types and conditions. We show that shallow sequencing across thousands of genomes permits reconstruction of clonal genomes to single nucleotide resolution through aggregation analysis of cells sharing higher order genome structure. From large-scale population analysis over thousands of cells, we identify rare cells exhibiting mitotic mis-segregation of whole chromosomes. We observe that tissue derived scWGS libraries exhibit lower rates of whole chromosome anueploidy than cell lines, and loss of p53 results in a shift in event type, but not overall prevalence in breast epithelium. Finally, we demonstrate that the replication states of genomes can be identified, allowing the number and proportion of replicating cells, as well as the chromosomal pattern of replication to be unambiguously identified in single-cell genome sequencing experiments. The combined annotated resource and approach provide a re-implementable large scale platform for studying lineages and tissue heterogeneity.

biorxiv genomics 100-200-users 2018

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo