UpSetR An R Package for the Visualization of Intersecting Sets and their Properties, bioRxiv, 2017-03-26
AbstractVenn and Euler diagrams are a popular yet inadequate solution for quantitative visualization of set intersections. A scalable alternative to Venn and Euler diagrams for visualizing intersecting sets and their properties is needed. We developed UpSetR, an open source R package that employs a scalable matrix-based visualization to show intersections of sets, their size, and other properties. UpSetR is available at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpscran.r-project.orgpackage=UpSetR>httpscran.r-project.orgpackage=UpSetR<jatsext-link> and released under the MIT License. A Shiny app is available at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsgehlenborglab.shinyapps.ioupsetr>httpsgehlenborglab.shinyapps.ioupsetr<jatsext-link>.
biorxiv bioinformatics 200-500-users 2017DroNc-Seq Deciphering cell types in human archived brain tissues by massively-parallel single nucleus RNA-seq, bioRxiv, 2017-03-10
Single nucleus RNA-Seq (sNuc-Seq) profiles RNA from tissues that are preserved or cannot be dissociated, but does not provide the throughput required to analyse many cells from complex tissues. Here, we develop DroNc-Seq, massively parallel sNuc-Seq with droplet technology. We profile 29,543 nuclei from mouse and human archived brain samples to demonstrate sensitive, efficient and unbiased classification of cell types, paving the way for charting systematic cell atlases.
biorxiv genomics 200-500-users 2017Genomic analysis of family data reveals additional genetic effects on intelligence and personality, bioRxiv, 2017-02-07
AbstractPedigree-based analyses of intelligence have reported that genetic differences account for 50-80% of the phenotypic variation. For personality traits these effects are smaller, with 34-48% of the variance being explained by genetic differences. However, molecular genetic studies using unrelated individuals typically report a heritability estimate of around 30% for intelligence and between 0% and 15% for personality variables. Pedigree-based estimates and molecular genetic estimates may differ because current genotyping platforms are poor at tagging causal variants, variants with low minor allele frequency, copy number variants, and structural variants. Using ∼20 000 individuals in the Generation Scotland family cohort genotyped for ∼700 000 single nucleotide polymorphisms (SNPs), we exploit the high levels of linkage disequilibrium (LD) found in members of the same family to quantify the total effect of genetic variants that are not tagged in GWASs of unrelated individuals. In our models, genetic variants in low LD with genotyped SNPs explain over half of the genetic variance in intelligence, education, and neuroticism. By capturing these additional genetic effects our models closely approximate the heritability estimates from twin studies for intelligence and education, but not for neuroticism and extraversion. We then replicated our finding using imputed molecular genetic data from unrelated individuals to show that ∼50% of differences in intelligence, and ∼40% of the differences in education, can be explained by genetic effects when a larger number of rare SNPs are included. From an evolutionary genetic perspective, a substantial contribution of rare genetic variants to individual differences in intelligence and education is consistent with mutation-selection balance.
biorxiv genetics 200-500-users 2017Comprehensive single cell transcriptional profiling of a multicellular organism by combinatorial indexing, bioRxiv, 2017-02-03
AbstractConventional methods for profiling the molecular content of biological samples fail to resolve heterogeneity that is present at the level of single cells. In the past few years, single cell RNA sequencing has emerged as a powerful strategy for overcoming this challenge. However, its adoption has been limited by a paucity of methods that are at once simple to implement and cost effective to scale massively. Here, we describe a combinatorial indexing strategy to profile the transcriptomes of large numbers of single cells or single nuclei without requiring the physical isolation of each cell (Single cell Combinatorial Indexing RNA-seq or sci-RNA-seq). We show that sci-RNA-seq can be used to efficiently profile the transcriptomes of tens-of-thousands of single cells per experiment, and demonstrate that we can stratify cell types from these data. Key advantages of sci-RNA-seq over contemporary alternatives such as droplet-based single cell RNA-seq include sublinear cost scaling, a reliance on widely available reagents and equipment, the ability to concurrently process many samples within a single workflow, compatibility with methanol fixation of cells, cell capture based on DNA content rather than cell size, and the flexibility to profile either cells or nuclei. As a demonstration of sci-RNA-seq, we profile the transcriptomes of 42,035 single cells from C. elegans at the L2 stage, effectively 50-fold “shotgun cellular coverage” of the somatic cell composition of this organism at this stage. We identify 27 distinct cell types, including rare cell types such as the two distal tip cells of the developing gonad, estimate consensus expression profiles and define cell-type specific and selective genes. Given that C. elegans is the only organism with a fully mapped cellular lineage, these data represent a rich resource for future methods aimed at defining cell types and states. They will advance our understanding of developmental biology, and constitute a major step towards a comprehensive, single-cell molecular atlas of a whole animal.
biorxiv genomics 200-500-users 2017Mass-spectrometry of single mammalian cells quantifies proteome heterogeneity during cell differentiation, bioRxiv, 2017-01-25
Cellular heterogeneity is important to biological processes, including cancer and development. However, proteome heterogeneity is largely unexplored because of the limitations of existing methods for quantifying protein levels in single cells. To alleviate these limitations, we developed Single Cell ProtEomics by Mass Spectrometry (SCoPE-MS), and validated its ability to identify distinct human cancer cell types based on their proteomes. We used SCoPE-MS to quantify over a thousand proteins in differentiating mouse embryonic stem (ES) cells. The single-cell proteomes enabled us to deconstruct cell populations and infer protein abundance relationships. Comparison between single-cell proteomes and transcriptomes indicated coordinated mRNA and protein covariation. Yet many genes exhibited functionally concerted and distinct regulatory patterns at the mRNA and the protein levels, suggesting that post-transcriptional regulatory mechanisms contribute to proteome remodeling during lineage specification, especially for developmental genes. SCoPE-MS is broadly applicable to measuring proteome configurations of single cells and linking them to functional phenotypes, such as cell type and differentiation potentials.
biorxiv genomics 200-500-users 2017