Reversed graph embedding resolves complex single-cell developmental trajectories, bioRxiv, 2017-02-22
AbstractOrganizing single cells along a developmental trajectory has emerged as a powerful tool for understanding how gene regulation governs cell fate decisions. However, learning the structure of complex single-cell trajectories with two or more branches remains a challenging computational problem. We present Monocle 2, which uses reversed graph embedding to reconstruct single-cell trajectories in a fully unsupervised manner. Monocle 2 learns an explicit principal graph to describe the data, greatly improving the robustness and accuracy of its trajectories compared to other algorithms. Monocle 2 uncovered a new, alternative cell fate in what we previously reported to be a linear trajectory for differentiating myoblasts. We also reconstruct branched trajectories for two studies of blood development, and show that loss of function mutations in key lineage transcription factors diverts cells to alternative branches on the a trajectory. Monocle 2 is thus a powerful tool for analyzing cell fate decisions with single-cell genomics.
biorxiv genomics 0-100-users 2017Single-cell epigenomics maps the continuous regulatory landscape of human hematopoietic differentiation, bioRxiv, 2017-02-22
AbstractNormal human hematopoiesis involves cellular differentiation of multipotent cells into progressively more lineage-restricted states. While epigenomic landscapes of this process have been explored in immunophenotypically-defined populations, the single-cell regulatory variation that defines hematopoietic differentiation has been hidden by ensemble averaging. We generated single-cell chromatin accessibility landscapes across 8 populations of immunophenotypically-defined human hematopoietic cell types. Using bulk chromatin accessibility profiles to scaffold our single-cell data analysis, we constructed an epigenomic landscape of human hematopoiesis and characterized epigenomic heterogeneity within phenotypically sorted populations to find epigenomic lineage-bias toward different developmental branches in multipotent stem cell states. We identify and isolate sub-populations within classically-defined granulocyte-macrophage progenitors (GMPs) and use ATAC-seq and RNA-seq to confirm that GMPs are epigenomically and transcriptomically heterogeneous. Furthermore, we identified transcription factors and cis-regulatory elements linked to changes in chromatin accessibility within cellular populations and across a continuous myeloid developmental trajectory, and observe relatively simple TF motif dynamics give rise to a broad diversity of accessibility dynamics at cis-regulatory elements. Overall, this work provides a template for exploration of complex regulatory dynamics in primary human tissues at the ultimate level of granular specificity – the single cell.One Sentence SummarySingle cell chromatin accessibility reveals a high-resolution, continuous landscape of regulatory variation in human hematopoiesis.
biorxiv genomics 100-200-users 2017Cas9-Assisted Targeting of CHromosome segments (CATCH) for targeted nanopore sequencing and optical genome mapping, bioRxiv, 2017-02-21
ABSTRACTVariations in the genetic code, from single point mutations to large structural or copy number alterations, influence susceptibility, onset, and progression of genetic diseases and tumor transformation. Next-generation sequencing analysis is unable to reliably capture aberrations larger than the typical sequencing read length of several hundred bases. Long-read, single-molecule sequencing methods such as SMRT and nanopore sequencing can address larger variations, but require costly whole genome analysis. Here we describe a method for isolation and enrichment of a large genomic region of interest for targeted analysis based on Cas9 excision of two sites flanking the target region and isolation of the excised DNA segment by pulsed field gel electrophoresis. The isolated target remains intact and is ideally suited for optical genome mapping and long-read sequencing at high coverage. In addition, analysis is performed directly on native genomic DNA that retains genetic and epigenetic composition without amplification bias. This method enables detection of mutations and structural variants as well as detailed analysis by generation of hybrid scaffolds composed of optical maps and sequencing data at a fraction of the cost of whole genome sequencing.
biorxiv genomics 100-200-users 2017Niche construction in evolutionary theory the construction of an academic niche?, bioRxiv, 2017-02-20
AbstractIn recent years, fairly far-reaching claims have been repeatedly made about how niche construction, the modification by organisms of their environment, and that of other organisms, represents a vastly neglected phenomenon in ecological and evolutionary thought. The proponents of this view claim that the niche construction perspective greatly expands the scope of standard evolutionary theory and that niche construction deserves to be treated as a significant evolutionary process in its own right, almost at par with natural selection. Claims have also been advanced about how niche construction theory represents a substantial extension to, and re-orientation of, standard evolutionary theory, which is criticized as being narrowly gene-centric and ignoring the rich complexity and reciprocity of organism-environment interactions. We examine these claims in some detail and show that they do not stand up to scrutiny. We suggest that the manner in which niche construction theory is sought to be pushed in the literature is better viewed as an exercise in academic niche construction whereby, through incessant repetition of largely untenable claims, and the deployment of rhetorically appealing but logically dubious analogies, a receptive climate for a certain sub-discipline is sought to be manufactured within the scientific community. We see this as an unfortunate, but perhaps inevitable, nascent post-truth tendency within science.
biorxiv evolutionary-biology 100-200-users 2017mixOmics an R package for ‘omics feature selection and multiple data integration, bioRxiv, 2017-02-15
AbstractThe advent of high throughput technologies has led to a wealth of publicly available ‘omics data coming from different sources, such as transcriptomics, proteomics, metabolomics. Combining such large-scale biological data sets can lead to the discovery of important biological insights, provided that relevant information can be extracted in a holistic manner. Current statistical approaches have been focusing on identifying small subsets of molecules (a ‘molecular signature’) to explain or predict biological conditions, but mainly for a single type of ‘omics. In addition, commonly used methods are univariate and consider each biological feature independently.We introducemixOmics, an R package dedicated to the multivariate analysis of biological data sets with a specific focus on data exploration, dimension reduction and visualisation. By adopting a system biology approach, the toolkit provides a wide range of methods that statistically integrate several data sets at once to probe relationships between heterogeneous ‘omics data sets. Our recent methods extend Projection to Latent Structure (PLS) models for discriminant analysis, for data integration across multiple ‘omics data or across independent studies, and for the identification of molecular signatures. We illustrate our latestmixOmicsintegrative frameworks for the multivariate analyses of ‘omics data available from the package.
biorxiv bioinformatics 100-200-users 2017Genes Affecting Vocal and Facial Anatomy Went Through Extensive Regulatory Divergence in Modern Humans, bioRxiv, 2017-02-09
SummaryRegulatory changes are broadly accepted as key drivers of phenotypic divergence. However, identifying regulatory changes that underlie human-specific traits has proven very challenging. Here, we use 63 DNA methylation maps of ancient and present-day humans, as well as of six chimpanzees, to detect differentially methylated regions that emerged in modern humans after the split from Neanderthals and Denisovans. We show that genes affecting the face and vocal tract went through particularly extensive methylation changes. Specifically, we identify widespread hypermethylation in a network of face- and voice-affecting genes (SOX9, ACAN, COL2A1, NFIX and XYLT1). We propose that these repression patterns appeared after the split from Neanderthals and Denisovans, and that they might have played a key role in shaping the modern human face and vocal tract.
biorxiv genomics 0-100-users 2017