Organization and Regulation of Chromatin by Liquid-Liquid Phase Separation, bioRxiv, 2019-01-18
Genomic DNA is highly compacted in the nucleus of eukaryotic cells as a nucleoprotein assembly called chromatin. The basic unit of chromatin is the nucleosome, where ~146 base pair increments of the genome are wrapped and compacted around the core histone proteins. Further genomic organization and compaction occur through higher order assembly of nucleosomes. This organization regulates many nuclear processes, and is controlled in part by histone post-transtranslational modifications and chromatin-binding proteins. Mechanisms that regulate the assembly and compaction of the genome remain unclear. Here we show that in the presence of physiologic concentrations of mono- and divalent salts, histone tail-driven interactions drive liquid-liquid phase separation (LLPS) of nucleosome arrays, resulting in substantial condensation. Phase separation of nucleosomal arrays is inhibited by histone acetylation, whereas histone H1 promotes phase separation, further compaction, and decreased dynamics within droplets, mirroring the relationship between these modulators and the accessibility of the genome in cells. These results indicate that under physiologically relevant conditions, LLPS is an intrinsic behavior of the chromatin polymer, and suggest a model in which the condensed phase reflects a genomic 'ground state' that can produce chromatin organization and compaction in vivo. The dynamic nature of this state could enable known modulators of chromatin structure, such as post-translational modifications and chromatin binding proteins, to act upon it and consequently control nuclear processes such as transcription and DNA repair. Our data suggest an important role for LLPS of chromatin in the organization of the eukaryotic genome.
biorxiv biophysics 100-200-users 2019Transposable elements drive reorganisation of 3D chromatin during early embryogenesis, bioRxiv, 2019-01-18
Transposable elements are abundant genetic components of eukaryotic genomes with important regulatory features affecting transcription, splicing, and recombination, among others. Here we demonstrate that the Murine Endogenous Retroviral Element (MuERV-LMERVL) family of transposable elements drives the 3D reorganisation of the genome in the early mouse embryo. By generating Hi-C data in 2-cell-like cells, we show that MERLV elements promote the formation of insulating domain boundaries throughout the genome in vivo and in vitro. The formation of these boundaries is coupled to the upregulation of directional transcription from MERVL, which results in the activation of a subset of the gene expression programme of the 2-cell stage embryo. Domain boundaries in the 2-cell stage embryo are transient and can be remodelled without undergoing cell division. Remarkably, we find extensive inter-strain MERVL variation, suggesting multiple non-overlapping rounds of recent genome invasion and a high regulatory plasticity of genome organisation. Our results demonstrate that MERVL drive chromatin organisation during early embryonic development shedding light into how nuclear organisation emerges during zygotic genome activation in mammals.
biorxiv genomics 100-200-users 2019A 3D-printed hand-powered centrifuge for molecular biology Supplementary Information, bioRxiv, 2019-01-17
The centrifuge is an essential tool for many aspects of research and medical diagnostics. However, conventional centrifuges are often inaccessible outside of conventional laboratory settings, such as remote field sites, require a constant external power source, and can be prohibitively costly in resource-limited settings and STEM-focused programs. Here we present the 3D-Fuge, a 3D-printed hand-powered centrifuge, as a novel alternative to standard benchtop centrifuges. Based on the design principles of a paper-based centrifuge, this 3D-printed instrument increases the volume capacity to 2 mL and can reach hand-powered centrifugation speeds up to 6,000 rpm. The 3D-Fuge devices presented here are capable of centrifugation of a wide variety of different solutions such as spinning down samples for biomarker applications and performing nucleotide extractions as part of a portable molecular lab setup. We introduce the design and proof-of-principle trials that demonstrate the utility of low-cost 3D printed centrifuges for use in remote and educational settings.
biorxiv bioengineering 100-200-users 2019BEHST genomic set enrichment analysis enhanced through integration of chromatin long-range interactions, bioRxiv, 2019-01-16
Transforming data from genome-scale assays into knowledge of affected molecular functions and pathways is a key challenge in biomedical research. Using vocabularies of functional terms and databases annotating genes with these terms, pathway enrichment methods can identify terms enriched in a gene list. With data that can refer to intergenic regions, however, one must first connect the regions to the terms, which are usually annotated only to genes. To make these connections, existing pathway enrichment approaches apply unwarranted assumptions such as annotating non-coding regions with the terms from adjacent genes. We developed a computational method that instead links genomic regions to annotations using data on long-range chromatin interactions. Our method, Biological Enrichment of Hidden Sequence Targets (BEHST), finds Gene Ontology (GO) terms enriched in genomic regions more precisely and accurately than existing methods. We demonstrate BEHST's ability to retrieve more pertinent and less ambiguous GO terms associated with results of in vivo mouse enhancer screens or enhancer RNA assays for multiple tissue types. BEHST will accelerate the discovery of affected pathways mediated through long-range interactions that explain non-coding hits in genome-wide association study (GWAS) or genome editing screens. BEHST is free software with a command-line interface for Linux or macOS and a web interface (httpbehst.hoffmanlab.org).
biorxiv bioinformatics 100-200-users 2019Killer whale genomes reveal a complex history of recurrent admixture and vicariance Supplementary Materials, bioRxiv, 2019-01-16
Reconstruction of the demographic and evolutionary history of populations assuming a consensus tree-like relationship can mask more complex scenarios, which are prevalent in nature. An emerging genomic toolset, which has been most comprehensively harnessed in the reconstruction of human evolutionary history, enables molecular ecologists to elucidate complex population histories. Killer whales have limited extrinsic barriers to dispersal and have radiated globally, and are therefore a good candidate model for the application of such tools. Here, we analyse a global dataset of killer whale genomes in a rare attempt to elucidate global population structure in a non-human species. We identify a pattern of genetic homogenisation at lower latitudes and the greatest differentiation at high latitudes, even between currently sympatric lineages. The processes underlying the major axis of structure include high drift at the edge of species' range, likely associated with founder effects and allelic surfing during post-glacial range expansion. Divergence between Antarctic and non-Antarctic lineages is further driven by ancestry segments with up to four-fold older coalescence time than the genome-wide average; relicts of a previous vicariance during an earlier glacial cycle. Our study further underpins that episodic gene flow is ubiquitous in natural populations, and can occur across great distances and after substantial periods of isolation between populations. Thus, understanding the evolutionary history of a species requires comprehensive geographic sampling and genome-wide data to sample the variation in ancestry within individuals.
biorxiv evolutionary-biology 100-200-users 2019Probabilistic cell type assignment of single-cell transcriptomic data reveals spatiotemporal microenvironment dynamics in human cancers Supplementary tables, bioRxiv, 2019-01-16
Single-cell RNA sequencing (scRNA-seq) has transformed biomedical research, enabling decomposition of complex tissues into disaggregated, functionally distinct cell types. For many applications, investigators wish to identify cell types with known marker genes. Typically, such cell type assignments are performed through unsupervised clustering followed by manual annotation based on these marker genes, or via mapping procedures to existing data. However, the manual interpretation required in the former case scales poorly to large datasets, which are also often prone to batch effects, while existing data for purified cell types must be available for the latter. Furthermore, unsupervised clustering can be error-prone, leading to under- and over- clustering of the cell types of interest. To overcome these issues we present CellAssign, a probabilistic model that leverages prior knowledge of cell type marker genes to annotate scRNA-seq data into pre-defined and de novo cell types. CellAssign automates the process of assigning cells in a highly scalable manner across large datasets while simultaneously controlling for batch and patient effects. We demonstrate the analytical advantages of CellAssign through extensive simulations and exemplify real-world utility to profile the spatial dynamics of high-grade serous ovarian cancer and the temporal dynamics of follicular lymphoma. Our analysis reveals subclonal malignant phenotypes and points towards an evolutionary interplay between immune and cancer cell populations with cancer cells escaping immune recognition.
biorxiv bioinformatics 100-200-users 2019