The Malaria Cell Atlas a comprehensive reference of single parasite transcriptomes across the complete Plasmodium life cycle file S1, bioRxiv, 2019-01-23
Malaria parasites adopt a remarkable variety of morphological life stages as they transition through multiple mammalian host and mosquito vector environments. Here we profile the single-cell transcriptomes of thousands of individual parasites, deriving the first high-resolution transcriptional atlas of the entire Plasmodium berghei life cycle. We then use our atlas to precisely define developmental stages of single cells from three different human malaria parasite species, including parasites isolated directly from infected individuals. The Malaria Cell Atlas provides both a comprehensive view of gene usage in a complex eukaryotic parasite and an open access reference data set for the study of malaria parasites.
biorxiv genomics 100-200-users 2019Pericentromeric heterochromatin is hierarchically organized and spatially contacts H3K9me2 islands in euchromatin, bioRxiv, 2019-01-21
AbstractMembraneless pericentromeric heterochromatin (PCH) domains play vital roles in chromosome dynamics and genome stability. However, our current understanding of 3D genome organization does not include PCH domains because of technical challenges associated with repetitive sequences enriched in PCH genomic regions. We investigated the 3D architecture of Drosophila melanogaster PCH domains and their spatial associations with euchromatic genome by developing a novel analysis method that incorporates genome-wide Hi-C reads originating from PCH DNA. Combined with cytogenetic analysis, we reveal a hierarchical organization of the PCH domains into distinct “territories.” Strikingly, H3K9me23-enriched regions embedded in the euchromatic genome show prevalent 3D interactions with the PCH domain. These spatial contacts require H3K9me23 enrichment, are likely mediated by liquid-liquid phase separation, and may influence organismal fitness. Our findings have important implications for how PCH architecture influences the function and evolution of both repetitive heterochromatin and the gene-rich euchromatin.Author summaryThe three dimensional (3D) organization of genomes in cell nuclei can influence a wide variety of genome functions. However, most of our understanding of this critical architecture has been limited to the gene-rich euchromatin, and largely ignores the gene-poor and repeat-rich pericentromeric heterochromatin, or PCH. PCH comprises large part of most eukaryotic genomes, forms 3D PCH domains in nuclei, and plays vital role in chromosome dynamics and genome stability. In this study, we developed a new method that overcomes the technical challenges imposed by the highly repetitive PCH DNA, and generated a comprehensive picture of its 3D organization. Combined with image analyses, we revealed a hierarchical organization of the PCH domains. Surprisingly, we showed that distant euchromatic regions enriched for repressive epigenetic marks also dynamically interact with the main PCH domains. These 3D interactions are mediated by liquid-liquid phase separation mechanisms, similar to how oil and vinegar separate in salad dressing, and can influence the fitness of individuals. Our discoveries have strong implications for how seemingly “junk” DNA could impact functions in the gene-rich euchromatin.
biorxiv genomics 100-200-users 2019Pericentromeric heterochromatin is hierarchically organized and spatially contacts H3K9me23 islands located in euchromatic genome, bioRxiv, 2019-01-21
Membraneless pericentromeric heterochromatin (PCH) domains play vital roles in chromosome dynamics and genome stability. However, our current understanding of 3D genome organization does not include PCH domains because of technical challenges associated with repetitive sequences enriched in PCH genomic regions. We investigated the 3D architecture of Drosophila melanogaster PCH domains and their spatial associations with euchromatic genome by developing a novel analysis method that incorporates genome-wide Hi-C reads originating from PCH DNA. Combined with cytogenetic analysis, we reveal a hierarchical organization of the PCH domains into distinct 'territories,' in which 'intra-arm' interactions are the most prevalent, followed by 3D contacts between specific PCH regions on different chromosomes. Strikingly, H3K9me23-enriched regions embedded in euchromatic genome show prevalent 3D interactions with the PCH domain. These spatial contacts require H3K9me23 enrichment, are likely mediated by liquid-liquid phase separation, and influence organismal fitness. Our findings have important implications for how PCH architecture influences the function and evolution of both repetitive heterochromatin and the gene-rich euchromatin.
biorxiv genomics 100-200-users 2019Identification and mitigation of pervasive off-target activity in CRISPR-Cas9 screens for essential non-coding elements Supplementary Information, bioRxiv, 2019-01-19
Pooled CRISPR-Cas9 screens have recently emerged as a powerful method for functionally characterizing regulatory elements in the non-coding genome, but off-target effects in these experiments have not been systematically evaluated. Here, we conducted a genome-scale screen for essential CTCF loop anchors in the K562 leukemia cell line. Surprisingly, the primary drivers of signal in this screen were single guide RNAs (sgRNAs) with low specificity scores. After removing these guides, we found that there were no CTCF loop anchors critical for cell growth. We also observed this effect in an independent screen fine-mapping the core motifs in enhancers of the GATA1 gene. We then conducted screens in parallel with CRISPRi and CRISPRa, which do not induce DNA damage, and found that an unexpected and distinct set of off-targets also caused strong confounding growth effects with these epigenome-editing platforms. Promisingly, strict filtering of CRISPRi libraries using GuideScan specificity scores removed these confounded sgRNAs and allowed for the identification of essential enhancers, which we validated extensively. Together, our results show off-target activity can severely limit identification of essential functional motifs by active Cas9, while strictly filtered CRISPRi screens can be reliably used for assaying larger regulatory elements.
biorxiv genomics 100-200-users 2019Transposable elements drive reorganisation of 3D chromatin during early embryogenesis, bioRxiv, 2019-01-18
Transposable elements are abundant genetic components of eukaryotic genomes with important regulatory features affecting transcription, splicing, and recombination, among others. Here we demonstrate that the Murine Endogenous Retroviral Element (MuERV-LMERVL) family of transposable elements drives the 3D reorganisation of the genome in the early mouse embryo. By generating Hi-C data in 2-cell-like cells, we show that MERLV elements promote the formation of insulating domain boundaries throughout the genome in vivo and in vitro. The formation of these boundaries is coupled to the upregulation of directional transcription from MERVL, which results in the activation of a subset of the gene expression programme of the 2-cell stage embryo. Domain boundaries in the 2-cell stage embryo are transient and can be remodelled without undergoing cell division. Remarkably, we find extensive inter-strain MERVL variation, suggesting multiple non-overlapping rounds of recent genome invasion and a high regulatory plasticity of genome organisation. Our results demonstrate that MERVL drive chromatin organisation during early embryonic development shedding light into how nuclear organisation emerges during zygotic genome activation in mammals.
biorxiv genomics 100-200-users 2019Highly-accurate long-read sequencing improves variant detection and assembly of a human genome, bioRxiv, 2019-01-13
AbstractThe major DNA sequencing technologies in use today produce either highly-accurate short reads or noisy long reads. We developed a protocol based on single-molecule, circular consensus sequencing (CCS) to generate highly-accurate (99.8%) long reads averaging 13.5 kb and applied it to sequence the well-characterized human HG002NA24385. We optimized existing tools to comprehensively detect variants, achieving precision and recall above 99.91% for SNVs, 95.98% for indels, and 95.99% for structural variants. We estimate that 2,434 discordances are correctable mistakes in the high-quality Genome in a Bottle benchmark. Nearly all (99.64%) variants are phased into haplotypes, which further improves variant detection. De novo assembly produces a highly contiguous and accurate genome with contig N50 above 15 Mb and concordance of 99.998%. CCS reads match short reads for small variant detection, while enabling structural variant detection and de novo assembly at similar contiguity and markedly higher concordance than noisy long reads.
biorxiv genomics 200-500-users 2019