WAPL maintains dynamic cohesin to preserve lineage specific distal gene regulation, bioRxiv, 2019-08-10
SUMMARYThe cohesin complex plays essential roles in sister chromatin cohesin, chromosome organization and gene expression. The role of cohesin in gene regulation is incompletely understood. Here, we report that the cohesin release factor WAPL is crucial for maintaining a pool of dynamic cohesin bound to regions that are associated with lineage specific genes in mouse embryonic stem cells. These regulatory regions are enriched for active enhancer marks and transcription factor binding sites, but largely devoid of CTCF binding sites. Stabilization of cohesin, which leads to a loss of dynamic cohesin from these regions, does not affect transcription factor binding or active enhancer marks, but does result in changes in promoter-enhancer interactions and downregulation of genes. Acute cohesin depletion can phenocopy the effect of WAPL depletion, showing that cohesin plays a crucial role in maintaining expression of lineage specific genes. The binding of dynamic cohesin to chromatin is dependent on the pluripotency transcription factor OCT4, but not NANOG. Finally, dynamic cohesin binding sites are also found in differentiated cells, suggesting that they represent a general regulatory principle. We propose that cohesin dynamically binding to regulatory sites creates a favorable spatial environment in which promoters and enhancers can communicate to ensure proper gene expression.HIGHLIGHTS<jatslist list-type=order><jatslist-item>The cohesin release factor WAPL is crucial for maintaining a pluripotency-specific phenotype.<jatslist-item><jatslist-item>Dynamic cohesin is enriched at lineage specific loci and overlaps with binding sites of pluripotency transcription factors.<jatslist-item><jatslist-item>Expression of lineage specific genes is maintained by dynamic cohesin binding through the formation of promoter-enhancer associated self-interaction domains.<jatslist-item><jatslist-item>CTCF-independent cohesin binding to chromatin is controlled by the pioneer factor OCT4.<jatslist-item>
biorxiv genomics 0-100-users 2019Progressive alignment with Cactus a multiple-genome aligner for the thousand-genome era, bioRxiv, 2019-08-09
AbstractCactus, a reference-free multiple genome alignment program, has been shown to be highly accurate, but the existing implementation scales poorly with increasing numbers of genomes, and struggles in regions of highly duplicated sequence. We describe progressive extensions to Cactus that enable reference-free alignment of tens to thousands of large vertebrate genomes while maintaining high alignment quality. We show that Cactus is capable of scaling to hundreds of genomes and beyond by describing results from an alignment of over 600 amniote genomes, which is to our knowledge the largest multiple vertebrate genome alignment yet created. Further, we show improvements in orthology resolution leading to downstream improvements in annotation.
biorxiv genomics 100-200-users 2019In Situ Transcriptome Accessibility Sequencing (INSTA-seq), bioRxiv, 2019-08-06
Subcellular RNA localization regulates spatially polarized cellular processes, but unbiased investigation of its control in vivo remains challenging. Current hybridization-based methods cannot differentiate small regulatory variants, while in situ sequencing is limited by short reads. We solved these problems using a bidirectional sequencing chemistry to efficiently image transcript-specific barcode in situ, which are then extracted and assembled into longer reads using NGS. In the Drosophila retina, genes regulating eye development and cytoskeletal organization were enriched compared to methods using extracted RNA. We therefore named our method In Situ Transcriptome Accessibility sequencing (INSTA-seq). Sequencing reads terminated near 3’ UTR cis-motifs (e.g. Zip48C, stau), revealing RNA-protein interactions. Additionally, Act5C polyadenylation isoforms retaining zipcode motifs were selectively localized to the optical stalk, consistent with their biology. Our platform provides a powerful way to visualize any RNA variants or protein interactions in situ to study their regulation in animal development.
biorxiv genomics 100-200-users 2019The Integrator complex terminates promoter-proximal transcription at protein-coding genes, bioRxiv, 2019-08-06
SUMMARYThe transition of RNA polymerase II (Pol II) from initiation to productive elongation is a central, regulated step in metazoan gene expression. At many genes, Pol II pauses stably in early elongation, remaining engaged with the 25-60 nucleotide-long nascent RNA for many minutes while awaiting signals for release into the gene body. However, a number of genes display highly unstable promoter Pol II, suggesting that paused polymerase might dissociate from template DNA at these promoters and release a short, non-productive mRNA. Here, we report that paused Pol II can be actively destabilized by the Integrator complex. Specifically, Integrator utilizes its RNA endonuclease activity to cleave nascent RNA and drive termination of paused Pol II. These findings uncover a previously unappreciated mechanism of metazoan gene repression, akin to bacterial transcription attenuation, wherein promoter-proximal Pol II is prevented from entering productive elongation through factor-regulated termination.Highlights<jatslist list-type=bullet><jatslist-item>The Integrator complex inhibits transcription elongation at ∼15% of mRNA genes<jatslist-item><jatslist-item>Integrator targets promoter-proximally paused Pol II for termination<jatslist-item><jatslist-item>The RNA endonuclease of Integrator subunit 11 is critical for gene attenuation<jatslist-item><jatslist-item>Integrator-repressed genes are enriched in signaling and growth-responsive pathways<jatslist-item>
biorxiv genomics 100-200-users 2019Template plasmid integration in germline genome-edited cattle, bioRxiv, 2019-07-29
AbstractWe analyzed publicly available whole genome sequencing data from cattle which were germline genome-edited to introduce polledness. Our analysis discovered the unintended heterozygous integration of the plasmid and a second copy of the repair template sequence, at the target site. Our finding underscores the importance of employing screening methods suited to reliably detect the unintended integration of plasmids and multiple template copies.
biorxiv genomics 100-200-users 2019Evidence that APP gene copy number changes reflect recombinant vector contamination, bioRxiv, 2019-07-23
AbstractMutations that occur in cells of the body, called somatic mutations, cause human diseases including cancer and some neurological disorders1. In a recent study published in Nature, Lee et al.2 (hereafter “the Lee study”) reported somatic copy number gains of the APP gene, a known risk locus of Alzheimer’s disease (AD), in the neurons of AD-patients and controls (69% vs 25% of neurons with at least one APP copy gain on average). The authors argue that the mechanism of these copy number gains was somatic integration of APP mRNA into the genome, creating what they called genomic cDNA (gencDNA). We reanalyzed the data from the Lee study, revealing evidence that APP gencDNA originates mainly from contamination by exogenous APP recombinant vectors, rather from true somatic retrotransposition of endogenous APP. Our reanalysis of two recent whole exome sequencing (WES) datasets—one by the authors of the Lee study3 and the other by Park et al.4—revealed that reads claimed to support APP gencDNA in AD samples resulted from contamination by PCR products and mRNA, respectively. Lastly, we present our own single-cell whole genome sequencing (scWGS) data that show no evidence for somatic APP retrotransposition in AD neurons or in neurons from normal individuals of various ages.
biorxiv genomics 0-100-users 2019