Structure of transcribing RNA polymerase II-nucleosome complex, bioRxiv, 2018-10-07
Transcription of eukaryotic protein-coding genes requires passage of RNA polymerase II (Pol II) through chromatin. Pol II passage is impaired by nucleosomes and requires elongation factors that help Pol II to efficiently overcome the nucleosomal barrier1-4. How the Pol II machinery transcribes through a nucleosome remains unclear because structural studies have been limited to Pol II elongation complexes formed on DNA templates lacking nucleosomes5. Here we report the cryo-electron microscopy (cryo-EM) structure of transcribing Pol II from the yeast Saccharomyces cerevisiae engaged with a downstream nucleosome core particle (NCP) at an overall resolution of 4.4 Å with resolutions ranging from 4-6 Å in Pol II and 6-8 Å in the NCP. Pol II and the NCP adopt a defined orientation that could not be predicted from modelling. Pol II contacts DNA of the incoming NCP on both sides of the nucleosomal dyad with its domains ‘clamp head’ and ‘lobe’. Comparison of the Pol II-NCP structure to known structures of Pol II complexes reveals that the elongation factors TFIIS, DSIF, NELF, PAF1 complex, and SPT6 can be accommodated on the Pol II surface in the presence of the oriented nucleosome. Further structural comparisons show that the chromatin remodelling enzyme Chd1, which is also required for efficient Pol II passage6,7, could bind the oriented nucleosome with its motor domain. The DNA-binding region of Chd1 must however be released from DNA when Pol II approaches the nucleosome, and based on published data8,9 this is predicted to stimulate Chd1 activity and to facilitate Pol II passage. Our results provide a starting point for a mechanistic analysis of chromatin transcription.
biorxiv biochemistry 100-200-users 2018Microscopy-based chromosome conformation capture enables simultaneous visualization of genome organization and transcription in intact organisms, bioRxiv, 2018-10-04
Eukaryotic chromosomes are organized in multiple scales, from nucleosomes to chromosome territories. Recently, genome-wide methods identified an intermediate level of chromosome organization, topologically associating domains (TADs), that play key roles in transcriptional regulation. However, these methods cannot directly examine the interplay between transcriptional activation and chromosome architecture while maintaining spatial information. Here, we present a multiplexed, sequential imaging approach (Hi-M) that permits the simultaneous detection of chromosome organization and transcription in single nuclei. This allowed us to unveil the changes in 3D chromatin organization occurring upon transcriptional activation and homologous chromosome un-pairing during the awakening of the zygotic genome in intact Drosophila embryos. Excitingly, the ability of Hi-M to explore the multi-scale chromosome architecture with spatial resolution at different stages of development or during the cell cycle will be key to understand the mechanisms and consequences of the 4D organization of the genome.
biorxiv cell-biology 100-200-users 2018scRNA-seq mixology towards better benchmarking of single cell RNA-seq analysis methods, bioRxiv, 2018-10-04
AbstractSingle cell RNA sequencing (scRNA-seq) technology has undergone rapid development in recent years, bringing with new challenges in data processing and analysis. This has led to an explosion of tailored analysis methods for scRNA-seq data to address various biological questions. However, the current lack of gold-standard benchmark datasets makes it difficult for researchers to systematically evaluate the performance of the many methods available. Here, we designed and carried out a realistic benchmark experiment that included mixtures of single cells or ‘pseudo cells’ created by sampling admixtures of cells or RNA from up to 5 distinct cancer cell lines. Altogether we generated 14 datasets using droplet and plate-based scRNA-seq protocols, compared multiple data analysis methods in combination for tasks ranging from normalization and imputation, to clustering, trajectory analysis and data integration. Evaluation across 3,913 analyses (methods × benchmark dataset combinations) revealed pipelines suited to different types of data for different tasks. Our dataset and analysis present a comprehensive comparison framework for benchmarking most common scRNA-seq analysis tasks.
biorxiv bioinformatics 100-200-users 2018NanoJ a high-performance open-source super-resolution microscopy toolbox, bioRxiv, 2018-10-02
Super-resolution microscopy has become essential for the study of nanoscale biological processes. This type of imaging often requires the use of specialised image analysis tools to process a large volume of recorded data and extract quantitative information. In recent years, our team has built an open-source image analysis framework for super-resolution microscopy designed to combine high performance and ease of use. We named it NanoJ - a reference to the popular ImageJ software it was developed for. In this paper, we highlight the current capabilities of NanoJ for several essential processing steps spatio-temporal alignment of raw data (NanoJ-Core), super-resolution image reconstruction (NanoJ-SRRF), image quality assessment (NanoJ-SQUIRREL), structural modelling (NanoJ-VirusMapper) and control of the sample environment (NanoJ-Fluidics). We expect to expand NanoJ in the future through the development of new tools designed to improve quantitative data analysis and measure the reliability of fluorescent microscopy studies.
biorxiv bioinformatics 100-200-users 2018A spatial atlas of inhibitory cell types in mouse hippocampus, bioRxiv, 2018-10-01
Understanding the function of a tissue requires knowing the spatial organization of its constituent cell types. In the cerebral cortex, single-cell RNA sequencing (scRNA-seq) has revealed the genome-wide expression patterns that define its many, closely related cell types, but cannot reveal their spatial arrangement. Here we introduce probabilistic cell typing by in situ sequencing (pciSeq), an approach that leverages prior scRNA-seq classification to identify cell types using multiplexed in situ RNA detection. We applied this method to map the inhibitory neurons of hippocampal area CA1, a cell system critical for memory function, for which ground truth is available from extensive prior work identifying the laminar organization of subtly differing cell types. Our method confidently identified 16 interneuron classes, in a spatial arrangement closely matching ground truth. This method will allow identifying the spatial organization of fine cell types across the brain and other tissues.
biorxiv neuroscience 100-200-users 2018An introduction to MPEG-G, the new ISO standard for genomic information representation, bioRxiv, 2018-09-27
AbstractThe MPEG-G standardization initiative is a coordinated international effort to specify a compressed data format that enables large scale genomic data to be processed, transported and shared. The standard consists of a set of specifications (i.e., a book) describing i) a nor-mative format syntax, and ii) a normative decoding process to retrieve the information coded in a compliant file or bitstream. Such decoding process enables the use of leading-edge com-pression technologies that have exhibited significant compression gains over currently used formats for storage of unaligned and aligned sequencing reads. Additionally, the standard provides a wealth of much needed functionality, such as selective access, data aggregation, ap-plication programming interfaces to the compressed data, standard interfaces to support data protection mechanisms, support for streaming and a procedure to assess the conformance of implementations. ISOIEC is engaged in supporting the maintenance and availability of the standard specification, which guarantees the perenniality of applications using MPEG-G. Fi-nally, the standard ensures interoperability and integration with existing genomic information processing pipelines by providing support for conversion from the FASTQSAMBAM file formats.In this paper we provide an overview of the MPEG-G specification, with particular focus on the main advantages and novel functionality it offers. As the standard only specifies the decoding process, encoding performance, both in terms of speed and compression ratio, can vary depending on specific encoder implementations, and will likely improve during the lifetime of MPEG-G. Hence, the performance statistics provided here are only indicative baseline examples of the technologies included in the standard.
biorxiv bioinformatics 100-200-users 2018