Bayesian multivariate reanalysis of large genetic studies identifies many new associations, bioRxiv, 2019-05-17
AbstractGenome-wide association studies (GWAS) have now been conducted for hundreds of phenotypes of relevance to human health. Many such GWAS involve multiple closely-related phenotypes collected on the same samples. However, the vast majority of these GWAS have been analyzed using simple univariate analyses, which consider one phenotype at a time. This is de-spite the fact that, at least in simulation experiments, multivariate analyses have been shown to be more powerful at detecting associations. Here, we conduct multivariate association analyses on 13 different publicly-available GWAS datasets that involve multiple closely-related phenotypes. These data include large studies of anthropometric traits (GIANT), plasma lipid traits (GlobalLipids), and red blood cell traits (HaemgenRBC). Our analyses identify many new associations (433 in total across the 13 studies), many of which replicate when follow-up samples are available. Overall, our results demonstrate that multivariate analyses can help make more effective use of data from both existing and future GWAS.1Author SummaryGenome-wide association studies (GWAS) have become a common and powerful tool for identifying significant correlations between markers of genetic variation and physical traits of interest. Often these studies are conducted by comparing genetic variation against single traits one at a time (‘univariate’); however, it has previously been shown that it is possible to increase your power to detect significant associations by comparing genetic variation against multiple traits simultaneously (‘multivariate’). Despite this apparent increase in power though, researchers still rarely conduct multivariate GWAS, even when studies have multiple traits readily available. Here, we reanalyze 13 previously published GWAS using a multivariate method and find >400 additional associations. Our method makes use of univariate GWAS summary statistics and is available as a software package, thus making it accessible to other researchers interested in conducting the same analyses. We also show, using studies that have multiple releases, that our new associations have high rates of replication. Overall, we argue multivariate approaches in GWAS should no longer be overlooked and how, often, there is low-hanging fruit in the form of new associations by running these methods on data already collected.
biorxiv genomics 0-100-users 2019OncoOmics approaches to reveal essential genes in breast cancer a panoramic view from pathogenesis to precision medicine, bioRxiv, 2019-05-17
SUMMARYBreast cancer (BC) is a heterogeneous disease where each OncoOmics approach needs to be fully understood as a part of a complex network. Therefore, the main objective of this study was to analyze genetic alterations, signaling pathways, protein-protein interaction networks, protein expression, dependency maps and enrichment maps in 230 previously prioritized genes by the Consensus Strategy, the Pan-Cancer Atlas, the Pharmacogenomics Knowledgebase and the Cancer Genome Interpreter, in order to reveal essential genes to accelerate the development of precision medicine in BC. The OncoOmics essential genes were rationally filtered to 144, 48 (33%) of which were hallmarks of cancer and 20 (14%) were significant in at least three OncoOmics approaches RAC1, AKT1 CCND1, PIK3CA, ERBB2, CDH1, MAPK14, TP53, MAPK1, SRC, RAC3, PLCG1, GRB2, MED1, TOP2A, GATA3, BCL2, CTNNB1, EGFR and CDK2. According to the Open Targets Platform, there are 111 drugs that are currently being analyzed in 3151 clinical trials in 39 genes. Lastly, there are more than 800 clinical annotations associated with 94 genes in BC pharmacogenomics.
biorxiv genomics 0-100-users 2019Resolving the 3D landscape of transcription-linked mammalian chromatin folding, bioRxiv, 2019-05-17
ABSTRACTChromatin folding below the scale of topologically associating domains (TADs) remains largely unexplored in mammals. Here, we used a high-resolution 3C-based method, Micro-C, to probe links between 3D-genome organization and transcriptional regulation in mouse stem cells. Combinatorial binding of transcription factors, cofactors, and chromatin modifiers spatially segregate TAD regions into “microTADs” with distinct regulatory features. Enhancer-promoter and promoter-promoter interactions extending from the edge of these domains predominantly link co-regulated loci, often independently of CTCFCohesin. Acute inhibition of transcription disrupts the gene-related folding features without altering higher-order chromatin structures. Intriguingly, we detect “two-start” zig-zag 30-nanometer chromatin fibers. Our work uncovers the finer-scale genome organization that establishes novel functional links between chromatin folding and gene regulation.ONE SENTENCE SUMMARYTranscriptional regulatory elements shape 3D genome architecture of microTADs.
biorxiv genomics 100-200-users 2019Tracking of antibiotic resistance transfer and rapid plasmid evolution in a hospital setting by Nanopore sequencing, bioRxiv, 2019-05-17
AbstractBackgroundInfection of patients with multidrug-resistant (MDR) bacteria often leave very limited or no treatment options. The transfer of antimicrobial resistance genes (ARG) carrying plasmids between bacterial species by horizontal gene transfer represents an important mode of expansion of ARGs. Here, we evaluated the application of Nanopore sequencing technology in a hospital setting for monitoring the transfer and rapid evolution of antibiotic resistance plasmids within and across multiple species.ResultsIn 2009 we experienced an outbreak with an extensively multidrug resistant P. aeruginosa harboring the carbapenemase enzyme blaIMP-8, and in 2012 the first Citrobacter freundii and Citrobacter werkmanii harboring the same enzyme were detected. Using Nanopore and Illumina sequencing we conducted a comparative analysis of all blaIMP-8 bacteria isolated in our hospital over a 6-year period (n = 54). We developed the computational platforms pathoLogic and plasmIDent for Nanopore-based characterization of clinical isolates and monitoring of ARG transfer, comprising de-novo assembly of genomes and plasmids, polishing, QC, plasmid circularization, ARG annotation, comparative genome analysis of multiple isolates and visualization of results. Using plasmIDent we identified a 40 kb plasmid carrying blaIMP-8 in P. aeruginosa and C. freundii, verifying that plasmid transfer had occurred. Within C. freundii the plasmid underwent further evolution and plasmid fusion, resulting in a 164 kb mega-plasmid, which was transferred to C. werkmanii. Moreover, multiple rearrangements of the multidrug resistance gene cassette were detected in P. aeruginosa, including deletions and translocations of complete ARGs.ConclusionPlasmid transfer, plasmid fusion and rearrangement of the multidrug resistance gene cassette mediated the rapid evolution of opportunistic pathogens in our hospital. We demonstrated the feasibility of tracking plasmid evolution dynamics and ARG transfer in clinical settings in a timely manner. The approach will allow for successful countermeasures to contain not only clonal, but also plasmid mediated outbreaks.
biorxiv genomics 100-200-users 2019Ultrastructural details of mammalian chromosome architecture, bioRxiv, 2019-05-17
ABSTRACTOver the past decade, 3C-related methods, complemented by increasingly detailed microscopic views of the nucleus, have provided unprecedented insights into chromosome folding in vivo. Here, to overcome the resolution limits inherent to the majority of genome-wide chromosome architecture mapping studies, we extend a recently-developed Hi-C variant, Micro-C, to map chromosome architecture at nucleosome resolution in human embryonic stem cells and fibroblasts. Micro-C maps robustly capture well-described features of mammalian chromosome folding including AB compartment organization, topologically associating domains (TADs), and cis interaction peaks anchored at CTCF binding sites, while also providing a detailed 1-dimensional map of nucleosome positioning and phasing genome-wide. Compared to high-resolution in situ Hi-C, Micro-C exhibits substantially improved signal-to-noise with an order of magnitude greater dynamic range, enabling not only localization of domain boundaries with single-nucleosome accuracy, but also resolving more than 20,000 additional looping interaction peaks in each cell type. Intriguingly, many of these newly-identified peaks are localized along stripe patterns and form transitive grids, consistent with their anchors being pause sites impeding the process of cohesin-dependent loop extrusion. Together, our analyses provide the highest resolution maps of chromosome folding in human cells to date, and provide a valuable resource for studies of chromosome folding mechanisms.
biorxiv genomics 100-200-users 2019Benchmarking Single-Cell RNA Sequencing Protocols for Cell Atlas Projects, bioRxiv, 2019-05-14
AbstractSingle-cell RNA sequencing (scRNA-seq) is the leading technique for charting the molecular properties of individual cells. The latest methods are scalable to thousands of cells, enabling in-depth characterization of sample composition without prior knowledge. However, there are important differences between scRNA-seq techniques, and it remains unclear which are the most suitable protocols for drawing cell atlases of tissues, organs and organisms. We have generated benchmark datasets to systematically evaluate techniques in terms of their power to comprehensively describe cell types and states. We performed a multi-center study comparing 13 commonly used single-cell and single-nucleus RNA-seq protocols using a highly heterogeneous reference sample resource. Comparative and integrative analysis at cell type and state level revealed marked differences in protocol performance, highlighting a series of key features for cell atlas projects. These should be considered when defining guidelines and standards for international consortia, such as the Human Cell Atlas project.
biorxiv genomics 500+-users 2019