Bayesian multivariate reanalysis of large genetic studies identifies many new associations, bioRxiv, 2019-05-17
AbstractGenome-wide association studies (GWAS) have now been conducted for hundreds of phenotypes of relevance to human health. Many such GWAS involve multiple closely-related phenotypes collected on the same samples. However, the vast majority of these GWAS have been analyzed using simple univariate analyses, which consider one phenotype at a time. This is de-spite the fact that, at least in simulation experiments, multivariate analyses have been shown to be more powerful at detecting associations. Here, we conduct multivariate association analyses on 13 different publicly-available GWAS datasets that involve multiple closely-related phenotypes. These data include large studies of anthropometric traits (GIANT), plasma lipid traits (GlobalLipids), and red blood cell traits (HaemgenRBC). Our analyses identify many new associations (433 in total across the 13 studies), many of which replicate when follow-up samples are available. Overall, our results demonstrate that multivariate analyses can help make more effective use of data from both existing and future GWAS.1Author SummaryGenome-wide association studies (GWAS) have become a common and powerful tool for identifying significant correlations between markers of genetic variation and physical traits of interest. Often these studies are conducted by comparing genetic variation against single traits one at a time (‘univariate’); however, it has previously been shown that it is possible to increase your power to detect significant associations by comparing genetic variation against multiple traits simultaneously (‘multivariate’). Despite this apparent increase in power though, researchers still rarely conduct multivariate GWAS, even when studies have multiple traits readily available. Here, we reanalyze 13 previously published GWAS using a multivariate method and find >400 additional associations. Our method makes use of univariate GWAS summary statistics and is available as a software package, thus making it accessible to other researchers interested in conducting the same analyses. We also show, using studies that have multiple releases, that our new associations have high rates of replication. Overall, we argue multivariate approaches in GWAS should no longer be overlooked and how, often, there is low-hanging fruit in the form of new associations by running these methods on data already collected.
biorxiv genomics 0-100-users 2019OncoOmics approaches to reveal essential genes in breast cancer a panoramic view from pathogenesis to precision medicine, bioRxiv, 2019-05-17
SUMMARYBreast cancer (BC) is a heterogeneous disease where each OncoOmics approach needs to be fully understood as a part of a complex network. Therefore, the main objective of this study was to analyze genetic alterations, signaling pathways, protein-protein interaction networks, protein expression, dependency maps and enrichment maps in 230 previously prioritized genes by the Consensus Strategy, the Pan-Cancer Atlas, the Pharmacogenomics Knowledgebase and the Cancer Genome Interpreter, in order to reveal essential genes to accelerate the development of precision medicine in BC. The OncoOmics essential genes were rationally filtered to 144, 48 (33%) of which were hallmarks of cancer and 20 (14%) were significant in at least three OncoOmics approaches RAC1, AKT1 CCND1, PIK3CA, ERBB2, CDH1, MAPK14, TP53, MAPK1, SRC, RAC3, PLCG1, GRB2, MED1, TOP2A, GATA3, BCL2, CTNNB1, EGFR and CDK2. According to the Open Targets Platform, there are 111 drugs that are currently being analyzed in 3151 clinical trials in 39 genes. Lastly, there are more than 800 clinical annotations associated with 94 genes in BC pharmacogenomics.
biorxiv genomics 0-100-users 2019Rare microbes from diverse Earth biomes dominate community activity, bioRxiv, 2019-05-17
AbstractMicrobes are the Earth’s most numerous organisms and are instrumental in driving major global biological and chemical processes. Microbial activity is a crucial component of all ecosystems, as microbes have the potential to control any major biochemical process. In recent years, considerable strides have been made in describing the community structure, i.e. diversity and abundance, of microbes from the Earth’s major biomes. In virtually all environments studied, a few highly abundant taxa dominate the structure of microbial communities. Still, microbial diversity is high and is concentrated in the less abundant, or rare, fractions of the community, i.e. the “long tail” of the abundance distribution. The relationship between microbial community structure and activity, specifically the role of rare microbes, and its connection to ecosystem function, is not fully understood. We analyzed 12.3 million metagenomic and metatranscriptomic sequence assemblies and their genes from environmental, human, and engineered microbiomes, and show that microbial activity is dominated by rare microbes (96% of total activity) across all measured biomes. Further, rare microbial activity was comprised of traits that are fundamental to ecosystem and organismal health, e.g. biogeochemical cycling and infectious disease. The activity of rare microbes was also tightly coupled to temperature, revealing a link between basic biological processes, e.g. reaction rates, and community activity. Our study provides a broadly applicable and predictable paradigm that implicates rare microbes as the main microbial drivers of ecosystem function and organismal health.
biorxiv microbiology 100-200-users 2019Resolving the 3D landscape of transcription-linked mammalian chromatin folding, bioRxiv, 2019-05-17
ABSTRACTChromatin folding below the scale of topologically associating domains (TADs) remains largely unexplored in mammals. Here, we used a high-resolution 3C-based method, Micro-C, to probe links between 3D-genome organization and transcriptional regulation in mouse stem cells. Combinatorial binding of transcription factors, cofactors, and chromatin modifiers spatially segregate TAD regions into “microTADs” with distinct regulatory features. Enhancer-promoter and promoter-promoter interactions extending from the edge of these domains predominantly link co-regulated loci, often independently of CTCFCohesin. Acute inhibition of transcription disrupts the gene-related folding features without altering higher-order chromatin structures. Intriguingly, we detect “two-start” zig-zag 30-nanometer chromatin fibers. Our work uncovers the finer-scale genome organization that establishes novel functional links between chromatin folding and gene regulation.ONE SENTENCE SUMMARYTranscriptional regulatory elements shape 3D genome architecture of microTADs.
biorxiv genomics 100-200-users 2019Tracking of antibiotic resistance transfer and rapid plasmid evolution in a hospital setting by Nanopore sequencing, bioRxiv, 2019-05-17
AbstractBackgroundInfection of patients with multidrug-resistant (MDR) bacteria often leave very limited or no treatment options. The transfer of antimicrobial resistance genes (ARG) carrying plasmids between bacterial species by horizontal gene transfer represents an important mode of expansion of ARGs. Here, we evaluated the application of Nanopore sequencing technology in a hospital setting for monitoring the transfer and rapid evolution of antibiotic resistance plasmids within and across multiple species.ResultsIn 2009 we experienced an outbreak with an extensively multidrug resistant P. aeruginosa harboring the carbapenemase enzyme blaIMP-8, and in 2012 the first Citrobacter freundii and Citrobacter werkmanii harboring the same enzyme were detected. Using Nanopore and Illumina sequencing we conducted a comparative analysis of all blaIMP-8 bacteria isolated in our hospital over a 6-year period (n = 54). We developed the computational platforms pathoLogic and plasmIDent for Nanopore-based characterization of clinical isolates and monitoring of ARG transfer, comprising de-novo assembly of genomes and plasmids, polishing, QC, plasmid circularization, ARG annotation, comparative genome analysis of multiple isolates and visualization of results. Using plasmIDent we identified a 40 kb plasmid carrying blaIMP-8 in P. aeruginosa and C. freundii, verifying that plasmid transfer had occurred. Within C. freundii the plasmid underwent further evolution and plasmid fusion, resulting in a 164 kb mega-plasmid, which was transferred to C. werkmanii. Moreover, multiple rearrangements of the multidrug resistance gene cassette were detected in P. aeruginosa, including deletions and translocations of complete ARGs.ConclusionPlasmid transfer, plasmid fusion and rearrangement of the multidrug resistance gene cassette mediated the rapid evolution of opportunistic pathogens in our hospital. We demonstrated the feasibility of tracking plasmid evolution dynamics and ARG transfer in clinical settings in a timely manner. The approach will allow for successful countermeasures to contain not only clonal, but also plasmid mediated outbreaks.
biorxiv genomics 100-200-users 2019Ultrastructural details of mammalian chromosome architecture, bioRxiv, 2019-05-17
ABSTRACTOver the past decade, 3C-related methods, complemented by increasingly detailed microscopic views of the nucleus, have provided unprecedented insights into chromosome folding in vivo. Here, to overcome the resolution limits inherent to the majority of genome-wide chromosome architecture mapping studies, we extend a recently-developed Hi-C variant, Micro-C, to map chromosome architecture at nucleosome resolution in human embryonic stem cells and fibroblasts. Micro-C maps robustly capture well-described features of mammalian chromosome folding including AB compartment organization, topologically associating domains (TADs), and cis interaction peaks anchored at CTCF binding sites, while also providing a detailed 1-dimensional map of nucleosome positioning and phasing genome-wide. Compared to high-resolution in situ Hi-C, Micro-C exhibits substantially improved signal-to-noise with an order of magnitude greater dynamic range, enabling not only localization of domain boundaries with single-nucleosome accuracy, but also resolving more than 20,000 additional looping interaction peaks in each cell type. Intriguingly, many of these newly-identified peaks are localized along stripe patterns and form transitive grids, consistent with their anchors being pause sites impeding the process of cohesin-dependent loop extrusion. Together, our analyses provide the highest resolution maps of chromosome folding in human cells to date, and provide a valuable resource for studies of chromosome folding mechanisms.
biorxiv genomics 100-200-users 2019