Patterns of structural variation in human cancer, bioRxiv, 2017-08-28
ABSTRACTA key mutational process in cancer is structural variation, in which rearrangements delete, amplify or reorder genomic segments ranging in size from kilobases to whole chromosomes. We developed methods to group, classify and describe structural variants, applied to >2,500 cancer genomes. Nine signatures of structural variation emerged. Deletions have trimodal size distribution; assort unevenly across tumour types and patients; enrich in late-replicating regions; and correlate with inversions. Tandem duplications also have trimodal size distribution, but enrich in early-replicating regions, as do unbalanced translocations. Replication-based mechanisms of rearrangement generate varied chromosomal structures with low-level copy number gains and frequent inverted rearrangements. One prominent structure consists of 1-7 templates copied from distinct regions of the genome strung together within one locus. Such ‘cycles of templated insertions’ correlate with tandem duplications, frequently activating the telomerase gene, TERT, in liver cancer. Cancers access many rearrangement processes, flexibly sculpting the genome to maximise oncogenic potential.
biorxiv cancer-biology 0-100-users 2017An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues, bioRxiv, 2017-08-27
ABSTRACTWe present Omni-ATAC, an improved ATAC-seq protocol for chromatin accessibility profiling that works across multiple applications with substantial improvement of signal-to-background ratio and information content. The Omni-ATAC protocol enables chromatin accessibility profiling from archival frozen tissue samples and 50 μm sections, revealing the activities of disease-associated DNA elements in distinct human brain structures. The Omni-ATAC protocol enables the interrogation of personal regulomes in tissue context and translational studies.
biorxiv genomics 100-200-users 2017A neural algorithm for a fundamental computing problem, bioRxiv, 2017-08-26
Similarity search, such as identifying similar images in a database or similar documents on the Web, is a fundamental computing problem faced by many large-scale information retrieval systems. We discovered that the fly’s olfac-tory circuit solves this problem using a novel variant of a traditional computer science algorithm (called locality-sensitive hashing). The fly’s circuit assigns similar neural activity patterns to similar input stimuli (odors), so that behav-iors learned from one odor can be applied when a similar odor is experienced. The fly’s algorithm, however, uses three new computational ingredients that depart from traditional approaches. We show that these ingredients can be translated to improve the performance of similarity search compared to tra-ditional algorithms when evaluated on several benchmark datasets. Overall, this perspective helps illuminate the logic supporting an important sensory function (olfaction), and it provides a conceptually new algorithm for solving a fundamental computational problem.
biorxiv neuroscience 200-500-users 2017The 10,000 Immunomes Project A resource for human immunology, bioRxiv, 2017-08-26
AbstractNew immunological assays now enable rich measurements of human immune function, but difficulty attaining enough measurements across sufficiently large and diverse cohorts has hindered describing normal human immune physiology on a large scale. Here we present the 10,000 Immunomes Project (10KIP), a diverse human immunology reference derived from over 44,000 individuals across 242 studies from ImmPort, a publicly available resource of raw immunology study data and protocols. We carefully curated datasets, aggregating subjects from healthycontrol arms and harmonizing data across studies. We demonstrate 10KIP’s utility by describing variations in serum cytokines and leukocytes by age, race, and sex; defining a baseline cell-cytokine network; and using 10KIP as a common control to describe immunologic changes in pregnancy. Subject-level data is available for interactive visualization and download at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=http10kImmunomes.org>http10kImmunomes.org<jatsext-link>. We believe 10KIP can serve as a common control cohort and will accelerate hypothesis generation by clinical and basic immunologists across diverse populations.One Sentence SummaryAn open online resource of human immunology data from more than 10,000 normal subjects including interactive data visualization and download enables a new look at immune system differences across age and sex, rapid hypothesis generation, and creation of custom control cohorts.
biorxiv immunology 200-500-users 2017Rapid profiling of the preterm infant gut microbiota using nanopore sequencing aids pathogen diagnostics, bioRxiv, 2017-08-25
ABSTRACTThe Oxford Nanopore MinION sequencing platform offers near real time analysis of DNA reads as they are generated, which makes the device attractive for in-field or clinical deployment, e.g. rapid diagnostics. We used the MinION platform for shotgun metagenomic sequencing and analysis of gut-associated microbial communities; firstly, we used a 20-species human microbiota mock community to demonstrate how Nanopore metagenomic sequence data can be reliably and rapidly classified. Secondly, we profiled faecal microbiomes from preterm infants at increased risk of necrotising enterocolitis and sepsis. In single patient time course, we captured the diversity of the immature gut microbiota and observed how its complexity changes over time in response to interventions, i.e. probiotic, antibiotics and episodes of suspected sepsis. Finally, we performed ‘real-time’ runs from sample to analysis using faecal samples of critically ill infants and of healthy infants receiving probiotic supplementation. Real-time analysis was facilitated by our new NanoOK RT software package which analysed sequences as they were generated. We reliably identified potentially pathogenic taxa (i.e. Klebsiella pneumoniae and Enterobacter cloacae) and their corresponding antimicrobial resistance (AMR) gene profiles within as little as one hour of sequencing. Antibiotic treatment decisions may be rapidly modified in response to these AMR profiles, which we validated using pathogen isolation, whole genome sequencing and antibiotic susceptibility testing. Our results demonstrate that our pipeline can process clinical samples to a rich dataset able to inform tailored patient antimicrobial treatment in less than 5 hours.
biorxiv genomics 100-200-users 2017A Large-Scale Binding and Functional Map of Human RNA Binding Proteins, bioRxiv, 2017-08-24
Genomes encompass all the information necessary to specify the development and function of an organism. In addition to genes, genomes also contain a myriad of functional elements that control various steps in gene expression. A major class of these elements function only when transcribed into RNA as they serve as the binding sites for RNA binding proteins (RBPs), which act to control post-transcriptional processes including splicing, cleavage and polyadenylation, RNA editing, RNA localization, stability, and translation. Despite the importance of these functional RNA elements encoded in the genome, they have been much less studied than genes and DNA elements. Here, we describe the mapping and characterization of RNA elements recognized by a large collection of human RBPs in K562 and HepG2 cells. These data expand the catalog of functional elements encoded in the human genome by addition of a large set of elements that function at the RNA level through interaction with RBPs.
biorxiv genomics 200-500-users 2017