Real-time analysis of nanopore-based metagenomic sequencing from orthopaedic device infection, bioRxiv, 2017-11-18
AbstractProsthetic joint infections are clinically difficult to diagnose and treat. Previously, we demonstrated metagenomic sequencing on an Illumina MiSeq replicates the findings of current gold standard microbiological diagnostic techniques. Nanopore sequencing offers advantages in speed of detection over MiSeq. Here, we compare direct-from-clinical-sample metagenomic Illumina sequencing with Nanopore sequencing, and report a real-time analytical pathway for Nanopore sequence data, designed for detecting bacterial composition of prosthetic joint infections.DNA was extracted from the sonication fluids of seven explanted orthopaedic devices, and additionally from two culture negative controls, and was sequenced on the Oxford Nanopore Technologies MinION platform. A specific analysis pipeline was assembled to overcome the challenges of identifying the true infecting pathogen, given high levels of host contamination and unavoidable background lab and kit contamination.The majority of DNA classified (>90%) was host contamination and discarded. Using negative control filtering thresholds, the species identified corresponded with both routine microbiological diagnosis and MiSeq results. By analysing sequences in real time, causes of infection were robustly detected within minutes from initiation of sequencing.We demonstrate initial proof of concept that metagenomic MinION sequencing can provide rapid, accurate diagnosis for prosthetic joint infections. We demonstrate a novel, scalable pipeline for real-time analysis of MinION sequence data. The high proportion of human DNA in extracts prevents full genome analysis from complete coverage, and methods to reduce this could increase genome depth and allow antimicrobial resistance profiling.
biorxiv microbiology 100-200-users 2017Scaling accurate genetic variant discovery to tens of thousands of samples, bioRxiv, 2017-11-15
AbstractComprehensive disease gene discovery in both common and rare diseases will require the efficient and accurate detection of all classes of genetic variation across tens to hundreds of thousands of human samples. We describe here a novel assembly-based approach to variant calling, the GATK HaplotypeCaller (HC) and Reference Confidence Model (RCM), that determines genotype likelihoods independently per-sample but performs joint calling across all samples within a project simultaneously. We show by calling over 90,000 samples from the Exome Aggregation Consortium (ExAC) that, in contrast to other algorithms, the HC-RCM scales efficiently to very large sample sizes without loss in accuracy; and that the accuracy of indel variant calling is superior in comparison to other algorithms. More importantly, the HC-RCM produces a fully squared-off matrix of genotypes across all samples at every genomic position being investigated. The HC-RCM is a novel, scalable, assembly-based algorithm with abundant applications for population genetics and clinical studies.
biorxiv genomics 0-100-users 2017The nature of nurture effects of parental genotypes, bioRxiv, 2017-11-15
AbstractSequence variants in the parental genomes that are not transmitted to a childproband are often ignored in genetic studies. Here we show that non-transmitted alleles can impact a child through their effects on the parents and other relatives, a phenomenon we call genetic nurture. Using results from a meta-analysis of educational attainment, the polygenic score computed for the non-transmitted alleles of 21,637 probands with at least one parent genotyped has an estimated effect on the educational attainment of the proband that is 29.9% (P = 1.6×10−14) of that of the transmitted polygenic score. Genetic nurturing effects of this polygenic score extend to other traits. Paternal and maternal polygenic scores have similar effects on educational attainment, but mothers contribute more than fathers to nutritionheath related traits.One Sentence SummaryNurture has a genetic component, i.e. alleles in the parents affect the parents’ phenotypes and through that influence the outcomes of the child.
biorxiv genetics 200-500-users 2017Comprehensive analysis of mobile genetic elements in the gut microbiome reveals phylum-level niche-adaptive gene pools, bioRxiv, 2017-11-14
AbstractMobile genetic elements (MGEs) drive extensive horizontal transfer in the gut microbiome. This transfer could benefit human health by conferring new metabolic capabilities to commensal microbes, or it could threaten human health by spreading antibiotic resistance genes to pathogens. Despite their biological importance and medical relevance, MGEs from the gut microbiome have not been systematically characterized. Here, we present a comprehensive analysis of chromosomal MGEs in the gut microbiome using a method called Split Read Insertion Detection (SRID) that enables the identification of the exact mobilizable unit of MGEs. Leveraging the SRID method, we curated a database of 5600 putative MGEs encompassing seven MGE classes called ImmeDB (Intestinal microbiome mobile element database) (<jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsimmedb.mit.edu>httpsimmedb.mit.edu<jatsext-link>). We observed that many MGEs carry genes that confer an adaptive advantage to the gut environment including gene families involved in antibiotic resistance, bile salt detoxification, mucus degradation, capsular polysaccharide biosynthesis, polysaccharide utilization, and sporulation. We find that antibiotic resistance genes are more likely to be spread by conjugation via integrative conjugative elements or integrative mobilizable elements than transduction via prophages. Additionally, we observed that horizontal transfer of MGEs is extensive within phyla but rare across phyla. Taken together, our findings support a phylum level niche-adaptive gene pools in the gut microbiome. ImmeDB will be a valuable resource for future fundamental and translational studies on the gut microbiome and MGE communities.
biorxiv bioinformatics 100-200-users 2017EpiGraph an open-source platform to quantify epithelial organization, bioRxiv, 2017-11-14
SUMMARYDuring development, cells must coordinate their differentiation with their growth and organization to form complex multicellular structures such as tissues and organs. Healthy tissues must maintain these structures during homeostasis. Epithelia are packed ensembles of cells from which the different tissues of the organism will originate during embryogenesis. A large barrier to the analysis of the morphogenetic changes in epithelia is the lack of simple tools that enable the quantification of cell arrangements. Here we present EpiGraph, an image analysis tool that quantifies epithelial organization. Our method combines computational geometry and graph theory to measure the degree of order of any packed tissue. EpiGraph goes beyond the traditional polygon distribution analysis, capturing other organizational traits that improve the characterization of epithelia. EpiGraph can objectively compare the rearrangements of epithelial cells during development and homeostasis to quantify how the global ensemble is affected. Importantly, it has been implemented in the open-access platform FIJI. This makes EpiGraph very user friendly, with no programming skills required.
biorxiv developmental-biology 0-100-users 2017Explanation implies causation?, bioRxiv, 2017-11-14
AbstractMost researchers do not deliberately claim causal results in an observational study. But do we lead our readers to draw a causal conclusion unintentionally by explaining why significant correlations and relationships may exist? Here we perform a randomized study in a data analysis massive online open course to test the hypothesis that explaining an analysis will lead readers to interpret an inferential analysis as causal. We show that adding an explanation to the description of an inferential analysis leads to a 15.2% increase in readers interpreting the analysis as causal (95% CI 12.8% - 17.5%). We then replicate this finding in a second large scale massive online open course. Nearly every scientific study, regardless of the study design, includes explanation for observed effects. Our results suggest that these explanations may be misleading to the audience of these data analyses.
biorxiv scientific-communication-and-education 100-200-users 2017