Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes, bioRxiv, 2019-01-27
Illumina sequencing allows rapid, cheap and accurate whole genome bacterial analyses, but short reads (<300 bp) do not usually enable complete genome assembly. Long read sequencing greatly assists with resolving complex bacterial genomes, particularly when combined with short-read Illumina data (hybrid assembly); however, it is not clear how different long-read sequencing methods impact on assembly accuracy. Relative automation of the assembly process is also crucial to facilitating high-throughput complete bacterial genome reconstruction, avoiding multiple bespoke filtering and data manipulation steps. In this study, we compared hybrid assemblies for 20 bacterial isolates, including two reference strains, using Illumina sequencing and long reads from either Oxford Nanopore Technologies (ONT) or from SMRT Pacific Biosciences (PacBio) sequencing platforms. We chose isolates from the Enterobacteriaceae family, as these frequently have highly plastic, repetitive genetic structures and complete genome reconstruction for these species is relevant for a precise understanding of the epidemiology of antimicrobial resistance. We de novo assembled genomes using the hybrid assembler Unicycler and compared different read processing strategies. Both strategies facilitate high-quality genome reconstruction. Combining ONT and Illumina reads fully resolved most genomes without additional manual steps, and at a lower cost per isolate in our setting. Automated hybrid assembly is a powerful tool for complete and accurate bacterial genome assembly.
biorxiv bioinformatics 100-200-users 2019Fast and accurate long-read assembly with wtdbg2, bioRxiv, 2019-01-27
Existing long-read assemblers require tens of thousands of CPU hours to assemble a human genome and are being outpaced by sequencing technologies in terms of both throughput and cost. We developed a novel long-read assembler wtdbg2 that, for human data, is tens of times faster than published tools while achieving comparable contiguity and accuracy. It represents a significant algorithmic advance and paves the way for population-scale long-read assembly in future.
biorxiv bioinformatics 200-500-users 2019Genome wide meta-analysis identifies genomic relationships, novel loci, and pleiotropic mechanisms across eight psychiatric disorders, bioRxiv, 2019-01-27
SummaryGenetic influences on psychiatric disorders transcend diagnostic boundaries, suggesting substantial pleiotropy of contributing loci. However, the nature and mechanisms of these pleiotropic effects remain unclear. We performed a meta-analysis of 232,964 cases and 494,162 controls from genome-wide studies of anorexia nervosa, attention-deficithyperactivity disorder, autism spectrum disorder, bipolar disorder, major depression, obsessive-compulsive disorder, schizophrenia, and Tourette syndrome. Genetic correlation analyses revealed a meaningful structure within the eight disorders identifying three groups of inter-related disorders. We detected 109 loci associated with at least two psychiatric disorders, including 23 loci with pleiotropic effects on four or more disorders and 11 loci with antagonistic effects on multiple disorders. The pleiotropic loci are located within genes that show heightened expression in the brain throughout the lifespan, beginning in the second trimester prenatally, and play prominent roles in a suite of neurodevelopmental processes. These findings have important implications for psychiatric nosology, drug development, and risk prediction.
biorxiv genomics 100-200-users 2019Genome wide meta-analysis identifies genomic relationships, novel loci, and pleiotropic mechanisms across eight psychiatric disorders. Supplemental Tables 1 - 18, bioRxiv, 2019-01-27
Genetic influences on psychiatric disorders transcend diagnostic boundaries, suggesting substantial pleiotropy of contributing loci. However, the nature and mechanisms of these pleiotropic effects remain unclear. We performed a meta-analysis of 232,964 cases and 494,162 controls from genome-wide studies of anorexia nervosa, attention-deficithyperactivity disorder, autism spectrum disorder, bipolar disorder, major depression, obsessive-compulsive disorder, schizophrenia, and Tourette syndrome. Genetic correlation analyses revealed a meaningful structure within the eight disorders identifying three groups of inter-related disorders. We detected 109 loci associated with at least two psychiatric disorders, including 23 loci with pleiotropic effects on four or more disorders and 11 loci with antagonistic effects on multiple disorders. The pleiotropic loci are located within genes that show heightened expression in the brain throughout the lifespan, beginning in the second trimester prenatally, and play prominent roles in a suite of neurodevelopmental processes. These findings have important implications for psychiatric nosology, drug development, and risk prediction.
biorxiv genomics 100-200-users 2019Global phylogeography and ancient evolution of the widespread human gut virus crAssphage, bioRxiv, 2019-01-27
Microbiomes are vast communities of microbes and viruses that populate all natural ecosystems. Viruses have been considered the most variable component of microbiomes, as supported by virome surveys and examples of high genomic mosaicism. However, recent evidence suggests that the human gut virome is remarkably stable compared to other environments. Here we investigate the origin, evolution, and epidemiology of crAssphage, a widespread human gut virus. Through a global collaboratory, we obtained DNA sequences of crAssphage from over one-third of the world's countries, and showed that its phylogeography is locally clustered within countries, cities, and individuals. We also found colinear crAssphage-like genomes in both Old-World and New-World primates, challenging genomic mosaicism and suggesting that the association of crAssphage with primates may be millions of years old. We conclude that crAssphage is a benign globetrotter virus that may have co-evolved with the human lineage and an integral part of the normal human gut virome.
biorxiv microbiology 0-100-users 2019High-pass filtering artifacts in multivariate classification of neural time series data, bioRxiv, 2019-01-27
The application of time-resolved multivariate pattern classification analyses (MVPA) to EEG and MEG data has become increasingly popular. Traditionally, such time series data are high-pass filtered before analyses, in order to remove slow drifts. Here we show that high-pass filtering should be applied with extreme caution in MVPA, as it may easily create artifacts that result in displacement of decoding accuracy, leading to statistically significant above-chance classification during time periods in which the source is clearly not in brain activity. In both real and simulated EEG data, we show that spurious decoding may emerge with filter cut-off settings from as modest as 0.1 Hz. We provide an alternative method of removing slow drift noise, referred to as robust detrending (de Cheveigne & Arzounian, 2018), which, when applied in concert with masking of cortical events does not result in the temporal displacement of information. We show that temporal generalization may benefit from robust detrending, without any of the unwanted side effects introduced by filtering. However, we conclude that for sufficiently clean data sets, no filtering or detrending at all may work sufficiently well. Implications for other types of data are discussed, followed by a number of recommendations.
biorxiv neuroscience 100-200-users 2019