Comparative assessment of long-read error-correction software applied to RNA-sequencing data, bioRxiv, 2018-11-23
AbstractMotivationLong-read sequencing technologies offer promising alternatives to high-throughput short read sequencing, especially in the context of RNA-sequencing. However these technologies are currently hindered by high error rates in the output data that affect analyses such as the identification of isoforms, exon boundaries, open reading frames, and the creation of gene catalogues. Due to the novelty of such data, computational methods are still actively being developed and options for the error-correction of RNA-sequencing long reads remain limited.ResultsIn this article, we evaluate the extent to which existing long-read DNA error correction methods are capable of correcting cDNA Nanopore reads. We provide an automatic and extensive benchmark tool that not only reports classical error-correction metrics but also the effect of correction on gene families, isoform diversity, bias towards the major isoform, and splice site detection. We find that long read error-correction tools that were originally developed for DNA are also suitable for the correction of RNA-sequencing data, especially in terms of increasing base-pair accuracy. Yet investigators should be warned that the correction process perturbs gene family sizes and isoform diversity. This work provides guidelines on which (or whether) error-correction tools should be used, depending on the application type.Benchmarking software<jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsgitlab.comleoislLR_EC_analyser>httpsgitlab.comleoislLR_EC_analyser<jatsext-link>
biorxiv bioinformatics 0-100-users 2018Discovery of the first genome-wide significant risk loci for attention deficithyperactivity disorder, Nature Genetics, 2018-11-23
Attention deficithyperactivity disorder (ADHD) is a highly heritable childhood behavioral disorder affecting 5% of children and 2.5% of adults. Common genetic variants contribute substantially to ADHD susceptibility, but no variants have been robustly associated with ADHD. We report a genome-wide association meta-analysis of 20,183 individuals diagnosed with ADHD and 35,191 controls that identifies variants surpassing genome-wide significance in 12 independent loci, finding important new information about the underlying biology of ADHD. Associations are enriched in evolutionarily constrained genomic regions and loss-of-function intolerant genes and around brain-expressed regulatory marks. Analyses of three replication studies a cohort of individuals diagnosed with ADHD, a self-reported ADHD sample and a meta-analysis of quantitative measures of ADHD symptoms in the population, support these findings while highlighting study-specific differences on genetic overlap with educational attainment. Strong concordance with GWAS of quantitative population measures of ADHD symptoms supports that clinical diagnosis of ADHD is an extreme expression of continuous heritable traits.
nature genetics genetics 200-500-users 2018Nuclei multiplexing with barcoded antibodies for single-nucleus genomics, bioRxiv, 2018-11-23
AbstractSingle-nucleus RNA-Seq (snRNA-seq) enables the interrogation of cellular states in complex tissues that are challenging to dissociate, including frozen clinical samples. This opens the way, in principle, to large studies, such as those required for human genetics, clinical trials, or precise cell atlases of large organs. However, such applications are currently limited by batch effects, sequential processing, and costs. To address these challenges, we present an approach for multiplexing snRNA-seq, using sample-barcoded antibodies against the nuclear pore complex to uniquely label nuclei from distinct samples. Comparing human brain cortex samples profiled in multiplex with or without hashing antibodies, we demonstrate that nucleus hashing does not significantly alter the recovered transcriptome profiles. We further developed demuxEM, a novel computational tool that robustly detects inter-sample nucleus multiplets and assigns singlets to their samples of origin by antibody barcodes, and validated its accuracy using gender-specific gene expression, species-mixing and natural genetic variation. Nucleus hashing significantly reduces cost per nucleus, recovering up to about 5 times as many single nuclei per microfluidc channel. Our approach provides a robust technique for diverse studies including tissue atlases of isogenic model organisms or from a single larger human organ, multiple biopsies or longitudinal samples of one donor, and large-scale perturbation screens.
biorxiv genomics 0-100-users 2018C. elegans pathogenic learning confers multigenerational pathogen avoidance, bioRxiv, 2018-11-22
AbstractThe ability to pass on learned information to progeny could present an evolutionary advantage for many generations. While apparently evolutionarily conserved1–12, transgenerational epigenetic inheritance (TEI) is not well understood at the molecular or behavioral levels. Here we describe our discovery that C. elegans can pass on a learned pathogenic avoidance behavior to their progeny for several generations through epigenetic mechanisms. Although worms are initially attracted to the gram-negative bacteria P. aeruginosa (PA14)13, they can learn to avoid this pathogen13. We found that prolonged PA14 exposure results in transmission of avoidance behavior to progeny that have themselves never been exposed to PA14, and this behavior persists through the fourth generation. This form of transgenerational inheritance of bacterial avoidance is specific to pathogenic P. aeruginosa, requires physical contact and infection, and is distinct from CREB-dependent long-term associative memory and larval imprinting. The TGF-β ligand daf-7, whose expression increases in the ASJ upon initial exposure to PA1414, is highly expressed in the ASI neurons of progeny of trained mothers until the fourth generation, correlating with transgenerational avoidance behavior. Mutants of histone modifiers and small RNA mediators display defects in naïve PA14 attraction and aversive learning. By contrast, the germline-expressed PRG-1Piwi homolog15 is specifically required for transgenerational inheritance of avoidance behavior. Our results demonstrate a novel and natural paradigm of TEI that may optimize progeny decisions and subsequent survival in the face of changing environmental conditions.
biorxiv genetics 100-200-users 2018Hunger for Knowledge How the Irresistible Lure of Curiosity is Generated in the Brain, bioRxiv, 2018-11-22
Introductory ParagraphCuriosity is often portrayed as a desirable feature of human faculty. For example, a meta-analysis revealed that curiosity predicts academic performance above and beyond intelligence1, corroborating findings that curiosity supported long-term consolidation of learning2,3. However, curiosity may come at a cost of strong seductive power that sometimes puts people in a harmful situation. Here, with a set of three behavioural and two neuroimaging experiments including novel stimuli that strongly trigger curiosity (i.e. magic tricks), we examined the psychological and neural mechanisms underlying the irresistible lure of curiosity. We consistently demonstrated that across different samples people were indeed willing to gamble to expose themselves to physical risks (i.e. electric shocks) in order to satisfy their curiosity for trivial knowledge that carries no apparent instrumental values. Also, underlying this seductive power of curiosity is its incentive salience properties, which share common neural mechanisms with extrinsic incentives (i.e. hunger for foods). In particular, the two independent fMRI experiments using different kinds of curiosity-stimulating stimuli found replicable results that acceptance (compared to rejection) of curiosityincentive-driven gambles was accompanied with an enhanced activity in the striatum.
biorxiv neuroscience 100-200-users 2018A primer on deep learning in genomics, Nature Genetics, 2018-11-21
Deep learning methods are a class of machine learning techniques capable of identifying highly complex patterns in large datasets. Here, we provide a perspective and primer on deep learning applications for genome analysis. We discuss successful applications in the fields of regulatory genomics, variant calling and pathogenicity scores. We include general guidance for how to effectively use deep learning methods as well as a practical guide to tools and resources. This primer is accompanied by an interactive online tutorial.
nature genetics genetics 500+-users 2018