Comparative assessment of long-read error-correction software applied to RNA-sequencing data, bioRxiv, 2018-11-23

AbstractMotivationLong-read sequencing technologies offer promising alternatives to high-throughput short read sequencing, especially in the context of RNA-sequencing. However these technologies are currently hindered by high error rates in the output data that affect analyses such as the identification of isoforms, exon boundaries, open reading frames, and the creation of gene catalogues. Due to the novelty of such data, computational methods are still actively being developed and options for the error-correction of RNA-sequencing long reads remain limited.ResultsIn this article, we evaluate the extent to which existing long-read DNA error correction methods are capable of correcting cDNA Nanopore reads. We provide an automatic and extensive benchmark tool that not only reports classical error-correction metrics but also the effect of correction on gene families, isoform diversity, bias towards the major isoform, and splice site detection. We find that long read error-correction tools that were originally developed for DNA are also suitable for the correction of RNA-sequencing data, especially in terms of increasing base-pair accuracy. Yet investigators should be warned that the correction process perturbs gene family sizes and isoform diversity. This work provides guidelines on which (or whether) error-correction tools should be used, depending on the application type.Benchmarking software<jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsgitlab.comleoislLR_EC_analyser>httpsgitlab.comleoislLR_EC_analyser<jatsext-link>

biorxiv bioinformatics 0-100-users 2018

Nuclei multiplexing with barcoded antibodies for single-nucleus genomics, bioRxiv, 2018-11-23

AbstractSingle-nucleus RNA-Seq (snRNA-seq) enables the interrogation of cellular states in complex tissues that are challenging to dissociate, including frozen clinical samples. This opens the way, in principle, to large studies, such as those required for human genetics, clinical trials, or precise cell atlases of large organs. However, such applications are currently limited by batch effects, sequential processing, and costs. To address these challenges, we present an approach for multiplexing snRNA-seq, using sample-barcoded antibodies against the nuclear pore complex to uniquely label nuclei from distinct samples. Comparing human brain cortex samples profiled in multiplex with or without hashing antibodies, we demonstrate that nucleus hashing does not significantly alter the recovered transcriptome profiles. We further developed demuxEM, a novel computational tool that robustly detects inter-sample nucleus multiplets and assigns singlets to their samples of origin by antibody barcodes, and validated its accuracy using gender-specific gene expression, species-mixing and natural genetic variation. Nucleus hashing significantly reduces cost per nucleus, recovering up to about 5 times as many single nuclei per microfluidc channel. Our approach provides a robust technique for diverse studies including tissue atlases of isogenic model organisms or from a single larger human organ, multiple biopsies or longitudinal samples of one donor, and large-scale perturbation screens.

biorxiv genomics 0-100-users 2018

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo