Cell “hashing” with barcoded antibodies enables multiplexing and doublet detection for single cell genomics, bioRxiv, 2017-12-22
ABSTRACTDespite rapid developments in single cell sequencing technology, sample-specific batch effects, detection of cell doublets, and the cost of generating massive datasets remain outstanding challenges. Here, we introduce cell “hashing”, where oligo-tagged antibodies against ubiquitously expressed surface proteins are used to uniquely label cells from distinct samples, which can be subsequently pooled. By sequencing these tags alongside the cellular transcriptome, we can assign each cell to its sample of origin, and robustly identify doublets originating from multiple samples. We demonstrate our approach by pooling eight human PBMC samples on a single run of the 10x Chromium system, substantially reducing our per-cell costs for library generation. Cell “hashing” is inspired by, and complementary to, elegant multiplexing strategies based on genetic variation, which we also leverage to validate our results. We therefore envision that our approach will help to generalize the benefits of single cell multiplexing to diverse samples and experimental designs.
biorxiv genomics 100-200-users 2017Cellular diversity in the Drosophila midbrain revealed by single-cell transcriptomics, bioRxiv, 2017-12-22
AbstractTo understand the brain, molecular details need to be overlaid onto neural wiring diagrams so that synaptic mode, neuromodulation and critical signaling operations can be considered. Single-cell transcriptomics provide a unique opportunity to collect this information. Here we present an initial analysis of thousands of individual cells from Drosophila midbrain, that were acquired using Drop-Seq. A number of approaches permitted the assignment of transcriptional profiles to several major brain regions and cell-types. Expression of biosynthetic enzymes and reuptake mechanisms allows all the neurons to be typed according to the neurotransmitter or neuromodulator that they produce and presumably release. Some neuropeptides are preferentially co-expressed in neurons using a particular fast-acting transmitter, or monoamine. Neuromodulatory and neurotransmitter receptor subunit expression illustrates the potential of these molecules in generating complexity in neural circuit function. This cell atlas dataset provides an important resource to link molecular operations to brain regions and complex neural processes.
biorxiv neuroscience 0-100-users 2017Challenges in Using ctDNA to Achieve Early Detection of Cancer, bioRxiv, 2017-12-22
AbstractEarly detection of cancer is a significant unmet clinical need. Improved technical ability to detect circulating tumor-derived DNA (ctDNA) in the cell-free DNA (cfDNA) component of blood plasma via next-generation sequencing and established correlations between ctDNA load and tumor burden in cancer patients have spurred excitement about the possibilities of detecting cancer early by performing ctDNA mutation detection.We reanalyze published data on the expected ctDNA allele fraction in early-stage cancer and the population statistics of cfDNA concentration to show that under conservative technical assumptions, high-sensitivity cancer detection by ctDNA mutation detection will require either more blood volume (150-300mL) than practical for a routine screen or variant filtering that may be impossible given our knowledge of cancer evolution, and will likely remain out of economic reach for routine population screening without multiple-order-of-magnitude decreases in sequencing cost. Instead, new approaches that integrate ctDNA mutations with multiple other blood-based analytes (such as exosomes, circulating tumor cells, ctDNA epigenetics, metabolites) as well as integration of these signals over time for each individual may be needed.
biorxiv cancer-biology 0-100-users 2017Firefly genomes illuminate parallel origins of bioluminescence in beetles, bioRxiv, 2017-12-22
AbstractFireflies and their fascinating luminous courtships have inspired centuries of scientific study. Today firefly luciferase is widely used in biotechnology, but the evolutionary origin of their bioluminescence remains unclear. To shed light on this long-standing question, we sequenced the genomes of two firefly species that diverged over 100 million-years-ago the North American Photinus pyralis and Japanese Aquatica lateralis. We also sequenced the genome of a related click-beetle, the Caribbean Ignelater luminosus, with bioluminescent biochemistry near-identical to fireflies, but anatomically unique light organs, suggesting the intriguing but contentious hypothesis of parallel gains of bioluminescence. Our analyses support two independent gains of bioluminescence between fireflies and click-beetles, and provide new insights into the genes, chemical defenses, and symbionts that evolved alongside their luminous lifestyle.One Sentence SummaryComparative analyses of the first linkage-group-resolution genomes of fireflies and related bioluminescent beetles address long-standing questions of the origin and evolution of bioluminescence and its associated traits.
biorxiv genomics 200-500-users 2017NanoPack visualizing and processing long read sequencing data, bioRxiv, 2017-12-22
AbstractSummary Here we describe NanoPack, a set of tools developed for visualization and processing of long read sequencing data from Oxford Nanopore Technologies and Pacific Biosciences.Availability and Implementation The NanoPack tools are written in Python3 and released under the GNU GPL3.0 Licence. The source code can be found at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsgithub.comwdecosternanopack>httpsgithub.comwdecosternanopack<jatsext-link>, together with links to separate scripts and their documentation. The scripts are compatible with Linux, Mac OS and the MS Windows 10 subsystem for linux and are available as a graphical user interface, a web service at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpnanoplot.bioinf.be>httpnanoplot.bioinf.be<jatsext-link> and command line tools.Contactwouter.decoster@molgen.vib-ua.beSupplementary information Supplementary tables and figures are available at Bioinformatics online.
biorxiv bioinformatics 100-200-users 2017The essential genome of Escherichia coli K-12, bioRxiv, 2017-12-22
ABSTRACTTransposon-Directed Insertion-site Sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries and therefore it remains unclear whether the two methodologies are comparable. To address this, a high density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false positive identification of essential gene candidates, statistical data analysis included corrections for both gene length and genome length. Through this analysis new essential genes and genes previously incorrectly designated as essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects and fine resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis datasets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry.IMPORTANCEIncentives to define lists of genes that are essential for bacterial survival include the identification of potential targets for antibacterial drug development, genes required for rapid growth for exploitation in biotechnology, and discovery of new biochemical pathways. To identify essential genes in E. coli, we constructed a very high density transposon mutant library. Initial automated analysis of the resulting data revealed many discrepancies when compared to the literature. We now report more extensive statistical analysis supported by both literature searches and detailed inspection of high density TraDIS sequencing data for each putative essential gene for the model laboratory organism, Escherichia coli. This paper is important because it provides a better understanding of the essential genes of E. coli, reveals the limitations of relying on automated analysis alone and a provides new standard for the analysis of TraDIS data.
biorxiv microbiology 100-200-users 2017