Comparative performance of the BGI and Illumina sequencing technology for single-cell RNA-sequencing, bioRxiv, 2019-02-17
The libraries generated by high-throughput single cell RNA-sequencing platforms such as the Chromium from 10x Genomics require considerable amounts of sequencing, typically due to the large number of cells. The ability to use this data to address biological questions is directly impacted by the quality of the sequence data. Here we have compared the performance of the Illumina NextSeq 500 and NovaSeq 6000 against the BGI MGISEQ-2000 platform using identical Single Cell libraries consisting of over 70,000 cells. Our results demonstrate a highly comparable performance between the NovaSeq 6000 and MGISEQ-2000 in sequencing quality, and cell, UMI, and gene detection. However, compared with the NextSeq 500, the MGISEQ- 2000 platform performs consistently better, identifying more cells, genes, and UMIs at equalised read depth. We were able to call an additional 1,065,659 SNPs from sequence data generated by the BGI platform, enabling an additional 14% of cells to be assigned to the correct donor from a multiplexed library. However, both the NextSeq 500 and MGISEQ-2000 detected similar frequencies of gRNAs from a pooled CRISPR single cell screen. Our study provides a benchmark for high capacity sequencing platforms applied to high-throughput single cell RNA-seq libraries.
biorxiv genomics 100-200-users 2019Evolutionary dynamics of phage resistance in bacterial biofilms, bioRxiv, 2019-02-17
Interactions among bacteria and their viral predators, the bacteriophages, are likely among the most common ecological phenomena on Earth. The constant threat of phage infection to bacterial hosts, and the imperative of achieving infection on the part of phages, drives an evolutionary contest in which phage-resistant bacteria emerge, often followed by phages with new routes of infection. This process has received abundant theoretical and experimental attention for decades and forms an important basis for molecular genetics and theoretical ecology and evolution. However, at present, we know very little about the nature of phage-bacteria interaction -- and the evolution of phage resistance -- inside the surface-bound communities that microbes usually occupy in natural environments. These communities, termed biofilms, are encased in a matrix of secreted polymers produced by their microbial residents. Biofilms are spatially constrained such that interactions become limited to neighbors or near-neighbors; diffusion of solutes and particulates is reduced; and there is pronounced heterogeneity in nutrient access and therefore physiological state. These factors can dramatically impact the way phage infections proceed even in simple, single-strain biofilms, but we still know little of their effect on phage resistance evolutionary dynamics. Here we explore this problem using a computational simulation framework customized for implementing phage infection inside multi-strain biofilms. Our simulations predict that it is far easier for phage-susceptible and phage- resistant bacteria to coexist inside biofilms relative to planktonic culture, where phages and hosts are well-mixed. We characterize the negative frequency dependent selection that underlies this coexistence, and we then test and confirm this prediction using an experimental model of biofilm growth measured with confocal microscopy at single-cell and single-phage resolution.
biorxiv microbiology 0-100-users 2019High-Fidelity Nanopore Sequencing of Ultra-Short DNA Sequences, bioRxiv, 2019-02-17
Nanopore sequencing offers a portable and affordable alternative to sequencing-by-synthesis methods but suffers from lower accuracy and cannot sequence ultra-short DNA. This puts applications such as molecular diagnostics based on the analysis of cell-free DNA or single-nucleotide variants (SNV) out of reach. To overcome these limitations, we report a nanopore-based sequencing strategy in which short target sequences are first circularized and then amplified via rolling-circle amplification to produce long stretches of concatemeric repeats. These can be sequenced on the MinION platform from Oxford Nanopore Technologies (ONT), and the resulting repeat sequences aligned to produce a highly-accurate consensus that reduces the high error-rate present in the individual repeats. Using this approach, we demonstrate for the first time the ability to obtain unbiased and accurate nanopore data for target DNA sequences of < 100 bp. Critically, this approach is sensitive enough to achieve SNV discrimination in mixtures of sequences and even enables quantitative detection of specific variants present at ratios of < 10%. Our method is simple, cost-effective, and only requires well-established processes. It therefore expands the utility of nanopore sequencing for molecular diagnostics and other applications, especially in resource-limited settings.
biorxiv bioengineering 100-200-users 2019Large, three-generation CEPH families reveal post-zygotic mosaicism and variability in germline mutation accumulation, bioRxiv, 2019-02-17
AbstractThe number of de novo mutations (DNMs) found in an offspring’s genome increases with both paternal and maternal age. But does the rate of mutation accumulation in human gametes differ across families? Using sequencing data from 33 large, three-generation CEPH families, we observed significant variability in parental age effects on DNM counts across families, with estimates ranging from 0.19 to 3.24 DNMs per year. Additionally, we found that approximately 3% of DNMs originated following primordial germ cell specification (PGCS) in a parent, and differed from non-mosaic germline DNMs in their mutational spectra. We also discovered that nearly 10% of candidate DNMs in the second generation were post-zygotic, and present in both somatic and germ cells; these gonosomal mutations occurred at equivalent frequencies on both parental haplotypes. Our results demonstrate that the rate of germline mutation accumulation varies among families with similar ancestry, and confirm that post-zygotic mosaicism is a substantial source of de novo mutations in humans.Data and code availability. Code used for statistical analysis and figure generation has been deposited on GitHub as a collection of annotated Jupyter Notebooks <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsgithub.comquinlan-labceph-dnm-manuscript>httpsgithub.comquinlan-labceph-dnm-manuscript<jatsext-link>. Data files containing high-confidence de novo mutations, as well as the gonosomal and post-primordial germ cell specification (PGCS) mosaic mutations, are included with these Notebooks. To mitigate compatibility issues, we have also made all notebooks available in a Binder environment, accessible at the above GitHub repository.
biorxiv genetics 0-100-users 2019SquiggleKit A toolkit for manipulating nanopore signal data, bioRxiv, 2019-02-17
The management of raw nanopore sequencing data poses a challenge that must be overcome to accelerate the development of new bioinformatics algorithms predicated on signal analysis. SquiggleKit is a toolkit for manipulating and interrogating nanopore data that simplifies file handling, data extraction, visualisation, and signal processing. Its modular tools can be used to reduce file numbers and memory footprint, identify poly-A tails, target barcodes, adapters, and find nucleotide sequence motifs in raw nanopore signal, amongst other applications. SquiggleKit serves as a bioinformatics portal into signal space, for novice and experienced users alike. It is comprehensively documented, simple to use, cross-platform compatible and freely available from (httpsgithub.comPsy-FerSquiggleKit).
biorxiv bioinformatics 100-200-users 2019A high-resolution, chromosome-assigned Komodo dragon genome reveals adaptations in the cardiovascular, muscular, and chemosensory systems of monitor lizards, bioRxiv, 2019-02-16
Monitor lizards are unique among ectothermic reptiles in that they have a high aerobic capacity and distinctive cardiovascular physiology which resembles that of endothermic mammals. We have sequenced the genome of the Komodo dragon (Varanus komodoensis), the largest extant monitor lizard, and present a high resolution de novo chromosome-assigned genome assembly for V. komodoensis, generated with a hybrid approach of long-range sequencing and single molecule physical mapping. Comparing the genome of V. komodoensis with those of related species showed evidence of positive selection in pathways related to muscle energy metabolism, cardiovascular homeostasis, and thrombosis. We also found species-specific expansions of a chemoreceptor gene family related to pheromone and kairomone sensing in V. komodoensis and several other lizard lineages. Together, these evolutionary signatures of adaptation reveal genetic underpinnings of the unique Komodo sensory, cardiovascular, and muscular systems, and suggest that selective pressure altered thrombosis genes to help Komodo dragons evade the anticoagulant effects of their own saliva. As the only sequenced monitor lizard genome, the Komodo dragon genome is an important resource for understanding the biology of this lineage and of reptiles worldwide.
biorxiv genomics 100-200-users 2019