High accuracy DNA sequencing on a small, scalable platform via electrical detection of single base incorporations, bioRxiv, 2019-04-16
AbstractHigh throughput DNA sequencing technologies have undergone tremendous development over the past decade. Although optical detection-based sequencing has constituted the majority of data output, it requires a large capital investment and aggregation of samples to achieve optimal cost per sample. We have developed a novel electronic detection-based platform capable of accurately detecting single base incorporations. The GenapSys technology with its electronic detection modality allows the system to be compact, accessible, and affordable. We demonstrate the performance of the system by sequencing several different microbial genomes with varying GC content. The platform is capable of generating 1.5 Gb of high-quality nucleic acid sequence in a single run. We routinely generate sequence data that exceeds 99% raw accuracy with read lengths of up to 175 bp. The utility of the platform is highlighted by targeted sequencing of the human genome. We show high concordance of SNP detection on the human NA12878 HapMap cell line with data generated on the Illumina sequencing platform. In addition, we sequenced a targeted panel of cancer-associated genes in a well characterized reference standard. With multiple library preparation approaches on this sample, we were able to identify low frequency mutations at expected allele frequencies.
biorxiv genomics 100-200-users 2019nf-core Community curated bioinformatics pipelines, bioRxiv, 2019-04-16
AbstractThe standardization, portability, and reproducibility of analysis pipelines is a renowned problem within the bioinformatics community. Most pipelines are designed for execution on-premise, and the associated software dependencies are tightly coupled with the local compute environment. This leads to poor pipeline portability and reproducibility of the ensuing results - both of which are fundamental requirements for the validation of scientific findings. Here, we introduce nf-core a framework that provides a community-driven, peer-reviewed platform for the development of best practice analysis pipelines written in Nextflow. Key obstacles in pipeline development such as portability, reproducibility, scalability and unified parallelism are inherently addressed by all nf-core pipelines. We are also continually developing a suite of tools that assist in the creation and development of both new and existing pipelines. Our primary goal is to provide a platform for high-quality, reproducible bioinformatics pipelines that can be utilized across various institutions and research facilities.
biorxiv bioinformatics 100-200-users 2019Scale-free Vertical Tracking Microscopy Towards Bridging Scales in Biological Oceanography, bioRxiv, 2019-04-16
AbstractUnderstanding key biophysical phenomena in the ocean often requires one to simultaneously focus on microscale entities, such as motile plankton and sedimenting particles, while maintaining the macroscale context of vertical transport in a highly stratified environment. This poses a conundrum How to measure single organisms, at microscale resolution, in the lab, while allowing them to freely move hundreds of meters in the vertical direction? We present a solution in the form of a scale-free, vertical tracking microscope based on a circular “hydrodynamic-treadmill”. Our technology allows us to transcend physiological and ecological scales, tracking organisms from marine zooplankton to single-cells over vertical scales of meters while resolving microflows and behavioral processes. We demonstrate measurements of sinking particles, including marine snow as they sediment tens of meters while capturing sub-particle-scale phenomena. We also demonstrate depth-patterned virtual-reality environments for novel behavioral analyses of microscale plankton. This technique offers a new experimental paradigm in microscale ocean biophysics by combining physiological-scale imaging with free movement in an ecological-scale patterned environment.One sentence summaryScale-free vertical tracking microscopy captures, for the first time, untethered behavioral dynamics at cellular resolution for marine plankton.
biorxiv biophysics 100-200-users 2019Rare variants contribute disproportionately to quantitative trait variation in yeast, bioRxiv, 2019-04-15
AbstractA detailed understanding of the sources of heritable variation is a central goal of modern genetics. Genome-wide association studies (GWAS) in humans1 have implicated tens of thousands of DNA sequence variants in disease risk and quantitative trait variation, but these variants fail to account for the entire heritability of diseases and traits. GWAS have by design focused on common DNA sequence variants; however, recent studies underscore the likely importance of the contribution of rare variants to heritable variation2. Further, finding the genes that underlie the GWAS signals remains a major challenge. Here, we use a unique model system to disentangle the contributions of common and rare variants to a large number of quantitative traits. We generated large crosses among 16 diverse yeast strains and identified thousands of quantitative trait loci (QTLs) that explain most of the heritable variation in 38 traits. We combined our results with sequencing data for 1,011 yeast isolates3 to decouple variant effect size estimation from allele frequency and showed that rare variants make a disproportionate contribution to trait variation as a consequence of their larger effect sizes. Evolutionary analyses revealed that this contribution is driven by rare variants that arose recently, that such variants are more likely to decrease fitness, and that negative selection has shaped the relationship between variant frequency and effect size. Finally, we leveraged the structure of the crosses to resolve hundreds of QTLs to single genes. These results refine our understanding of trait variation at the population level and suggest that studies of rare variants are a fertile ground for discovery of genetic effects.
biorxiv genetics 100-200-users 2019Automated Reconstruction of a Serial-Section EM Drosophila Brain with Flood-Filling Networks and Local Realignment, bioRxiv, 2019-04-12
AbstractReconstruction of neural circuitry at single-synapse resolution is an attractive target for improving understanding of the nervous system in health and disease. Serial section transmission electron microscopy (ssTEM) is among the most prolific imaging methods employed in pursuit of such reconstructions. We demonstrate how Flood-Filling Networks (FFNs) can be used to computationally segment a forty-teravoxel whole-brain Drosophila ssTEM volume. To compensate for data irregularities and imperfect global alignment, FFNs were combined with procedures that locally re-align serial sections and dynamically adjust image content. The proposed approach produced a largely merger-free segmentation of the entire ssTEM Drosophila brain, which we make freely available. As compared to manual tracing using an efficient skeletonization strategy, the segmentation enabled circuit reconstruction and analysis workflows that were an order of magnitude faster.
biorxiv neuroscience 100-200-users 2019Exploring dimension-reduced embeddings with Sleepwalk, bioRxiv, 2019-04-12
AbstractDimension-reduction methods, such as t-SNE or UMAP, are widely used when exploring high-dimensional data describing many entities, e.g., RNA-Seq data for many single cells. However, dimension reduction is unavoidably prone to introducing artefacts, and we hence need means to see where a dimension-reduced embedding is a faithful representation of the local neighbourhood and where it is not.We present Sleepwalk, a simple but powerful tool that allows the user to interactively explore an embedding, using colour to depict “true” similarities of all points to the cell under the mouse cursor. We show how this approach not only highlights distortions, but also reveals otherwise hidden characteristics of the data, and how Sleepwalk’s comparative modes help integrate multi-sample data and understand differences between embedding and preprocessing methods. Sleepwalk is a versatile and intuitive tool that unlocks the full power of dimension reduction and will be of value not only in single-cell RNA-Seq but also in any other area with matrix-shaped big data.
biorxiv bioinformatics 100-200-users 2019