The Barcode, UMI, Set format and BUStools, bioRxiv, 2018-11-19
AbstractWe introduce the Barcode-UMI-Set format (BUS) for representing pseudoalignments of reads from single-cell RNA-seq experiments. The format can be used with all single-cell RNA-seq technologies, and we show that BUS files can be efficiently generated. BUStools is a suite of tools for working with BUS files and facilitates rapid quantification and analysis of single-cell RNA-seq data. The BUS format therefore makes possible the development of modular, technology-specific, and robust workflows for single-cell RNA-seq analysis.
biorxiv bioinformatics 100-200-users 2018Plant Extracellular Vesicles Contain Diverse Small RNA Species and Are Enriched in 10 to 17 Nucleotide “Tiny” RNAs, bioRxiv, 2018-11-17
ABSTRACTSmall RNAs (sRNAs) that are 21 to 24 nucleotides (nt) in length are found in most eukaryotic organisms and regulate numerous biological functions, including transposon silencing, development, reproduction, and stress responses, typically via control of the stability andor translation of target mRNAs. Major classes of sRNAs in plants include microRNAs (miRNAs) and small interfering RNAs (siRNAs); sRNAs are known to travel as a silencing signal from cell to cell, root to shoot, and even between host and pathogen. In mammals, sRNAs are transported inside extracellular vesicles (EVs), which are mobile lipid compartments that participate in intercellular communication. In addition to sRNAs, EVs carry proteins, lipids, metabolites, and potentially other types of nucleic acids. Here we report that plant EVs also contain diverse species of sRNA. We found that specific miRNAs and siRNAs are preferentially loaded into plant EVs. We also report a previously overlooked class of “tiny RNAs” (10 to 17 nt) that are highly enriched in EVs. This new RNA category of unknown function has a broad and very diverse genome origin and might correspond to degradation products.
biorxiv plant-biology 100-200-users 2018Variability of bacterial behavior in the mammalian gut captured using a growth-linked single-cell synthetic gene oscillator, bioRxiv, 2018-11-17
AbstractThe dynamics of the bacterial population that comprises the gut microbiota plays key roles in overall mammalian health. However, a detailed understanding of bacterial growth within the gut is limited by the inherent complexity and inaccessibility of the gut environment. Here, we deploy an improved synthetic genetic oscillator to investigate dynamics of bacterial colonization and growth in the mammalian gut under both healthy and disease conditions. The synthetic oscillator, when introduced into both Escherichia coli and Salmonella Typhimurium maintains regular oscillations with a constant period in generations across growth conditions. We determine the phase of oscillation from individual bacteria using image analysis of resultant colonies and thereby infer the number of cell divisions elapsed. In doing so, we demonstrate robust functionality and controllability of the oscillator circuit’s activity during bacterial growth in vitro, in a simulated murine gut microfluidic environment, and in vivo within the mouse gut. We determine different dynamics of bacterial colonization and growth in the gut under normal and inflammatory conditions. Our results show that a precise genetic oscillator can function in a complex environment and reveal single cell behavior under diverse conditions where disease may create otherwise impossible-to-quantify variability in growth across the population.
biorxiv synthetic-biology 0-100-users 2018Epigenetically reprogrammed methylation landscape drives the DNA self-assembly and serves as a universal cancer biomarker, Nature Communications, 2018-11-15
Epigenetic reprogramming in cancer genomes creates a distinct methylation landscape encompassing clustered methylation at regulatory regions separated by large intergenic tracks of hypomethylated regions. This methylation landscape that we referred to as Methylscape is displayed by most cancer types, thus may serve as a universal cancer biomarker. To-date most research has focused on the biological consequences of DNA Methylscape changes whereas its impact on DNA physicochemical properties remains unexplored. Herein, we examine the effect of levels and genomic distribution of methylcytosines on the physicochemical properties of DNA to detect the Methylscape biomarker. We find that DNA polymeric behaviour is strongly affected by differential patterning of methylcytosine, leading to fundamental differences in DNA solvation and DNA-gold affinity between cancerous and normal genomes. We exploit these Methylscape differences to develop simple, highly sensitive and selective electrochemical or colorimetric one-step assays for the detection of cancer. These assays are quick, i.e., analysis time ≤10 minutes, and require minimal sample preparation and small DNA input.
nature communications genetics 200-500-users 2018Scaling computational genomics to millions of individuals with GPUs, bioRxiv, 2018-11-14
Current genomics methods were designed to handle tens to thousands of samples, but will soon need to scale to millions to keep up with the pace of data and hypothesis generation in biomedical science. Moreover, costs associated with processing these growing datasets will become prohibitive without improving the computational efficiency and scalability of methods. Here, we show that recently developed machine-learning libraries (TensorFlow and PyTorch) facilitate implementation of genomics methods for GPUs and significantly accelerate computations. To demonstrate this, we re-implemented methods for two commonly performed computational genomics tasks QTL mapping and Bayesian non-negative matrix factorization. Our implementations ran > 200 times faster than current CPU-based versions, and these analyses are ~5-10 fold cheaper on GPUs due to the vastly shorter runtimes. We anticipate that the accessibility of these libraries, and the improvements in run-time will lead to a transition to GPU-based implementations for a wide range of computational genomics methods.
biorxiv genomics 200-500-users 2018