Common methods for fecal sample storage in field studies yield consistent signatures of individual identity in microbiome sequencing data, bioRxiv, 2016-02-05
Field studies of wild vertebrates are frequently associated with extensive collections of banked fecal samples, which are often collected from known individuals and sometimes also sampled longitudinally across time. Such collections represent unique resources for understanding ecological, behavioral, and phylogenetic effects on the gut microbiome, especially for species of particular conservation concern. However, we do not understand whether sample storage methods confound the ability to investigate interindividual variation in gut microbiome profiles. This uncertainty arises in part because comparisons across storage methods to date generally include only a few (≤5) individuals, or analyze pooled samples. Here, we used n=52 samples from 13 rhesus macaque individuals to compare immediate freezing, the gold standard of preservation, to three methods commonly used in vertebrate field studies storage in ethanol, lyophilization following ethanol storage, and storage in RNAlater. We found that the signature of individual identity consistently outweighed storage effects alpha diversity and beta diversity measures were significantly correlated across methods, and while samples often clustered by donor, they never clustered by storage method. Provided that all analyzed samples are stored the same way, banked fecal samples therefore appear highly suitable for investigating variation in gut microbiota. Our results open the door to a much-expanded perspective on variation in the gut microbiome across species and ecological contexts.
biorxiv genomics 0-100-users 2016INC-Seq Accurate single molecule reads using nanopore sequencing, bioRxiv, 2016-01-28
Nanopore sequencing provides a rapid, cheap and portable real-time sequencing platform with the potential to revolutionize genomics. Several applications, including RNA-seq, haplotype sequencing and 16S sequencing, are however limited by its relatively high single read error rate (>10%). We present INC-Seq (Intramolecular-ligated Nanopore Consensus Sequencing) as a strategy for obtaining long and accurate nanopore reads starting with low input DNA. Applying INC-Seq for 16S rRNA based bacterial profiling generated full-length amplicon sequences with median accuracy >97%. INC-Seq reads enable accurate species-level classification, identification of species at 0.1% abundance and robust quantification of relative abundances, providing a cheap and effective approach for pathogen detection and microbiome profiling on the MinION system.
biorxiv genomics 0-100-users 2016Fast and accurate single-cell RNA-Seq analysis by clustering of transcript-compatibility counts, bioRxiv, 2016-01-20
Current approaches to single-cell transcriptomic analysis are computationally intensive and require assay-specific modeling which limit their scope and generality. We propose a novel method that departs from standard analysis pipelines, comparing and clustering cells based not on their transcript or gene quantifications but on their transcript-compatibility read counts. In re-analysis of two landmark yet disparate single-cell RNA-Seq datasets, we show that our method is up to two orders of magnitude faster than previous approaches, provides accurate and in some cases improved results, and is directly applicable to data from a wide variety of assays.
biorxiv genomics 0-100-users 2016Comparative Analysis of Single-Cell RNA Sequencing Methods, bioRxiv, 2016-01-14
AbstractBackgroundSingle-cell RNA sequencing (scRNA-seq) offers exciting possibilities to address biological and medical questions, but a systematic comparison of recently developed protocols is still lacking.ResultsWe generated data from 447 mouse embryonic stem cells using Drop-seq, SCRB-seq, Smart-seq (on Fluidigm C1) and Smart-seq2 and analyzed existing data from 35 mouse embryonic stem cells prepared with CEL-seq. We find that Smart-seq2 is the most sensitive method as it detects the most genes per cell and across cells. However, it shows more amplification noise than CEL-seq, Drop-seq and SCRB-seq as it cannot use unique molecular identifiers (UMIs). We use simulations to model how the observed combinations of sensitivity and amplification noise affects detection of differentially expressed genes and find that SCRB-seq reaches 80% power with the fewest number of cells. When considering cost-efficiency at different sequencing depths at 80% power, we find that Drop-seq is preferable when quantifying transcriptomes of a large numbers of cells with low sequencing depth, SCRB-seq is preferable when quantifying transcriptomes of fewer cells and Smart-seq2 is preferable when annotating andor quantifying transcriptomes of fewer cells as long one can use in-house produced transposase.ConclusionsOur analyses allows an informed choice among five prominent scRNA-seq protocols and provides a solid framework for benchmarking future improvements in scRNA-seq methodologies.
biorxiv genomics 0-100-users 2016Early farmers from across Europe directly descended from Neolithic Aegeans, bioRxiv, 2015-11-26
Farming and sedentism first appear in southwest Asia during the early Holocene and later spread to neighboring regions, including Europe, along multiple dispersal routes. Conspicuous uncertainties remain about the relative roles of migration, cultural diffusion and admixture with local foragers in the early Neolithisation of Europe. Here we present paleogenomic data for five Neolithic individuals from northwestern Turkey and northern Greece, spanning the time and region of the earliest spread of farming into Europe. We observe striking genetic similarity both among Aegean early farmers and with those from across Europe. Our study demonstrates a direct genetic link between Mediterranean and Central European early farmers and those of Greece and Anatolia, extending the European Neolithic migratory chain all the way back to southwestern Asia.
biorxiv genomics 0-100-users 2015FecalSeq methylation-based enrichment for noninvasive population genomics from feces, bioRxiv, 2015-11-26
AbstractObtaining high-quality samples from wild animals is a major obstacle for genomic studies of many taxa, particular at the population level, as collection methods for such samples are typically invasive. DNA from feces is easy to obtain noninvasively, but is dominated by a preponderance of bacterial and other non-host DNA. Because next-generation sequencing technology sequences DNA largely indiscriminately, the high proportion of exogenous DNA drastically reduces the efficiency of high-throughput sequencing for host animal genomics. In order to address this issue, we developed an inexpensive methylation-based capture method for enriching host DNA from noninvasively obtained fecal DNA samples. Our method exploits natural differences in CpG-methylation density between vertebrate and bacterial genomes to preferentially bind and isolate host DNA from majority-bacterial fecal DNA samples. We demonstrate that the enrichment is robust, efficient, and compatible with downstream library preparation methods useful for population studies (e.g., RADseq). Compared to other enrichment strategies, our method is quick and inexpensive, adding only a negligible cost to sample preparation for research that is often severely constrained by budgetary limitations. In combination with downstream methods such as RADseq, our approach allows for cost-effective and customizable genomic-scale genotyping that was previously feasible in practice only with invasive samples. Because feces are widely available and convenient to collect, our method empowers researchers to explore genomic-scale population-level questions in organisms for which invasive sampling is challenging or undesirable.
biorxiv genomics 0-100-users 2015