Global adaptation confounds the search for local adaptation, bioRxiv, 2019-08-21

AbstractSpatially varying selection promotes variance in allele frequencies, increasing genetic differentiation between the demes of a metapopulation. For that reason, outliers in the genome wide distribution of summary statistics measuring genetic differentiation, such as FST, are often interpreted as evidence for alleles which contribute to local adaptation. However, in spatially structured populations, the spread of beneficial mutations with spatially uniform effects can also induce transient genetic differentiation and numerous theoretical studies have suggested that species-wide, or global, adaptation makes a substantial contribution to molecular evolution. In this study, we ask whether such global adaptation affects the genome-wide distribution of FST and generates statistical outliers which could be mistaken for local adaptation. Using forward-in-time population genetic simulations assuming parameters for the rate and strength of beneficial mutations similar to those that have been estimated for natural populations, we show the spread of globally beneficial in parapatric populations can readily generate FST outliers, which may be misinterpreted as evidence for local adaptation. The spread of beneficial mutations causes selective sweeps at flanking sites, so the effects of global versus local adaptation may be distinguished by examining patterns of nucleotide diversity along with FST. Our study suggests that global adaptation should be considered in the interpretation of genome-scan results and the design of future studies aimed at understanding the genetic basis of local adaptation.

biorxiv evolutionary-biology 0-100-users 2019

ATLAS a Snakemake workflow for assembly, annotation, and genomic binning of metagenome sequence data, bioRxiv, 2019-08-20

AbstractBackgroundMetagenomics and metatranscriptomics studies provide valuable insight into the composition and function of microbial populations from diverse environments, however the data processing pipelines that rely on mapping reads to gene catalogs or genome databases for cultured strains yield results that underrepresent the genes and functional potential of uncultured microbes. Recent improvements in sequence assembly methods have eased the reliance on genome databases, thereby allowing the recovery of genomes from uncultured microbes. However, configuring these tools, linking them with advanced binning and annotation tools, and maintaining provenance of the processing continues to be challenging for researchers.ResultsHere we present ATLAS, a software package for customizable data processing from raw sequence reads to functional and taxonomic annotations using state-of-the-art tools to assemble, annotate, quantify, and bin metagenome and metatranscriptome data. Genome-centric resolution and abundance estimates are provided for each sample in a dataset. ATLAS is written in Python and the workflow implemented in Snakemake; it operates in a Linux environment, and is compatible with Python 3.5+ and Anaconda 3+ versions. The source code for ATLAS is freely available, distributed under a BSD-3 license.ConclusionATLAS provides a user-friendly, modular and customizable Snakemake workflow for metagenome and metatranscriptome data processing; it is easily installable with conda and maintained as open-source on GitHub at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsgithub.commetagenome-atlasatlas>httpsgithub.commetagenome-atlasatlas<jatsext-link>.

biorxiv bioinformatics 100-200-users 2019

A Single-Cell Transcriptome Atlas for Zebrafish Development, bioRxiv, 2019-08-19

ABSTRACTThe ability to define cell types and how they change during organogenesis is central to our understanding of animal development and human disease. Despite the crucial nature of this knowledge, we have yet to fully characterize all distinct cell types and the gene expression differences that generate cell types during development. To address this knowledge gap, we produced an Atlas using single-cell RNA-sequencing methods to investigate gene expression from the pharyngula to early larval stages in developing zebrafish. Our single-cell transcriptome Atlas encompasses transcriptional profiles from 44,102 cells across four days of development using duplicate experiments that confirmed high reproducibility. We annotated 220 identified clusters and highlighted several strategies for interrogating changes in gene expression associated with the development of zebrafish embryos at single-cell resolution. Furthermore, we highlight the power of this analysis to assign new cell-type or developmental stage-specific expression information to many genes, including those that are currently known only by sequence andor that lack expression information altogether. The resulting Atlas is a resource of biologists to generate hypotheses for genetic (mutant) or functional analysis, to launch an effort to define the diversity of cell-types during zebrafish organogenesis, and to examine the transcriptional profiles that produce each cell type over developmental time.

biorxiv developmental-biology 100-200-users 2019

Prevalence estimate of blood doping in elite track and field at the introduction of the Athlete Biological Passport, bioRxiv, 2019-08-19

AbstractIn elite sport, the Athlete Biological Passport (ABP) was invented to tackle cheaters by monitoring closely changes in biological parameters, flagging atypical variations. The haematological module of the ABP was indeed adopted in 2011 by the International Association of Athletics Federations (IAAF). This study estimates the prevalence of blood doping based on haematological parameters in a large cohort of track &amp; field athletes measured at two international major events (2011 &amp; 2013 IAAF World Championships) with a hypothesized decrease in prevalence due to the ABP introduction.A total of 3683 blood samples were collected and analysed from all participating athletes originating from 209 countries. The estimate of doping prevalence was obtained by using a Bayesian network with seven variables, as well as “doping” as a variable mimicking doping with low-doses of recombinant human erythropoietin (rhEPO), to generate reference cumulative distribution functions (CDFs) for the Abnormal Blood Profile Score (ABPS) from the ABP.Our results from robust haematological parameters indicate an estimation of an overall blood doping prevalence of 18% in average in endurance athletes (95% Confidence Interval (C.I.) 14-22%). A higher prevalence was observed in female athletes (22%, C.I. 16-28%) than in male athletes (15%, C.I. 9-20%). In conclusion, this study presents the first comparison of blood doping prevalence in elite athletes based on biological measurements from major international events that may help scientists and experts to use the ABP in a more efficient and deterrent way.What are the new findings ?<jatslist list-type=bullet><jatslist-item>This study presents the first comparison of blood doping prevalence in elite track &amp; field athletes based on biological measurements from major international events<jatslist-item><jatslist-item>Our results from robust haematological parameters indicate an estimation of an overall blood doping prevalence of 18% in average in endurance athletes.<jatslist-item><jatslist-item>The confidence intervals for blood doping prevalence range from 9-28% with wide discrepancies between certain countries.<jatslist-item>How might it impact on clinical practice in the near future<jatslist list-type=bullet><jatslist-item>The further development of the Athlete Biological Passport with a careful monitoring of biological parameters still represents the most consistent approach to thwart athletes using undetectable prohibited substances or methods.<jatslist-item><jatslist-item>This study describes a method to define blood doping prevalence with the analysis of robust haematological parameters<jatslist-item><jatslist-item>Estimates of doping prevalence in subpopulations of athletes may represent a valuable tool for the antidoping authorities to perform a risk assessment in their sport.<jatslist-item>

biorxiv physiology 0-100-users 2019

Assessment of computational methods for the analysis of single-cell ATAC-seq data, bioRxiv, 2019-08-18

AbstractBackgroundRecent innovations in single-cell Assay for Transposase Accessible Chromatin using sequencing (scATAC-seq) enable profiling of the epigenetic landscape of thousands of individual cells. scATAC-seq data analysis presents unique methodological challenges. scATAC-seq experiments sample DNA, which, due to low copy numbers (diploid in humans) lead to inherent data sparsity (1-10% of peaks detected per cell) compared to transcriptomic (scRNA-seq) data (20-50% of expressed genes detected per cell). Such challenges in data generation emphasize the need for informative features to assess cell heterogeneity at the chromatin level.ResultsWe present a benchmarking framework that was applied to 10 computational methods for scATAC-seq on 13 synthetic and real datasets from different assays, profiling cell types from diverse tissues and organisms. Methods for processing and featurizing scATAC-seq data were evaluated by their ability to discriminate cell types when combined with common unsupervised clustering approaches. We rank evaluated methods and discuss computational challenges associated with scATAC-seq analysis including inherently sparse data, determination of features, peak calling, the effects of sequencing coverage and noise, and clustering performance. Running times and memory requirements are also discussed.ConclusionsThis reference summary of scATAC-seq methods offers recommendations for best practices with consideration for both the non-expert user and the methods developer. Despite variation across methods and datasets, SnapATAC, Cusanovich2018, and cisTopic outperform other methods in separating cell populations of different coverages and noise levels in both synthetic and real datasets. Notably, SnapATAC was the only method able to analyze a large dataset (&gt; 80,000 cells).

biorxiv bioinformatics 200-500-users 2019

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo