Assembly methods for nanopore-based metagenomic sequencing a comparative study, bioRxiv, 2019-08-02
ABSTRACTBackgroundMetagenomic sequencing has lead to the recovery of previously unexplored microbial genomes. In this sense, short-reads sequencing platforms often result in highly fragmented metagenomes, thus complicating downstream analyses. Third generation sequencing technologies, such as MinION, could lead to more contiguous assemblies due to their ability to generate long reads. Nevertheless, there is a lack of studies evaluating the suitability of the available assembly tools for this new type of data.FindingsWe benchmarked the ability of different short-reads and long-reads tools to assembly two different commercially available mock communities, and observed remarkable differences in the resulting assemblies depending on the software of choice. Short-reads metagenomic assemblers proved unsuitable for MinION data. Among the long-reads assemblers tested, Flye and Canu were the only ones performing well in all the datasets. These tools were able to retrieve complete individual genomes directly from the metagenome, and assembled a bacterial genome in only two contigs in the best scenario. Despite the intrinsic high error of long-reads technologies, Canu and Flye lead to high accurate assemblies (~99.4-99.8 % of accuracy). However, errors still had an impact on the prediction of biosynthetic gene clusters.ConclusionsMinION metagenomic sequencing data proved sufficient for assembling low-complex microbial communities, leading to the recovery of highly complete and contiguous individual genomes. This work is the first systematic evaluation of the performance of different assembly tools on MinION data, and may help other researchers willing to use this technology to choose the most appropriate software depending on their goals. Future work is still needed in order to assess the performance of Oxford Nanopore MinION data on more complex microbiomes.
biorxiv bioinformatics 100-200-users 2019Genetic tool development in marine protists Emerging model organisms for experimental cell biology, bioRxiv, 2019-08-02
ABSTRACTMarine microbial eukaryotes underpin the largest food web on the planet and influence global biogeochemical cycles that maintain habitability. They are also remarkably diverse and provide insights into evolution, including the origins of complex life forms, as revealed through genome analyses. However, their genetic tractability has been limited to a few species that do not represent the broader diversity of eukaryotic life or some of the most environmentally relevant taxa. Here, we report on genetic systems developed as a community resource for experimental cell biology of aquatic protists from across the eukaryotic tree and primarily from marine habitats. We present evidence for foreign DNA delivery and expression in 14 species never before transformed, report on the advancement of genetic systems in 7 species, review of an already published transformation protocol in 1 species and discuss why the transformation of 17 additional species has not been achieved yet. For all protists studied in this community effort, we outline our methods, constructs, and genome-editing approaches in the context of published systems. The reported breakthroughs on genetic manipulation position the community to dissect cellular mechanisms from a breadth of protists, which will collectively provide insights into ancestral eukaryotic lifeforms, protein diversification and evolution of cellular pathways.
biorxiv ecology 100-200-users 2019Negative selection on complex traits limits genetic risk prediction accuracy between populations, bioRxiv, 2019-08-02
Accurate genetic risk prediction is a key goal for medical genetics and great progress has been made toward identifying individuals with extreme risk across several traits and diseases (Collins and Varmus, 2015). However, many of these studies are done in predominantly European populations (Bustamante et al., 2011; Popejoy and Fullerton, 2016). Although GWAS effect sizes correlate across ancestries (Wojcik et al., 2019), risk scores show substantial reductions in accuracy when applied to non-European populations (Kim et al., 2018; Martin et al., 2019; Scutari et al., 2016). We use simulations to show that human demographic history and negative selection on complex traits result in population specific genetic architectures. For traits under moderate negative selection, ~50% of the heritability can be accounted for by variants in Europe that are absent from Africa. We show that this directly leads to poor performance in risk prediction when using variants discovered in Europe to predict risk in African populations, especially in the tails of the risk distribution. To evaluate the impact of this effect in genomic data, we built a Bayesian model to stratify heritability between European-specific and shared variants and applied it to 43 traits and diseases in the UK Biobank. Across these phenotypes, we find ~50% of the heritability comes from European-specific variants, setting an upper bound on the accuracy of genetic risk prediction in non-European populations using effect sizes discovered in European populations. We conclude that genetic association studies need to include more diverse populations to enable to utility of genetic risk prediction in all populations.
biorxiv genetics 100-200-users 2019Phylogenies of extant species are consistent with an infinite array of diversification histories, bioRxiv, 2019-08-01
AbstractTime-calibrated molecular phylogenies of extant species (extant timetrees) are widely used for estimating the dynamics of diversification rates (1–6) and testing for associations between these rates and environmental factors (5, 7) or species traits (8). However, there has been considerable debate surrounding the reliability of these inferences in the absence of fossil data (9–13), and to date this critical question remains unresolved. Here we mathematically clarify the precise information that can be extracted from extant timetrees under the generalized birth-death model, which underlies the majority of existing estimation methods. We prove that for a given extant timetree and a candidate diversification scenario, there exists an infinite number of alternative diversification scenarios that are equally likely to have generated a given tree. These “congruent” scenarios cannot possibly be distinguished using extant timetrees alone, even in the presence of infinite data. Importantly, congruent diversification scenarios can exhibit markedly different and yet plausible diversification dynamics, suggesting that many previous studies may have over-interpreted phylogenetic evidence. We show that sets of congruent models can be uniquely described using composite variables, which contain all available information about past dynamics of diversification (14); this suggests an alternative paradigm for learning about the past from extant timetrees.
biorxiv evolutionary-biology 100-200-users 2019Cross-species transcriptomic and epigenomic analysis reveals key regulators of injury response and neuronal regeneration in vertebrate retinas, bioRxiv, 2019-07-31
AbstractInjury induces retinal Müller glia of cold-blooded, but not mammalian, vertebrates to generate neurons. To identify gene regulatory networks that control neuronal reprogramming in retinal glia, we comprehensively profiled injury-dependent changes in gene expression and chromatin conformation in Müller glia from zebrafish, chick and mice using bulk RNA and ATAC-Seq, as well as single-cell RNA-Seq. Cross-species integrative analysis of these data, together with functional validation of candidate genes, identified evolutionarily conserved and species-specific gene networks controlling glial quiescence, gliosis and neurogenesis. In zebrafish and chick, transition from quiescence to gliosis is a critical stage in acquisition of neurogenic competence, while in mice a dedicated network suppresses this transition and rapidly restores quiescence. Selective disruption of NFI family transcription factors, which maintain and restore quiescence, enables Müller glia to proliferate and robustly generate neurons in adult mice after retinal injury. These comprehensive resources and findings will facilitate the design of cell-based therapies aimed at restoring retinal neurons lost to degenerative disease.
biorxiv neuroscience 100-200-users 2019Template plasmid integration in germline genome-edited cattle, bioRxiv, 2019-07-29
AbstractWe analyzed publicly available whole genome sequencing data from cattle which were germline genome-edited to introduce polledness. Our analysis discovered the unintended heterozygous integration of the plasmid and a second copy of the repair template sequence, at the target site. Our finding underscores the importance of employing screening methods suited to reliably detect the unintended integration of plasmids and multiple template copies.
biorxiv genomics 100-200-users 2019