Consistent metagenome-derived metrics verify and define bacterial species boundaries, bioRxiv, 2019-05-25
AbstractLongstanding questions relate to the existence of naturally distinct bacterial species and genetic approaches to distinguish them. Bacterial genomes in public databases form distinct groups, but these databases are subject to isolation and deposition biases. We compared 5,203 bacterial genomes from 1,457 environmental metagenomic samples to test for distinct clouds of diversity, and evaluated metrics that could be used to define the species boundary. Bacterial genomes from the human gut, soil, and the ocean all exhibited gaps in whole-genome average nucleotide identities (ANI) near the previously suggested species threshold of 95% ANI. While genome-wide ratios of non-synonymous and synonymous nucleotide differences (dNdS) decrease until ANI values approach ∼98%, estimates for homologous recombination approached zero at ∼95% ANI, supporting breakdown of recombination due to sequence divergence as a species-forming force. We evaluated 107 genome-based metrics for their ability to distinguish species when full genomes are not recovered. Full length 16S rRNA genes were least useful because they were under-recovered from metagenomes, but many ribosomal proteins displayed both high metagenomic recoverability and species-discrimination power. Taken together, our results verify the existence of sequence-discrete microbial species in metagenome-derived genomes and highlight the usefulness of ribosomal genes for gene-level species discrimination.
biorxiv microbiology 100-200-users 2019A single bacterial genus maintains root development in a complex microbiome, bioRxiv, 2019-05-24
AbstractPlants grow within a complex web of species interacting with each other and with the plant. Many of these interactions are governed by a wide repertoire of chemical signals, and the resulting chemical landscape of the rhizosphere can strongly affect root health and development. To understand how microbe-microbe interactions influence root development in Arabidopsis, we established a model system for plant-microbe-microbe-environment interactions. We inoculated seedlings with a 185-member bacterial synthetic community (SynCom), manipulated the abiotic environment, and measured bacterial colonization of the plant. This enabled classification of the SynCom into four modules of co-occurring strains. We deconstructed the SynCom based on these modules, identifying microbe-microbe interactions that determine root phenotypes. These interactions primarily involve a single bacterial genus, Variovorax, which completely reverts severe root growth inhibition (RGI) induced by a wide diversity of bacterial strains as well as by the entire 185-member community. We demonstrate that Variovorax manipulate plant hormone levels to balance this ecologically realistic root community’s effects on root development. We identify a novel auxin degradation operon in the Variovorax genome that is necessary and sufficient for RGI reversion. Therefore, metabolic signal interference shapes bacteria-plant communication networks and is essential for maintaining the root’s developmental program. Optimizing the feedbacks that shape chemical interaction networks in the rhizosphere provides a promising new ecological strategy towards the development of more resilient and productive crops.
biorxiv microbiology 100-200-users 2019Can education be personalised using pupils’ genetic data?, bioRxiv, 2019-05-24
AbstractThe predictive power of polygenic scores for some traits now rivals that of more classical phenotypic measures, and as such they have been promoted as a potential tool for genetically informed policy. However, how predictive polygenic scores are conditional on other easily available phenotypic data is not well understood. Using data from a UK cohort study, the Avon Longitudinal Study of Parents and Children, we investigated how well polygenic scores for education predict individuals’ realised attainment over and above phenotypic data available to schools. Across our sample children’s polygenic scores predicted their educational outcomes almost as well as parent’s socioeconomic position or education. There was high overlap between the polygenic score and attainment distributions, leading to weak predictive accuracy at the individual level. Furthermore, conditional on prior attainment the polygenic score was not predictive of later attainment. Our results suggest that polygenic scores are informative for identifying group level differences, but they currently have limited use in predicting individual attainment.
biorxiv genetics 100-200-users 2019Clonal replacement of tumor-specific T cells following PD-1 blockade, bioRxiv, 2019-05-24
AbstractImmunotherapies that block inhibitory checkpoint receptors on T cells have transformed the clinical care of cancer patients. However, which tumor-specific T cells are mobilized following checkpoint blockade remains unclear. Here, we performed paired single-cell RNA- and T cell receptor (TCR)-sequencing on 79,046 cells from site-matched tumors from patients with basal cell carcinoma (BCC) or squamous cell carcinoma (SCC) pre- and post-anti-PD-1 therapy. Tracking TCR clones and transcriptional phenotypes revealed a coupling of tumor-recognition, clonal expansion, and T cell dysfunction the T cell response to treatment was accompanied by clonal expansions of CD8+CD39+ T cells, which co-expressed markers of chronic T cell activation and exhaustion. However, this expansion did not derive from pre-existing tumor infiltrating T cell clones; rather, it comprised novel clonotypes, which were not previously observed in the same tumor. Clonal replacement of T cells was preferentially observed in exhausted CD8+ T cells, compared to other distinct T cell phenotypes, and was evident in BCC and SCC patients. These results, enabled by single-cell multi-omic profiling of clinical samples, demonstrate that pre-existing tumor-specific T cells may be limited in their capacity for re-invigoration, and that the T cell response to checkpoint blockade relies on the expansion of a distinct repertoire of T cell clones that may have just recently entered the tumor.
biorxiv immunology 100-200-users 2019Benchmarking principal component analysis for large-scale single-cell RNA-sequencing, bioRxiv, 2019-05-20
Principal component analysis (PCA) is an essential method for analyzing single-cell RNA-seq (scRNA-seq) dataset but for large-scale scRNA-seq datasets, the computation consumes a long time and large memory space. In this work, we review the existing fast and memory-efficient PCA algorithms and implementations and evaluate their practical application to large-scale scRNA-seq dataset. Our benchmark showed that some PCA algorithms based on Krylov subspace and randomized singular value decomposition are fast, memory-efficient, and accurate than the other algorithms. Considering the difference of computational environment of users and developers, we also developed the guideline to select the appropriate PCA implementations.
biorxiv bioinformatics 100-200-users 2019DeepFly3D A deep learning-based approach for 3D limb and appendage tracking in tethered, adult Drosophila, bioRxiv, 2019-05-20
AbstractStudying how neural circuits orchestrate limbed behaviors requires the precise measurement of the positions of each appendage in 3-dimensional (3D) space. Deep neural networks can estimate 2-dimensional (2D) pose in freely behaving and tethered animals. However, the unique challenges associated with transforming these 2D measurements into reliable and precise 3D poses have not been addressed for small animals including the fly, Drosophila melanogaster. Here we present DeepFly3D, a software that infers the 3D pose of tethered, adult Drosophila—or other animals—using multiple camera images. DeepFly3D does not require manual calibration, uses pictorial structures to automatically detect and correct pose estimation errors, and uses active learning to iteratively improve performance. We demonstrate more accurate unsupervised behavioral embedding using 3D joint angles rather than commonly used 2D pose data. Thus, DeepFly3D enables the automated acquisition of behavioral measurements at an unprecedented level of resolution for a variety of biological applications.
biorxiv animal-behavior-and-cognition 100-200-users 2019