Systematic comparative analysis of single cell RNA-sequencing methods, bioRxiv, 2019-05-10
ABSTRACTA multitude of single-cell RNA sequencing methods have been developed in recent years, with dramatic advances in scale and power, and enabling major discoveries and large scale cell mapping efforts. However, these methods have not been systematically and comprehensively benchmarked. Here, we directly compare seven methods for single cell andor single nucleus profiling from three types of samples – cell lines, peripheral blood mononuclear cells and brain tissue – generating 36 libraries in six separate experiments in a single center. To analyze these datasets, we developed and applied scumi, a flexible computational pipeline that can be used for any scRNA-seq method. We evaluated the methods for both basic performance and for their ability to recover known biological information in the samples. Our study will help guide experiments with the methods in this study as well as serve as a benchmark for future studies and for computational algorithm development.
biorxiv genomics 100-200-users 2019A megaplasmid family responsible for dissemination of multidrug resistance in Pseudomonas, bioRxiv, 2019-05-08
AbstractMultidrug resistance (MDR) represents a global threat to health. Although plasmids can play an important role in the dissemination of MDR, they have not been commonly linked to the emergence of antimicrobial resistance in the pathogen Pseudomonas aeruginosa. We used whole genome sequencing to characterize a collection of P. aeruginosa clinical isolates from a hospital in Thailand. Using long-read sequence data we obtained complete sequences of two closely related megaplasmids (>420 kb) carrying large arrays of antibiotic resistance genes located in discrete, complex and dynamic resistance regions, and revealing evidence of extensive duplication and recombination events. A comprehensive pangenomic and phylogenomic analysis indicated that 1) these large plasmids comprise a family present in different members of the Pseudomonas genus and associated with multiple sources (geographical, clinical or environmental); 2) the megaplasmids encode diverse niche-adaptive accessory traits, including multidrug resistance; 3) the pangenome of the megaplasmid family is highly flexible and diverse, comprising a substantial core genome (average of 48% of plasmid genes), but with individual members carrying large numbers of unique genes. The history of the megaplasmid family, inferred from our analysis of the available database, suggests that members carrying multiple resistance genes date back to at least the 1970s.FundingThis work was supported by the International Pseudomonas Genomics Consortium, funded by Cystic Fibrosis Canada [RCL]; and the Secretaría de Educación, Ciencia, Tecnología e Innovación (SECTEI), Mexico [AC].
biorxiv microbiology 0-100-users 2019Single-cell RNA-sequencing of differentiating iPS cells reveals dynamic genetic effects on gene expression, bioRxiv, 2019-05-08
AbstractRecent developments in stem cell biology have enabled the study of cell fate decisions in early human development that are impossible to study in vivo. However, understanding how development varies across individuals and, in particular, the influence of common genetic variants during this process has not been characterised. Here, we exploit human iPS cell lines from 125 donors, a pooled experimental design, and single-cell RNA-sequencing to study population variation of endoderm differentiation. We identify molecular markers that are predictive of differentiation efficiency, and utilise heterogeneity in the genetic background across individuals to map hundreds of expression quantitative trait loci that influence expression dynamically during differentiation and across cellular contexts.
biorxiv genomics 100-200-users 2019The neural basis of tadpole transport in poison frogs, bioRxiv, 2019-05-08
AbstractParental care has evolved repeatedly and independently across animals. While the ecological and evolutionary significance of parental behaviour is well recognized, underlying mechanisms remain poorly understood. We took advantage of behavioural diversity across closely related species of South American poison frogs (Family Dendrobatidae) to identify neural correlates of parental behaviour shared across sexes and species. We characterized differences in neural induction, gene expression in active neurons, and activity of specific neuronal types in three species with distinct parental care patterns male uniparental, female uniparental, and biparental. We identified the medial pallium and preoptic area as core brain regions associated with parental care, independent of sex and species. Identification of neurons active during parental care confirms a role for neuropeptides associated with care in other vertebrates as well as identifying novel candidates. Our work is the first to explore neural and molecular mechanisms of parental care in amphibians and highlights the potential for mechanistic studies in closely related but behaviourally variable species to build a more complete understanding of how shared principles and species-specific diversity govern parental care and other social behaviour.
biorxiv animal-behavior-and-cognition 0-100-users 2019A Fast and Flexible Algorithm for Solving the Lasso in Large-scale and Ultrahigh-dimensional Problems, bioRxiv, 2019-05-07
AbstractSince its first proposal in statistics (Tibshirani, 1996), the lasso has been an effective method for simultaneous variable selection and estimation. A number of packages have been developed to solve the lasso efficiently. However as large datasets become more prevalent, many algorithms are constrained by efficiency or memory bounds. In this paper, we propose a meta algorithm batch screening iterative lasso (BASIL) that can take advantage of any existing lasso solver and build a scalable lasso solution for large datasets. We also introduce snpnet, an R package that implements the proposed algorithm on top of glmnet (Friedman et al., 2010a) for large-scale single nucleotide polymorphism (SNP) datasets that are widely studied in genetics. We demonstrate results on a large genotype-phenotype dataset from the UK Biobank, where we achieve state-of-the-art heritability estimation on quantitative and qualitative traits including height, body mass index, asthma and high cholesterol.
biorxiv bioinformatics 0-100-users 2019Variable prediction accuracy of polygenic scores within an ancestry group, bioRxiv, 2019-05-07
AbstractFields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group, the prediction accuracy of polygenic scores depends on characteristics such as the age or sex composition of the individuals in which the GWAS and the prediction were conducted, and on the GWAS study design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use.
biorxiv genetics 200-500-users 2019