Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data, bioRxiv, 2019-05-20
AbstractWe present a comprehensive evaluation of state-of-the-art algorithms for inferring gene regulatory networks (GRNs) from single-cell gene expression data. We develop a systematic framework called BEELINE for this purpose. We use synthetic networks with predictable cellular trajectories as well as curated Boolean models to serve as the ground truth for evaluating the accuracy of GRN inference algorithms. We develop a strategy to simulate single-cell gene expression data from these two types of networks that avoids the pitfalls of previously-used methods. We selected 12 representative GRN inference algorithms. We found that the accuracy of these methods (measured in terms of AUROC and AUPRC) was moderate, by and large, although the methods were better in recovering interactions in the synthetic networks than the Boolean models. Techniques that did not require pseudotime-ordered cells were more accurate, in general. The observation that the endpoints of many false positive edges were connected by paths of length two in the Boolean models suggested that indirect effects may be predominant in the outputs of the algorithms we tested. The predicted networks were considerably inconsistent with each other, indicating that combining GRN inference algorithms using ensembles is likely to be challenging. Based on the results, we present some recommendations to users of GRN inference algorithms, including suggestions on how to create simulated gene expression datasets for testing them. BEELINE, which is available at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpgithub.commurali-groupBEELINE>httpgithub.commurali-groupBEELINE<jatsext-link> under an open-source license, will aid in the future development of GRN inference algorithms for single-cell transcriptomic data.
biorxiv bioinformatics 0-100-users 2019High-throughput microcircuit analysis of individual human brains through next-generation multineuron patch-clamp, bioRxiv, 2019-05-18
AbstractComparing neuronal microcircuits across different brain regions, species and individuals can reveal common and divergent principles of network computation. Simultaneous patch-clamp recordings from multiple neurons offer the highest temporal and subthreshold resolution to analyse local synaptic connectivity. However, its establishment is technically complex and the experimental performance is limited by high failure rates, long experimental times and small sample sizes. We introduce an in-vitro multipatch setup with an automated pipette pressure and cleaning system facilitating recordings of up to 10 neurons simultaneously and sequential patching of additional neurons. We present hardware and software solutions that increase the usability, speed and data throughput of multipatch experiments which allowed probing of 150 synaptic connections between 17 neurons in one human cortical slice and screening of over 600 connections in tissue from a single patient. This method will facilitate the systematic analysis of microcircuits and allow unprecedented comparisons at the level of individuals.
biorxiv neuroscience 0-100-users 2019Assortative mating in hybrid zones is remarkably ineffective in promoting speciation, bioRxiv, 2019-05-17
AbstractAssortative mating and other forms of partial prezygotic isolation are often viewed as being more important than partial postzygotic isolation (low fitness of hybrids) early in the process of speciation. Here we simulate secondary contact between two populations (‘species’) to examine effects of pre- and postzygotic isolation in preventing blending. A small reduction in hybrid fitness (e.g., 10%) produces a narrower hybrid zone than a strong but imperfect mating preference (e.g., 10x stronger preference for conspecific over heterospecific mates). This is because, in the latter case, rare F1 hybrids find each other attractive (due to assortative mating), leading to the gradual buildup of a full continuum of intermediates between the two species. The cline is narrower than would result from purely neutral diffusion over the same number of generations, but this effect is due to the frequency-dependent mating disadvantage of individuals of rare mating types. Hybrids tend to pay this cost of rarity more than pure individuals, meaning there is an induced postzygotic isolation effect of assortative mating. When this induced mating disadvantage is removed, partial assortative mating does not prevent eventual blending of the species. These results prompt a questioning of the concept of partial prezygotic isolation, since it is not very isolating unless there is also postzygotic isolation.
biorxiv evolutionary-biology 0-100-users 2019Bayesian multivariate reanalysis of large genetic studies identifies many new associations, bioRxiv, 2019-05-17
AbstractGenome-wide association studies (GWAS) have now been conducted for hundreds of phenotypes of relevance to human health. Many such GWAS involve multiple closely-related phenotypes collected on the same samples. However, the vast majority of these GWAS have been analyzed using simple univariate analyses, which consider one phenotype at a time. This is de-spite the fact that, at least in simulation experiments, multivariate analyses have been shown to be more powerful at detecting associations. Here, we conduct multivariate association analyses on 13 different publicly-available GWAS datasets that involve multiple closely-related phenotypes. These data include large studies of anthropometric traits (GIANT), plasma lipid traits (GlobalLipids), and red blood cell traits (HaemgenRBC). Our analyses identify many new associations (433 in total across the 13 studies), many of which replicate when follow-up samples are available. Overall, our results demonstrate that multivariate analyses can help make more effective use of data from both existing and future GWAS.1Author SummaryGenome-wide association studies (GWAS) have become a common and powerful tool for identifying significant correlations between markers of genetic variation and physical traits of interest. Often these studies are conducted by comparing genetic variation against single traits one at a time (‘univariate’); however, it has previously been shown that it is possible to increase your power to detect significant associations by comparing genetic variation against multiple traits simultaneously (‘multivariate’). Despite this apparent increase in power though, researchers still rarely conduct multivariate GWAS, even when studies have multiple traits readily available. Here, we reanalyze 13 previously published GWAS using a multivariate method and find >400 additional associations. Our method makes use of univariate GWAS summary statistics and is available as a software package, thus making it accessible to other researchers interested in conducting the same analyses. We also show, using studies that have multiple releases, that our new associations have high rates of replication. Overall, we argue multivariate approaches in GWAS should no longer be overlooked and how, often, there is low-hanging fruit in the form of new associations by running these methods on data already collected.
biorxiv genomics 0-100-users 2019OncoOmics approaches to reveal essential genes in breast cancer a panoramic view from pathogenesis to precision medicine, bioRxiv, 2019-05-17
SUMMARYBreast cancer (BC) is a heterogeneous disease where each OncoOmics approach needs to be fully understood as a part of a complex network. Therefore, the main objective of this study was to analyze genetic alterations, signaling pathways, protein-protein interaction networks, protein expression, dependency maps and enrichment maps in 230 previously prioritized genes by the Consensus Strategy, the Pan-Cancer Atlas, the Pharmacogenomics Knowledgebase and the Cancer Genome Interpreter, in order to reveal essential genes to accelerate the development of precision medicine in BC. The OncoOmics essential genes were rationally filtered to 144, 48 (33%) of which were hallmarks of cancer and 20 (14%) were significant in at least three OncoOmics approaches RAC1, AKT1 CCND1, PIK3CA, ERBB2, CDH1, MAPK14, TP53, MAPK1, SRC, RAC3, PLCG1, GRB2, MED1, TOP2A, GATA3, BCL2, CTNNB1, EGFR and CDK2. According to the Open Targets Platform, there are 111 drugs that are currently being analyzed in 3151 clinical trials in 39 genes. Lastly, there are more than 800 clinical annotations associated with 94 genes in BC pharmacogenomics.
biorxiv genomics 0-100-users 2019Gene knock-ins in Drosophila using homology-independent insertion of universal donor plasmids, bioRxiv, 2019-05-16
AbstractSite-specific insertion of DNA into endogenous genes (knock-in) is a powerful method to study gene function. However, traditional methods for knock-in require laborious cloning of long homology arms for homology-directed repair. Here, we report a simplified method in Drosophila melanogaster to insert large DNA elements into any gene using homology-independent repair. This method, known as CRISPaint, employs CRISPR-Cas9 and non-homologous end joining (NHEJ) to linearize and insert donor plasmid DNA into a target genomic cut site. The inclusion of commonly used elements such as GFP on donor plasmids makes them universal, abolishing the need to create gene-specific homology arms and greatly reducing user workload. Using this method, we show robust gene-specific integration of donor plasmids in cultured cells and the fly germ line. Furthermore, we use this method to analyze gene function by fluorescently tagging endogenous proteins, disrupting gene function, and generating reporters of gene expression. Finally, we assemble a collection of donor plasmids for germ line knock-in that contain commonly used insert sequences. This method simplifies the generation of site-specific large DNA insertions in Drosophila cell lines and fly strains, and better enables researchers to dissect gene function in vivo.SummaryWe report a new homology-independent genomic knock-in method in Drosophila to insert large DNA elements into any target gene. Using CRISPR-Cas9 and non-homologous end joining (NHEJ), an entire donor plasmid is inserted into the genome without the need for homology arms. This approach eliminates the burden associated with designing and constructing traditional donor plasmids. We demonstrate its usefulness in cultured cells and in vivo to fluorescently tag endogenous proteins, generate reporters of gene expression, and disrupt gene function.
biorxiv genetics 0-100-users 2019