Estimating the prevalence of missing experiments in a neuroimaging meta-analysis, bioRxiv, 2017-11-28
AbstractCoordinate-based meta-analyses (CBMA) allow researchers to combine the results from multiple fMRI experiments with the goal of obtaining results that are more likely to generalise. However, the interpretation of CBMA findings can be impaired by the file drawer problem, a type of publications bias that refers to experiments that are carried out but are not published. Using foci per contrast count data from the BrainMap database, we propose a zero-truncated modelling approach that allows us to estimate the prevalence of non-significant experiments. We validate our method with simulations and real coordinate data generated from the Human Connectome Project. Application of our method to the data from BrainMap provides evidence for the existence of a file drawer effect, with the rate of missing experiments estimated as at least 6 per 100 reported.
biorxiv neuroscience 0-100-users 2017High-throughput ANI Analysis of 90K Prokaryotic Genomes Reveals Clear Species Boundaries, bioRxiv, 2017-11-28
A fundamental question in microbiology is whether there is a continuum of genetic diversity among genomes or clear species boundaries prevail instead. Answering this question requires robust measurement of whole-genome relatedness among thousands of genomes and from diverge phylogenetic lineages. Whole-genome similarity metrics such as Average Nucleotide Identity (ANI) can provide the resolution needed for this task, overcoming several limitations of traditional techniques used for the same purposes. Although the number of genomes currently available may be adequate, the associated bioinformatics tools for analysis are lagging behind these developments and cannot scale to large datasets. Here, we present a new method, FastANI, to compute ANI using alignment-free approximate sequence mapping. Our analyses demonstrate that FastANI produces an accurate ANI estimate and is up to three orders of magnitude faster when compared to an alignment (e.g., BLAST)-based approach. We leverage FastANI to compute pairwise ANI values among all prokaryotic genomes available in the NCBI database. Our results reveal a clear genetic discontinuity among the database genomes, with 99.8% of the total 8 billion genome pairs analyzed showing either >95% intra-species ANI or <83% inter-species ANI values. We further show that this discontinuity is recovered with or without the most frequently represented species in the database and is robust to historic additions in the public genome databases. Therefore, 95% ANI represents an accurate threshold for demarcating almost all currently named prokaryotic species, and wide species boundaries may exist for prokaryotes.
biorxiv bioinformatics 200-500-users 2017Prioritized memory access explains planning and hippocampal replay, bioRxiv, 2017-11-28
AbstractTo make decisions, animals must evaluate outcomes of candidate choices by accessing memories of relevant experiences. Yet little is known about which experiences are considered or ignored during deliberation, which ultimately governs choice. Here, we propose a normative theory to predict which memories should be accessed at each moment to optimize future decisions. Using nonlocal “replay” of spatial locations in hippocampus as a window into memory access, we simulate a spatial navigation task where an agent accesses memories of locations sequentially, ordered by utility how much extra reward would be earned due to the computation enabling better choices. This prioritization balances two desiderata the need to evaluate imminent choices, vs. the gain from propagating newly encountered information to predecessor states. We show that this theory offers a unifying account of a range of hitherto disconnected findings in the place cell literature such as the balance of forward and reverse replay, biases in the replayed content, and effects of experience. Accordingly, various types of nonlocal events during behavior and rest are re-interpreted as instances of a single choice evaluation operation, unifying seemingly disparate proposed functions of replay including planning, learning and consolidation, and whose dysfunction may underlie pathologies like rumination and craving.
biorxiv neuroscience 0-100-users 2017A suite of transgenic driver and reporter mouse lines with enhanced brain cell type targeting and functionality, bioRxiv, 2017-11-26
SUMMARYModern genetic approaches are powerful in providing access to diverse types of neurons within the mammalian brain and greatly facilitating the study of their function. We here report a large set of driver and reporter transgenic mouse lines, including 23 new driver lines targeting a variety of cortical and subcortical cell populations and 26 new reporter lines expressing an array of molecular tools. In particular, we describe the TIGRE2.0 transgenic platform and introduce Cre-dependent reporter lines that enable optical physiology, optogenetics, and sparse labeling of genetically-defined cell populations. TIGRE2.0 reporters broke the barrier in transgene expression level of single-copy targeted-insertion transgenesis in a wide range of neuronal types, along with additional advantage of a simplified breeding strategy compared to our first-generation TIGRE lines. These novel transgenic lines greatly expand the repertoire of high-precision genetic tools available to effectively identify, monitor, and manipulate distinct cell types in the mouse brain.
biorxiv neuroscience 200-500-users 2017Common risk variants identified in autism spectrum disorder, bioRxiv, 2017-11-26
AbstractAutism spectrum disorder (ASD) is a highly heritable and heterogeneous group of neurodevelopmental phenotypes diagnosed in more than 1% of children. Common genetic variants contribute substantially to ASD susceptibility, but to date no individual variants have been robustly associated with ASD. With a marked sample size increase from a unique Danish population resource, we report a genome-wide association meta-analysis of 18,381 ASD cases and 27,969 controls that identifies five genome-wide significant loci. Leveraging GWAS results from three phenotypes with significantly overlapping genetic architectures (schizophrenia, major depression, and educational attainment), seven additional loci shared with other traits are identified at equally strict significance levels. Dissecting the polygenic architecture we find both quantitative and qualitative polygenic heterogeneity across ASD subtypes, in contrast to what is typically seen in other complex disorders. These results highlight biological insights, particularly relating to neuronal function and corticogenesis and establish that GWAS performed at scale will be much more productive in the near term in ASD, just as it has been in a broad range of important psychiatric and diverse medical phenotypes.
biorxiv genetics 200-500-users 2017Deciphering eukaryotic cis-regulatory logic with 100 million random promoters, bioRxiv, 2017-11-26
AbstractDeciphering cis-regulation, the code by which transcription factors (TFs) interpret regulatory DNA sequence to control gene expression levels, is a long-standing challenge. Previous studies of native or engineered sequences have remained limited in scale. Here, we use random sequences as an alternative, allowing us to measure the expression output of over 100 million synthetic yeast promoters. Random sequences yield a broad range of reproducible expression levels, indicating that the fortuitous binding sites in random DNA are functional. From these data we learn models of transcriptional regulation that predict over 94% of the expression driven from independent test data and nearly 89% from sequences from yeast promoters. These models allow us to characterize the activity of TFs and their interactions with chromatin, and help refine cis-regulatory motifs. We find that strand, position, and helical face preferences of TFs are widespread and depend on interactions with neighboring chromatin. Such massive-throughput regulatory assays of random DNA provide the diverse examples necessary to learn complex models of cis-regulatory logic.
biorxiv genomics 200-500-users 2017