Towards an integration of deep learning and neuroscience, bioRxiv, 2016-06-14
Neuroscience has focused on the detailed implementation of computation, studying neural codes, dynamics and circuits. In machine learning, however, artificial neural networks tend to eschew precisely designed codes, dynamics or circuits in favor of brute force optimization of a cost function, often using simple and relatively uniform initial architectures. Two recent developments have emerged within machine learning that create an opportunity to connect these seemingly divergent perspectives. First, structured architectures are used, including dedicated systems for attention, recursion and various forms of short- and long-term memory storage. Second, cost functions and training procedures have become more complex and are varied across layers and over time. Here we think about the brain in terms of these ideas. We hypothesize that (1) the brain optimizes cost functions, (2) these cost functions are diverse and differ across brain locations and over development, and (3) optimization operates within a pre-structured architecture matched to the computational problems posed by behavior. Such a heterogeneously optimized system, enabled by a series of interacting cost functions, serves to make learning data-efficient and precisely targeted to the needs of the organism. We suggest directions by which neuroscience could seek to refine and test these hypotheses.
biorxiv neuroscience 200-500-users 2016Thanatotranscriptome genes actively expressed after organismal death, bioRxiv, 2016-06-11
AbstractA continuing enigma in the study of biological systems is what happens to highly ordered structures, far from equilibrium, when their regulatory systems suddenly become disabled. In life, genetic and epigenetic networks precisely coordinate the expression of genes -- but in death, it is not known if gene expression diminishes gradually or abruptly stops or if specific genes are involved. We investigated the unwinding of the clock by identifying upregulated genes, assessing their functions, and comparing their transcriptional profiles through postmortem time in two species, mouse and zebrafish. We found transcriptional abundance profiles of 1,063 genes were significantly changed after death of healthy adult animals in a time series spanning from life to 48 or 96 h postmortem. Ordination plots revealed non-random patterns in profiles by time. While most thanatotranscriptome (thanatos-, Greek defn. death) transcript levels increased within 0.5 h postmortem, some increased only at 24 and 48 h. Functional characterization of the most abundant transcripts revealed the following categories stress, immunity, inflammation, apoptosis, transport, development, epigenetic regulation, and cancer. The increase of transcript abundance was presumably due to thermodynamic and kinetic controls encountered such as the activation of epigenetic modification genes responsible for unraveling the nucleosomes, which enabled transcription of previously silenced genes (e.g., development genes). The fact that new molecules were synthesized at 48 to 96 h postmortem suggests sufficient energy and resources to maintain self-organizing processes. A step-wise shutdown occurs in organismal death that is manifested by the apparent upregulation of genes with various abundance maxima and durations. The results are of significance to transplantology and molecular biology.
biorxiv systems-biology 200-500-users 2016Detection of human adaptation during the past 2,000 years, bioRxiv, 2016-05-08
AbstractDetection of recent natural selection is a challenging problem in population genetics, as standard methods generally integrate over long timescales. Here we introduce the Singleton Density Score (SDS), a powerful measure to infer very recent changes in allele frequencies from contemporary genome sequences. When applied to data from the UK10K Project, SDS reflects allele frequency changes in the ancestors of modern Britons during the past 2,000 years. We see strong signals of selection at lactase and HLA, and in favor of blond hair and blue eyes. Turning to signals of polygenic adaptation we find, remarkably, that recent selection for increased height has driven allele frequency shifts across most of the genome. Moreover, we report suggestive new evidence for polygenic shifts affecting many other complex traits. Our results suggest that polygenic adaptation has played a pervasive role in shaping genotypic and phenotypic variation in modern humans.
biorxiv genetics 200-500-users 2016Prevalence, phenotype and architecture of developmental disorders caused by de novo mutation The Deciphering Developmental Disorders Study, bioRxiv, 2016-04-21
AbstractIndividuals with severe, undiagnosed developmental disorders (DDs) are enriched for damaging de novo mutations (DNMs) in developmentally important genes. We exome sequenced 4,293 families with individuals with DDs, and meta-analysed these data with published data on 3,287 individuals with similar disorders. We show that the most significant factors influencing the diagnostic yield of de novo mutations are the sex of the affected individual, the relatedness of their parents and the age of both father and mother. We identified 94 genes enriched for damaging de novo mutation at genome-wide significance (P < 7 × 10−7), including 14 genes for which compelling data for causation was previously lacking. We have characterised the phenotypic diversity among these genetic disorders. We demonstrate that, at current cost differentials, exome sequencing has much greater power than genome sequencing for novel gene discovery in genetically heterogeneous disorders. We estimate that 42% of our cohort carry pathogenic DNMs (single nucleotide variants and indels) in coding sequences, with approximately half operating by a loss-of-function mechanism, and the remainder resulting in altered-function (e.g. activating, dominant negative). We established that most haplo insufficient developmental disorders have already been identified, but that many altered-function disorders remain to be discovered. Extrapolating from the DDD cohort to the general population, we estimate that developmental disorders caused by DNMs have an average birth prevalence of 1 in 213 to 1 in 448 (0.22-0.47% of live births), depending on parental age.Abbreviations<jatsdef-list><jatsdef-item>PTVProtein-Truncating Variant<jatsdef-item><jatsdef-item>DNMDe Novo Mutation<jatsdef-item><jatsdef-item>DDDevelopmental Disorder<jatsdef-item><jatsdef-item>DDDDeciphering Developmental Disorders study<jatsdef-item><jatsdef-list>
biorxiv genetics 200-500-users 2016The Prevalence of Inappropriate Image Duplication in Biomedical Research Publications, bioRxiv, 2016-04-21
ABSTRACTInaccurate data in scientific papers can result from honest error or intentional falsification. This study attempted to determine the percentage of published papers containing inappropriate image duplication, a specific type of inaccurate data. The images from a total of 20,621 papers in 40 scientific journals from 1995-2014 were visually screened. Overall, 3.8% of published papers contained problematic figures, with at least half exhibiting features suggestive of deliberate manipulation. The prevalence of papers with problematic images rose markedly during the past decade. Additional papers written by authors of papers with problematic images had an increased likelihood of containing problematic images as well. As this analysis focused only on one type of data, it is likely that the actual prevalence of inaccurate data in the published literature is higher. The marked variation in the frequency of problematic images among journals suggest that journal practices, such as pre-publication image screening, influence the quality of the scientific literature.
biorxiv scientific-communication-and-education 200-500-users 2016Impact of knowledge accumulation on pathway enrichment analysis, bioRxiv, 2016-04-20
Pathway-based interpretation of gene lists is a staple of genome analysis. It depends on frequently updated gene annotation databases. We analyzed the evolution of gene annotations over the past seven years and found that the vocabulary of pathways and processes has doubled. This strongly impacts practical analysis of genes 80% of publications we surveyed in 2015 used outdated software that only captured 20% of pathway enrichments apparent in current annotations.
biorxiv bioinformatics 200-500-users 2016