A neural algorithm for a fundamental computing problem, bioRxiv, 2017-08-26
Similarity search, such as identifying similar images in a database or similar documents on the Web, is a fundamental computing problem faced by many large-scale information retrieval systems. We discovered that the fly’s olfac-tory circuit solves this problem using a novel variant of a traditional computer science algorithm (called locality-sensitive hashing). The fly’s circuit assigns similar neural activity patterns to similar input stimuli (odors), so that behav-iors learned from one odor can be applied when a similar odor is experienced. The fly’s algorithm, however, uses three new computational ingredients that depart from traditional approaches. We show that these ingredients can be translated to improve the performance of similarity search compared to tra-ditional algorithms when evaluated on several benchmark datasets. Overall, this perspective helps illuminate the logic supporting an important sensory function (olfaction), and it provides a conceptually new algorithm for solving a fundamental computational problem.
biorxiv neuroscience 200-500-users 2017The 10,000 Immunomes Project A resource for human immunology, bioRxiv, 2017-08-26
AbstractNew immunological assays now enable rich measurements of human immune function, but difficulty attaining enough measurements across sufficiently large and diverse cohorts has hindered describing normal human immune physiology on a large scale. Here we present the 10,000 Immunomes Project (10KIP), a diverse human immunology reference derived from over 44,000 individuals across 242 studies from ImmPort, a publicly available resource of raw immunology study data and protocols. We carefully curated datasets, aggregating subjects from healthycontrol arms and harmonizing data across studies. We demonstrate 10KIP’s utility by describing variations in serum cytokines and leukocytes by age, race, and sex; defining a baseline cell-cytokine network; and using 10KIP as a common control to describe immunologic changes in pregnancy. Subject-level data is available for interactive visualization and download at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=http10kImmunomes.org>http10kImmunomes.org<jatsext-link>. We believe 10KIP can serve as a common control cohort and will accelerate hypothesis generation by clinical and basic immunologists across diverse populations.One Sentence SummaryAn open online resource of human immunology data from more than 10,000 normal subjects including interactive data visualization and download enables a new look at immune system differences across age and sex, rapid hypothesis generation, and creation of custom control cohorts.
biorxiv immunology 200-500-users 2017A Large-Scale Binding and Functional Map of Human RNA Binding Proteins, bioRxiv, 2017-08-24
Genomes encompass all the information necessary to specify the development and function of an organism. In addition to genes, genomes also contain a myriad of functional elements that control various steps in gene expression. A major class of these elements function only when transcribed into RNA as they serve as the binding sites for RNA binding proteins (RBPs), which act to control post-transcriptional processes including splicing, cleavage and polyadenylation, RNA editing, RNA localization, stability, and translation. Despite the importance of these functional RNA elements encoded in the genome, they have been much less studied than genes and DNA elements. Here, we describe the mapping and characterization of RNA elements recognized by a large collection of human RBPs in K562 and HepG2 cells. These data expand the catalog of functional elements encoded in the human genome by addition of a large set of elements that function at the RNA level through interaction with RBPs.
biorxiv genomics 200-500-users 2017Genome-wide association studies of brain structure and function in the UK Biobank, bioRxiv, 2017-08-22
SummaryThe genetic basis of brain structure and function is largely unknown. We carried out genome-wide association studies of 3,144 distinct functional and structural brain imaging derived phenotypes in UK Biobank (discovery dataset 8,428 subjects). We show that many of these phenotypes are heritable. We identify 148 clusters of SNP-imaging associations with lead SNPs that replicate at p<0.05, when we would expect 21 to replicate by chance. Notable significant and interpretable associations include iron transport and storage genes, related to changes in T2* in subcortical regions; extracellular matrix and the epidermal growth factor genes, associated with white matter micro-structure and lesion volume; genes regulating mid-line axon guidance development associated with pontine crossing tract organisation; and overall 17 genes involved in development, pathway signalling and plasticity. Our results provide new insight into the genetic architecture of the brain with relevance to complex neurological and psychiatric disorders, as well as brain development and aging. The full set of results is available on the interactive Oxford Brain Imaging Genetics (BIG) web browser.
biorxiv genetics 200-500-users 2017Bioinformatics Core Competencies for Undergraduate Life Sciences Education, bioRxiv, 2017-08-04
AbstractBioinformatics is becoming increasingly central to research in the life sciences. However, despite its importance, bioinformatics skills and knowledge are not well integrated in undergraduate biology education. This curricular gap prevents biology students from harnessing the full potential of their education, limiting their career opportunities and slowing genomic research innovation. To advance the integration of bioinformatics into life sciences education, a framework of core bioinformatics competencies is needed. To that end, we here report the results of a survey of life sciences faculty in the United States about teaching bioinformatics to undergraduate life scientists. Responses were received from 1,260 faculty representing institutions in all fifty states with a combined capacity to educate hundreds of thousands of students every year. Results indicate strong, widespread agreement that bioinformatics knowledge and skills are critical for undergraduate life scientists, as well as considerable agreement about which skills are necessary. Perceptions of the importance of some skills varied with the respondent’s degree of training, time since degree earned, andor the Carnegie classification of the respondent’s institution. To assess which skills are currently being taught, we analyzed syllabi of courses with bioinformatics content submitted by survey respondents. Finally, we used the survey results, the analysis of syllabi, and our collective research and teaching expertise to develop a set of bioinformatics core competencies for undergraduate life sciences students. These core competencies are intended to serve as a guide for institutions as they work to integrate bioinformatics into their life sciences curricula.Significance StatementBioinformatics, an interdisciplinary field that uses techniques from computer science and mathematics to store, manage, and analyze biological data, is becoming increasingly central to modern biology research. Given the widespread use of bioinformatics and its impacts on societal problem-solving (e.g., in healthcare, agriculture, and natural resources management), there is a growing need for the integration of bioinformatics competencies into undergraduate life sciences education. Here, we present a set of bioinformatics core competencies for undergraduate life scientists developed using the results of a large national survey and the expertise of our working group of bioinformaticians and educators. We also present results from the survey on the importance of bioinformatics skills and the current state of integration of bioinformatics into biology education.
biorxiv bioinformatics 200-500-users 2017Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depressive disorder, bioRxiv, 2017-07-25
Major depressive disorder (MDD) is a notably complex illness with a lifetime prevalence of 14%.1 It is often chronic or recurrent and is thus accompanied by considerable morbidity, excess mortality, substantial costs, and heightened risk of suicide.2-7 MDD is a major cause of disability worldwide.8 We conducted a genome-wide association (GWA) meta-analysis in 130,664 MDD cases and 330,470 controls, and identified 44 independent loci that met criteria for statistical significance. We present extensive analyses of these results which provide new insights into the nature of MDD. The genetic findings were associated with clinical features of MDD, and implicated prefrontal and anterior cingulate cortex in the pathophysiology of MDD (regions exhibiting anatomical differences between MDD cases and controls). Genes that are targets of antidepressant medications were strongly enriched for MDD association signals (P=8.5×10−10), suggesting the relevance of these findings for improved pharmacotherapy of MDD. Sets of genes involved in gene splicing and in creating isoforms were also enriched for smaller MDD GWA P-values, and these gene sets have also been implicated in schizophrenia and autism. Genetic risk for MDD was correlated with that for many adult and childhood onset psychiatric disorders. Our analyses suggested important relations of genetic risk for MDD with educational attainment, body mass, and schizophrenia the genetic basis of lower educational attainment and higher body mass were putatively causal for MDD whereas MDD and schizophrenia reflected a partly shared biological etiology. All humans carry lesser or greater numbers of genetic risk factors for MDD, and a continuous measure of risk underlies the observed clinical phenotype. MDD is not a distinct entity that neatly demarcates normalcy from pathology but rather a useful clinical construct associated with a range of adverse outcomes and the end result of a complex process of intertwined genetic and environmental effects. These findings help refine and define the fundamental basis of MDD.
biorxiv genetics 200-500-users 2017