Analyses of Neanderthal introgression suggest that Levantine and southern Arabian populations have a shared population history, bioRxiv, 2018-10-09
AbstractObjectivesModern humans are thought to have interbred with Neanderthals in the Near East soon after modern humans dispersed out of Africa. This introgression event likely took place in either the Levant or southern Arabian depending on which dispersal route out of Africa was followed. In this study, we compare Neanderthal introgression in contemporary Levantine and southern Arabian populations to investigate Neanderthal introgression and to study Near Eastern population history.Materials and MethodsWe analyzed genotyping data on >400,000 autosomal SNPs from seven Levantine and five southern Arabian populations and compared those data to populations from around the world including Neanderthal and Denisovan genomes. We used f4 and D statistics to estimate and compare levels of Neanderthal introgression between Levantine, southern Arabian, and comparative global populations. We also identified 1,581 putative Neanderthal-introgressed SNPs within our dataset and analyzed their allele frequencies as a means to compare introgression patterns in Levantine and southern Arabian genomes.ResultsWe find that Levantine and southern Arabian populations have similar levels of Neanderthal introgression to each other but lower levels than other non-Africans. Furthermore, we find that introgressed SNPs have very similar allele frequencies in the Levant and southern Arabia, which indicates that Neanderthal introgression is similarly distributed in Levantine and southern Arabian genomes.DiscussionWe infer that the ancestors of contemporary Levantine and southern Arabian populations received Neanderthal introgression prior to separating from each other and that there has been extensive gene flow between these populations.
biorxiv genomics 0-100-users 2018Building gene regulatory networks from scATAC-seq and scRNA-seq using Linked Self-Organizing Maps, bioRxiv, 2018-10-09
AbstractRapid advances in single-cell assays have outpaced methods for analysis of those data types. Different single-cell assays show extensive variation in sensitivity and signal to noise levels. In particular, scATAC-seq generates extremely sparse and noisy datasets. Existing methods developed to analyze this data require cells amenable to pseudo-time analysis or require datasets with drastically different cell-types. We describe a novel approach using self-organizing maps (SOM) to link scATAC-seq and scRNA-seq data that overcomes these challenges and can generate draft regulatory networks. Our SOMatic package generates chromatin and gene expression SOMs separately and combines them using a linking function. We applied SOMatic on a mouse pre-B cell differentiation time-course using controlled Ikaros over-expression to recover gene ontology enrichments, identify motifs in genomic regions showing similar single-cell profiles, and generate a gene regulatory network that both recovers known interactions and predicts new Ikaros targets during the differentiation process. The ability of linked SOMs to detect emergent properties from multiple types of highly-dimensional genomic data with very different signal properties opens new avenues for integrative analysis of single-cells.
biorxiv genomics 0-100-users 2018Genetic variability in response to Aβ deposition influences Alzheimer’s risk, bioRxiv, 2018-10-09
AbstractGenetic analysis of late-onset Alzheimer’s disease risk has previously identified a network of largely microglial genes that form a transcriptional network. In transgenic mouse models of amyloid deposition we have previously shown that the expression of many of the mouse orthologs of these genes are co-ordinately up-regulated by amyloid deposition. Here we investigate whether systematic analysis of other members of this mouse amyloid-responsive network predicts other Alzheimer’s risk loci. This statistical comparison of the mouse amyloid-response network with Alzheimer’s disease genome-wide association studies identifies 5 other genetic risk loci for the disease (OAS1, CXCL10, LAPTM5, ITGAM and LILRB4). This work suggests that genetic variability in the microglial response to amyloid deposition is a major determinant for Alzheimer’s risk.One Sentence SummaryIdentification of 5 new risk loci for Alzheimer’s by statistical comparison of mouse Aβ microglial response with gene-based SNPs from human GWAS
biorxiv neuroscience 0-100-users 2018Genome-wide meta-analysis of depression identifies 102 independent variants and highlights the importance of the prefrontal brain regions, bioRxiv, 2018-10-09
AbstractMajor depression is a debilitating psychiatric illness that is typically associated with low mood, anhedonia and a range of comorbidities. Depression has a heritable component that has remained difficult to elucidate with current sample sizes due to the polygenic nature of the disorder. To maximise sample size, we meta-analysed data on 807,553 individuals (246,363 cases and 561,190 controls) from the three largest genome-wide association studies of depression. We identified 102 independent variants, 269 genes, and 15 gene-sets associated with depression, including both genes and gene-pathways associated with synaptic structure and neurotransmission. Further evidence of the importance of prefrontal brain regions in depression was provided by an enrichment analysis. In an independent replication sample of 1,306,354 individuals (414,055 cases and 892,299 controls), 87 of the 102 associated variants were significant following multiple testing correction. Based on the putative genes associated with depression this work also highlights several potential drug repositioning opportunities. These findings advance our understanding of the complex genetic architecture of depression and provide several future avenues for understanding aetiology and developing new treatment approaches.
biorxiv genetics 200-500-users 2018Genome-wide meta-analysis of depression identifies 102 independent variants and highlights the importance of the prefrontal brain regions. Supplementary Information, bioRxiv, 2018-10-09
Major depression is a debilitating psychiatric illness that is typically associated with low mood, anhedonia and a range of comorbidities. Depression has a heritable component that has remained difficult to elucidate with current sample sizes due to the polygenic nature of the disorder. To maximise sample size, we meta-analysed data on 807,553 individuals (246,363 cases and 561,190 controls) from the three largest genome-wide association studies of depression. We identified 102 independent variants, 269 genes, and 15 gene-sets associated with depression, including both genes and gene-pathways associated with synaptic structure and neurotransmission. Further evidence of the importance of prefrontal brain regions in depression was provided by an enrichment analysis. In an independent replication sample of 1,306,354 individuals (414,055 cases and 892,299 controls), 87 of the 102 associated variants were significant following multiple testing correction. Based on the putative genes associated with depression this work also highlights several potential drug repositioning opportunities. These findings advance our understanding of the complex genetic architecture of depression and provide several future avenues for understanding aetiology and developing new treatment approaches.
biorxiv genetics 200-500-users 2018MetaCell analysis of single cell RNA-seq data using k-NN graph partitions, bioRxiv, 2018-10-09
ABSTRACTSingle cell RNA-seq (scRNA-seq) has become the method of choice for analyzing mRNA distributions in heterogeneous cell populations. scRNA-seq only partially samples the cells in a tissue and the RNA in each cell, resulting in sparse data that challenge analysis. We develop a methodology that addresses scRNA-seq’s sparsity through partitioning the data into metacells disjoint, homogenous and highly compact groups of cells, each exhibiting only sampling variance. Metacells constitute local building blocks for clustering and quantitative analysis of gene expression, while not enforcing any global structure on the data, thereby maintaining statistical control and minimizing biases. We illustrate the MetaCell framework by re-analyzing cell type and transcriptional gradients in peripheral blood and whole organism scRNA-seq maps. Our algorithms are implemented in the new MetaCell RC++ software package.
biorxiv bioinformatics 0-100-users 2018