Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depressive disorder, bioRxiv, 2017-07-25
Major depressive disorder (MDD) is a notably complex illness with a lifetime prevalence of 14%.1 It is often chronic or recurrent and is thus accompanied by considerable morbidity, excess mortality, substantial costs, and heightened risk of suicide.2-7 MDD is a major cause of disability worldwide.8 We conducted a genome-wide association (GWA) meta-analysis in 130,664 MDD cases and 330,470 controls, and identified 44 independent loci that met criteria for statistical significance. We present extensive analyses of these results which provide new insights into the nature of MDD. The genetic findings were associated with clinical features of MDD, and implicated prefrontal and anterior cingulate cortex in the pathophysiology of MDD (regions exhibiting anatomical differences between MDD cases and controls). Genes that are targets of antidepressant medications were strongly enriched for MDD association signals (P=8.5×10−10), suggesting the relevance of these findings for improved pharmacotherapy of MDD. Sets of genes involved in gene splicing and in creating isoforms were also enriched for smaller MDD GWA P-values, and these gene sets have also been implicated in schizophrenia and autism. Genetic risk for MDD was correlated with that for many adult and childhood onset psychiatric disorders. Our analyses suggested important relations of genetic risk for MDD with educational attainment, body mass, and schizophrenia the genetic basis of lower educational attainment and higher body mass were putatively causal for MDD whereas MDD and schizophrenia reflected a partly shared biological etiology. All humans carry lesser or greater numbers of genetic risk factors for MDD, and a continuous measure of risk underlies the observed clinical phenotype. MDD is not a distinct entity that neatly demarcates normalcy from pathology but rather a useful clinical construct associated with a range of adverse outcomes and the end result of a complex process of intertwined genetic and environmental effects. These findings help refine and define the fundamental basis of MDD.
biorxiv genetics 200-500-users 2017Profiling of accessible chromatin regions across multiple plant species and cell types reveals common gene regulatory principles and new control modules, bioRxiv, 2017-07-25
ABSTRACTThe transcriptional regulatory structure of plant genomes remains poorly defined relative to animals. It is unclear how many cis-regulatory elements exist, where these elements lie relative to promoters, and how these features are conserved across plant species. We employed the Assay for Transposase-Accessible Chromatin (ATAC-seq) in four plant species (Arabidopsis thaliana, Medicago truncatula, Solanum lycopersicum, and Oryza sativa) to delineate open chromatin regions and transcription factor (TF) binding sites across each genome. Despite 10-fold variation in intergenic space among species, the majority of open chromatin regions lie within 3 kb upstream of a transcription start site in all species. We find a common set of four TFs that appear to regulate conserved gene sets in the root tips of all four species, suggesting that TF-gene networks are generally conserved. Comparative ATAC-seq profiling of Arabidopsis root hair and non-hair cell types revealed extensive similarity as well as many cell type-specific differences. Analyzing TF binding sites in differentially accessible regions identified a MYB-driven regulatory module unique to the hair cell, which appears to control both cell fate regulators and abiotic stress responses. Our analyses revealed common regulatory principles among species and shed light on the mechanisms producing cell type-specific transcriptomes during development.
biorxiv plant-biology 0-100-users 2017Polygenic Adaptation has Impacted Multiple Anthropometric Traits, bioRxiv, 2017-07-24
AbstractOur understanding of the genetic basis of human adaptation is biased toward loci of large pheno-typic effect. Genome wide association studies (GWAS) now enable the study of genetic adaptation in polygenic phenotypes. We test for polygenic adaptation among 187 world-wide human populations using polygenic scores constructed from GWAS of 34 complex traits. We identify signals of polygenic adaptation for anthropometric traits including height, infant head circumference (IHC), hip circumference and waist-to-hip ratio (WHR). Analysis of ancient DNA samples indicates that a north-south cline of height within Europe and and a west-east cline across Eurasia can be traced to selection for increased height in two late Pleistocene hunter gatherer populations living in western and west-central Eurasia. Our observation that IHC and WHR follow a latitudinal cline in Western Eurasia support the role of natural selection driving Bergmann’s Rule in humans, consistent with thermoregulatory adaptation in response to latitudinal temperature variation.Author’s Note on Failure to ReplicateAfter this preprint was posted, the UK Biobank dataset was released, providing a new and open GWAS resource. When attempting to replicate the height selection results from this preprint using GWAS data from the UK Biobank, we discovered that we could not. In subsequent analyses, we determined that both the GIANT consortium height GWAS data, as well as another dataset that was used for replication, were impacted by stratification issues that created or at a minimum substantially inflated the height selection signals reported here. The results of this second investigation, written together with additional coauthors, have now been published (<jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpselifesciences.orgarticles39725>httpselifesciences.orgarticles39725<jatsext-link> along with another paper by a separate group of authors, showing similar issues <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpselifesciences.orgarticles39702>httpselifesciences.orgarticles39702<jatsext-link>). A preliminary investigation shows that the other non-height based results may suffer from similar issues. We stand by the theory and statistical methods reported in this paper, and the paper can be cited for these results. However, we have shown that the data on which the major empirical results were based are not sound, and so should be treated with caution until replicated.
biorxiv evolutionary-biology 200-500-users 2017Systematic mapping of chromatin state landscapes during mouse development, bioRxiv, 2017-07-22
SUMMARYEmbryogenesis requires epigenetic information that allows each cell to respond appropriately to developmental cues. Histone modifications are core components of a cell’s epigenome, giving rise to chromatin states that modulate genome function. Here, we systematically profile histone modifications in a diverse panel of mouse tissues at 8 developmental stages from 10.5 days post conception until birth, performing a total of 1,128 ChIP-seq assays across 72 distinct tissue-stages. We combine these histone modification profiles into a unified set of chromatin state annotations, and track their activity across developmental time and space. Through integrative analysis we identify dynamic enhancers, reveal key transcriptional regulators, and characterize the role of chromatin-based repression in developmental gene regulation. We also leverage these data to link enhancers to putative target genes, revealing connections between coding and non-coding sequence variation in disease etiology. Our study provides a compendium of resources for biomedical researchers, and achieves the most comprehensive view of embryonic chromatin states to date.
biorxiv genomics 100-200-users 2017Genome-wide genetic data on ~500,000 UK Biobank participants, bioRxiv, 2017-07-21
AbstractThe UK Biobank project is a large prospective cohort study of ~500,000 individuals from across the United Kingdom, aged between 40-69 at recruitment. A rich variety of phenotypic and health-related information is available on each participant, making the resource unprecedented in its size and scope. Here we describe the genome-wide genotype data (~805,000 markers) collected on all individuals in the cohort and its quality control procedures. Genotype data on this scale offers novel opportunities for assessing quality issues, although the wide range of ancestries of the individuals in the cohort also creates particular challenges. We also conducted a set of analyses that reveal properties of the genetic data – such as population structure and relatedness – that can be important for downstream analyses. In addition, we phased and imputed genotypes into the dataset, using computationally efficient methods combined with the Haplotype Reference Consortium (HRC) and UK10K haplotype resource. This increases the number of testable variants by over 100-fold to ~96 million variants. We also imputed classical allelic variation at 11 human leukocyte antigen (HLA) genes, and as a quality control check of this imputation, we replicate signals of known associations between HLA alleles and many common diseases. We describe tools that allow efficient genome-wide association studies (GWAS) of multiple traits and fast phenome-wide association studies (PheWAS), which work together with a new compressed file format that has been used to distribute the dataset. As a further check of the genotyped and imputed datasets, we performed a test-case genome-wide association scan on a well-studied human trait, standing height.
biorxiv genetics 200-500-users 2017The cis-regulatory dynamics of embryonic development at single cell resolution, bioRxiv, 2017-07-21
ABSTRACTSingle cell measurements of gene expression are providing new insights into lineage commitment, yet the regulatory changes underlying individual cell trajectories remain elusive. Here, we profiled chromatin accessibility in over 20,000 single nuclei across multiple stages of Drosophila embryogenesis. Our data reveal heterogeneity in the regulatory landscape prior to gastrulation that reflects anatomical position, a feature that aligns with future cell fate. During mid embryogenesis, tissue granularity emerges such that cell types can be inferred by their chromatin accessibility, while maintaining a signature of their germ layer of origin. We identify over 30,000 distal elements with tissue-specific accessibility. Using transgenic embryos, we tested the germ layer specificity of a subset of predicted enhancers, achieving near-perfect accuracy. Overall, these data demonstrate the power of shotgun single cell profiling of embryos to resolve dynamic changes in open chromatin during development, and to uncover the cis-regulatory programs of germ layers and cell types.
biorxiv genomics 200-500-users 2017