Recovery of trait heritability from whole genome sequence data, bioRxiv, 2019-03-26
AbstractHeritability, the proportion of phenotypic variance explained by genetic factors, can be estimated from pedigree data 1, but such estimates are uninformative with respect to the underlying genetic architecture. Analyses of data from genome-wide association studies (GWAS) on unrelated individuals have shown that for human traits and disease, approximately one-third to two-thirds of heritability is captured by common SNPs 2–5. It is not known whether the remaining heritability is due to the imperfect tagging of causal variants by common SNPs, in particular if the causal variants are rare, or other reasons such as over-estimation of heritability from pedigree data. Here we show that pedigree heritability for height and body mass index (BMI) appears to be fully recovered from whole-genome sequence (WGS) data on 21,620 unrelated individuals of European ancestry. We assigned 47.1 million genetic variants to groups based upon their minor allele frequencies (MAF) and linkage disequilibrium (LD) with variants nearby, and estimated and partitioned variation accordingly. The estimated heritability was 0.79 (SE 0.09) for height and 0.40 (SE 0.09) for BMI, consistent with pedigree estimates. Low-MAF variants in low LD with neighbouring variants were enriched for heritability, to a greater extent for protein altering variants, consistent with negative selection thereon. Cumulatively variants in the MAF range of 0.0001 to 0.1 explained 0.54 (SE 0.05) and 0.51 (SE 0.11) of heritability for height and BMI, respectively. Our results imply that the still missing heritability of complex traits and disease is accounted for by rare variants, in particular those in regions of low LD.
biorxiv genetics 500+-users 2019Extraordinary claims require extraordinary evidence in the case of asserted mtDNA biparental inheritance, bioRxiv, 2019-03-25
AbstractA breakthrough article published in PNAS by Luo et al. (2018) challenges a central dogma in biology which states that the mitochondrial DNA (mtDNA) is inherited exclusively from the mother. By sequencing the mitogenomes of several members of three independent families, the authors inferred an unprecedented pattern of biparental inheritance that requires the participation of an autosomal nuclear factor in the molecular process. However, a comprehensive analysis of their data reveals a number of issues that must be carefully addressed before challenging the current paradigm. Unfortunately, the methods section lacks any description of sample management, validation of their results in independent laboratories was deficient, and the reported findings have been observed at a frequency at complete variance with established evidence. Moreover, the remarkably high (and unusually homogeneous) levels of heteroplasmy reported can be readily detected using classical techniques for DNA sequencing. By reassessing the raw sequencing data with an alternative computational pipeline, we report strong correlation to the NextGENe results provided by the authors on a per sample base. However, the sequencing replicates from the same donors show aberrations in the variants detected that need further investigation to exclude contributions from other sources or methodological artifacts. Finally, applying the principle of reductio ad absurdum, we demonstrate that the nuclear factor invoked by the authors would need to be extraordinarily complex and precise in order to preclude linear accumulation of mtDNA lineages across generations. We discuss alternate scenarios that explain findings of the same nature as reported by Luo et al., in the context of in-vitro fertilization and therapeutic mtDNA replacement ooplasmic transplantation.
biorxiv genetics 100-200-users 2019Changes in gene expression shift and switch genetic interactions, bioRxiv, 2019-03-15
SummaryAn important goal in disease genetics and evolutionary biology is to understand how mutations combine to alter phenotypes and fitness. Non-additive interactions between mutations occur extensively and change across conditions, cell types, and species, making genetic prediction a difficult challenge. To understand the reasons for this, we reduced the problem to a minimal system where we combined mutations in a single protein performing a single function (a transcriptional repressor inhibiting a target gene). Even in this minimal system, a change in gene expression altered both the strength and type of genetic interactions. These seemingly complicated changes could, however, be predicted by a mathematical model that propagates the effects of mutations on protein folding to the cellular phenotype. We show that similar changes will be observed for many genes. These results provide fundamental insights into genotype-phenotype maps and illustrate how changes in genetic interactions can be predicted using hierarchical mechanistic models.One sentence SummaryDeep mutagenesis of the lambda repressor reveals that changes in gene expression will alter the strength and direction of genetic interactions between mutations in many genes.Highlights<jatslist list-type=bullet><jatslist-item>Deep mutagenesis of the lambda repressor at two expression levels reveals extensive changes in mutational effects and genetic interactions<jatslist-item><jatslist-item>Genetic interactions can switch from positive (suppressive) to negative (enhancing) as the expression of a gene changes<jatslist-item><jatslist-item>A mathematical model that propagates the effects of mutations on protein folding to the cellular phenotype accurately predicts changes in mutational effects and interactions<jatslist-item><jatslist-item>Changes in expression will alter mutational effects and interactions for many genes<jatslist-item><jatslist-item>For some genes, perfect mechanistic models will never be able to predict how mutations of known effect combine without measurements of intermediate phenotypes<jatslist-item>
biorxiv genetics 0-100-users 2019Population histories of the United States revealed through fine-scale migration and haplotype analysis, bioRxiv, 2019-03-14
AbstractThe population of the United States is shaped by centuries of migration, isolation, growth, and admixture between ancestors of global origins. Here, we assemble a comprehensive view of recent population history by studying the ancestry and population structure of over 32,000 individuals in the US using genetic, ancestral birth origin, and geographic data from the National Geographic Genographic Project. We identify migration routes and barriers that reflect historical demographic events. We also uncover the spatial patterns of relatedness in subpopulations through the combination of haplotype clustering, ancestral birth origin analysis, and local ancestry inference. These patterns include substantial substructure and heterogeneity in HispanicsLatinos, isolation-by-distance in African Americans, elevated levels of relatedness and homozygosity in Asian immigrants, and fine-scale structure in European descents. Taken together, our results provide detailed insights into the genetic structure and demographic history of the diverse US population.
biorxiv genetics 100-200-users 2019The genetic architecture of sporadic and recurrent miscarriage, bioRxiv, 2019-03-13
Miscarriage is a common complex trait that affects 10-25% of clinically confirmed pregnancies1,2. Here we present the first large-scale genetic association analyses with 69,118 cases from five different ancestries for sporadic miscarriage and 750 cases of European ancestry for recurrent miscarriage, and up to 359,469 female controls. We identify one genome-wide significant association on chromosome 13 (rs146350366, minor allele frequency (MAF) 1.2%, Pmeta=3.2× -8 (CI) 1.2-1.6) for sporadic miscarriage in our European ancestry meta-analysis (50,060 cases and 174,109 controls), located near FGF9 involved in pregnancy maintenance3 and progesterone production4. Additionally, we identified three genome-wide significant associations for recurrent miscarriage, including a signal on chromosome 9 (rs7859844, MAF=6.4%, Pmeta=1.3× -8 in controlling extravillous trophoblast motility5. We further investigate the genetic architecture of miscarriage with biobank-scale Mendelian randomization, heritability and, genetic correlation analyses. Our results implicate that miscarriage etiopathogenesis is partly driven by genetic variation related to gonadotropin regulation, placental biology and progesterone production.
biorxiv genetics 0-100-users 2019Genetic analysis identifies molecular systems and biological pathways associated with household income, bioRxiv, 2019-03-12
AbstractSocio-economic position (SEP) is a multi-dimensional construct reflecting (and influencing) multiple socio-cultural, physical, and environmental factors. Previous genome-wide association studies (GWAS) using household income as a marker of SEP have shown that common genetic variants account for 11% of its variation. Here, in a sample of 286,301 participants from UK Biobank, we identified 30 independent genome-wide significant loci, 29 novel, that are associated with household income. Using a recently-developed method to meta-analyze data that leverages power from genetically-correlated traits, we identified an additional 120 income-associated loci. These loci showed clear evidence of functional enrichment, with transcriptional differences identified across multiple cortical tissues, in addition to links with GABAergic and serotonergic neurotransmission. We identified neurogenesis and the components of the synapse as candidate biological systems that are linked with income. By combining our GWAS on income with data from eQTL studies and chromatin interactions, 24 genes were prioritized for follow up, 18 of which were previously associated with cognitive ability. Using Mendelian Randomization, we identified cognitive ability as one of the causal, partly-heritable phenotypes that bridges the gap between molecular genetic inheritance and phenotypic consequence in terms of income differences. Significant differences between genetic correlations indicated that, the genetic variants associated with income are related to better mental health than those linked to educational attainment (another commonly-used marker of SEP). Finally, we were able to predict 2.5% of income differences using genetic data alone in an independent sample. These results are important for understanding the observed socioeconomic inequalities in Great Britain today.
biorxiv genetics 200-500-users 2019