Distinguishing genetic correlation from causation across 52 diseases and complex traits, bioRxiv, 2017-10-19

AbstractMendelian randomization (MR) is widely used to identify causal relationships among heritable traits, but it can be confounded by genetic correlations reflecting shared etiology. We propose a model in which a latent causal variable mediates the genetic correlation between two traits. Under the latent causal variable (LCV) model, trait 1 is fully genetically causal for trait 2 if it is perfectly genetically correlated with the latent causal variable, implying that the entire genetic component of trait 1 is causal for trait 2; it is partially genetically causal for trait 2 if it has a high genetic correlation with the latent variable, implying that part of the genetic component of trait 1 is causal for trait 2. To quantify the degree of partial genetic causality, we define the genetic causality proportion (gcp). We fit this model using mixed fourth moments E(<jatsinline-formula><jatsinline-graphic xmlnsxlink=httpwww.w3.org1999xlink xlinkhref=205435_inline1.gif ><jatsinline-formula>α1α2) and E(<jatsinline-formula><jatsinline-graphic xmlnsxlink=httpwww.w3.org1999xlink xlinkhref=205435_inline2.gif ><jatsinline-formula>α1α2) of marginal effect sizes for each trait, exploiting the fact that if trait 1 is causal for trait 2 then SNPs affecting trait 1 (large <jatsinline-formula><jatsinline-graphic xmlnsxlink=httpwww.w3.org1999xlink xlinkhref=205435_inline3.gif ><jatsinline-formula>) will have correlated effects on trait 2 (large α1α2), but not vice versa. We performed simulations under a wide range of genetic architectures and determined that LCV, unlike state-of-the-art MR methods, produced well-calibrated false positive rates and reliable gcp estimates in the presence of genetic correlations and asymmetric genetic architectures; we also determined that LCV is well-powered to detect a causal effect. We applied LCV to GWAS summary statistics for 52 traits (average N=331k), identifying partially or fully genetically causal effects (1% FDR) for 59 pairs of traits, including 30 pairs of traits with high gcp estimates (gĉp &gt; 0.6). Results consistent with the published literature included genetically causal effects on myocardial infarction (MI) for LDL, triglycerides and BMI. Novel findings included a genetically causal effect of LDL on bone mineral density, consistent with clinical trials of statins in osteoporosis. These results demonstrate that it is possible to distinguish between genetic correlation and causation using genetic data.

biorxiv genetics 200-500-users 2017

Amplification-free, CRISPR-Cas9 Targeted Enrichment and SMRT Sequencing of Repeat-Expansion Disease Causative Genomic Regions, bioRxiv, 2017-10-17

AbstractTargeted sequencing has proven to be an economical means of obtaining sequence information for one or more defined regions of a larger genome. However, most target enrichment methods require amplification. Some genomic regions, such as those with extreme GC content and repetitive sequences, are recalcitrant to faithful amplification. Yet, many human genetic disorders are caused by repeat expansions, including difficult to sequence tandem repeats.We have developed a novel, amplification-free enrichment technique that employs the CRISPR-Cas9 system for specific targeting multiple genomic loci. This method, in conjunction with long reads generated through Single Molecule, Real-Time (SMRT) sequencing and unbiased coverage, enables enrichment and sequencing of complex genomic regions that cannot be investigated with other technologies. Using human genomic DNA samples, we demonstrate successful targeting of causative loci for Huntington’s disease (HTT; CAG repeat), Fragile X syndrome (FMR1; CGG repeat), amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (C9orf72; GGGGCC repeat), and spinocerebellar ataxia type 10 (SCA10) (ATXN10; variable ATTCT repeat). The method, amenable to multiplexing across multiple genomic loci, uses an amplification-free approach that facilitates the isolation of hundreds of individual on-target molecules in a single SMRT Cell and accurate sequencing through long repeat stretches, regardless of extreme GC percent or sequence complexity content. Our novel targeted sequencing method opens new doors to genomic analyses independent of PCR amplification that will facilitate the study of repeat expansion disorders.

biorxiv genomics 0-100-users 2017

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo