A zombie LIF gene in elephants is up-regulated by TP53 to induce apoptosis in response to DNA damage, bioRxiv, 2017-09-13
AbstractLarge bodied organisms have more cells that can potentially turn cancerous than smallbodied organisms with fewer cells, imposing an increased risk of developing cancer. This expectation predicts a positive correlation between body size and cancer risk, however, there is no correlation between body size and cancer risk across species (‘Peto’s Paradox’). Here we show that elephants and their extinct relatives (Proboscideans) may have resolved Peto’s Paradox in part through re-functionalizing a leukemia inhibitory factor pseudogene (LIF6) with pro-apoptotic functions. The LIF6 gene is transcriptionally up-regulated by TP53 in response to DNA damage, and translocates to the mitochondria where it induces apoptosis. Phylogenetic analyses of living and extinct Proboscidean LIF6 genes indicates its TP53 response element evolved coincident with the evolution of large body sizes in the Proboscidean stem-lineage. These results suggest that re-functionalizing of a pro-apoptotic LIF pseudogene may have been permissive (though not sufficient) for the evolution of large body sizes in Proboscideans.
biorxiv evolutionary-biology 100-200-users 2017No major flaws in “Identification of individuals by trait prediction using whole-genome sequencing data”, bioRxiv, 2017-09-12
AbstractIn a recently published PNAS article, we studied the identifiability of genomic samples using machine learning methods [Lippert et al., 2017]. In a response, Erlich [2017] argued that our work contained major flaws. The main technical critique of Erlich [2017] builds on a simulation experiment that shows that our proposed algorithm, which uses only a genomic sample for identification, performed no better than a strategy that uses demographic variables. Below, we show why this comparison is misleading and provide a detailed discussion of the key critical points in our analyses that have been brought up in Erlich [2017] and in the media. Further, not only faces may be derived from DNA, but a wide range of phenotypes and demographic variables. In this light, the main contribution of Lippert et al. [2017] is an algorithm that identifies genomes of individuals by combining multiple DNA-based predictive models for a myriad of traits.
biorxiv genomics 100-200-users 2017GWAS meta-analysis (N=279,930) identifies new genes and functional links to intelligence, bioRxiv, 2017-09-07
Intelligence is highly heritable1 and a major determinant of human health and well-being2. Recent genome-wide meta-analyses have identified 24 genomic loci linked to intelligence3–7, but much about its genetic underpinnings remains to be discovered. Here, we present the largest genetic association study of intelligence to date (N=279,930), identifying 206 genomic loci (191 novel) and implicating 1,041 genes (963 novel) via positional mapping, expression quantitative trait locus (eQTL) mapping, chromatin interaction mapping, and gene-based association analysis. We find enrichment of genetic effects in conserved and coding regions and identify 89 nonsynonymous exonic variants. Associated genes are strongly expressed in the brain and specifically in striatal medium spiny neurons and cortical and hippocampal pyramidal neurons. Gene-set analyses implicate pathways related to neurogenesis, neuron differentiation and synaptic structure. We confirm previous strong genetic correlations with several neuropsychiatric disorders, and Mendelian Randomization results suggest protective effects of intelligence for Alzheimer’s dementia and ADHD, and bidirectional causation with strong pleiotropy for schizophrenia. These results are a major step forward in understanding the neurobiology of intelligence as well as genetically associated neuropsychiatric traits.
biorxiv genetics 200-500-users 2017Major flaws in “Identification of individuals by trait prediction using whole-genome sequencing data”, bioRxiv, 2017-09-07
SummaryGenetic privacy is an area of active research. While it is important to identify new risks, it is equally crucial to supply policymakers with accurate information based on scientific evidence. Recently, Lippert et al. (PNAS, 2017) investigated the status of genetic privacy using trait-predictions from whole genome sequencing. The authors sequenced a cohort of about 1000 individuals and collected a range of demographic, visible, and digital traits such as age, sex, height, face morphology, and a voice signature. They attempted to use the genetic features in order to predict those traits and re-identify the individuals from small pool using the trait predictions. Here, I report major flaws in the Lippert et al. manuscript. In short, the authors’ technique performs similarly to a simple baseline procedure, does not utilize the power of whole genome markers, uses technically wrong metrics, and finally does not really identify anyone.
biorxiv genomics 500+-users 2017Genomic basis for RNA alterations revealed by whole-genome analyses of 27 cancer types, bioRxiv, 2017-09-04
AbstractWe present the most comprehensive catalogue of cancer-associated gene alterations through characterization of tumor transcriptomes from 1,188 donors of the Pan-Cancer Analysis of Whole Genomes project. Using matched whole-genome sequencing data, we attributed RNA alterations to germline and somatic DNA alterations, revealing likely genetic mechanisms. We identified 444 associations of gene expression with somatic non-coding single-nucleotide variants. We found 1,872 splicing alterations associated with somatic mutation in intronic regions, including novel exonization events associated with Alu elements. Somatic copy number alterations were the major driver of total gene and allele-specific expression (ASE) variation. Additionally, 82% of gene fusions had structural variant support, including 75 of a novel class called “bridged” fusions, in which a third genomic location bridged two different genes. Globally, we observe transcriptomic alteration signatures that differ between cancer types and have associations with DNA mutational signatures. Given this unique dataset of RNA alterations, we also identified 1,012 genes significantly altered through both DNA and RNA mechanisms. Our study represents an extensive catalog of RNA alterations and reveals new insights into the heterogeneous molecular mechanisms of cancer gene alterations.
biorxiv genomics 100-200-users 2017What exactly is ‘N’ in cell culture and animal experiments?, bioRxiv, 2017-09-03
AbstractBiologists establish the existence of experimental effects by applying treatments or interventions to biological entities or units, such as people, animals, slice preparations, or cells. When done appropriately, independent replication of the entity-intervention pair contributes to the sample size (N) and forms the basis of statistical inference. However, sometimes the appropriate entity-intervention pair may not be obvious, and the wrong choice can make an experiment worthless. We surveyed a random sample of published animal experiments from 2011 to 2016 where interventions were applied to parents but effects examined in the offspring, as regulatory authorities have provided clear guidelines on replication with such designs. We found that only 22% of studies (95% CI = 17% to 29%) replicated the correct entity-intervention pair and thus made valid statistical inferences. Approximately half of the studies (46%, 95% CI = 38% to 53%) had pseudoreplication while 32% (95% CI = 26% to 39%) provided insufficient information to make a judgement. Pseudoreplication artificially inflates the sample size, leading to more false positive results and inflating the apparent evidence supporting a scientific claim. It is hard for science to advance when so many experiments are poorly designed and analysed. We argue that distinguishing between biological units, experimental units, and observational units clarifies where replication should occur, describe the criteria for genuine replication, and provide guidelines for designing and analysing in vitro, ex vivo, and in vivo experiments.
biorxiv neuroscience 100-200-users 2017