Resetting the yeast epigenome with human nucleosomes, bioRxiv, 2017-06-08
SummaryHumans and yeast are separated by a billion years of evolution, yet their conserved core histones retain central roles in gene regulation. Here, we “reset” yeast to use core human nucleosomes in lieu of their own, an exceedingly rare event which initially took twenty days. The cells adapt, however, by acquiring suppressor mutations in cell-division genes, or by acquiring certain aneuploidy states. Robust growth was also restored by converting five histone residues back to their yeast counterparts. We reveal that humanized nucleosomes in yeast are positioned according to endogenous yeast DNA sequence and chromatin-remodeling network, as judged by a yeast-like nucleosome repeat length. However, human nucleosomes have higher DNA occupancy and reduce RNA content. Adaptation to new biological conditions presented a special challenge for these cells due to slower chromatin remodeling. This humanized yeast poses many fundamental new questions about the nature of chromatin and how it is linked to many cell processes, and provides a platform to study histone variants via yeast epigenome reprogramming.Highlights<jatslist list-type=simple><jatslist-item>- Only 1 in 107 yeast survive with fully human nucleosomes, but they rapidly evolve<jatslist-item><jatslist-item>- Nucleosome positioning and nucleosome repeat length is not influenced by histone type<jatslist-item><jatslist-item>- Human nucleosomes remodel slowly and delay yeast environmental adaptation<jatslist-item><jatslist-item>- Human core nucleosomes are more repressive and globally reduce transcription in yeast<jatslist-item>
biorxiv synthetic-biology 0-100-users 2017Ancient genomes from southern Africa pushes modern human divergence beyond 260,000 years ago, bioRxiv, 2017-06-06
Southern Africa is consistently placed as one of the potential regions for the evolution of Homo sapiens . To examine the region's human prehistory prior to the arrival of migrants from East and West Africa or Eurasia in the last 1,700 years, we generated and analyzed genome sequence data from seven ancient individuals from KwaZulu-Natal, South Africa. Three Stone Age hunter-gatherers date to ~2,000 years ago, and we show that they were related to current-day southern San groups such as the Karretjie People. Four Iron Age farmers (300-500 years old) have genetic signatures similar to present day Bantu-speakers. The genome sequence (13x coverage) of a juvenile boy from Ballito Bay, who lived ~2,000 years ago, demonstrates that southern African Stone Age hunter-gatherers were not impacted by recent admixture; however, we estimate that all modern-day Khoekhoe and San groups have been influenced by 9-22% genetic admixture from East AfricanEurasian pastoralist groups arriving >1,000 years ago, including the Ju|'hoansi San, previously thought to have very low levels of admixture. Using traditional and new approaches, we estimate the population divergence time between the Ballito Bay boy and other groups to beyond 260,000 years ago. These estimates dramatically increases the deepest divergence amongst modern humans, coincide with the onset of the Middle Stone Age in sub-Saharan Africa, and coincide with anatomical developments of archaic humans into modern humans as represented in the local fossil record. Cumulatively, cross-disciplinary records increasingly point to southern Africa as a potential (not necessarily exclusive) 'hot spot' for the evolution of our species.
biorxiv evolutionary-biology 200-500-users 2017Detecting polygenic adaptation in admixture graphs, bioRxiv, 2017-06-06
AbstractAn open question in human evolution is the importance of polygenic adaptation adaptive changes in the mean of a multifactorial trait due to shifts in allele frequencies across many loci. In recent years, several methods have been developed to detect polygenic adaptation using loci identified in genome-wide association studies (GWAS). Though powerful, these methods suffer from limited interpretability they can detect which sets of populations have evidence for polygenic adaptation, but are unable to reveal where in the history of multiple populations these processes occurred. To address this, we created a method to detect polygenic adaptation in an admixture graph, which is a representation of the historical divergences and admixture events relating different populations through time. We developed a Markov chain Monte Carlo (MCMC) algorithm to infer branch-specific parameters reflecting the strength of selection in each branch of a graph. Additionally, we developed a set of summary statistics that are fast to compute and can indicate which branches are most likely to have experienced polygenic adaptation. We show via simulations that this method - which we call PolyGraph - has good power to detect polygenic adaptation, and applied it to human population genomic data from around the world. We also provide evidence that variants associated with several traits, including height, educational attainment, and self-reported unibrow, have been influenced by polygenic adaptation in different populations during human evolution.
biorxiv evolutionary-biology 100-200-users 2017Distinct neuronal activity patterns induce different gene expression programs, bioRxiv, 2017-06-06
SUMMARYBrief and sustained neuronal activity patterns can have opposite effects on synaptic strength that both require activity-regulated gene (ARG) expression. However, whether distinct patterns of activity induce different sets of ARGs is unknown. In genome-scale experiments, we reveal that a neuron’s activity-pattern history can be predicted from the ARGs it expresses. Surprisingly, brief activity selectively induces a small subset of the ARG program that that corresponds precisely to the first of three temporal waves of genes induced by sustained activity. These first-wave genes are distinguished by an open chromatin state, proximity to rapidly activated enhancers, and a requirement for MAPKERK signaling for their induction. MAPKERK mediates rapid RNA polymerase recruitment to promoters, as well as enhancer RNA induction but not histone acetylation at enhancers. Thus, the same mechanisms that establish the multi-wave temporal structure of ARG induction also enable different sets of genes to be induced by distinct activity patterns.
biorxiv neuroscience 0-100-users 2017Discovery of the first genome-wide significant risk loci for ADHD, bioRxiv, 2017-06-04
AbstractAttention-DeficitHyperactivity Disorder (ADHD) is a highly heritable childhood behavioral disorder affecting 5% of school-age children and 2.5% of adults. Common genetic variants contribute substantially to ADHD susceptibility, but no individual variants have been robustly associated with ADHD. We report a genome-wide association meta-analysis of 20,183 ADHD cases and 35,191 controls that identifies variants surpassing genome-wide significance in 12 independent loci, revealing new and important information on the underlying biology of ADHD. Associations are enriched in evolutionarily constrained genomic regions and loss-of-function intolerant genes, as well as around brain-expressed regulatory marks. These findings, based on clinical interviews andor medical records are supported by additional analyses of a self-reported ADHD sample and a study of quantitative measures of ADHD symptoms in the population. Meta-analyzing these data with our primary scan yielded a total of 16 genome-wide significant loci. The results support the hypothesis that clinical diagnosis of ADHD is an extreme expression of one or more continuous heritable traits.
biorxiv genetics 200-500-users 2017Improving the value of public RNA-seq expression data by phenotype prediction, bioRxiv, 2017-06-04
Abstract<jatssec id=sa1>BackgroundPublicly available genomic data are a valuable resource for studying normal human variation and disease, but these data are often not well labeled or annotated. The lack of phenotype information for public genomic data severely limits their utility for addressing targeted biological questions.<jatssec id=sa2>ResultsWe develop an in silico phenotyping approach for predicting critical missing annotation directly from genomic measurements using, well-annotated genomic and phenotypic data produced by consortia like TCGA and GTEx as training data. We apply in silico phenotyping to a set of 70,000 RNA-seq samples we recently processed on a common pipeline as part of the recount2 project (<jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsjhubiostatistics.shinyapps.iorecount>httpsjhubiostatistics.shinyapps.iorecount<jatsext-link>). We use gene expression data to build and evaluate predictors for both biological phenotypes (sex, tissue, sample source) and experimental conditions (sequencing strategy). We demonstrate how these predictions can be used to study cross-sample properties of public genomic data, select genomic projects with specific characteristics, and perform downstream analyses using predicted phenotypes. The methods to perform phenotype prediction are available in the phenopredict R package (<jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsgithub.comleekgroupphenopredict>httpsgithub.comleekgroupphenopredict<jatsext-link>) and the predictions for recount2 are available from the recount R package (<jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsbioconductor.orgpackagesreleasebiochtmlrecount.html>httpsbioconductor.orgpackagesreleasebiochtmlrecount.html<jatsext-link>)<jatssec id=sa3>ConclusionHaving leveraging massive public data sets to generate a well-phenotyped set of expression data for more than 70,000 human samples, expression data is available for use on a scale that was not previously feasible.
biorxiv bioinformatics 100-200-users 2017