Polygenic Adaptation has Impacted Multiple Anthropometric Traits, bioRxiv, 2017-07-24
AbstractOur understanding of the genetic basis of human adaptation is biased toward loci of large pheno-typic effect. Genome wide association studies (GWAS) now enable the study of genetic adaptation in polygenic phenotypes. We test for polygenic adaptation among 187 world-wide human populations using polygenic scores constructed from GWAS of 34 complex traits. We identify signals of polygenic adaptation for anthropometric traits including height, infant head circumference (IHC), hip circumference and waist-to-hip ratio (WHR). Analysis of ancient DNA samples indicates that a north-south cline of height within Europe and and a west-east cline across Eurasia can be traced to selection for increased height in two late Pleistocene hunter gatherer populations living in western and west-central Eurasia. Our observation that IHC and WHR follow a latitudinal cline in Western Eurasia support the role of natural selection driving Bergmann’s Rule in humans, consistent with thermoregulatory adaptation in response to latitudinal temperature variation.Author’s Note on Failure to ReplicateAfter this preprint was posted, the UK Biobank dataset was released, providing a new and open GWAS resource. When attempting to replicate the height selection results from this preprint using GWAS data from the UK Biobank, we discovered that we could not. In subsequent analyses, we determined that both the GIANT consortium height GWAS data, as well as another dataset that was used for replication, were impacted by stratification issues that created or at a minimum substantially inflated the height selection signals reported here. The results of this second investigation, written together with additional coauthors, have now been published (<jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpselifesciences.orgarticles39725>httpselifesciences.orgarticles39725<jatsext-link> along with another paper by a separate group of authors, showing similar issues <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpselifesciences.orgarticles39702>httpselifesciences.orgarticles39702<jatsext-link>). A preliminary investigation shows that the other non-height based results may suffer from similar issues. We stand by the theory and statistical methods reported in this paper, and the paper can be cited for these results. However, we have shown that the data on which the major empirical results were based are not sound, and so should be treated with caution until replicated.
biorxiv evolutionary-biology 200-500-users 2017Genome-wide genetic data on ~500,000 UK Biobank participants, bioRxiv, 2017-07-21
AbstractThe UK Biobank project is a large prospective cohort study of ~500,000 individuals from across the United Kingdom, aged between 40-69 at recruitment. A rich variety of phenotypic and health-related information is available on each participant, making the resource unprecedented in its size and scope. Here we describe the genome-wide genotype data (~805,000 markers) collected on all individuals in the cohort and its quality control procedures. Genotype data on this scale offers novel opportunities for assessing quality issues, although the wide range of ancestries of the individuals in the cohort also creates particular challenges. We also conducted a set of analyses that reveal properties of the genetic data – such as population structure and relatedness – that can be important for downstream analyses. In addition, we phased and imputed genotypes into the dataset, using computationally efficient methods combined with the Haplotype Reference Consortium (HRC) and UK10K haplotype resource. This increases the number of testable variants by over 100-fold to ~96 million variants. We also imputed classical allelic variation at 11 human leukocyte antigen (HLA) genes, and as a quality control check of this imputation, we replicate signals of known associations between HLA alleles and many common diseases. We describe tools that allow efficient genome-wide association studies (GWAS) of multiple traits and fast phenome-wide association studies (PheWAS), which work together with a new compressed file format that has been used to distribute the dataset. As a further check of the genotyped and imputed datasets, we performed a test-case genome-wide association scan on a well-studied human trait, standing height.
biorxiv genetics 200-500-users 2017The cis-regulatory dynamics of embryonic development at single cell resolution, bioRxiv, 2017-07-21
ABSTRACTSingle cell measurements of gene expression are providing new insights into lineage commitment, yet the regulatory changes underlying individual cell trajectories remain elusive. Here, we profiled chromatin accessibility in over 20,000 single nuclei across multiple stages of Drosophila embryogenesis. Our data reveal heterogeneity in the regulatory landscape prior to gastrulation that reflects anatomical position, a feature that aligns with future cell fate. During mid embryogenesis, tissue granularity emerges such that cell types can be inferred by their chromatin accessibility, while maintaining a signature of their germ layer of origin. We identify over 30,000 distal elements with tissue-specific accessibility. Using transgenic embryos, we tested the germ layer specificity of a subset of predicted enhancers, achieving near-perfect accuracy. Overall, these data demonstrate the power of shotgun single cell profiling of embryos to resolve dynamic changes in open chromatin during development, and to uncover the cis-regulatory programs of germ layers and cell types.
biorxiv genomics 200-500-users 2017The evolutionary history of 2,658 cancers, bioRxiv, 2017-07-12
SummaryCancer develops through a process of somatic evolution. Here, we use whole-genome sequencing of 2,778 tumour samples from 2,658 donors to reconstruct the life history, evolution of mutational processes, and driver mutation sequences of 39 cancer types. The early phases of oncogenesis are driven by point mutations in a small set of driver genes, often including biallelic inactivation of tumour suppressors. Early oncogenesis is also characterised by specific copy number gains, such as trisomy 7 in glioblastoma or isochromosome 17q in medulloblastoma. By contrast, increased genomic instability, a nearly four-fold diversification of driver genes, and an acceleration of point mutation processes are features of later stages. Copy-number alterations often occur in mitotic crises leading to simultaneous gains of multiple chromosomal segments. Timing analysis suggests that driver mutations often precede diagnosis by many years, and in some cases decades, providing a window of opportunity for early cancer detection.
biorxiv cancer-biology 200-500-users 2017Speed breeding a powerful tool to accelerate crop research and breeding, bioRxiv, 2017-07-10
The growing human population and a changing environment have raised significant concern for global food security, with the current improvement rate of several important crops inadequate to meet future demand [1]. This slow improvement rate is attributed partly to the long generation times of crop plants. Here we present a method called ‘speed breeding’, which greatly shortens generation time and accelerates breeding and research programs. Speed breeding can be used to achieve up to 6 generations per year for spring wheat (Triticum aestivum), durum wheat (T. durum), barley (Hordeum vulgare), chickpea (Cicer arietinum), and pea (Pisum sativum) and 4 generations for canola (Brassica napus), instead of 2-3 under normal glasshouse conditions. We demonstrate that speed breeding in fully-enclosed controlled-environment growth chambers can accelerate plant development for research purposes, including phenotyping of adult plant traits, mutant studies, and transformation. The use of supplemental lighting in a glasshouse environment allows rapid generation cycling through single seed descent and potential for adaptation to larger-scale crop improvement programs. Cost-saving through LED supplemental lighting is also outlined. We envisage great potential for integrating speed breeding with other modern crop breeding technologies, including high-throughput genotyping, genome editing, and genomic selection, accelerating the rate of crop improvement.
biorxiv plant-biology 200-500-users 2017Robust and Bright Genetically Encoded Fluorescent Markers for Highlighting Structures and Compartments in Mammalian Cells, bioRxiv, 2017-07-07
To increase our understanding of cells, there is a need for specific markers to identify biomolecules, cellular structures and compartments. One type of markers comprises genetically encoded fluorescent probes that are linked with protein domains, peptides andor signal sequences. These markers are encoded on a plasmid and they allow straightforward, convenient labeling of cultured mammalian cells by introducing the plasmid into the cells. Ideally, the fluorescent marker combines favorable spectroscopic properties (brightness, photostability) with specific labeling of the structure or compartment of interest. Here, we report on our ongoing efforts to generate robust and bright genetically encoded fluorescent markers for highlighting structures and compartments in living cells.
biorxiv cell-biology 200-500-users 2017