Cas9-Assisted Targeting of CHromosome segments (CATCH) for targeted nanopore sequencing and optical genome mapping, bioRxiv, 2017-02-21
ABSTRACTVariations in the genetic code, from single point mutations to large structural or copy number alterations, influence susceptibility, onset, and progression of genetic diseases and tumor transformation. Next-generation sequencing analysis is unable to reliably capture aberrations larger than the typical sequencing read length of several hundred bases. Long-read, single-molecule sequencing methods such as SMRT and nanopore sequencing can address larger variations, but require costly whole genome analysis. Here we describe a method for isolation and enrichment of a large genomic region of interest for targeted analysis based on Cas9 excision of two sites flanking the target region and isolation of the excised DNA segment by pulsed field gel electrophoresis. The isolated target remains intact and is ideally suited for optical genome mapping and long-read sequencing at high coverage. In addition, analysis is performed directly on native genomic DNA that retains genetic and epigenetic composition without amplification bias. This method enables detection of mutations and structural variants as well as detailed analysis by generation of hybrid scaffolds composed of optical maps and sequencing data at a fraction of the cost of whole genome sequencing.
biorxiv genomics 100-200-users 2017Niche construction in evolutionary theory the construction of an academic niche?, bioRxiv, 2017-02-20
AbstractIn recent years, fairly far-reaching claims have been repeatedly made about how niche construction, the modification by organisms of their environment, and that of other organisms, represents a vastly neglected phenomenon in ecological and evolutionary thought. The proponents of this view claim that the niche construction perspective greatly expands the scope of standard evolutionary theory and that niche construction deserves to be treated as a significant evolutionary process in its own right, almost at par with natural selection. Claims have also been advanced about how niche construction theory represents a substantial extension to, and re-orientation of, standard evolutionary theory, which is criticized as being narrowly gene-centric and ignoring the rich complexity and reciprocity of organism-environment interactions. We examine these claims in some detail and show that they do not stand up to scrutiny. We suggest that the manner in which niche construction theory is sought to be pushed in the literature is better viewed as an exercise in academic niche construction whereby, through incessant repetition of largely untenable claims, and the deployment of rhetorically appealing but logically dubious analogies, a receptive climate for a certain sub-discipline is sought to be manufactured within the scientific community. We see this as an unfortunate, but perhaps inevitable, nascent post-truth tendency within science.
biorxiv evolutionary-biology 100-200-users 2017mixOmics an R package for ‘omics feature selection and multiple data integration, bioRxiv, 2017-02-15
AbstractThe advent of high throughput technologies has led to a wealth of publicly available ‘omics data coming from different sources, such as transcriptomics, proteomics, metabolomics. Combining such large-scale biological data sets can lead to the discovery of important biological insights, provided that relevant information can be extracted in a holistic manner. Current statistical approaches have been focusing on identifying small subsets of molecules (a ‘molecular signature’) to explain or predict biological conditions, but mainly for a single type of ‘omics. In addition, commonly used methods are univariate and consider each biological feature independently.We introducemixOmics, an R package dedicated to the multivariate analysis of biological data sets with a specific focus on data exploration, dimension reduction and visualisation. By adopting a system biology approach, the toolkit provides a wide range of methods that statistically integrate several data sets at once to probe relationships between heterogeneous ‘omics data sets. Our recent methods extend Projection to Latent Structure (PLS) models for discriminant analysis, for data integration across multiple ‘omics data or across independent studies, and for the identification of molecular signatures. We illustrate our latestmixOmicsintegrative frameworks for the multivariate analyses of ‘omics data available from the package.
biorxiv bioinformatics 100-200-users 2017Genes Affecting Vocal and Facial Anatomy Went Through Extensive Regulatory Divergence in Modern Humans, bioRxiv, 2017-02-09
SummaryRegulatory changes are broadly accepted as key drivers of phenotypic divergence. However, identifying regulatory changes that underlie human-specific traits has proven very challenging. Here, we use 63 DNA methylation maps of ancient and present-day humans, as well as of six chimpanzees, to detect differentially methylated regions that emerged in modern humans after the split from Neanderthals and Denisovans. We show that genes affecting the face and vocal tract went through particularly extensive methylation changes. Specifically, we identify widespread hypermethylation in a network of face- and voice-affecting genes (SOX9, ACAN, COL2A1, NFIX and XYLT1). We propose that these repression patterns appeared after the split from Neanderthals and Denisovans, and that they might have played a key role in shaping the modern human face and vocal tract.
biorxiv genomics 0-100-users 2017Salmonella entericagenomes recovered from victims of a major 16th century epidemic in Mexico, bioRxiv, 2017-02-09
AbstractIndigenous populations of the Americas experienced high mortality rates during the early contact period as a result of infectious diseases, many of which were introduced by Europeans. Most of the pathogenic agents that caused these outbreaks remain unknown. Using a metagenomic tool called MALT to search for traces of ancient pathogen DNA, we were able to identifySalmonella entericain individuals buried in an early contact era epidemic cemetery at Teposcolula-Yucundaa, Oaxaca in southern Mexico. This cemetery is linked to the 1545-1550 CE epidemic locally known as “cocoliztli”, the cause of which has been debated for over a century. Here we present two reconstructed ancient genomes forSalmonella entericasubsp.entericaserovar Paratyphi C, a bacterial cause of enteric fever. We propose thatS.Paratyphi C contributed to the population decline during the 1545cocoliztlioutbreak in Mexico.One Sentence SummaryGenomic evidence of enteric fever identified in an indigenous population from early contact period Mexico.
biorxiv genomics 0-100-users 2017Quantitative analysis of population-scale family trees using millions of relatives, bioRxiv, 2017-02-08
AbstractFamily trees have vast applications in multiple fields from genetics to anthropology and economics. However, the collection of extended family trees is tedious and usually relies on resources with limited geographical scope and complex data usage restrictions. Here, we collected 86 million profiles from publicly-available online data from genealogy enthusiasts. After extensive cleaning and validation, we obtained population-scale family trees, including a single pedigree of 13 million individuals. We leveraged the data to partition the genetic architecture of longevity by inspecting millions of relative pairs and to provide insights to population genetics theories on the dispersion of families. We also report a simple digital procedure to overlay other datasets with our resource in order to empower studies with population-scale genealogical data.One Sentence SummaryUsing massive crowd-sourced genealogy data, we created a population-scale family tree resource for scientific studies.
biorxiv genomics 100-200-users 2017