Flexible analysis of transcriptome assemblies with Ballgown, bioRxiv, 2014-03-31
We have built a statistical package called Ballgown for estimating differential expression of genes, transcripts, or exons from RNA sequencing experiments. Ballgown is designed to work with the popular Cufflinks transcript assembly software and uses well-motivated statistical methods to provide estimates of changes in expression. It permits statistical analysis at the transcript level for a wide variety of experimental designs, allows adjustment for confounders, and handles studies with continuous covariates. Ballgown provides improved statistical significance estimates as compared to the Cuffdiff differential expression tool included with Cufflinks. We demonstrate the flexibility of the Ballgown package by re-analyzing 667 samples from the GEUVADIS study to identify transcript-level eQTLs and identify non-linear artifacts in transcript data. Our package is freely available from httpsgithub.comalyssafrazeeballgown
biorxiv bioinformatics 0-100-users 2014Towards a new history and geography of human genes informed by ancient DNA, bioRxiv, 2014-03-22
Genetic information contains a record of the history of our species, and technological advances have transformed our ability to access this record. Many studies have used genome-wide data from populations today to learn about the peopling of the globe and subsequent adaptation to local conditions. Implicit in this research is the assumption that the geographic locations of people today are informative about the geographic locations of their ancestors in the distant past. However, it is now clear that long-range migration, admixture and population replacement have been the rule rather than the exception in human history. In light of this, we argue that it is time to critically re-evaluate current views of the peopling of the globe and the importance of natural selection in determining the geographic distribution of phenotypes. We specifically highlight the transformative potential of ancient DNA. By accessing the genetic make-up of populations living at archaeologically-known times and places, ancient DNA makes it possible to directly track migrations and responses to natural selection.
biorxiv evolutionary-biology 0-100-users 2014Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2, bioRxiv, 2014-02-20
In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq data, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data. DESeq2 uses shrinkage estimation for dispersions and fold changes to improve stability and interpretability of the estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression and facilitates downstream tasks such as gene ranking and visualization. DESeq2 is available as an RBioconductor package.
biorxiv bioinformatics 0-100-users 2014Analysis of the study of the cerebellar pinceau by Korn and Axelrad, bioRxiv, 2013-12-04
The axon initial segment of each cerebellar Purkinje cell is ensheathed by basket cell axons in a structure called the pinceau, which is largely devoid of chemical synapses and gap junctions. These facts and ultrastructural similarities with the axon cap of the teleost Mauthner cell led to the conjecture that the pinceau mediates ephaptic (via the extracellular field) inhibition. Korn and Axelrad published a study in 1980 in which they reported confirmation of this conjecture. We have analysed their results and show that most are likely to be explained by an artefactual signal arising from the massive stimulation of parallel fibres they employed. We reproduce their experiments and confirm that all of their results are consistent with this artefact. Their data therefore provide no evidence regarding the operation of the pinceau.
biorxiv neuroscience 0-100-users 2013Human genetics and clinical aspects of neurodevelopmental disorders, bioRxiv, 2013-11-30
There are ~12 billion nucleotides in every cell of the human body, and there are ~25-100 trillion cells in each human body. Given somatic mosaicism, epigenetic changes and environmental differences, no two human beings are the same, particularly as there are only ~7 billion people on the planet. One of the next great challenges for studying human genetics will be to acknowledge and embrace complexity. Every human is unique, and the study of human disease phenotypes (and phenotypes in general) will be greatly enriched by moving from a deterministic to a more stochasticprobabilistic model. The dichotomous distinction between simple and complex diseases is completely artificial, and we argue instead for a model that considers a spectrum of diseases that are variably manifesting in each person. The rapid adoption of whole genome sequencing (WGS) and the Internet-mediated networking of people promise to yield more insight into this century-old debate. Comprehensive ancestry tracking and detailed family history data, when combined with WGS or at least cascade-carrier screening, might eventually facilitate a degree of genetic prediction for some diseases in the context of their familial and ancestral etiologies. However, it is important to remain humble, as our current state of knowledge is not yet sufficient, and in principle, any number of nucleotides in the genome, if mutated or modified in a certain way and at a certain time and place, might influence some phenotype during embryogenesis or postnatal life.
biorxiv genetics 0-100-users 2013