Cell cycle dynamics of chromosomal organisation at single-cell resolution, bioRxiv, 2016-12-16
SummaryChromosomes in proliferating metazoan cells undergo dramatic structural metamorphoses every cell cycle, alternating between a highly condensed mitotic structure facilitating chromosome segregation, and a decondensed interphase structure accommodating transcription, gene silencing and DNA replication. These cyclical structural transformations have been evident under the microscope for over a century, but their molecular-level analysis is still lacking. Here we use single-cell Hi-C to study chromosome conformations in thousands of individual cells, and discover a continuum of cis-interaction profiles that finely position individual cells along the cell cycle. We show that chromosomal compartments, topological domains (TADs), contact insulation and long-range loops, all defined by ensemble Hi-C maps, are governed by distinct cell cycle dynamics. In particular, DNA replication correlates with build-up of compartments and reduction in TAD insulation, while loops are generally stable from G1 through S and G2. Analysing whole genome 3D structural models using haploid cell data, we discover a radial architecture of chromosomal compartments with distinct epigenomic signatures. Our single-cell data creates an essential new paradigm for the re-interpretation of chromosome conformation maps through the prism of the cell cycle.
biorxiv genomics 100-200-users 2016De novo Identification of DNA Modifications Enabled by Genome-Guided Nanopore Signal Processing, bioRxiv, 2016-12-16
AbstractAdvances in nanopore sequencing technology have enabled investigation of the full catalogue of covalent DNA modifications. We present the first algorithm for the identification of modified nucleotides without the need for prior training data along with the open source software implementation, nanoraw. Nanoraw accurately assigns contiguous raw nanopore signal to genomic positions, enabling novel data visualization, and increasing power and accuracy for the discovery of covalently modified bases in native DNA. Ground truth case studies utilizing synthetically methylated DNA show the capacity to identify three distinct methylation marks, 4mC, 5mC, and 6mA, in seven distinct sequence contexts without any changes to the algorithm. We demonstrate quantitative reproducibility simultaneously identifying 5mC and 6mA in native E. coli across biological replicates processed in different labs. Finally we propose a pipeline for the comprehensive discovery of DNA modifications in any genome without a priori knowledge of their chemical identities.
biorxiv bioinformatics 100-200-users 2016Creating a universal SNP and small indel variant caller with deep neural networks, bioRxiv, 2016-12-15
AbstractNext-generation sequencing (NGS) is a rapidly evolving set of technologies that can be used to determine the sequence of an individual’s genome1 by calling genetic variants present in an individual using billions of short, errorful sequence reads2. Despite more than a decade of effort and thousands of dedicated researchers, the hand-crafted and parameterized statistical models used for variant calling still produce thousands of errors and missed variants in each genome3,4. Here we show that a deep convolutional neural network5 can call genetic variation in aligned next-generation sequencing read data by learning statistical relationships (likelihoods) between images of read pileups around putative variant sites and ground-truth genotype calls. This approach, called DeepVariant, outperforms existing tools, even winning the “highest performance” award for SNPs in a FDA-administered variant calling challenge. The learned model generalizes across genome builds and even to other mammalian species, allowing non-human sequencing projects to benefit from the wealth of human ground truth data. We further show that, unlike existing tools which perform well on only a specific technology, DeepVariant can learn to call variants in a variety of sequencing technologies and experimental designs, from deep whole genomes from 10X Genomics to Ion Ampliseq exomes. DeepVariant represents a significant step from expert-driven statistical modeling towards more automatic deep learning approaches for developing software to interpret biological instrumentation data.
biorxiv genomics 200-500-users 2016Ten simple rules for structuring papers, bioRxiv, 2016-12-15
AbstractGood scientific writing is essential to career development and to the progress of science. A well-structured manuscript allows readers and reviewers to get excited about the subject matter, to understand and verify the paper’s contributions, and to integrate these contributions into a broader context. However, many scientists struggle with producing high-quality manuscripts and typically get little training in paper writing. Focusing on how readers consume information, we present a set of 10 simple rules to help you get across the main idea of your paper. These rules are designed to make your paper more influential and the process of writing more efficient and pleasurable.
biorxiv scientific-communication-and-education 500+-users 2016C. elegans discriminate colors without eyes or opsins, bioRxiv, 2016-12-09
AbstractHere we establish that, contrary to expectations, Caenorhabditis elegans nematode worms possess a color discrimination system despite lacking any opsin or other known visible light photoreceptor genes. We found that white light guides C. elegans foraging decisions away from harmful bacteria that secrete a blue pigment toxin. Absorption of amber light by this blue pigment toxin alters the color of light sensed by the worm, and thereby triggers an increase in avoidance. By combining narrow-band blue and amber light sources, we demonstrated that detection of the specific blueamber ratio by the worm guides its foraging decision. These behavioral and psychophysical studies thus establish the existence of a color detection system that is distinct from those of other animals.
biorxiv neuroscience 200-500-users 2016Population history of the Sardinian people inferred from whole-genome sequencing, bioRxiv, 2016-12-09
AbstractThe population of the Mediterranean island of Sardinia has made important contributions to genome-wide association studies of traits and diseases. The history of the Sardinian population has also been the focus of much research, and in recent ancient DNA (aDNA) studies, Sardinia has provided unique insight into the peopling of Europe and the spread of agriculture. In this study, we analyze whole-genome sequences of 3,514 Sardinians to address hypotheses regarding the founding of Sardinia and its relation to the peopling of Europe, including examining fine-scale substructure, population size history, and signals of admixture. We find the population of the mountainous Gennargentu region shows elevated genetic isolation with higher levels of ancestry associated with mainland Neolithic farmers and depleted ancestry associated with more recent Bronze Age Steppe migrations on the mainland. Notably, the Gennargentu region also has elevated levels of pre-Neolithic hunter-gatherer ancestry and increased affinity to Basque populations. Further, allele sharing with pre-Neolithic and Neolithic mainland populations is larger on the X chromosome compared to the autosome, providing evidence for a sex-biased demographic history in Sardinia. These results give new insight to the demography of ancestral Sardinians and help further the understanding of sharing of disease risk alleles between Sardinia and mainland populations.
biorxiv genetics 0-100-users 2016