Rapid de novo assembly of the European eel genome from nanopore sequencing reads, bioRxiv, 2017-01-21
AbstractWe have sequenced the genome of the endangered European eel using the MinION by Oxford Nanopore, and assembled these data using a novel algorithm specifically designed for large eukaryotic genomes. For this 860 Mbp genome, the entire computational process takes two days on a single CPU. The resulting genome assembly significantly improves on a previous draft based on short reads only, both in terms of contiguity (N50 1.2 Mbp) and structural quality. This combination of affordable nanopore sequencing and light-weight assembly promises to make high-quality genomic resources accessible for many non-model plants and animals.
biorxiv genomics 100-200-users 2017Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples, bioRxiv, 2017-01-10
Genome sequencing has become a powerful tool for studying emerging infectious diseases; however, genome sequencing directly from clinical samples without isolation remains challenging for viruses such as Zika, where metagenomic sequencing methods may generate insufficient numbers of viral reads. Here we present a protocol for generating coding-sequence complete genomes comprising an online primer design tool, a novel multiplex PCR enrichment protocol, optimised library preparation methods for the portable MinION sequencer (Oxford Nanopore Technologies) and the Illumina range of instruments, and a bioinformatics pipeline for generating consensus sequences. The MinION protocol does not require an internet connection for analysis, making it suitable for field applications with limited connectivity. Our method relies on multiplex PCR for targeted enrichment of viral genomes from samples containing as few as 50 genome copies per reaction. Viral consensus sequences can be achieved starting with clinical samples in 1-2 days following a simple laboratory workflow. This method has been successfully used by several groups studying Zika virus evolution and is facilitating an understanding of the spread of the virus in the Americas.
biorxiv genomics 100-200-users 2017Whole genome sequencing and assembly of a Caenorhabditis elegans genome with complex genomic rearrangements using the MinION sequencing device, bioRxiv, 2017-01-09
ABSTRACTAdvances in 3rd generation sequencing have opened new possibilities for ‘benchtop’ whole genome sequencing. The MinION is a portable device that uses nanopore technology and can sequence long DNA molecules. MinION long reads are well suited for sequencing and de novo assembly of complex genomes with large repetitive elements. Long reads also facilitate the identification of complex genomic rearrangements such as those observed in tumor genomes. To assess the feasibility of the de novo assembly of large complex genomes using both MinION and Illumina platforms, we sequenced the genome of a Caenorhabditis elegans strain that contains a complex acetaldehyde-induced rearrangement and a biolistic bombardment-mediated insertion of a GFP containing plasmid. Using ∼5.8 gigabases of MinION sequence data, we were able to assemble a C. elegans genome containing 145 contigs (N50 contig length = 1.22 Mb) that covered >99% of the 100,286,401 bp reference genome. In contrast, using ∼8.04 gigabases of Illumina sequence data, we were able to assemble a C. elegans genome in 38,645 contigs (N50 contig length = ∼26 kb) containing 117 Mb. From the MinION genome assembly we identified the complex structures of both the acetaldehyde-induced mutation and the biolistic-mediated insertion. To date, this is the largest genome to be assembled exclusively from MinION data and is the first demonstration that the long reads of MinION sequencing can be used for whole genome assembly of large (100 Mb) genomes and the elucidation of complex genomic rearrangements.
biorxiv genomics 100-200-users 2017Targeted degradation of CTCF decouples local insulation of chromosome domains from higher-order genomic compartmentalization, bioRxiv, 2016-12-22
The molecular mechanisms underlying folding of mammalian chromosomes remain poorly understood. The transcription factor CTCF is a candidate regulator of chromosomal structure. Using the auxin-inducible degron system in mouse embryonic stem cells, we show that CTCF is absolutely and dose-dependently required for looping between CTCF target sites and segmental organization into topologically associating domains (TADs). Restoring CTCF reinstates proper architecture on altered chromosomes, indicating a powerful instructive function for CTCF in chromatin folding, and CTCF remains essential for TAD organization in non-dividing cells. Surprisingly, active and inactive genome compartments remain properly segregated upon CTCF depletion, revealing that compartmentalization of mammalian chromosomes emerges independently of proper insulation of TADs. Further, our data supports that CTCF mediates transcriptional insulator function through enhancer-blocking but not direct chromatin barrier activity. These results define the functions of CTCF in chromosome folding, and provide new fundamental insights into the rules governing mammalian genome organization.
biorxiv genomics 200-500-users 2016Improved maize reference genome with single molecule technologies, bioRxiv, 2016-12-20
ABSTRACTComplete and accurate reference genomes and annotations provide fundamental tools for characterization of genetic and functional variation. These resources facilitate elucidation of biological processes and support translation of research findings into improved and sustainable agricultural technologies. Many reference genomes for crop plants have been generated over the past decade, but these genomes are often fragmented and missing complex repeat regions. Here, we report the assembly and annotation of maize, a genetic and agricultural model species, using Single Molecule Real-Time (SMRT) sequencing and high-resolution optical mapping. Relative to the previous reference genome, our assembly features a 52-fold increase in contig length and significant improvements in the assembly of intergenic spaces and centromeres. Characterization of the repetitive portion of the genome revealed over 130,000 intact transposable elements (TEs), allowing us to identify TE lineage expansions unique to maize. Gene annotations were updated using 111,000 full-length transcripts obtained by SMRT sequencing. In addition, comparative optical mapping of two other inbreds revealed a prevalence of deletions in the low gene density region and maize lineage-specific genes.
biorxiv genomics 100-200-users 2016Cell cycle dynamics of chromosomal organisation at single-cell resolution, bioRxiv, 2016-12-16
SummaryChromosomes in proliferating metazoan cells undergo dramatic structural metamorphoses every cell cycle, alternating between a highly condensed mitotic structure facilitating chromosome segregation, and a decondensed interphase structure accommodating transcription, gene silencing and DNA replication. These cyclical structural transformations have been evident under the microscope for over a century, but their molecular-level analysis is still lacking. Here we use single-cell Hi-C to study chromosome conformations in thousands of individual cells, and discover a continuum of cis-interaction profiles that finely position individual cells along the cell cycle. We show that chromosomal compartments, topological domains (TADs), contact insulation and long-range loops, all defined by ensemble Hi-C maps, are governed by distinct cell cycle dynamics. In particular, DNA replication correlates with build-up of compartments and reduction in TAD insulation, while loops are generally stable from G1 through S and G2. Analysing whole genome 3D structural models using haploid cell data, we discover a radial architecture of chromosomal compartments with distinct epigenomic signatures. Our single-cell data creates an essential new paradigm for the re-interpretation of chromosome conformation maps through the prism of the cell cycle.
biorxiv genomics 100-200-users 2016