Diversification and collapse of a telomere elongation mechanism, bioRxiv, 2018-10-18

AbstractIn virtually all eukaryotes, telomerase counteracts chromosome erosion by adding repetitive sequence to terminal ends. Drosophila melanogaster instead relies on specialized retrotransposons that insert preferentially at telomeres. This exchange of goods between host and mobile element—wherein the mobile element provides an essential genome service and the host provides a hospitable niche for mobile element propagation—has been called a ‘genomic symbiosis’. However, these telomere-specialized, ‘jockey’ family elements may actually evolve to selfishly over-replicate in the genomes that they ostensibly serve. Under this intra-genomic conflict model, we expect rapid diversification of telomere-specialized retrotransposon lineages and possibly, the breakdown of this tenuous relationship. Here we report data consistent with both predictions. Searching the raw reads of the 15-million-year-old ‘melanogaster species group’, we generated de novo jockey retrotransposon consensus sequences and used phylogenetic tree-building to delineate four distinct telomere-associated lineages. Recurrent gains, losses, and replacements account for this striking retrotransposon lineage diversity. Moreover, an ancestrally telomere-specialized element has ‘escaped,’ residing now throughout the genome of D. rhopaloa. In D. biarmipes, telomere-specialized elements have disappeared completely. De novo assembly of long-reads and cytogenetics confirmed this species-specific collapse of retrotransposon-dependent telomere elongation. Instead, telomere-restricted satellite DNA and DNA transposon fragments occupy its terminal ends. We infer that D. biarmipes relies instead on a recombination-based mechanism conserved from yeast to flies to humans. Combined with previous reports of adaptive evolution at host proteins that regulate telomere length, telomere-associated retrotransposon diversification and disappearance offer compelling evidence that intra-genomic conflict shapes Drosophila telomere evolution.

biorxiv evolutionary-biology 0-100-users 2018

Mutation detection in thousands of acute myeloid leukemia cells using single cell RNA-sequencing, bioRxiv, 2018-10-18

AbstractVirtually all tumors are genetically heterogeneous, containing subclonal populations of cells that are defined by distinct mutations1. Subclones can have unique phenotypes that influence disease progression2, but these phenotypes are difficult to characterize subclones usually cannot be physically purified, and bulk gene expression measurements obscure interclonal differences. Single-cell RNA-sequencing has revealed transcriptional heterogeneity within a variety of tumor types, but it is unclear how this expression heterogeneity relates to subclonal genetic events – for example, whether particular expression clusters correspond to mutationally defined subclones3,4,5,6-9. To address this question, we developed an approach that integrates enhanced whole genome sequencing (eWGS) with the 10x Genomics Chromium Single Cell 5’ Gene Expression workflow (scRNA-seq) to directly link expressed mutations with transcriptional profiles at single cell resolution. Using bone marrow samples from five cases of primary human Acute Myeloid Leukemia (AML), we generated WGS and scRNA-seq data for each case. Duplicate single cell libraries representing a median of 20,474 cells per case were generated from the bone marrow of each patient. Although the libraries were 5’ biased, we detected expressed mutations in cDNAs at distances up to 10 kbp from the 5’ ends of well-expressed genes, allowing us to identify hundreds to thousands of cells with AML-specific somatic mutations in every case. This data made it possible to distinguish AML cells (including normal-karyotype AML cells) from surrounding normal cells, to study tumor differentiation and intratumoral expression heterogeneity, to identify expression signatures associated with subclonal mutations, and to find cell surface markers that could be used to purify subclones for further study. The data also revealed transcriptional heterogeneity that occurred independently of subclonal mutations, suggesting that additional factors drive epigenetic heterogeneity. This integrative approach for connecting genotype to phenotype in AML cells is broadly applicable for analysis of any sample that is phenotypically and genetically heterogeneous.

biorxiv cancer-biology 100-200-users 2018

A computational framework for systematic exploration of biosynthetic diversity from large-scale genomic data, bioRxiv, 2018-10-17

AbstractGenome mining has become a key technology to explore and exploit natural product diversity through the identification and analysis of biosynthetic gene clusters (BGCs). Initially, this was performed on a single-genome basis; currently, the process is being scaled up to large-scale mining of pan-genomes of entire genera, complete strain collections and metagenomic datasets from which thousands of bacterial genomes can be extracted at once. However, no bioinformatic framework is currently available for the effective analysis of datasets of this size and complexity. Here, we provide a streamlined computational workflow, tightly integrated with antiSMASH and MIBiG, that consists of two new software tools, BiG-SCAPE and CORASON. BiG-SCAPE facilitates rapid calculation and interactive visual exploration of BGC sequence similarity networks, grouping gene clusters at multiple hierarchical levels, and includes a ‘glocal’ alignment mode that accurately groups both complete and fragmented BGCs. CORASON employs a phylogenomic approach to elucidate the detailed evolutionary relationships between gene clusters by computing high-resolution multi-locus phylogenies of all BGCs within and across gene cluster families (GCFs), and allows researchers to comprehensively identify all genomic contexts in which particular biosynthetic gene cassettes are found. We validate BiG-SCAPE by correlating its GCF output to metabolomic data across 403 actinobacterial strains. Furthermore, we demonstrate the discovery potential of the platform by using CORASON to comprehensively map the phylogenetic diversity of the large detoxinrimosamide gene cluster clan, prioritizing three new detoxin families for subsequent characterization of six new analogs using isotopic labeling and analysis of tandem mass spectrometric data.

biorxiv bioinformatics 100-200-users 2018

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo