Genome-wide DNA methylation and gene expression patterns reflect genetic ancestry and environmental differences across the Indonesian archipelago, bioRxiv, 2019-07-16
AbstractIndonesia is the world’s fourth most populous country, host to striking levels of human diversity, regional patterns of admixture, and varying degrees of introgression from both Neanderthals and Denisovans. However, it has been largely excluded from the human genomics sequencing boom of the last decade. To serve as a benchmark dataset of molecular phenotypes across the region, we generated genome-wide CpG methylation and gene expression measurements in over 100 individuals from three locations that capture the major genomic and geographical axes of diversity across the Indonesian archipelago. Investigating between- and within-island differences, we find up to 10% of tested genes are differentially expressed between the islands of Mentawai (Sumatra) and New Guinea. Variation in gene expression is closely associated with DNA methylation, with expression levels of 9.7% of genes strongly correlating with nearby CpG methylation, and many of these genes being differentially expressed between islands. Genes identified in our differential expression and methylation analyses are enriched in pathways involved in immunity, highlighting Indonesia tropical role as a source of infectious disease diversity and the strong selective pressures these diseases have exerted on humans. Finally, we identify robust within-island variation in DNA methylation and gene expression, likely driven by very local environmental differences across sampling sites. Together, these results strongly suggest complex relationships between DNA methylation, transcription, archaic hominin introgression and immunity, all jointly shaped by the environment. This has implications for the application of genomic medicine, both in critically understudied Indonesia and globally, and will allow a better understanding of the interacting roles of genomic and environmental factors shaping molecular and complex phenotypes.
biorxiv genomics 0-100-users 2019A single cell framework for multi-omic analysis of disease identifies malignant regulatory signatures in mixed phenotype acute leukemia, bioRxiv, 2019-07-10
AbstractIn order to identify the molecular determinants of human diseases, such as cancer, that arise from a diverse range of tissue, it is necessary to accurately distinguish normal and pathogenic cellular programs.1–3Here we present a novel approach for single-cell multi-omic deconvolution of healthy and pathological molecular signatures within phenotypically heterogeneous malignant cells. By first creating immunophenotypic, transcriptomic and epigenetic single-cell maps of hematopoietic development from healthy peripheral blood and bone marrow mononuclear cells, we identify cancer-specific transcriptional and chromatin signatures from single cells in a cohort of mixed phenotype acute leukemia (MPAL) clinical samples. MPALs are a high-risk subtype of acute leukemia characterized by a heterogeneous malignant cell population expressing both myeloid and lymphoid lineage-specific markers.4, 5Our results reveal widespread heterogeneity in the pathogenetic gene regulatory and expression programs across patients, yet relatively consistent changes within patients even across malignant cells occupying diverse portions of the hematopoietic lineage. An integrative analysis of transcriptomic and epigenetic maps identifies 91,601 putative gene-regulatory interactions and classifies a number of transcription factors that regulate leukemia specific genes, includingRUNX1-linked regulatory elements proximal toCD69. This work provides a template for integrative, multi-omic analysis for the interpretation of pathogenic molecular signatures in the context of developmental origin.
biorxiv genomics 100-200-users 2019Recent evolutionary history of tigers highlights contrasting roles of genetic drift and selection, bioRxiv, 2019-07-09
AbstractTigers are among the most charismatic of endangered species, yet little is known about their evolutionary history. We sequenced 65 individual genomes representing extant tiger geographic range. We found strong genetic differentiation between putative tiger subspecies, divergence within the last 10,000 years, and demographic histories dominated by population bottlenecks. Indian tigers have substantial genetic variation and substructure stemming from population isolation and intense recent bottlenecks here. Despite high genetic diversity across India, individual tigers host longer runs of homozygosity, potentially suggesting recent inbreeding here. Amur tiger genomes revealed the strongest signals of selection and over-representation of gene ontology categories potentially involved in metabolic adaptation to cold. Novel insights highlight the antiquity of northeast Indian tigers. Our results demonstrate recent evolution, with differential isolation, selection and drift in extant tiger populations, providing insights for conservation and future survival.
biorxiv genomics 0-100-users 2019Transcriptome assembly from long-read RNA-seq alignments with StringTie2, bioRxiv, 2019-07-08
AbstractRNA sequencing using the latest single-molecule sequencing instruments produces reads that are thousands of nucleotides long. The ability to assemble these long reads can greatly improve the sensitivity of long-read analyses. Here we present StringTie2, a reference-guided transcriptome assembler that works with both short and long reads. StringTie2 includes new computational methods to handle the high error rate of long-read sequencing technology, which previous assemblers could not tolerate. It also offers the ability to work with full-length super-reads assembled from short reads, which further improves the quality of assemblies. On 33 short-read datasets from humans and two plant species, StringTie2 is 47.3% more precise and 3.9% more sensitive than Scallop. On multiple long read datasets, StringTie2 on average correctly assembles 8.3 and 2.6 times as many transcripts as FLAIR and Traphlor, respectively, with substantially higher precision. StringTie2 is also faster and has a smaller memory footprint than all comparable tools.
biorxiv genomics 100-200-users 2019Linking transcriptome and chromatin accessibility in nanoliter droplets for single-cell sequencing, bioRxiv, 2019-07-04
Linked profiling of transcriptome and chromatin accessibility from single cells can provide unprecedented insights into cellular status. Here we developed a droplet-based Single-Nucleus chromatin Accessibility and mRNA Expression sequencing (SNARE-seq) assay, that we used to profile neonatal and adult mouse cerebral cortices. To demonstrate the strength of single-cell dual-omics profiling, we reconstructed transcriptome and epigenetic landscapes of cell types, uncovered lineage-specific accessible sites, and connected dynamics of promoter accessibility with transcription during neurogenesis.
biorxiv genomics 100-200-users 2019Mutational signatures are jointly shaped by DNA damage and repair, bioRxiv, 2019-06-29
SummaryMutations arise when DNA lesions escape DNA repair. To delineate the contributions of DNA damage and DNA repair deficiency to mutagenesis we sequenced 2,721 genomes of 54 C. elegans strains, each deficient for a specific DNA repair gene and wild-type, upon exposure to 12 different genotoxins. Combining genotoxins and repair deficiency leads to differential mutation rates or new mutational signatures in more than one third of experiments. Translesion synthesis polymerase deficiencies show dramatic and diverging effects. Knockout of Polκ dramatically exacerbates the mutagenicity of alkylating agents; conversely, Polζ deficiency reduces alkylation- and UV-induced substitution rates. Examples of DNA damage-repair deficiency interactions are also found in cancer genomes, although cases of hypermutation are surprisingly rare despite signs of positive selection in a number of DNA repair genes. Nevertheless, cancer risk may be substantially elevated even by small increases in mutagenicity according to evolutionary multi-hit theory. Overall, our data underscore how mutagenesis is a joint product of DNA damage and DNA repair, implying that mutational signatures may be more variable than currently anticipated.Highlights<jatslist list-type=bullet><jatslist-item>Combining exposure to DNA damaging agents and DNA repair deficiency in C. elegans leads to altered mutation rates and new mutational signatures<jatslist-item><jatslist-item>Mutagenic effects of genotoxic exposures are generally exacerbated by DNA repair deficiency<jatslist-item><jatslist-item>Mutagenesis of UVB and alkylating agents is reduced in translesion synthesis polymerase deficiencies<jatslist-item><jatslist-item>Human cancer genomes contain examples of DNA damagerepair interactions, but mutations in DNA repair genes usually only associate with moderate mutator phenotypes, in line with evolutionary theory<jatslist-item>
biorxiv genomics 100-200-users 2019