Reports | audiences

Template plasmid integration in germline genome-edited cattle, bioRxiv, 2019-07-29

AbstractWe analyzed publicly available whole genome sequencing data from cattle which were germline genome-edited to introduce polledness. Our analysis discovered the unintended heterozygous integration of the plasmid and a second copy of the repair template sequence, at the target site. Our finding underscores the importance of employing screening methods suited to reliably detect the unintended integration of plasmids and multiple template copies.

biorxiv genomics 100-200-users 2019

Biological Plasticity Rescues Target Activity in CRISPR Knockouts, bioRxiv, 2019-07-27

AbstractGene knockouts (KOs) are efficiently engineered through CRISPR-Cas9-induced frameshift mutations. While DNA editing efficiency is readily verified by DNA sequencing, a systematic understanding of the efficiency of protein elimination has been lacking. Here, we devised an experimental strategy combining RNA-seq and triple-stage mass spectrometry (SPS-MS3) to characterize 193 genetically verified deletions targeting 136 distinct genes generated by CRISPR-induced frameshifts in HAP1 cells. We observed residual protein expression for about one third of the quantified targets, at variable levels from low to original, and identified two causal mechanisms, translation reinitiation leading to N-terminally truncated target proteins, or skipping of the edited exon leading to protein isoforms with internal sequence deletions. Detailed analysis of three truncated targets, BRD4, DNMT1 and NGLY1, revealed partial preservation of protein function. Our results imply that systematic characterization of residual protein expression or function in CRISPR-Cas9 generated KO lines is necessary for phenotype interpretation.

biorxiv bioengineering 100-200-users 2019

Efficient de novo assembly of eleven human genomes using PromethION sequencing and a novel nanopore toolkit, bioRxiv, 2019-07-26

AbstractPresent workflows for producing human genome assemblies from long-read technologies have cost and production time bottlenecks that prohibit efficient scaling to large cohorts. We demonstrate an optimized PromethION nanopore sequencing method for eleven human genomes. The sequencing, performed on one machine in nine days, achieved an average 63x coverage, 42 Kb read N50, 90% median read identity and 6.5x coverage in 100 Kb+ reads using just three flow cells per sample. To assemble these data we introduce new computational tools Shasta - a de novo long read assembler, and MarginPolish & HELEN - a suite of nanopore assembly polishing algorithms. On a single commercial compute node Shasta can produce a complete human genome assembly in under six hours, and MarginPolish & HELEN can polish the result in just over a day, achieving 99.9% identity (QV30) for haploid samples from nanopore reads alone. We evaluate assembly performance for diploid, haploid and trio-binned human samples in terms of accuracy, cost, and time and demonstrate improvements relative to current state-of-the-art methods in all areas. We further show that addition of proximity ligation (Hi-C) sequencing yields near chromosome-level scaffolds for all eleven genomes.

biorxiv bioinformatics 200-500-users 2019

On the discovery of population-specific state transitions from multi-sample multi-condition single-cell RNA sequencing data, bioRxiv, 2019-07-26

AbstractSingle-cell RNA sequencing (scRNA-seq) has quickly become an empowering technology to profile the transcriptomes of individual cells on a large scale. Many early analyses of differential expression have aimed at identifying differences between subpopulations, and thus are focused on finding subpopulation markers either in a single sample or across multiple samples. More generally, such methods can compare expression levels in multiple sets of cells, thus leading to cross-condition analyses. However, given the emergence of replicated multi-condition scRNA-seq datasets, an area of increasing focus is making sample-level inferences, termed here as differential state analysis. For example, one could investigate the condition-specific responses of cell subpopulations measured from patients from each condition; however, it is not clear which statistical framework best handles this situation. In this work, we surveyed the methods available to perform cross-condition differential state analyses, including cell-level mixed models and methods based on aggregated “pseudobulk” data. We developed a flexible simulation platform that mimics both single and multi-sample scRNA-seq data and provide robust tools for multi-condition analysis within the muscat R package.

biorxiv bioinformatics 100-200-users 2019

Advances in epigenetics link genetics to the environment and disease, Nature, 2019-07-24

Epigenetic research has accelerated rapidly in the twenty-first century, generating justified excitement and hope, but also a degree of hype. Here we review how the field has evolved over the last few decades and reflect on some of the recent advances that are changing our understanding of biology. We discuss the interplay between epigenetics and DNA sequence variation as well as the implications of epigenetics for cellular memory and plasticity. We consider the effects of the environment and both intergenerational and transgenerational epigenetic inheritance on biology, disease and evolution. Finally, we present some new frontiers in epigenetics with implications for human health.

nature genetics 500+-users 2019

Innovations in Primate Interneuron Repertoire, bioRxiv, 2019-07-24

ABSTRACTPrimates and rodents, which descended from a common ancestor more than 90 million years ago, exhibit profound differences in behavior and cognitive capacity. Modifications, specializations, and innovations to brain cell types may have occurred along each lineage. We used Drop-seq to profile RNA expression in more than 184,000 individual telencephalic interneurons from humans, macaques, marmosets, and mice. Conserved interneuron types varied significantly in abundance and RNA expression between mice and primates, but varied much more modestly among primates. In adult primates, the expression patterns of dozens of genes exhibited spatial expression gradients among neocortical interneurons, suggesting that adult neocortical interneurons are imprinted by their local cortical context. In addition, we found that an interneuron type previously associated with the mouse hippocampus—the “ivy cell”, which has neurogliaform characteristics—has become abundant across the neocortex of humans, macaques, and marmosets. The most striking innovation was subcortical we identified an abundant striatal interneuron type in primates that had no molecularly homologous cell population in mouse striatum, cortex, thalamus, or hippocampus. These interneurons, which expressed a unique combination of transcription factors, receptors, and neuropeptides, including the neuropeptide TAC3, constituted almost 30% of striatal interneurons in marmosets and humans. Understanding how gene and cell-type attributes changed or persisted over the evolutionary divergence of primates and rodents will guide the choice of models for human brain disorders and mutations and help to identify the cellular substrates of expanded cognition in humans and other primates.

biorxiv neuroscience 100-200-users 2019