A Chromosome-Scale Assembly of the En ormous (32 Gb) Axolotl Genome, bioRxiv, 2018-07-20
ABSTRACTThe axolotl (Ambystoma mexicanum) provides critical models for studying regeneration, evolution and development. However, its large genome (~32 gigabases) presents a formidable barrier to genetic analyses. Recent efforts have yielded genome assemblies consisting of thousands of unordered scaffolds that resolve gene structures, but do not yet permit large scale analyses of genome structure and function. We adapted an established mapping approach to leverage dense SNP typing information and for the first time assemble the axolotl genome into 14 chromosomes. Moreover, we used fluorescence in situ hybridization to verify the structure of these 14 scaffolds and assign each to its corresponding physical chromosome. This new assembly covers 27.3 gigabases and encompasses 94% of annotated gene models on chromosomal scaffolds. We show the assembly’s utility by resolving genome-wide orthologies between the axolotl and other vertebrates, identifying the footprints of historical introgression events that occurred during the development of axolotl genetic stocks, and precisely mapping several phenotypes including a large deletion underlying the cardiac mutant. This chromosome-scale assembly will greatly facilitate studies of the axolotl in biological research.
biorxiv genomics 0-100-users 2018Combined quantification of intracellular (phospho-)proteins and transcriptomics from fixed single cells, bioRxiv, 2018-06-27
AbstractEnvironmental stimuli often lead to heterogeneous cellular responses and transcriptional output. We developed single-cell RNA and Immunodetection (RAID) to allow combined analysis the transcriptome and intracellular (phospho-)proteins from fixed single cells. RAID successfully recapitulated differentiation-state changes at the protein and mRNA level in human keratinocytes. Furthermore, we show that differentiated keratinocytes that retain high phosphorylated FAK levels, a feature associated with stem cells, also express a selection of stem cell associated transcripts. Our data demonstrates that RAID allows investigation of heterogenenous cellular responses to environmental signals at the mRNA and phospho-proteome level.
biorxiv genomics 100-200-users 2018Up, down, and out optimized libraries for CRISPRa, CRISPRi, and CRISPR-knockout genetic screens, bioRxiv, 2018-06-27
ABSTRACTAdvances in CRISPR-Cas9 technology have enabled the flexible modulation of gene expression at large scale. In particular, the creation of genome-wide libraries for CRISPR knockout (CRISPRko), CRISPR interference (CRISPRi), and CRISPR activation (CRISPRa) has allowed gene function to be systematically interrogated. Here, we evaluate numerous CRISPRko libraries and show that our recently-described CRISPRko library (Brunello) is more effective than previously published libraries at distinguishing essential and non-essential genes, providing approximately the same perturbation-level performance improvement over GeCKO libraries as GeCKO provided over RNAi. Additionally, we developed genome-wide libraries for CRISPRi (Dolcetto) and CRISPRa (Calabrese). Negative selection screens showed that Dolcetto substantially outperforms existing CRISPRi libraries with fewer sgRNAs per gene and achieves comparable performance to CRISPRko in the detection of gold-standard essential genes. We also conducted positive selection CRISPRa screens and show that Calabrese outperforms the SAM library approach at detecting vemurafenib resistance genes. We further compare CRISPRa to genome-scale libraries of open reading frames (ORFs). Together, these libraries represent a suite of genome-wide tools to efficiently interrogate gene function with multiple modalities.tracr
biorxiv genomics 0-100-users 2018Re-identification of genomic data using long range familial searches, bioRxiv, 2018-06-18
AbstractConsumer genomics databases reached the scale of millions of individuals. Recently, law enforcement investigators have started to exploit some of these databases to find distant familial relatives, which can lead to a complete re-identification. Here, we leveraged genomic data of 600,000 individuals tested with consumer genomics to investigate the power of such long-range familial searches. We project that half of the searches with European-descent individuals will result with a third cousin or closer match and will provide a search space small enough to permit re-identification using common demographic identifiers. Moreover, in the near future, virtually any European-descent US person could be implicated by this technique. We propose a potential mitigation strategy based on cryptographic signature that can resolve the issue and discuss policy implications to human subject research.
biorxiv genomics 500+-users 2018Antimicrobial exposure in sexual networks drives divergent evolution in modern gonococci, bioRxiv, 2018-05-31
AbstractThe sexually transmitted pathogen Neisseria gonorrhoeae is regarded as being on the way to becoming an untreatable superbug. Despite its clinical importance, little is known about its emergence and evolution, and how this corresponds with the introduction of antimicrobials. We present a genome-based phylogeographic analysis of 419 gonococcal isolates from across the globe. Results indicate that modern gonococci originated in Europe or Africa as late as the 16thcentury and subsequently disseminated globally. We provide evidence that the modern gonococcal population has been shaped by antimicrobial treatment of sexually transmitted and other infections, leading to the emergence of two major lineages with different evolutionary strategies. The well-described multi-resistant lineage is associated with high rates of homologous recombination and infection in high-risk sexual networks where antimicrobial treatment is frequent. A second, multi-susceptible lineage associated with heterosexual networks, where asymptomatic infection is more common, was also identified, with potential implications for infection control.
biorxiv genomics 0-100-users 2018Thousands of large-scale RNA sequencing experiments yield a comprehensive new human gene list and reveal extensive transcriptional noise, bioRxiv, 2018-05-28
AbstractWe assembled the sequences from 9,795 RNA sequencing experiments, collected from 31 human tissues and hundreds of subjects as part of the GTEx project, to create a new, comprehensive catalog of human genes and transcripts. The new human gene database contains 43,162 genes, of which 21,306 are protein-coding and 21,856 are noncoding, and a total of 323,824 transcripts, for an average of 7.5 transcripts per gene. Our expanded gene list includes 4,998 novel genes (1,178 coding and 3,819 noncoding) and 97,511 novel splice variants of protein-coding genes as compared to the most recent human gene catalogs. We detected over 30 million additional transcripts at more than 650,000 sites, nearly all of which are likely to be nonfunctional, revealing a heretofore unappreciated amount of transcriptional noise in human cells.
biorxiv genomics 500+-users 2018