A revised model for promoter competition based on multi-way chromatin interactions, bioRxiv, 2019-04-18
AbstractSpecific communication between gene promoters and enhancers is critical for accurate regulation of gene expression. However, it remains unclear how specific interactions between multiple regulatory elements and genes contained within a single chromatin domain are coordinated. Recent technological advances allow for the investigation of multi-way chromatin interactions at single alleles in individual nuclei. This can provide insights into how multiple regulatory elements cooperate or compete for transcriptional activation. We have used these techniques in a mouse model in which the α-globin domain is extended to include several additional genes. This allows us to determine how the interactions of the α-globin super-enhancer are distributed between multiple promoters in a single domain. Our data show that gene promoters do not form mutually exclusive interactions with the super-enhancer, but all interact simultaneously in a single complex. These finding show that promoters within the same domain do not structurally compete for interactions with enhancers, but form a regulatory hub structure, consistent with the recent model of transcriptional activation in phase-separated nuclear condensates.
biorxiv genomics 100-200-users 2019Droplet-based combinatorial indexing for massive scale single-cell epigenomics, bioRxiv, 2019-04-18
AbstractWhile recent technical advancements have facilitated the mapping of epigenomes at single-cell resolution, the throughput and quality of these methods have limited the widespread adoption of these technologies. Here, we describe a droplet microfluidics platform for single-cell assay for transposase accessible chromatin (scATAC-seq) for high-throughput single-cell profiling of chromatin accessibility. We use this approach for the unbiased discovery of cell types and regulatory elements within the mouse brain. Further, we extend the throughput of this approach by pairing combinatorial indexing with droplet microfluidics, enabling single-cell studies at a massive scale. With this approach, we measure chromatin accessibility across resting and stimulated human bone marrow derived cells to reveal changes in the cis- and trans- regulatory landscape across cell types and upon stimulation conditions at single-cell resolution. Altogether, we describe a total of 502,207 single-cell profiles, demonstrating the scalability and flexibility of this droplet-based platform.
biorxiv genomics 200-500-users 2019Massively parallel single-cell chromatin landscapes of human immune cell development and intratumoral T cell exhaustion, bioRxiv, 2019-04-18
AbstractUnderstanding complex tissues requires single-cell deconstruction of gene regulation with precision and scale. Here we present a massively parallel droplet-based platform for mapping transposase-accessible chromatin in tens of thousands of single cells per sample (scATAC-seq). We obtain and analyze chromatin profiles of over 200,000 single cells in two primary human systems. In blood, scATAC-seq allows marker-free identification of cell type-specific cis- and trans-regulatory elements, mapping of disease-associated enhancer activity, and reconstruction of trajectories of differentiation from progenitors to diverse and rare immune cell types. In basal cell carcinoma, scATAC-seq reveals regulatory landscapes of malignant, stromal, and immune cell types in the tumor microenvironment. Moreover, scATAC-seq of serial tumor biopsies before and after PD-1 blockade allows identification of chromatin regulators and differentiation trajectories of therapy-responsive intratumoral T cell subsets, revealing a shared regulatory program driving CD8+ T cell exhaustion and CD4+ T follicular helper cell development. We anticipate that droplet-based single-cell chromatin accessibility will provide a broadly applicable means of identifying regulatory factors and elements that underlie cell type and function.
biorxiv genomics 200-500-users 2019Extensive impact of low-frequency variants on the phenotypic landscape at population-scale, bioRxiv, 2019-04-16
AbstractGenome-wide association studies (GWAS) allows to dissect the genetic basis of complex traits at the population level1. However, despite the extensive number of trait-associated loci found, they often fail to explain a large part of the observed phenotypic variance2–4. One potential source of this discrepancy could be the preponderance of undetected low-frequency genetic variants in natural populations5,6. To increase the allele frequency of those variants and assess their phenotypic effects at the population level, we generated a diallel panel consisting of 3,025 hybrids, derived from pairwise crosses between a subset of natural isolates from a completely sequenced 1,011 Saccharomyces cerevisiae population. We examined each hybrid across a large number of growth traits, resulting in a total of 148,225 crosstrait combinations. Parental versus hybrid regression analysis showed that while most phenotypic variance is explained by additivity, a significant proportion (29%) is governed by non-additive effects. This is confirmed by the fact that a majority of complete dominance is observed in 25% of the traits. By performing GWAS on the diallel panel, we detected 1,723 significantly associated genetic variants, with 16.3% of them being low-frequency variants in the initial population. These variants, which would not be detected using classical GWAS, explain 21% of the phenotypic variance on average. Altogether, our results demonstrate that low-frequency variants should be accounted for as they contribute to a large part of the phenotypic variation observed in a population.
biorxiv genomics 100-200-users 2019High accuracy DNA sequencing on a small, scalable platform via electrical detection of single base incorporations, bioRxiv, 2019-04-16
AbstractHigh throughput DNA sequencing technologies have undergone tremendous development over the past decade. Although optical detection-based sequencing has constituted the majority of data output, it requires a large capital investment and aggregation of samples to achieve optimal cost per sample. We have developed a novel electronic detection-based platform capable of accurately detecting single base incorporations. The GenapSys technology with its electronic detection modality allows the system to be compact, accessible, and affordable. We demonstrate the performance of the system by sequencing several different microbial genomes with varying GC content. The platform is capable of generating 1.5 Gb of high-quality nucleic acid sequence in a single run. We routinely generate sequence data that exceeds 99% raw accuracy with read lengths of up to 175 bp. The utility of the platform is highlighted by targeted sequencing of the human genome. We show high concordance of SNP detection on the human NA12878 HapMap cell line with data generated on the Illumina sequencing platform. In addition, we sequenced a targeted panel of cancer-associated genes in a well characterized reference standard. With multiple library preparation approaches on this sample, we were able to identify low frequency mutations at expected allele frequencies.
biorxiv genomics 100-200-users 2019Loss-of-function tolerance of enhancers in the human genome, bioRxiv, 2019-04-14
AbstractPrevious studies have surveyed the potential impact of loss-of-function (LoF) variants and identified LoF-tolerant protein-coding genes. However, the tolerance of human genomes to losing enhancers has not yet been evaluated. Here we present the catalog of LoF-tolerant enhancers using structural variants from whole-genome sequences. Using a conservative approach, we estimate that each individual human genome possesses at least 28 LoF-tolerant enhancers on average. We assessed the properties of LoF-tolerant enhancers in a unified regulatory network constructed by integrating tissue-specific enhancers and gene-gene interactions. We find that LoF-tolerant enhancers are more tissue-specific and regulate fewer and more dispensable genes. They are enriched in immune-related cells while LoF-intolerant enhancers are enriched in kidney and brainneuronal stem cells. We developed a supervised learning approach to predict the LoF-tolerance of enhancers, which achieved an AUROC of 96%. We predict 5,677 more enhancers would be likely tolerant to LoF and 75 enhancers that would be highly LoF-intolerant. Our predictions are supported by known set of disease enhancers and novel deletions from PacBio sequencing. The LoF-tolerance scores provided here will serve as an important reference for disease studies.
biorxiv genomics 0-100-users 2019