Determinants of transcription factor regulatory range, bioRxiv, 2019-03-19
AbstractTo characterize the genomic distances over which transcription factors (TFs) influence gene expression, we examined thousands of TF and histone modification ChIP-seq datasets and thousands of gene expression profiles. A model integrating these data revealed two classes of TF one with short-range regulatory influence, the other with long-range regulatory influence. The two TF classes also had distinct chromatin-binding preferences and auto-regulatory properties. The regulatory range of a single TF bound within different topologically associating domains (TADs) depended on intrinsic TAD properties such as local gene density and GC content, but also on the TAD chromatin state in specific cell types. Our results provide evidence that most TFs belong to one of these two functional classes, and that the regulatory range of long-range TFs is chromatin-state dependent. Thus, consideration of TF type, distance-to-target, and chromatin context is likely important in identifying TF regulatory targets and interpreting GWAS and eQTL SNPs.
biorxiv genomics 100-200-users 2019Droplet scRNA-seq is not zero-inflated, bioRxiv, 2019-03-19
Potential users of single cell RNA-sequencing often encounter a choice between high-throughput droplet based methods and high sensitivity plate based methods. In particular there is a widespread belief that single-cell RNA-sequencing will often fail to generate measurements for particular gene, cell pairs due to molecular inefficiencies, causing data to have an overabundance of zero-values. Investigation of published data of technical controls in droplet based single cell RNA-seq experiments demonstrates the number of zeros in the data is consistent with count statistics, indicating that over-abundances of zero-values in biological data are likely due to biological variation as opposed to technical shortcomings.
biorxiv bioinformatics 200-500-users 2019Evolutionary pathways to antibiotic resistance are dependent upon environmental structure and bacterial lifestyle, bioRxiv, 2019-03-19
AbstractBacterial populations vary in their stress tolerance and population structure depending upon whether growth occurs in well-mixed or structured environments. We hypothesized that evolution in biofilms would generate greater genetic diversity than well-mixed environments and lead to different pathways of antibiotic resistance. We used experimental evolution and whole genome sequencing to test how the biofilm lifestyle influenced the rate, genetic mechanisms, and pleiotropic effects of resistance to ciprofloxacin in Acinetobacter baumannii populations. Both evolutionary dynamics and the identities of mutations differed between lifestyle. Planktonic populations experienced selective sweeps of mutations including the primary topoisomerase drug targets, whereas biofilm-adapted populations acquired mutations in regulators of efflux pumps. An overall trade-off between fitness and resistance level emerged, wherein biofilm-adapted clones were less resistant than planktonic but more fit in the absence of drug. However, biofilm populations developed collateral sensitivity to cephalosporins, demonstrating the clinical relevance of lifestyle on the evolution of resistance.
biorxiv microbiology 0-100-users 2019Gene regulatory network reconstruction using single-cell RNA sequencing of barcoded genotypes in diverse environments, bioRxiv, 2019-03-19
AbstractUnderstanding how gene expression programs are controlled requires identifying regulatory relationships between transcription factors and target genes. Gene regulatory networks are typically constructed from gene expression data acquired following genetic perturbation or environmental stimulus. Single-cell RNA sequencing (scRNAseq) captures the gene expression state of thousands of individual cells in a single experiment, offering advantages in combinatorial experimental design, large numbers of independent measurements, and accessing the interaction between the cell cycle and environmental responses that is hidden by population-level analysis of gene expression. To leverage these advantages, we developed a method for transcriptionally barcoding gene deletion mutants and performing scRNAseq in budding yeast (Saccharomyces cerevisiae). We pooled diverse genotypes in 11 different environmental conditions and determined their expression state by sequencing 38,285 individual cells. We developed, and benchmarked, a framework for learning gene regulatory networks from scRNAseq data that incorporates multitask learning and constructed a global gene regulatory network comprising 12,018 interactions. Our study establishes a general approach to gene regulatory network reconstruction from scRNAseq data that can be employed in any organism.
biorxiv genomics 0-100-users 2019Predicting the effects of SNPs on transcription factor binding affinity, bioRxiv, 2019-03-19
AbstractGWAS have revealed that 88% of disease associated SNPs reside in noncoding regions. However, noncoding SNPs remain understudied, partly because they are challenging to prioritize for experimental validation. To address this deficiency, we developed the SNP effect matrix pipeline (SEMpl). SEMpl estimates transcription factor binding affinity by observing differences in ChIP-seq signal intensity for SNPs within functional transcription factor binding sites genome-wide. By cataloging the effects of every possible mutation within the transcription factor binding site motif, SEMpl can predict the consequences of SNPs to transcription factor binding. This knowledge can be used to identify potential disease-causing regulatory loci.
biorxiv bioinformatics 0-100-users 2019Ribosome profiling at isoform level reveals an evolutionary conserved impact of differential splicing on the proteome, bioRxiv, 2019-03-19
AbstractThe differential production of transcript isoforms from gene loci is a key cellular mechanism. Yet, its impact in protein production remains an open question. Here, we describe ORQAS (ORF quantification pipeline for alternative splicing) a new pipeline for the translation quantification of individual transcript isoforms using ribosome-protected mRNA fragments (Ribosome profiling). We found evidence of translation for 40-50% of the expressed transcript isoforms in human and mouse, with 53% of the expressed genes having more than one translated isoform in human, 33% in mouse. Differential analysis revealed that about 40% of the splicing changes at RNA level were concordant with changes in translation, with 21.7% of changes at RNA level and 17.8% at translational level conserved between human and mouse. Furthermore, orthologous cassette exons preserving the directionality of the change were found enriched in microexons in a comparison between glia and glioma, and were conserved between human and mouse. ORQAS leverages ribosome profiling to uncover a widespread and evolutionary conserved impact of differential splicing on the translation of isoforms and in particular, of microexon-containing ones. ORQAS is available at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsgithub.comcomprnaorqas>httpsgithub.comcomprnaorqas<jatsext-link>
biorxiv genomics 100-200-users 2019