The interaction landscape between transcription factors and the nucleosome, bioRxiv, 2017-12-29
Nucleosomes cover most of the genome and are thought to be displaced by transcription factors (TFs) in regions that direct gene expression. However, the modes of interaction between TFs and nucleosomal DNA remain largely unknown. Here, we use nucleosome consecutive affinity-purification systematic evolution of ligands by exponential enrichment (NCAP-SELEX) to systematically explore interactions between the nucleosome and 220 TFs representing diverse structural families. Consistently with earlier observations, we find that the vast majority of TFs have less access to nucleosomal DNA than to free DNA. The motifs recovered from TFs bound to nucleosomal and free DNA are generally similar; however, steric hindrance and scaffolding by the nucleosome result in specific positioning and orientation of the motifs. Many TFs preferentially bind close to the end of nucleosomal DNA, or to periodic positions at its solvent-exposed side. TFs often also bind nucleosomal DNA in a particular orientation, because the nucleosome breaks the local rotational symmetry of DNA. Some TFs also specifically interact with DNA located at the dyad position where only one DNA gyre is wound, whereas other TFs prefer sites spanning two DNA gyres and bind specifically to each of them. Our work reveals striking differences in TF binding to free and nucleosomal DNA, and uncovers a rich interaction landscape between the TFs and the nucleosome.
biorxiv systems-biology 100-200-users 2017Rethinking dopamine as generalized prediction error, bioRxiv, 2017-12-26
AbstractMidbrain dopamine neurons are commonly thought to report a reward prediction error, as hypothesized by reinforcement learning theory. While this theory has been highly successful, several lines of evidence suggest that dopamine activity also encodes sensory prediction errors unrelated to reward. Here we develop a new theory of dopamine function that embraces a broader conceptualization of prediction errors. By signaling errors in both sensory and reward predictions, dopamine supports a form of reinforcement learning that lies between model-based and model-free algorithms. This account remains consistent with current canon regarding the correspondence between dopamine transients and reward prediction errors, while also accounting for new data suggesting a role for these signals in phenomena such as sensory preconditioning and identity unblocking, which ostensibly draw upon knowledge beyond reward predictions.
biorxiv neuroscience 100-200-users 2017The null additivity of multi-drug combinations, bioRxiv, 2017-12-26
AbstractFrom natural ecology 1–4 to clinical therapy 5–8, cells are often exposed to mixtures of multiple drugs. Two competing null models are used to predict the combined effect of drugs response additivity (Bliss) and dosage additivity (Loewe) 9–11. Here, noting that these models diverge with increased number of drugs, we contrast their predictions with measurements of Escherichia coli growth under combinations of up to 10 different antibiotics. As the number of drugs increases, Bliss maintains accuracy while Loewe systematically loses its predictive power. The total dosage required for growth inhibition, which Loewe predicts should be fixed, steadily increases with the number of drugs, following a square root scaling. This scaling is explained by an approximation to Bliss where, inspired by RA Fisher’s classical geometric model 12, dosages of independent drugs adds up as orthogonal vectors rather than linearly. This dose-orthogonality approximation provides results similar to Bliss, yet uses the dosage language as Loewe and is hence easier to implement and intuit. The rejection of dosage additivity in favor of effect additivity and dosage orthogonality provides a framework for understanding how multiple drugs and stressors add up in nature and the clinic.
biorxiv systems-biology 100-200-users 2017Contact-dependent cell-cell communications drive morphological invariance during ascidian embryogenesis, bioRxiv, 2017-12-25
ABSTRACTCanalization of developmental processes ensures the reproducibility and robustness of embryogenesis within each species. In its extreme form, found in ascidians, early embryonic cell lineages are invariant between embryos within and between species, despite rapid genomic divergence. To resolve this paradox, we used live light-sheet imaging to quantify individual cell behaviors in digitalized embryos and explore the forces that canalize their development. This quantitative approach revealed that individual cell geometries and cell contacts are strongly constrained, and that these constraints are tightly linked to the control of fate specification by local cell inductions. While in vertebrates ligand concentration usually controls cell inductions, we found that this role is fulfilled in ascidians by the area of contacts between signaling and responding cells. We propose that the duality between geometric and genetic control of inductions contributes to the counterintuitive inverse correlation between geometric and genetic variability during embryogenesis.
biorxiv developmental-biology 100-200-users 2017DeepMHC Deep Convolutional Neural Networks for High-performance peptide-MHC Binding Affinity Prediction, bioRxiv, 2017-12-25
AbstractConvolutional neural networks (CNN) have been shown to outperform conventional methods in DNA-protien binding specificity prediction. However, whether we can transfer this success to protien-peptide binding affinity prediction depends on appropriate design of the CNN architectue that calls for thorough understanding how to match the architecture to the problem. Here we propose DeepMHC, a deep convolutional neural network (CNN) based protein-peptide binding prediction algorithm for achieving better performance in MHC-I peptide binding affinity prediction than conventional algorithms. Our model takes only raw binding peptide sequences as input without needing any human-designed features and othe physichochemical or evolutionary information of the amino acids. Our CNN models are shown to be able to learn non-linear relationships among the amino acid positions of the peptides to achieve highly competitive performance on most of the IEDB benchmark datasets with a single model architecture and without using any consensus or composite ensemble classifier models. By systematically exploring the best CNN architecture, we identified critical design considerations in CNN architecture development for peptide-MHC binding prediction.
biorxiv bioinformatics 100-200-users 2017Discovery and characterization of coding and non-coding driver mutations in more than 2,500 whole cancer genomes, bioRxiv, 2017-12-24
AbstractDiscovery of cancer drivers has traditionally focused on the identification of protein-coding genes. Here we present a comprehensive analysis of putative cancer driver mutations in both protein-coding and non-coding genomic regions across >2,500 whole cancer genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium. We developed a statistically rigorous strategy for combining significance levels from multiple driver discovery methods and demonstrate that the integrated results overcome limitations of individual methods. We combined this strategy with careful filtering and applied it to protein-coding genes, promoters, untranslated regions (UTRs), distal enhancers and non-coding RNAs. These analyses redefine the landscape of non-coding driver mutations in cancer genomes, confirming a few previously reported elements and raising doubts about others, while identifying novel candidate elements across 27 cancer types. Novel recurrent events were found in the promoters or 5’UTRs of TP53, RFTN1, RNF34, and MTG2, in the 3’UTRs of NFKBIZ and TOB1, and in the non-coding RNA RMRP. We provide evidence that the previously reported non-coding RNAs NEAT1 and MALAT1 may be subject to a localized mutational process. Perhaps the most striking finding is the relative paucity of point mutations driving cancer in non-coding genes and regulatory elements. Though we have limited power to discover infrequent non-coding drivers in individual cohorts, combined analysis of promoters of known cancer genes show little excess of mutations beyond TERT.
biorxiv genomics 100-200-users 2017