The rust fungus Melampsora larici-populina expresses a conserved genetic program and distinct sets of secreted protein genes during infection of its two host plants, larch and poplar, bioRxiv, 2017-12-07

SummaryMechanims required for broad spectrum or specific host colonization of plant parasites are poorly understood. As a perfect illustration, heteroecious rust fungi require two alternate host plants to complete their life cycle. Melampsora larici-populina infects two taxonomically unrelated plants, larch on which sexual reproduction is achieved and poplar on which clonal multiplication occurs leading to severe epidemics in plantations. High-depth RNA sequencing was applied to three key developmental stages of M. larici-populina infection on larch basidia, pycnia and aecia. Comparative transcriptomics of infection on poplar and larch hosts was performed using available expression data. Secreted protein was the only significantly over-represented category among differentially expressed M. larici-populina genes in basidia, pycnia and aecia compared together, highlighting their probable involvement in the infection process. Comparison of fungal transcriptomes in larch and poplar revealed a majority of rust genes commonly expressed on the two hosts and a fraction exhibiting a host-specific expression. More particularly, gene families encoding small secreted proteins presented striking expression profiles that highlight probable candidate effectors specialized on each host. Our results bring valuable new information about the biological cycle of rust fungi and identify genes that may contribute to host specificity.

biorxiv microbiology 0-100-users 2017

k-mer grammar uncovers maize regulatory architecture, bioRxiv, 2017-12-06

ABSTRACTOnly a small percentage of the genome sequence is involved in regulation of gene expression, but to biochemically identify this portion is expensive and laborious. In species like maize, with diverse intergenic regions and lots of repetitive elements, this is an especially challenging problem. While regulatory regions are rare, they do have characteristic chromatin contexts and sequence organization (the grammar) with which they can be identified. We developed a computational framework to exploit this sequence arrangement. The models learn to classify regulatory regions based on sequence features - k-mers. To do this, we borrowed two approaches from the field of natural language processing (1) “bag-of-words” which is commonly used for differentially weighting key words in tasks like sentiment analyses, and (2) a vector-space model using word2vec (vector-k-mers), that captures semantic and linguistic relationships between words. We built “bag-of-k-mers” and “vector-k-mers” models that distinguish between regulatory and non-regulatory regions with an accuracy above 90%. Our “bag-of-k-mers” achieved higher overall accuracy, while the “vector-k-mers” models were more useful in highlighting key groups of sequences within the regulatory regions. These models now provide powerful tools to annotate regulatory regions in other maize lines beyond the reference, at low cost and with high accuracy.

biorxiv plant-biology 0-100-users 2017

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo