The essential genome of Escherichia coli K-12, bioRxiv, 2017-12-22

ABSTRACTTransposon-Directed Insertion-site Sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries and therefore it remains unclear whether the two methodologies are comparable. To address this, a high density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false positive identification of essential gene candidates, statistical data analysis included corrections for both gene length and genome length. Through this analysis new essential genes and genes previously incorrectly designated as essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects and fine resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis datasets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry.IMPORTANCEIncentives to define lists of genes that are essential for bacterial survival include the identification of potential targets for antibacterial drug development, genes required for rapid growth for exploitation in biotechnology, and discovery of new biochemical pathways. To identify essential genes in E. coli, we constructed a very high density transposon mutant library. Initial automated analysis of the resulting data revealed many discrepancies when compared to the literature. We now report more extensive statistical analysis supported by both literature searches and detailed inspection of high density TraDIS sequencing data for each putative essential gene for the model laboratory organism, Escherichia coli. This paper is important because it provides a better understanding of the essential genes of E. coli, reveals the limitations of relying on automated analysis alone and a provides new standard for the analysis of TraDIS data.

biorxiv microbiology 100-200-users 2017

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo