Minimum epistasis interpolation for sequence-function relationships, bioRxiv, 2019-06-02

AbstractMassively parallel phenotyping assays have provided unprecedented insight into how multiple mutations combine to determine biological function. While these assays can measure phenotypes for thousands to millions of genotypes in a single experiment, in practice these measurements are not exhaustive, so that there is a need for techniques to impute values for genotypes whose phenotypes are not directly assayed. Here we present a method based on the idea of inferring the least epistatic possible sequence-function relationship compatible with the data. In particular, we infer the reconstruction in which mutational effects change as little as possible across adjacent genetic backgrounds. Although this method is highly conservative and has no tunable parameters, it also makes no assumptions about the form that genetic interactions take, resulting in predictions that can behave in a very complicated manner where the data require it but which are nearly additive where data is sparse or absent. We apply this method to analyze a fitness landscape for protein G, showing that our technique can provide a substantially less epistatic fit to the landscape than standard methods with little loss in predictive power. Moreover, our analysis reveals that the complex structure of epistasis observed in this dataset can be well-understood in terms of a simple qualitative model consisting of three fitness peaks where the landscape is locally additive in the vicinity of each peak.

biorxiv bioinformatics 0-100-users 2019

The Mastery Rubric for Bioinformatics supporting design and evaluation of career-spanning education and training, bioRxiv, 2019-06-02

AbstractAs the life sciences have become more data intensive, the pressure to incorporate the requisite training into life-science education and training programs has increased. To facilitate curriculum development, various sets of (bio)informatics competencies have been articulated; however, these have proved difficult to implement in practice. Addressing this issue, we have created a curriculum-design and -evaluation tool to support the development of specific Knowledge, Skills and Abilities (KSAs) that reflect the scientific method and promote both bioinformatics practice and the achievement of competencies. Twelve KSAs were extracted via formal analysis, and stages along a developmental trajectory, from uninitiated student to independent practitioner, were identified. Demonstration of each KSA by a performer at each stage was initially described (Performance Level Descriptors, PLDs), evaluated, and revised at an international workshop. This work was subsequently extended and further refined to yield the Mastery Rubric for Bioinformatics (MR-Bi). The MR-Bi was validated by demonstrating alignment between the KSAs and competencies, and its consistency with principles of adult learning. The MR-Bi tool provides a formal framework to support curriculum building, training, and self-directed learning. It prioritizes the development of independence and scientific reasoning, and is structured to allow individuals (regardless of career stage, disciplinary background, or skill level) to locate themselves within the framework. The KSAs and their PLDs promote scientific problem formulation and problem solving, lending the MR-Bi durability and flexibility. With its explicit developmental trajectory, the tool can be used by developing or practicing scientists to direct their (and their team’s) acquisition of new, or to deepen existing, bioinformatics KSAs. The MR-Bi can thereby contribute to the cultivation of a next generation of bioinformaticians who are able to design reproducible and rigorous research, and to critically analyze results from their own, and others’, work.

biorxiv scientific-communication-and-education 200-500-users 2019

3D RNA-seq - a powerful and flexible tool for rapid and accurate differential expression and alternative splicing analysis of RNA-seq data for biologists, bioRxiv, 2019-06-01

AbstractRNA-sequencing (RNA-seq) analysis of gene expression and alternative splicing should be routine and robust but is often a bottleneck for biologists because of different and complex analysis programs and reliance on skilled bioinformaticians to perform the analysis. To overcome these issues, we have developed the “3D RNA-seq” App, an R shiny App which provides an easy-to-use, flexible and powerful tool for the three-way differential analysis Differential Expression (DE), Differential Alternative Splicing (DAS) and Differential Transcript Usage (DTU) of RNA-seq data. The full analysis is extremely rapidand can be done within hours. The program integrates Limma, a state-of-the-art, highly rated differential expression analysis tool and adopts best practice for RNA-seq analysis. It runs the analysis through a user-friendly graphical interface, can handle complex experimental designs, allows user setting of statistical parameters, visualizes the results through graphics and tables, and generates publication quality figures such as heat-maps, expression profiles and GO enrichment plots. The utility of 3D RNA-seq is illustrated by analysis of Arabidopsis and mouse RNA-seq data. The program is designed to be run by biologists with minimal bioinformatics experience (or by bioinformaticians) allowing lab scientists to take control of the analysis of their RNA-seq data.

biorxiv bioinformatics 100-200-users 2019

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo