Evaluating the clinical validity of gene-disease associations an evidence-based framework developed by the Clinical Genome Resource, bioRxiv, 2017-02-23

AbstractWith advances in genomic sequencing technology, the number of reported gene-disease relationships has rapidly expanded. However, the evidence supporting these claims varies widely, confounding accurate evaluation of genomic variation in a clinical setting. Despite the critical need to differentiate clinically valid relationships from less well-substantiated relationships, standard guidelines for such evaluation do not currently exist. The NIH-funded Clinical Genome Resource (ClinGen) has developed a framework to define and evaluate the clinical validity of gene-disease pairs across a variety of Mendelian disorders. In this manuscript we describe a proposed framework to evaluate relevant genetic and experimental evidence supporting or contradicting a gene-disease relationship, and the subsequent validation of this framework using a set of representative gene-disease pairs. The framework provides a semi-quantitative measurement for the strength of evidence of a gene-disease relationship which correlates to a qualitative classification “Definitive”, “Strong”, “Moderate”, “Limited”, “No Reported Evidence” or “Conflicting Evidence.” Within the ClinGen structure, classifications derived using this framework are reviewed and confirmed or adjusted based on clinical expertise of appropriate disease experts. Detailed guidance for utilizing this framework and access to the curation interface is available on our website. This evidence-based, systematic method to assess the strength of gene-disease relationships will facilitate more knowledgeable utilization of genomic variants in clinical and research settings.

biorxiv genetics 0-100-users 2017

Granatum a graphical single-cell RNA-Seq analysis pipeline for genomics scientists, bioRxiv, 2017-02-23

AbstractBackgroundSingle-cell RNA sequencing (scRNA-Seq) is an increasingly popular platform to study heterogeneity at the single-cell level.Computational methods to process scRNA-Seq have limited accessibility to bench scientists as they require significant amounts of bioinformatics skills.ResultsWe have developed Granatum, a web-based scRNA-Seq analysis pipeline to make analysis more broadly accessible to researchers. Without a single line of programming code, users can click through the pipeline, setting parameters and visualizing results via the interactive graphical interface Granatum conveniently walks users through various steps of scRNA-Seq analysis. It has a comprehensive list of modules, including plate merging and batch-effect removal, outlier-sample removal, gene filtering, geneexpression normalization, cell clustering, differential gene expression analysis, pathwayontology enrichment analysis, protein-networ interaction visualization, and pseudo-time cell series construction.ConclusionsGranatum enables broad adoption of scRNA-Seq technology by empowering the bench scientists with an easy-to-use graphical interface for scRNA-Seq data analysis. The package is freely available for research use at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpgarmiregroup.orggranatumapp>httpgarmiregroup.orggranatumapp<jatsext-link>

biorxiv bioinformatics 0-100-users 2017

W2RAP a pipeline for high quality, robust assemblies of large complex genomes from short read data, bioRxiv, 2017-02-23

AbstractProducing high-quality whole-genome shotgun de novo assemblies from plant and animal species with large and complex genomes using low-cost short read sequencing technologies remains a challenge. But when the right sequencing data, with appropriate quality control, is assembled using approaches focused on robustness of the process rather than maximization of a single metric such as the usual contiguity estimators, good quality assemblies with informative value for comparative analyses can be produced. Here we present a complete method described from data generation and qc all the way up to scaffold of complex genomes using Illumina short reads and its application to data from plants and human datasets. We show how to use the w2rap pipeline following a metric-guided approach to produce cost-effective assemblies. The assemblies are highly accurate, provide good coverage of the genome and show good short range contiguity. Our pipeline has already enabled the rapid, cost-effective generation of de novo genome assemblies from large, polyploid crop species with a focus on comparative genomics.Availabilityw2rap is available under MIT license, with some subcomponents under GPL-licenses. A ready-to-run docker with all software pre-requisites and example data is also available.<jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpgithub.combioinfologicsw2rap>httpgithub.combioinfologicsw2rap<jatsext-link><jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpgithub.combioinfologicsw2rap-contigger>httpgithub.combioinfologicsw2rap-contigger<jatsext-link>

biorxiv bioinformatics 0-100-users 2017

Single-cell epigenomics maps the continuous regulatory landscape of human hematopoietic differentiation, bioRxiv, 2017-02-22

AbstractNormal human hematopoiesis involves cellular differentiation of multipotent cells into progressively more lineage-restricted states. While epigenomic landscapes of this process have been explored in immunophenotypically-defined populations, the single-cell regulatory variation that defines hematopoietic differentiation has been hidden by ensemble averaging. We generated single-cell chromatin accessibility landscapes across 8 populations of immunophenotypically-defined human hematopoietic cell types. Using bulk chromatin accessibility profiles to scaffold our single-cell data analysis, we constructed an epigenomic landscape of human hematopoiesis and characterized epigenomic heterogeneity within phenotypically sorted populations to find epigenomic lineage-bias toward different developmental branches in multipotent stem cell states. We identify and isolate sub-populations within classically-defined granulocyte-macrophage progenitors (GMPs) and use ATAC-seq and RNA-seq to confirm that GMPs are epigenomically and transcriptomically heterogeneous. Furthermore, we identified transcription factors and cis-regulatory elements linked to changes in chromatin accessibility within cellular populations and across a continuous myeloid developmental trajectory, and observe relatively simple TF motif dynamics give rise to a broad diversity of accessibility dynamics at cis-regulatory elements. Overall, this work provides a template for exploration of complex regulatory dynamics in primary human tissues at the ultimate level of granular specificity – the single cell.One Sentence SummarySingle cell chromatin accessibility reveals a high-resolution, continuous landscape of regulatory variation in human hematopoiesis.

biorxiv genomics 100-200-users 2017

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo