Resolving the Full Spectrum of Human Genome Variation using Linked-Reads, bioRxiv, 2017-12-09

AbstractLarge-scale population based analyses coupled with advances in technology have demonstrated that the human genome is more diverse than originally thought. To date, this diversity has largely been uncovered using short read whole genome sequencing. However, standard short-read approaches, used primarily due to accuracy, throughput and costs, fail to give a complete picture of a genome. They struggle to identify large, balanced structural events, cannot access repetitive regions of the genome and fail to resolve the human genome into its two haplotypes. Here we describe an approach that retains long range information while harnessing the advantages of short reads. Starting from only ∼1ng of DNA, we produce barcoded short read libraries. The use of novel informatic approaches allows for the barcoded short reads to be associated with the long molecules of origin producing a novel datatype known as ‘Linked-Reads’. This approach allows for simultaneous detection of small and large variants from a single Linked-Read library. We have previously demonstrated the utility of whole genome Linked-Reads (lrWGS) for performing diploid, de novo assembly of individual genomes (Weisenfeld et al. 2017). In this manuscript, we show the advantages of Linked-Reads over standard short read approaches for reference based analysis. We demonstrate the ability of Linked-Reads to reconstruct megabase scale haplotypes and to recover parts of the genome that are typically inaccessible to short reads, including phenotypically important genes such as STRC, SMN1 and SMN2. We demonstrate the ability of both lrWGS and Linked-Read Whole Exome Sequencing (lrWES) to identify complex structural variations, including balanced events, single exon deletions, and single exon duplications. The data presented here show that Linked-Reads provide a scalable approach for comprehensive genome analysis that is not possible using short reads alone.

biorxiv genomics 0-100-users 2017

The rust fungus Melampsora larici-populina expresses a conserved genetic program and distinct sets of secreted protein genes during infection of its two host plants, larch and poplar, bioRxiv, 2017-12-07

SummaryMechanims required for broad spectrum or specific host colonization of plant parasites are poorly understood. As a perfect illustration, heteroecious rust fungi require two alternate host plants to complete their life cycle. Melampsora larici-populina infects two taxonomically unrelated plants, larch on which sexual reproduction is achieved and poplar on which clonal multiplication occurs leading to severe epidemics in plantations. High-depth RNA sequencing was applied to three key developmental stages of M. larici-populina infection on larch basidia, pycnia and aecia. Comparative transcriptomics of infection on poplar and larch hosts was performed using available expression data. Secreted protein was the only significantly over-represented category among differentially expressed M. larici-populina genes in basidia, pycnia and aecia compared together, highlighting their probable involvement in the infection process. Comparison of fungal transcriptomes in larch and poplar revealed a majority of rust genes commonly expressed on the two hosts and a fraction exhibiting a host-specific expression. More particularly, gene families encoding small secreted proteins presented striking expression profiles that highlight probable candidate effectors specialized on each host. Our results bring valuable new information about the biological cycle of rust fungi and identify genes that may contribute to host specificity.

biorxiv microbiology 0-100-users 2017

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo