A complete Cannabis chromosome assembly and adaptive admixture for elevated cannabidiol (CBD) content, bioRxiv, 2018-10-31

AbstractCannabis has been cultivated for millennia with distinct cultivars providing either fiber and grain or tetrahydrocannabinol. Recent demand for cannabidiol rather than tetrahydrocannabinol has favored the breeding of admixed cultivars with extremely high cannabidiol content. Despite several draft Cannabis genomes, the genomic structure of cannabinoid synthase loci has remained elusive. A genetic map derived from a tetrahydrocannabinolcannabidiol segregating population and a complete chromosome assembly from a high-cannabidiol cultivar together resolve the linkage of cannabidiolic and tetrahydrocannabinolic acid synthase gene clusters which are associated with transposable elements. High-cannabidiol cultivars appear to have been generated by integrating hemp-type cannabidiolic acid synthase gene clusters into a background of marijuana-type cannabis. Quantitative trait locus mapping suggests that overall drug potency, however, is associated with other genomic regions needing additional study.Resources available online at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpcannabisgenome.org>httpcannabisgenome.org<jatsext-link>SummaryA complete chromosome assembly and an ultra-high-density linkage map together identify the genetic mechanism responsible for the ratio of tetrahydrocannabinol (THC) to cannabidiol (CBD) in Cannabis cultivars, allowing paradigms for the evolution and inheritance of drug potency to be evaluated.

biorxiv genomics 100-200-users 2018

A practical guide to methods controlling false discoveries in computational biology, bioRxiv, 2018-10-31

In high-throughput studies, hundreds to millions of hypotheses are typically tested. Statistical methods that control the false discovery rate (FDR) have emerged as popular and powerful tools for error rate control. While classic FDR methods use only p-values as input, more modern FDR methods have been shown to increase power by incorporating complementary information as informative covariates to prioritize, weight, and group hypotheses. However, there is currently no consensus on how the modern methods compare to one another. We investigated the accuracy, applicability, and ease of use of two classic and six modern FDR-controlling methods by performing a systematic benchmark comparison using simulation studies as well as six case studies in computational biology. Methods that incorporate informative covariates were modestly more powerful than classic approaches, and did not underperform classic approaches, even when the covariate was completely uninformative. The majority of methods were successful at controlling the FDR, with the exception of two modern methods under certain settings. Furthermore, we found the improvement of the modern FDR methods over the classic methods increased with the informativeness of the covariate, total number of hypothesis tests, and proportion of truly non-null hypotheses. Modern FDR methods that use an informative covariate provide advantages over classic FDR-controlling procedures, with the relative gain dependent on the application and informativeness of available covariates. We present our findings as a practical guide and provide recommendations to aid researchers in their choice of methods to correct for false discoveries.

biorxiv bioinformatics 200-500-users 2018

Immediate visualization of recombination events and chromosome segregation defects in fission yeast meiosis, bioRxiv, 2018-10-31

AbstractSchizosaccharomyces pombe, also known as fission yeast, is an established model for studying chromosome biological processes. Over the years research employing fission yeast has made important contributions to our knowledge about chromosome segregation during meiosis, as well as meiotic recombination and its regulation. Quantification of meiotic recombination frequency is not a straightforward undertaking, either requiring viable progeny for a genetic plating assay, or relying on laborious Southern blot analysis of recombination intermediates. Neither of these methods lends itself to high-throughput screens to identify novel meiotic factors. Here, we establish visual assays novel to Sz. pombe for characterizing chromosome segregation and meiotic recombination phenotypes. Genes expressing red, yellow, andor cyan fluorophores from spore-autonomous promoters have been integrated into the fission yeast genomes, either close to the centromere of chromosome I to monitor chromosome segregation, or on the arm of chromosome III to form a genetic interval at which recombination frequency can be determined. The visual recombination assay allows straightforward and immediate assessment of the genetic outcome of a single meiosis by epi-fluorescence microscopy without requiring tetrad dissection. We also demonstrate that the recombination frequency analysis can be automatized by utilizing imaging flow cytometry to enable high-throughput screens. These assays have several advantages over traditional methods for analysing meiotic phenotypes.

biorxiv genetics 0-100-users 2018

Personalized and graph genomes reveal missing signal in epigenomic data, bioRxiv, 2018-10-31

AbstractBackgroundEpigenomic studies that use next generation sequencing experiments typically rely on the alignment of reads to a reference sequence. However, because of genetic diversity and the diploid nature of the human genome, we hypothesized that using a generic reference could lead to incorrectly mapped reads and bias downstream results.ResultsWe show that accounting for genetic variation using a modified reference genome (MPG) or a denovo assembled genome (DPG) can alter histone H3K4me1 and H3K27ac ChIP-seq peak calls by either creating new personal peaks or by the loss of reference peaks. MPGs are found to alter approximately 1% of peak calls while DPGs alter up to 5% of peaks. We also show statistically significant differences in the amount of reads observed in regions associated with the new, altered and unchanged peaks. We report that short insertions and deletions (indels), followed by single nucleotide variants (SNVs), have the highest probability of modifying peak calls. A counter-balancing factor is peak width, with wider calls being less likely to be altered. Next, because high-quality DPGs remain hard to obtain, we show that using a graph personalized genome (GPG), represents a reasonable compromise between MPGs and DPGs and alters about 2.5% of peak calls. Finally, we demonstrate that altered peaks have a genomic distribution typical of other peaks. For instance, for H3K4me1, 518 personal-only peaks were replicated using at least two of three approaches, 394 of which were inside or within 10Kb of a gene.ConclusionsAnalysing epigenomic datasets with personalized and graph genomes allows the recovery of new peaks enriched for indels and SNVs. These altered peaks are more likely to differ between individuals and, as such, could be relevant in the study of various human phenotypes.

biorxiv bioinformatics 100-200-users 2018

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo