Whole-genome sequencing analysis of copy number variation (CNV) using low-coverage and paired-end strategies is efficient and outperforms array-based CNV analysis, bioRxiv, 2017-11-05

ABSTRACTBackgroundCNV analysis is an integral component to the study of human genomes in both research and clinical settings. Array-based CNV analysis is the current first-tier approach in clinical cytogenetics. Decreasing costs in high-throughput sequencing and cloud computing have opened doors for the development of sequencing-based CNV analysis pipelines with fast turnaround times. We carry out a systematic and quantitative comparative analysis for several low-coverage whole-genome sequencing (WGS) strategies to detect CNV in the human genome.MethodsWe compared the CNV detection capabilities of WGS strategies (short-insert, 3kb-, and 5kb-insert mate-pair) each at 1x, 3x, and 5x coverages relative to each other and to 17 currently used high-density oligonucleotide arrays. For benchmarking, we used a set of Gold Standard (GS) CNVs generated for the 1000-Genomes-Project CEU subject NA12878.ResultsOverall, low-coverage WGS strategies detect drastically more GS CNVs compared to arrays and are accompanied with smaller percentages of CNV calls without validation. Furthermore, we show that WGS (at ≥1x coverage) is able to detect all seven GS deletion-CNVs >100 kb in NA12878 whereas only one is detected by most arrays. Lastly, we show that the much larger 15 Mbp Cri-du-chat deletion can be readily detected with short-insert paired-end WGS at even just 1x coverage.ConclusionsCNV analysis using low-coverage WGS is efficient and outperforms the array-based analysis that is currently used for clinical cytogenetics.

biorxiv genomics 100-200-users 2017

Germline determinants of the somatic mutation landscape in 2,642 cancer genomes, bioRxiv, 2017-11-02

AbstractCancers develop through somatic mutagenesis, however germline genetic variation can markedly contribute to tumorigenesis via diverse mechanisms. We discovered and phased 88 million germline single nucleotide variants, short insertionsdeletions, and large structural variants in whole genomes from 2,642 cancer patients, and employed this genomic resource to study genetic determinants of somatic mutagenesis across 39 cancer types. Our analyses implicate damaging germline variants in a variety of cancer predisposition and DNA damage response genes with specific somatic mutation patterns. Mutations in the MBD4 DNA glycosylase gene showed association with elevated C>T mutagenesis at CpG dinucleotides, a ubiquitous mutational process acting across tissues. Analysis of somatic structural variation exposed complex rearrangement patterns, involving cycles of templated insertions and tandem duplications, in BRCA1-deficient tumours. Genome-wide association analysis implicated common genetic variation at the APOBEC3 gene cluster with reduced basal levels of somatic mutagenesis attributable to APOBEC cytidine deaminases across cancer types. We further inferred over a hundred polymorphic L1LINE elements with somatic retrotransposition activity in cancer. Our study highlights the major impact of rare and common germline variants on mutational landscapes in cancer.

biorxiv genomics 0-100-users 2017

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo