cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate

doi:10.1093/NAR/GKS003

Open AccessJournal ArticleDOI

cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate

Günter Klambauer, +6 more

- 01 May 2012 -

Nucleic Acids Research

- Vol. 40, Iss: 9

TLDR

‘Copy Number estimation by a Mixture Of PoissonS’ (cn.MOPS), a data processing pipeline for CNV detection in NGS data outperformed its five competitors in terms of precision (1–FDR) and recall for both gains and losses in all benchmark data sets.

Abstract:

Quantitative analyses of next-generation sequencing (NGS) data, such as the detection of copy number variations (CNVs), remain challenging. Current methods detect CNVs as changes in the depth of coverage along chromosomes. Technological or genomic variations in the depth of coverage thus lead to a high false discovery rate (FDR), even upon correction for GC content. In the context of association studies between CNVs and disease, a high FDR means many false CNVs, thereby decreasing the discovery power of the study after correction for multiple testing. We propose ‘Copy Number estimation by a Mixture Of PoissonS’ (cn.MOPS), a data processing pipeline for CNV detection in NGS data. In contrast to previous approaches, cn.MOPS incorporates modeling of depths of coverage across samples at each genomic position. Therefore, cn.MOPS is not affected by read count variations along chromosomes. Using a Bayesian approach, cn.MOPS decomposes variations in the depth of coverage across samples into integer copy numbers and noise by means of its mixture components and Poisson distributions, respectively. The noise estimate allows for reducing the FDR by filtering out detections having high noise that are likely to be false detections. We compared cn.MOPS with the five most popular methods for CNV detection in NGS data using four benchmark datasets: (i) simulated data, (ii) NGS data from a male HapMap individual with implanted CNVs from the X chromosome, (iii) data from HapMap individuals with known CNVs, (iv) high coverage data from the 1000 Genomes Project. cn.MOPS outperformed its five competitors in terms of precision (1–FDR) and recall for both gains and losses in all benchmark data sets. The software cn.MOPS is publicly available as an R package at http://www.bioinf.jku.at/ software/cnmops/ and at Bioconductor.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Sequencing depth and coverage: key considerations in genomic analyses

David Sims, +4 more

- 01 Feb 2014 -

Nature Reviews Genetics

TL;DR: The issue of sequencing depth in the design of next-generation sequencing experiments is discussed and current guidelines and precedents on the issue of coverage are reviewed for four major study designs, including de novo genome sequencing, genome resequencing, transcriptome sequencing and genomic location analyses.

...read moreread less

Journal ArticleDOI

CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing

Eric Talevich, +3 more

- 21 Apr 2016 -

PLOS Computational Biology

TL;DR: A method for copy number detection, implemented in the software package CNVkit, that uses both the targeted reads and the nonspecifically captured off-target reads to infer copy number evenly across the genome, successfully inferred copy number at equivalent to 100-kilobase resolution genome-wide from a platform targeting as few as 293 genes.

...read moreread less

Journal ArticleDOI

Mosdepth: quick coverage calculation for genomes and exomes

Brent S. Pedersen, +1 more

- 01 Mar 2018 -

Bioinformatics

TL;DR: Mosdepth is a new command‐line tool for rapidly calculating genome‐wide sequencing coverage that uses a simple algorithm that is computationally efficient and enables it to quickly produce coverage summaries.

...read moreread less

Journal ArticleDOI

Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives

Min Zhao, +4 more

- 13 Sep 2013 -

BMC Bioinformatics

TL;DR: The recent advances in computational methods pertaining to CNV detection using whole genome and whole exome sequencing data are reviewed to discuss their strengths and weaknesses and suggest directions for future development.

...read moreread less

Journal ArticleDOI

A structural variation reference for medical and population genetics

Ryan L. Collins, +65 more

- 28 May 2020 -

Nature

TL;DR: A large empirical assessment of sequence-resolved structural variants from 14,891 genomes across diverse global populations in the Genome Aggregation Database (gnomAD) provides a reference map for disease-association studies, population genetics, and diagnostic screening.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Ultrafast and memory-efficient alignment of short DNA sequences to the human genome

Ben Langmead, +3 more

- 04 Mar 2009 -

Genome Biology

TL;DR: Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches and can be used simultaneously to achieve even greater alignment speeds.

...read moreread less

Journal ArticleDOI

A framework for variation discovery and genotyping using next-generation DNA sequencing data

Mark A. DePristo, +22 more

- 01 May 2011 -

Nature Genetics

TL;DR: A unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs is presented.

...read moreread less

Journal ArticleDOI

A Map of Human Genome Variation From Population-Scale Sequencing

Gonçalo R. Abecasis, +8 more

- 28 Oct 2010 -

Nature

TL;DR: The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype as mentioned in this paper, and the results of the pilot phase of the project, designed to develop and compare different strategies for genomewide sequencing with high-throughput platforms.

...read moreread less

Journal ArticleDOI

Accurate whole human genome sequencing using reversible terminator chemistry

David R. Bentley, +201 more

- 06 Nov 2008 -

Nature

TL;DR: An approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost is reported, effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications.

...read moreread less

Journal ArticleDOI

The cancer genome

Michael R. Stratton, +4 more

- 09 Apr 2009 -

Nature

TL;DR: This work has shown that the complete DNA sequence of large numbers of cancer genomes will be possible to obtain and will provide a detailed and comprehensive perspective on how individual cancers have developed.

...read moreread less