Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data.

doi:10.1093/BIOINFORMATICS/BTS690

Journal Article•DOI•

Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data.

Yongchao Liu¹, Jan Schröder¹, Bertil Schmidt¹•Institutions (1)

01 Feb 2013-Bioinformatics (Oxford University Press)-Vol. 29, Iss: 3, pp 308-315

TL;DR: This article uses the k-mer spectrum approach and introduces three correction techniques in a multistage workflow: two-sided conservative correction, one-sided aggressive correction and voting-based refinement to reveal that Musket is consistently one of the top performing correctors for Illumina short-read data.

read less

Abstract: Motivation: The imperfect sequence data produced by nextgeneration sequencing technologies has motivated the development of a number of short-read error correctors in recent years. The majority of methods focus on the correction of substitution errors, which are the dominant error source in data produced by Illumina sequencing technology. Existing tools either score high in terms of recall or precision but not consistently high in terms of both measures. Results: In this paper, we present Musket, an efficient multistage kmer based corrector for Illumina short-read data. We employ the kmer spectrum approach and introduce three correction techniques in a multistage workflow: two-sided conservative correction, one-sided aggressive correction and voting-based refinement. Our performance evaluation results, in terms of correction quality and de novo genome assembly measures, reveal that Musket is consistently one of the top performing correctors. In addition, Musket is multithreaded using a master-slave model and demonstrates superior parallel scalability compared to all other evaluated correctors as well as a highly competitive overall execution time. Availability: Musket is available at http://musket.sourceforge.net. Contact: liuy@uni-mainz.de; bertil.schmidt@uni-mainz.de Supplementary information: available at Bioinformatics online

...read moreread less

Content maybe subject to copyright Report

Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data.

Citations

Cites methods from "Musket: a multistage k-mer spectrum..."

Cites methods from "Musket: a multistage k-mer spectrum..."

Additional excerpts

Cites methods or result from "Musket: a multistage k-mer spectrum..."

Additional excerpts

References

"Musket: a multistage k-mer spectrum..." refers methods in this paper

"Musket: a multistage k-mer spectrum..." refers methods in this paper

"Musket: a multistage k-mer spectrum..." refers methods in this paper

Related Papers (5)