Error correction of next-generation sequencing data and reliable estimation of HIV quasispecies

doi:10.1093/NAR/GKQ655

Journal Article•DOI•

Error correction of next-generation sequencing data and reliable estimation of HIV quasispecies

Osvaldo Zagordi¹, Rolf Klein², Martin Däumer², Niko Beerenwinkel²•Institutions (2)

ETH Zurich¹, Swiss Institute of Bioinformatics²

01 Nov 2010-Nucleic Acids Research (Oxford University Press)-Vol. 38, Iss: 21, pp 7400-7409

TL;DR: It is concluded that pyrosequencing can be used to investigate genetically diverse samples with high accuracy if technical errors are properly treated and probabilistic haplotype inference outperforms the counting-based calling method in both precision and recall.

read less

Abstract: Next-generation sequencing technologies can be used to analyse genetically heterogeneous samples at unprecedented detail. The high coverage achievable with these methods enables the detection of many low-frequency variants. However, sequencing errors complicate the analysis of mixed populations and result in inflated estimates of genetic diversity. We developed a probabilistic Bayesian approach to minimize the effect of errors on the detection of minority variants. We applied it to pyrosequencing data obtained from a 1.5-kb-fragment of the HIV-1 gag/pol gene in two control and two clinical samples. The effect of PCR amplification was analysed. Error correction resulted in a two- and five-fold decrease of the pyrosequencing base substitution rate, from 0.05% to 0.03% and from 0.25% to 0.05% in the non-PCR and PCR-amplified samples, respectively. We were able to detect viral clones as rare as 0.1% with perfect sequence reconstruction. Probabilistic haplotype inference outperforms the counting-based calling method in both precision and recall. Genetic diversity observed within and between two clinical samples resulted in various patterns of phenotypic drug resistance and suggests a close epidemiological link. We conclude that pyrosequencing can be used to investigate genetically diverse samples with high accuracy if technical errors are properly treated.

...read moreread less

Content maybe subject to copyright Report

Error correction of next-generation sequencing data and reliable estimation of HIV quasispecies

Citations

References

"Error correction of next-generation..." refers methods in this paper

"Error correction of next-generation..." refers background in this paper

Related Papers (5)