scispace - formally typeset
Open AccessJournal ArticleDOI

Haplotype reconstruction from SNP fragments by minimum error correction

Rui-Sheng Wang, +3 more
- 15 May 2005 - 
- Vol. 21, Iss: 10, pp 2456-2462
Reads0
Chats0
TLDR
To improve the MEC model for haplotype reconstruction, a new computational model is proposed, which simultaneously employs genotype information of an individual in the process of SNP correction, and is called MEC with genotypes information (shortly, MEC/GI).
Abstract
Motivation: Haplotype reconstruction based on aligned single nucleotide polymorphism (SNP) fragments is to infer a pair of haplotypes from localized polymorphism data gathered through short genome fragment assembly. An important computational model of this problem is the minimum error correction (MEC) model, which has been mentioned in several literatures. The model retrieves a pair of haplotypes by correcting minimum number of SNPs in given genome fragments coming from an individual's DNA. Results: In the first part of this paper, an exact algorithm for the MEC model is presented. Owing to the NP-hardness of the MEC model, we also design a genetic algorithm (GA). The designed GA is intended to solve large size problems and has very good performance. The strength and weakness of the MEC model are shown using experimental results on real data and simulation data. In the second part of this paper, to improve the MEC model for haplotype reconstruction, a new computational model is proposed, which simultaneously employs genotype information of an individual in the process of SNP correction, and is called MEC with genotype information (shortly, MEC/GI). Computational results on extensive datasets show that the new model has much higher accuracy in haplotype reconstruction than the pure MEC model. Contact: wangrsh@amss.ac.cn

read more

Citations
More filters
Journal ArticleDOI

Machine learning in bioinformatics

TL;DR: Modelling methods, such as supervised classification, clustering and probabilistic graphical models for knowledge discovery, as well as deterministic and stochastic heuristics for optimization, are presented.
Journal ArticleDOI

HapCUT: an efficient and accurate algorithm for the haplotype assembly problem

TL;DR: A novel combinatorial approach based on computing max-cuts in certain graphs derived from the sequenced fragments of a human individual to infer haplotypes and demonstrates that the haplotypes inferred using HapCUT are significantly more accurate than the greedy heuristic and a previously published method, Fast Hare.
Journal ArticleDOI

WhatsHap: Weighted Haplotype Assembly for Future-Generation Sequencing Reads

TL;DR: WhatsHap is the first approach that yields provably optimal solutions to the weighted minimum error correction problem in runtime linear in the number of SNPs, and is demonstrated that it can handle datasets of coverage up to 20×, and that 15× are generally enough for reliably phasing long reads, even at significantly elevated sequencing error rates.
Journal ArticleDOI

Optimal algorithms for haplotype assembly from whole-genome sequence data

TL;DR: A dynamic programming algorithm is proposed that is able to assemble the haplotypes optimally with time complexity O(m × 2k × n), where m is the number of reads, k is the length of the longest read and n is the total number of SNPs in the haplotype.
Journal ArticleDOI

SDhaP: haplotype assembly for diploids and polyploids via semi-definite programming

TL;DR: A novel framework for diploid/polyploid haplotype assembly from high-throughput sequencing data that outperform several well-known haplotypes assembly methods in terms of either accuracy or speed or both.
References
More filters
Book

Genetic algorithms in search, optimization, and machine learning

TL;DR: In this article, the authors present the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields, including computer programming and mathematics.
Journal ArticleDOI

High-resolution haplotype structure in the human genome.

TL;DR: A high-resolution analysis of the haplotype structure across 500 kilobases on chromosome 5q31 using 103 single-nucleotide polymorphisms (SNPs) in a European-derived population offers a coherent framework for creating a haplotype map of the human genome.
Journal ArticleDOI

Inference of haplotypes from PCR-amplified samples of diploid populations.

TL;DR: Details of the algorithm for extracting allelic sequences from population samples, along with some population-genetic considerations that influence the likelihood for success of the method, are presented here.
Journal ArticleDOI

An SNP map of the human genome generated by reduced representation shotgun sequencing

TL;DR: A simple but powerful method, called reduced representation shotgun (RRS) sequencing, for creating SNP maps, which facilitates the rapid, inexpensive construction of SNP maps in biomedically and agriculturally important species.
Related Papers (5)