Haplotype reconstruction from SNP fragments by minimum error correction
Reads0
Chats0
TLDR
To improve the MEC model for haplotype reconstruction, a new computational model is proposed, which simultaneously employs genotype information of an individual in the process of SNP correction, and is called MEC with genotypes information (shortly, MEC/GI).Abstract:
Motivation: Haplotype reconstruction based on aligned single nucleotide polymorphism (SNP) fragments is to infer a pair of haplotypes from localized polymorphism data gathered through short genome fragment assembly. An important computational model of this problem is the minimum error correction (MEC) model, which has been mentioned in several literatures. The model retrieves a pair of haplotypes by correcting minimum number of SNPs in given genome fragments coming from an individual's DNA.
Results: In the first part of this paper, an exact algorithm for the MEC model is presented. Owing to the NP-hardness of the MEC model, we also design a genetic algorithm (GA). The designed GA is intended to solve large size problems and has very good performance. The strength and weakness of the MEC model are shown using experimental results on real data and simulation data. In the second part of this paper, to improve the MEC model for haplotype reconstruction, a new computational model is proposed, which simultaneously employs genotype information of an individual in the process of SNP correction, and is called MEC with genotype information (shortly, MEC/GI). Computational results on extensive datasets show that the new model has much higher accuracy in haplotype reconstruction than the pure MEC model.
Contact: wangrsh@amss.ac.cnread more
Citations
More filters
Journal ArticleDOI
Machine learning in bioinformatics
Pedro Larrañaga,Borja Calvo,Roberto Santana,Concha Bielza,Josu Galdiano,Iñaki Inza,Jose A. Lozano,Rubén Armañanzas,Guzmán Santafé,Aritz Pérez,Víctor Robles +10 more
TL;DR: Modelling methods, such as supervised classification, clustering and probabilistic graphical models for knowledge discovery, as well as deterministic and stochastic heuristics for optimization, are presented.
Journal ArticleDOI
HapCUT: an efficient and accurate algorithm for the haplotype assembly problem
Vikas Bansal,Vineet Bafna +1 more
TL;DR: A novel combinatorial approach based on computing max-cuts in certain graphs derived from the sequenced fragments of a human individual to infer haplotypes and demonstrates that the haplotypes inferred using HapCUT are significantly more accurate than the greedy heuristic and a previously published method, Fast Hare.
Journal ArticleDOI
WhatsHap: Weighted Haplotype Assembly for Future-Generation Sequencing Reads
Murray Patterson,Tobias Marschall,Nadia Pisanti,Leo van Iersel,Leen Stougie,Gunnar W. Klau,Alexander Schönhuth +6 more
TL;DR: WhatsHap is the first approach that yields provably optimal solutions to the weighted minimum error correction problem in runtime linear in the number of SNPs, and is demonstrated that it can handle datasets of coverage up to 20×, and that 15× are generally enough for reliably phasing long reads, even at significantly elevated sequencing error rates.
Journal ArticleDOI
Optimal algorithms for haplotype assembly from whole-genome sequence data
TL;DR: A dynamic programming algorithm is proposed that is able to assemble the haplotypes optimally with time complexity O(m × 2k × n), where m is the number of reads, k is the length of the longest read and n is the total number of SNPs in the haplotype.
Journal ArticleDOI
SDhaP: haplotype assembly for diploids and polyploids via semi-definite programming
Shreepriya Das,Haris Vikalo +1 more
TL;DR: A novel framework for diploid/polyploid haplotype assembly from high-throughput sequencing data that outperform several well-known haplotypes assembly methods in terms of either accuracy or speed or both.
References
More filters
Book
Genetic algorithms in search, optimization, and machine learning
TL;DR: In this article, the authors present the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields, including computer programming and mathematics.
Journal ArticleDOI
High-resolution haplotype structure in the human genome.
Mark J. Daly,John D. Rioux,Stephen F. Schaffner,Thomas J. Hudson,Thomas J. Hudson,Eric S. Lander +5 more
TL;DR: A high-resolution analysis of the haplotype structure across 500 kilobases on chromosome 5q31 using 103 single-nucleotide polymorphisms (SNPs) in a European-derived population offers a coherent framework for creating a haplotype map of the human genome.
Journal ArticleDOI
Haplotype variation and linkage disequilibrium in 313 human genes.
J. Claiborne Stephens,Julie A. Schneider,Debra A. Tanguay,Julie . Choi,Tara Acharya,Scott E. Stanley,Ruhong Jiang,Chad Messer,Anne Chew,Jin-Hua Han,Jicheng Duan,Janet L. Carr,Min Seob Lee,Beena Koshy,A. Madan Kumar,Ge Zhang,William R. Newell,Andreas Windemuth,Chuanbo Xu,Theodore S. Kalbfleisch,Sandra L. Shaner,Kevin M. Arnold,Vincent P. Schulz,Connie M. Drysdale,Krishnan Nandabalan,Richard S. Judson,Gualberto Ruaño,Gerald F. Vovis +27 more
TL;DR: Pairs of SNPs exhibited variability in the degree of linkage disequilibrium that was a function of their location within a gene, distance from each other, population distribution, and population frequency.
Journal ArticleDOI
Inference of haplotypes from PCR-amplified samples of diploid populations.
TL;DR: Details of the algorithm for extracting allelic sequences from population samples, along with some population-genetic considerations that influence the likelihood for success of the method, are presented here.
Journal ArticleDOI
An SNP map of the human genome generated by reduced representation shotgun sequencing
David Altshuler,David Altshuler,Victor J. Pollara,Chris R. Cowles,William J. Van Etten,Jennifer Baldwin,Lauren Linton,Eric S. Lander +7 more
TL;DR: A simple but powerful method, called reduced representation shotgun (RRS) sequencing, for creating SNP maps, which facilitates the rapid, inexpensive construction of SNP maps in biomedically and agriculturally important species.
Related Papers (5)
Fast hare: A fast heuristic for single individual SNP haplotype reconstruction
Alessandro Panconesi,Mauro Sozio +1 more
The Diploid Genome Sequence of an Individual Human
Samuel Levy,Granger G. Sutton,Pauline C. Ng,Lars Feuk,Aaron L. Halpern,Brian P. Walenz,Nelson Axelrod,Jiaqi Huang,Ewen F. Kirkness,Gennady Denisov,Yuan Lin,Jeffrey R. MacDonald,Andy Wing Chun Pang,Mary Shago,Timothy B. Stockwell,Alexia Tsiamouri,Vineet Bafna,Vikas Bansal,Saul A. Kravitz,Dana A. Busam,Karen Beeson,Tina C McIntosh,Karin A. Remington,Josep F. Abril,John Gill,Jon Borman,Yu-Hui Rogers,Marvin Frazier,Stephen W. Scherer,Robert L. Strausberg,J. Craig Venter +30 more