scispace - formally typeset
Book ChapterDOI

SNPs Problems, Complexity, and Algorithms

Reads0
Chats0
TLDR
It is shown that the general SNPs Haplotyping Problem is NP-hard for mate-pairs assembly data, and polynomial time algorithms for fragment assembly data are designed, and the Minimum SNPs Removal problem amounts to finding the largest independent set in a weakly triangulated graph.
Abstract
Single nucleotide polymorphisms (SNPs) are the most frequent form of human genetic variation. They are of fundamental importance for a variety of applications including medical diagnostic and drug design. They also provide the highest-resolution genomic fingerprint for tracking disease genes. This paper is devoted to algorithmic problems related to computational SNPs validation based on genome assembly of diploid organisms. In diploid genomes, there are two copies of each chromosome. A description of the SNPs sequence information from one of the two chromosomes is called SNPs haplotype. The basic problem addressed here is the Haplotyping, i.e., given a set of SNPs prospects inferred from the assembly alignment of a genomic region of a chromosome, find the maximally consistent pair of SNPs haplotypes by removing data "errors" related to DNA sequencing errors, repeats, and paralogous recruitment. In this paper, we introduce several versions of the problem from a computational point of view. We show that the general SNPs Haplotyping Problem is NP-hard for mate-pairs assembly data, and design polynomial time algorithms for fragment assembly data.We give a network-flow based polynomial algorithm for the Minimum Fragment Removal Problem, and we show that the Minimum SNPs Removal problem amounts to finding the largest independent set in a weakly triangulated graph.

read more

Citations
More filters
Journal ArticleDOI

HapCUT: an efficient and accurate algorithm for the haplotype assembly problem

TL;DR: A novel combinatorial approach based on computing max-cuts in certain graphs derived from the sequenced fragments of a human individual to infer haplotypes and demonstrates that the haplotypes inferred using HapCUT are significantly more accurate than the greedy heuristic and a previously published method, Fast Hare.
Journal ArticleDOI

WhatsHap: Weighted Haplotype Assembly for Future-Generation Sequencing Reads

TL;DR: WhatsHap is the first approach that yields provably optimal solutions to the weighted minimum error correction problem in runtime linear in the number of SNPs, and is demonstrated that it can handle datasets of coverage up to 20×, and that 15× are generally enough for reliably phasing long reads, even at significantly elevated sequencing error rates.
Journal ArticleDOI

Reinforcement learning for combinatorial optimization: A survey

TL;DR: This survey explores the synergy between the CO and RL frameworks, which can become a promising direction for solving combinatorial problems.
Journal ArticleDOI

Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem

TL;DR: Algorithmic considerations in a new approach for haplotype determination: inferring haplotypes from localised polymorphism data gathered from short genome 'fragments' are presented.
Journal ArticleDOI

Clique-detection models in computational biochemistry and genomics

TL;DR: The proposed article includes an introduction to the underlying biochemistry and genomic aspects of the problems as well as to the graph-theoretic aspects ofThe solution approaches, which describes a particular type of problem, and gives an example to show how the graph model can be derived.
References
More filters
Book

Computers and Intractability: A Guide to the Theory of NP-Completeness

TL;DR: The second edition of a quarterly column as discussed by the authors provides a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book "Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman & Co., San Francisco, 1979.
Journal ArticleDOI

The sequence of the human genome.

J. Craig Venter, +272 more
- 16 Feb 2001 - 
TL;DR: Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems are indicated.
Book

Algorithmic graph theory and perfect graphs

TL;DR: This new Annals edition continues to convey the message that intersection graph models are a necessary and important tool for solving real-world problems and remains a stepping stone from which the reader may embark on one of many fascinating research trails.
Journal ArticleDOI

Testing for the consecutive ones property, interval graphs, and graph planarity using PQ-tree algorithms

TL;DR: The consecutive ones test for the consecutive ones property in matrices and for graph planarity is extended to a test for interval graphs using a recently discovered fast recognition algorithm for chordal graphs.
Related Papers (5)