Book ChapterDOI
RISOTTO: fast extraction of motifs with mismatches
Nadia Pisanti,Alexandra M. Carvalho,Laurent Marsan,Marie-France Sagot +3 more
- Vol. 3887, pp 757-768
TLDR
In this paper, an exact algorithm for motif extraction based on suffix trees is presented, which is shown to be more than two times faster than the best known exact algorithm in terms of average case complexity.Abstract:
We present in this paper an exact algorithm for motif extraction. Efficiency is achieved by means of an improvement in the algorithm and data structures that applies to the whole class of motif inference algorithms based on suffix trees. An average case complexity analysis shows a gain over the best known exact algorithm for motif extraction. A full implementation was developed and made available online. Experimental results show that the proposed algorithm is more than two times faster than the best known exact algorithm for motif extraction.read more
Citations
More filters
Journal ArticleDOI
Fast and Practical Algorithms for Planted (l, d) Motif Search
TL;DR: A sequence of practical algorithms are proposed, which start based on the ideas considered in PMS1, and are able to tackle challenging instances with bigger d, taking less time in the instances reported solved by exact algorithms.
Proceedings ArticleDOI
Finding Motifs in Biological Sequences Using the Micron Automata Processor
Indranil Roy,Srinivas Aluru +1 more
TL;DR: This paper proposes a novel algorithm for the (l, d) motif search problem using streaming execution over a large set of Non-deterministic Finite Automata (NFA), designed to take advantage of the Micron Automata Processor, a new technology close to deployment that can simultaneously execute multiple NFA in parallel.
Journal ArticleDOI
PMS5: an efficient exact algorithm for the (ℓ, d)-motif finding problem
TL;DR: A fast algorithm is proposed that can solve the well-known challenging instances of PMS: (21, 8) and (23, 9).
Journal ArticleDOI
Efficient and Accurate Discovery of Patterns in Sequence Data Sets
TL;DR: This paper presents a new algorithm called FLAME, a flexible suffix-tree-based algorithm that can be used to find frequent patterns with a variety of definitions of motif (pattern) models, and demonstrates that FLAME is fast, scalable, and outperforms existing algorithms on a range of performance metrics.
Journal ArticleDOI
qPMS7: A Fast Algorithm for Finding (ℓ, d)-Motifs in DNA and Protein Sequences
TL;DR: A novel algorithm named qPMS7 is proposed that tackles theqPMS problem on real data as well as challenging instances and Experimental results show that the Algorithm qP MS7 is on an average 5 times faster than the state-of-art algorithm.
References
More filters
Journal ArticleDOI
Assessing computational tools for the discovery of transcription factor binding sites.
Martin Tompa,Nan Li,Timothy L. Bailey,George M. Church,Bart De Moor,Eleazar Eskin,Alexander V. Favorov,Martin C. Frith,Yutao Fu,W. James Kent,Vsevolod J. Makeev,Andrei A. Mironov,William Stafford Noble,Giulio Pavesi,Graziano Pesole,Mireille Régnier,Nicolas Simonis,Saurabh Sinha,Gert Thijs,Jacques van Helden,Mathias Vandenbogaert,Zhiping Weng,Christopher T. Workman,Chun Ye,Zhou Zhu +24 more
TL;DR: The purpose of the current assessment is to provide some guidance to users regarding the accuracy of currently available tools in various settings, and to provide a benchmark of data sets for assessing future tools.
Proceedings ArticleDOI
BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes.
TL;DR: BioProspector, a C program using a Gibbs sampling strategy, examines the upstream region of genes in the same gene expression pattern group and looks for regulatory sequence motifs, showing preliminary success in finding the binding motifs for Saccharomyces cerevisiae RAP1, Bacillus subtilis RNA polymerase, and Escherichia coli CRP.
Proceedings Article
Combinatorial Approaches to Finding Subtle Signals in DNA Sequences
Pavel A. Pevzner,Sing-Hoi Sze +1 more
TL;DR: This work complements existing statistical and machine learning approaches to this problem by a combinatorial approach that proved to be successful in identifying very subtle signals in DNA sequences.
Book
Applied Combinatorics on Words (Encyclopedia of Mathematics and its Applications)
TL;DR: Binatorics automata and number theory, binatorial mathematics article about binatorial, and algorithms binations of encyclopedia of mathematics.
Book
Applied Combinatorics on Words
TL;DR: This paper presents a meta-analyses of natural language processing techniques used for pattern matching in the context of biological sequences, and some of the techniques used in this paper were adapted for that purpose.
Related Papers (5)
Combinatorial Approaches to Finding Subtle Signals in DNA Sequences
Pavel A. Pevzner,Sing-Hoi Sze +1 more
Assessing computational tools for the discovery of transcription factor binding sites.
Martin Tompa,Nan Li,Timothy L. Bailey,George M. Church,Bart De Moor,Eleazar Eskin,Alexander V. Favorov,Martin C. Frith,Yutao Fu,W. James Kent,Vsevolod J. Makeev,Andrei A. Mironov,William Stafford Noble,Giulio Pavesi,Graziano Pesole,Mireille Régnier,Nicolas Simonis,Saurabh Sinha,Gert Thijs,Jacques van Helden,Mathias Vandenbogaert,Zhiping Weng,Christopher T. Workman,Chun Ye,Zhou Zhu +24 more