scispace - formally typeset
Book ChapterDOI

RISOTTO: fast extraction of motifs with mismatches

TLDR
In this paper, an exact algorithm for motif extraction based on suffix trees is presented, which is shown to be more than two times faster than the best known exact algorithm in terms of average case complexity.
Abstract
We present in this paper an exact algorithm for motif extraction. Efficiency is achieved by means of an improvement in the algorithm and data structures that applies to the whole class of motif inference algorithms based on suffix trees. An average case complexity analysis shows a gain over the best known exact algorithm for motif extraction. A full implementation was developed and made available online. Experimental results show that the proposed algorithm is more than two times faster than the best known exact algorithm for motif extraction.

read more

Citations
More filters
Journal ArticleDOI

Fast and Practical Algorithms for Planted (l, d) Motif Search

TL;DR: A sequence of practical algorithms are proposed, which start based on the ideas considered in PMS1, and are able to tackle challenging instances with bigger d, taking less time in the instances reported solved by exact algorithms.
Proceedings ArticleDOI

Finding Motifs in Biological Sequences Using the Micron Automata Processor

TL;DR: This paper proposes a novel algorithm for the (l, d) motif search problem using streaming execution over a large set of Non-deterministic Finite Automata (NFA), designed to take advantage of the Micron Automata Processor, a new technology close to deployment that can simultaneously execute multiple NFA in parallel.
Journal ArticleDOI

PMS5: an efficient exact algorithm for the (ℓ, d)-motif finding problem

TL;DR: A fast algorithm is proposed that can solve the well-known challenging instances of PMS: (21, 8) and (23, 9).
Journal ArticleDOI

Efficient and Accurate Discovery of Patterns in Sequence Data Sets

TL;DR: This paper presents a new algorithm called FLAME, a flexible suffix-tree-based algorithm that can be used to find frequent patterns with a variety of definitions of motif (pattern) models, and demonstrates that FLAME is fast, scalable, and outperforms existing algorithms on a range of performance metrics.
Journal ArticleDOI

qPMS7: A Fast Algorithm for Finding (ℓ, d)-Motifs in DNA and Protein Sequences

TL;DR: A novel algorithm named qPMS7 is proposed that tackles theqPMS problem on real data as well as challenging instances and Experimental results show that the Algorithm qP MS7 is on an average 5 times faster than the state-of-art algorithm.
References
More filters
Proceedings ArticleDOI

BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes.

TL;DR: BioProspector, a C program using a Gibbs sampling strategy, examines the upstream region of genes in the same gene expression pattern group and looks for regulatory sequence motifs, showing preliminary success in finding the binding motifs for Saccharomyces cerevisiae RAP1, Bacillus subtilis RNA polymerase, and Escherichia coli CRP.
Proceedings Article

Combinatorial Approaches to Finding Subtle Signals in DNA Sequences

TL;DR: This work complements existing statistical and machine learning approaches to this problem by a combinatorial approach that proved to be successful in identifying very subtle signals in DNA sequences.
Book

Applied Combinatorics on Words (Encyclopedia of Mathematics and its Applications)

M. Lothaire
TL;DR: Binatorics automata and number theory, binatorial mathematics article about binatorial, and algorithms binations of encyclopedia of mathematics.
Book

Applied Combinatorics on Words

M. Lothaire
TL;DR: This paper presents a meta-analyses of natural language processing techniques used for pattern matching in the context of biological sequences, and some of the techniques used in this paper were adapted for that purpose.
Related Papers (5)