Journal ArticleDOI
Discovering Motifs in Biological Sequences Using the Micron Automata Processor
Indranil Roy,Srinivas Aluru +1 more
TLDR
This paper proposes a novel algorithm for the (l; d) motif search problem using streaming execution over a large set of non-deterministic finite automata (NFA), designed to take advantage of the micron automata processor, a new technology close to deployment that can simultaneously execute multiple NFA in parallel.Abstract:
Finding approximately conserved sequences, called motifs , across multiple DNA or protein sequences is an important problem in computational biology. In this paper, we consider the $(l,d)$ motif search problem of identifying one or more motifs of length $l$ present in at least $q$ of the $n$ given sequences, with each occurrence differing from the motif in at most $d$ substitutions. The problem is known to be NP-complete, and the largest solved instance reported to date is $(26,11)$ . We propose a novel algorithm for the $(l,d)$ motif search problem using streaming execution over a large set of non-deterministic finite automata (NFA). This solution is designed to take advantage of the micron automata processor, a new technology close to deployment that can simultaneously execute multiple NFA in parallel. We demonstrate the capability for solving much larger instances of the $(l,d)$ motif search problem using the resources available within a single automata processor board, by estimating run-times for problem instances $(39,18)$ and $(40,17)$ . The paper serves as a useful guide to solving problems using this new accelerator technology.read more
Citations
More filters
Proceedings ArticleDOI
Genax: a genome sequencing accelerator
Daichi Fujiki,Aran Subramaniyan,Tianjun Zhang,Yu Zeng,Reetuparna Das,David Blaauw,Satish Narayanasamy +6 more
TL;DR: GenAx is presented, an accelerator for read alignment, a time-consuming step in genome sequencing which achieves 31.7× speedup over the standard BWA-MEM sequence aligner running on a 56-thread dualsocket 14-core Xeon E5 server processor, while reducing power consumption and area.
Proceedings ArticleDOI
Cache automaton
Arun Subramaniyan,Jingcheng Wang,Ezhil R. M. Balasubramanian,David Blaauw,Dennis Sylvester,Reetuparna Das +5 more
TL;DR: Cache Automaton as discussed by the authors extends a conventional last-level cache architecture with components to accelerate two phases in NFA processing: state-match and state-transition, which is made efficient using a sense-amplifier cycling technique that exploits spatial locality in symbol matches.
Proceedings ArticleDOI
Sequential pattern mining with the Micron automata processor
TL;DR: A hardware-accelerated solution of the SPM using Micron's Automata Processor (AP), a hardware implementation of non-deterministic finite automata (NFAs), and a generalized automaton structure is proposed by flattening sequential patterns to simple strings to reduce compilation time and to minimize overhead of reconfiguration.
Journal ArticleDOI
Computer Science Education for Primary and Lower Secondary School Students: Teaching the Concept of Automata
TL;DR: A puzzle game that players can answer correctly if they understand the fundamental concepts of automata theory is designed, which suggests that primary and lower secondary school students can understand thefundamental concepts of Automata theory.
Proceedings ArticleDOI
Parallel Automata Processor
Arun Subramaniyan,Reetuparna Das +1 more
TL;DR: This paper explores the FSM parallelization problem in the context of the Micron Automata Processor and proposes solutions that leverage both the unique properties of the NFAs and unique features in the AP to realize parallel NFA execution on the AP.
References
More filters
Proceedings Article
Fitting a mixture model by expectation maximization to discover motifs in biopolymers.
Timothy L. Bailey,Charles Elkan +1 more
TL;DR: The algorithm described in this paper discovers one or more motifs in a collection of DNA or protein sequences by using the technique of expectation maximization to fit a two-component finite mixture model to the set of sequences.
Journal ArticleDOI
Detecting Subtle Sequence Signals: A Gibbs Sampling Strategy for Multiple Alignment
Charles E. Lawrence,Stephen F. Altschul,Mark S. Boguski,Jun Liu,Andrew F. Neuwald,John C. Wootton +5 more
TL;DR: A mathematical definition of this "local multiple alignment" problem suitable for full computer automation has been used to develop a new and sensitive algorithm, based on the statistical method of iterative sampling, that finds an optimized local alignment model for N sequences in N-linear time, requiring only seconds on current workstations.
Journal ArticleDOI
Identifying DNA and protein patterns with statistically significant alignments of multiple sequences.
Gerald Z. Hertz,Gary D. Stormo +1 more
TL;DR: A greedy algorithm for determining alignments of functionally related sequences is described, and the accuracy of the P value calculations are tested, and an example of using the algorithm to identify binding sites for the Escherichia coli CRP protein is given.
Proceedings Article
Combinatorial Approaches to Finding Subtle Signals in DNA Sequences
Pavel A. Pevzner,Sing-Hoi Sze +1 more
TL;DR: This work complements existing statistical and machine learning approaches to this problem by a combinatorial approach that proved to be successful in identifying very subtle signals in DNA sequences.
Journal ArticleDOI
A restriction enzyme from Hemophilus influenzae: II. Base sequence of the recognition site
TL;DR: In this paper, the authors have explored the nucleotide sequences at the 5′-ends of the limit product by labeling the 5-phoryl groups (using polynucleotide kinase) and characterizing the labeled fragments released by various nucleases.
Related Papers (5)
Finding Motifs in Biological Sequences Using the Micron Automata Processor
Indranil Roy,Srinivas Aluru +1 more