scispace - formally typeset
Open AccessJournal Article

Searching gene and protein sequence databases.

Barsalou T, +1 more
- 01 May 1991 - 
- Vol. 8, Iss: 3, pp 144-149
Reads0
Chats0
TLDR
The classic algorithms for similarity searching and sequence alignment are described and good performance of these algorithms is critical to searching very large and growing databases.
Abstract
A large-scale effort to map and sequence the human genome is now under way. Crucial to the success of this research is a group of computer programs that analyze and compare data on molecular sequences. This article describes the classic algorithms for similarity searching and sequence alignment. Because good performance of these algorithms is critical to searching very large and growing databases, we analyze the running times of the algorithms and discuss recent improvements in this area.

read more

Citations
More filters
Proceedings Article

FLASH: A Fast Look-Up Algorithm for String Homology

TL;DR: In this paper, a probabilistic indexing framework is proposed for homology detection based on a table look-up paradigm, which uses the sequences of interest to generate a highly redundant number of very descriptive tuples, which are subsequently used as indices in a table lookup paradigm.
Patent

Method for finding a reference token sequence in an original token string within a database of token strings using appended non-contiguous substrings

TL;DR: In this article, the authors proposed a method for creating a large number of indexes by partitioning strings of tokens into substrings, appending non contiguous substrings together to form tuples, and creating indexes from the tuples.
Proceedings ArticleDOI

FLASH: a fast look-up algorithm for string homology

TL;DR: The algorithm presented is based on a probabilistic indexing framework which requires minimal access to the database for each match, and is shown to scale well to databases containing billions of nucleotides with performances that are orders of magnitude better than the fastest of the current techniques.
Patent

Gene database retrieval system where a key sequence is compared to database sequences by a dynamic programming device

TL;DR: In this article, a dynamic programming operation unit is used to determine the degree of similarity between target data and key data by utilizing the sequence data of the bases of the gene from the gene database as the target data.
Journal ArticleDOI

BLAZE™: An implementation of the Smith-Waterman sequence comparison algorithm on a massively parallel computer

TL;DR: The Smith and Waterman dynamic programming algorithm is implemented on the massively parallel MP1104 computer from MasPar and its ability to detect remote protein sequence homologies with that of other commonly used database search algorithms is compared.