scispace - formally typeset
Open AccessJournal ArticleDOI

Semi-local String Comparison: Algorithmic Techniques and Applications

Alexander Tiskin
- 23 May 2008 - 
- Vol. 1, Iss: 4, pp 571-603
Reads0
Chats0
TLDR
It is concluded that semi-local string comparison turns out to be a useful algorithmic plug-in, which unifies, and often improves on, a number of previous approaches to various substring- and subsequence-related problems.
Abstract
Given two strings, the longest common subsequence (LCS) problem consists in computing the length of the longest string that is a subsequence of both input strings. Its generalisation, the all semi-local LCS problem, requires computing the LCS length for each string against all substrings of the other string, and for all prefixes of each string against all suffixes of the other string. We survey a number of algorithmic techniques related to the all semi-local LCS problem. We then present a number of algorithmic applications of these techniques, both existing and new. In particular, we obtain a new all semi-local LCS algorithm, with asymptotic running time matching (in the case of an unbounded alphabet) the fastest known global LCS algorithm by Masek and Paterson. We conclude that semi-local string comparison turns out to be a useful algorithmic plug-in, which unifies, and often improves on, a number of previous approaches to various substring- and subsequence-related problems.

read more

Citations
More filters
Journal ArticleDOI

Conserved Noncoding Sequences Highlight Shared Components of Regulatory Networks in Dicotyledonous Plants

TL;DR: This study identifies regions of noncoding DNA in dicot plants that are likely to facilitate complex regulation of genes by binding multiple transcription factors by detecting hundreds of CNSs upstream of Arabidopsis genes.
Proceedings ArticleDOI

Fast distance multiplication of unit-Monge matrices

TL;DR: In this article, the authors give an algorithm for finding a maximum clique in a circle graph in time O(n log 2n) and a surprisingly efficient algorithm for comparing compressed strings.
Journal ArticleDOI

Evolutionary analysis of regulatory sequences (EARS) in plants.

TL;DR: A robust and highly sensitive, in silico method to identify evolutionarily conserved regions within non-coding DNA that contain clusters of transcription binding sites, often described as regulatory modules are demonstrated.
Book ChapterDOI

Faster algorithm for computing the edit distance between SLP-Compressed strings

TL;DR: Given two strings described by SLPs of total size n, it is shown how to compute their edit distance in $\mathcal{O}(nN\sqrt{\log\frac{N}{n}})$ time, where N is the sum of the strings length.
Journal ArticleDOI

Faster subsequence recognition in compressed strings

TL;DR: This work considers local subsequence recognition problems on strings compressed by straight-line programs (SLP), which is closely related to Lempel–Ziv compression.
References
More filters
Journal ArticleDOI

A general method applicable to the search for similarities in the amino acid sequence of two proteins

TL;DR: A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed and it is possible to determine whether significant homology exists between the proteins to trace their possible evolutionary development.
Journal ArticleDOI

Identification of common molecular subsequences.

TL;DR: This letter extends the heuristic homology algorithm of Needleman & Wunsch (1970) to find a pair of segments, one from each of two long sequences, such that there is no other Pair of segments with greater similarity (homology).
Journal ArticleDOI

EMBOSS: The European Molecular Biology Open Software Suite

TL;DR: The European Molecular Biology Open Software Suite is a mature package of software tools developed for the molecular biology community that includes a comprehensive set of applications for molecular sequence analysis and other tasks and integrates popular third-party software packages under a consistent interface.
Book

The Design and Analysis of Computer Algorithms

TL;DR: This text introduces the basic data structures and programming techniques often used in efficient algorithms, and covers use of lists, push-down stacks, queues, trees, and graphs.
Journal ArticleDOI

A universal algorithm for sequential data compression

TL;DR: The compression ratio achieved by the proposed universal code uniformly approaches the lower bounds on the compression ratios attainable by block-to-variable codes and variable- to-block codes designed to match a completely specified source.
Related Papers (5)