Book ChapterDOI
Computing similarity between RNA strings
Vineet Bafna,S. Muthukrishnan,R. Ravi +2 more
- pp 1-16
Reads0
Chats0
TLDR
This paper defines a notion of alignment between two RNA strings and presents a method for optimally aligning a given RNA sequence with unknown secondary structure to one with known sequence and structure, thus attacking the structure prediction problem in the case when the structure of a closely related sequence is known.Abstract:
Ribonucleic acid (RNA) strings are strings over the four-letter alphabet {A,C,G,U} with a secondary structure of base-pairing between A-U and C-G pairs in the string Edges are drawn between two bases that are paired in the secondary structure and these edges have traditionally been assumed to be noncrossing The noncrossing base-pairing naturally leads to a tree-like representation of the secondary structure of RNA strings In this paper, we address several notions of similarity between two RNA strings that take into account both the primary sequence and secondary base-pairing structure of the strings We present efficient algorithms for exact matching and approximate matching between two RNA strings We define a notion of alignment between two RNA strings and devise algorithms based on dynamic programming We then present a method for optimally aligning a given RNA sequence with unknown secondary structure to one with known sequence and structure, thus attacking the structure prediction problem in the case when the structure of a closely related sequence is known The techniques employed to prove our results include reductions to well-known string matching problems, allowing wild cards and ranges, and speeding up dynamic programming by using the tree structures implicit in the secondary structure of RNA stringsread more
Citations
More filters
Journal ArticleDOI
A general edit distance between RNA structures.
TL;DR: The notion of edit distance is proposed to measure the similarity between two RNA secondary and tertiary structures, by incorporating various edit operations performed on both bases and arcs (i.e., base-pairs).
Proceedings ArticleDOI
Algorithmic aspects of protein structure similarity
TL;DR: These are the first approximation algorithms with guaranteed error bounds, and NP-completeness results in the literature in the area of protein structure alignment/fold recognition for measures of structure similarity of practical interest.
Journal ArticleDOI
Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization
TL;DR: A graph-based representation for sequence-structure alignments is presented, which is model as an integer linear program (ILP) using methods from combinatorial optimization and results on a recently published benchmark set for RNA alignments are presented.
Book ChapterDOI
Finding Common Subsequences with Arcs and Pseudoknots
TL;DR: The problem of finding the longest common subsequence, on which pairwise sequence comparison algorithms are frequently based, is modified to require common subsequences to preserve the arcs induced by the selected symbol positions to be analyzed using classical and parameterized complexity.
Dissertation
Algorithms and complexity for annotated sequence analysis
TL;DR: This research describes schemes to combinatorially annotate information onto sequences so that it can be analyzed in tandem with the sequence so that the overall result would reflect both types of information about the sequence.
References
More filters
Journal ArticleDOI
A general method applicable to the search for similarities in the amino acid sequence of two proteins
TL;DR: A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed and it is possible to determine whether significant homology exists between the proteins to trace their possible evolutionary development.
Journal ArticleDOI
Identification of common molecular subsequences.
TL;DR: This letter extends the heuristic homology algorithm of Needleman & Wunsch (1970) to find a pair of segments, one from each of two long sequences, such that there is no other Pair of segments with greater similarity (homology).
Journal ArticleDOI
Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information
Michael Zuker,Patrick Stiegler +1 more
TL;DR: In this article, a dynamic programming algorithm was proposed to fold an RNA molecule that finds a conformation of minimum free energy using published values of stacking and destabilizing energies, based on applied mathematics.
Journal ArticleDOI
Fast Pattern Matching in Strings
TL;DR: An algorithm is presented which finds all occurrences of one given string within another, in running time proportional to the sum of the lengths of the strings, showing that the set of concatenations of even palindromes, i.e., the language $\{\alpha \alpha ^R\}^*$, can be recognized in linear time.
Journal ArticleDOI
On finding all suboptimal foldings of an RNA molecule
TL;DR: The mathematical problem of determining how well defined a minimum energy folding is can now be solved and all predicted base pairs that can participate in suboptimal structures may be displayed and analyzed graphically.
Related Papers (5)
Comparing multiple RNA secondary structures using tree comparisons
Bruce A. Shapiro,Kaizhong Zhang +1 more