scispace - formally typeset
Search or ask a question

Showing papers by "Wing-Kin Sung published in 2001"


Journal ArticleDOI
TL;DR: An algorithm for comparing trees that are labeled in an arbitrary manner is presented, which is faster than the previous algorithms and an efficient algorithm is obtained for a new matching problem called the hierarchical bipartite matching problem.

55 citations


Journal ArticleDOI
04 Mar 2001
TL;DR: It is proved that allowing pseudoknots makes it NP-hard to maximize the number of stacking pairs in a planar secondary structure.
Abstract: In this paper we investigate the computational problem of predicting RNA secondary structures that allow any kinds of pseudoknots. The general belief is that allowing pseudoknots makes the problem very difficult. Existing polynomial-time algorithms, which aim at structures that optimize some energy functions, can only handle a certain types of pseudoknots. In this paper we initiate the study of approximation algorithms for handling all kinds of pseudoknots. We focus on predicting RNA secondary structures with a maximum number of stacking pairs and obtain two approximation algorithms with worst-case approximation ratios of 1/2 and 1/3 for planar and general secondary structures, respectively. Furthermore, we prove that allowing pseudoknots would make the problem of maximizing the number of stacking pairs on planar secondary structure to be NP-hard. This result should be contrasted with the recent NP-hard results on psuedoknots which are based on optimizing some peculiar energy functions.

55 citations


Posted Content
TL;DR: In this article, the authors presented an algorithm for comparing trees that are labeled in an arbitrary manner, which is faster than the previous algorithms and is at the core of their maximum agreement subtree algorithm.
Abstract: A widely used method for determining the similarity of two labeled trees is to compute a maximum agreement subtree of the two trees. Previous work on this similarity measure is only concerned with the comparison of labeled trees of two special kinds, namely, uniformly labeled trees (i.e., trees with all their nodes labeled by the same symbol) and evolutionary trees (i.e., leaf-labeled trees with distinct symbols for distinct leaves). This paper presents an algorithm for comparing trees that are labeled in an arbitrary manner. In addition to this generality, this algorithm is faster than the previous algorithms. Another contribution of this paper is on maximum weight bipartite matchings. We show how to speed up the best known matching algorithms when the input graphs are node-unbalanced or weight-unbalanced. Based on these enhancements, we obtain an efficient algorithm for a new matching problem called the hierarchical bipartite matching problem, which is at the core of our maximum agreement subtree algorithm.

51 citations


Posted Content
TL;DR: This paper focuses on predicting RNA secondary structures with a maximum number of stacking pairs and obtains two approximation algorithms with worst-case approximation ratios of 1/2 and 1/3 for planar and general secondary structures, respectively.
Abstract: The paper investigates the computational problem of predicting RNA secondary structures. The general belief is that allowing pseudoknots makes the problem hard. Existing polynomial-time algorithms are heuristic algorithms with no performance guarantee and can only handle limited types of pseudoknots. In this paper we initiate the study of predicting RNA secondary structures with a maximum number of stacking pairs while allowing arbitrary pseudoknots. We obtain two approximation algorithms with worst-case approximation ratios of 1/2 and 1/3 for planar and general secondary structures,respectively. For an RNA sequence of $n$ bases, the approximation algorithm for planar secondary structures runs in $O(n^3)$ time while that for the general case runs in linear time. Furthermore, we prove that allowing pseudoknots makes it NP-hard to maximize the number of stacking pairs in a planar secondary structure. This result is in contrast with the recent NP-hard results on psuedoknots which are based on optimizing some general and complicated energy functions.

13 citations


Posted Content
TL;DR: A new approach to the double digest problem is presented, which can be solved in linear time in certain theoretically interesting cases and is also NP-hard.
Abstract: The double digest problem is a common NP-hard approach to constructing physical maps of DNA sequences. This paper presents a new approach called the enhanced double digest problem. Although this new problem is also NP-hard, it can be solved in linear time in certain theoretically interesting cases.

8 citations


Book ChapterDOI
04 Sep 2001
TL;DR: This paper model an online catalog organization as a decision tree structure and proposes a metric, based on the popularity of products and the relative importance of product attribute values, to evaluate the quality of a catalog organization, to produce better catalog organizations.
Abstract: The organization of a web site is important to help users get the most out of the site. Designing such an organization, however, is a complicated problem. Traditionally, this design is mainly done by hand. To what extent this can be automated is a challenging problem. Recently, there have been investigations on how to reorganize an existing web site based on some criteria. But none of them has addressed the problem of organizing a web site automatically from scratch. In this paper, we attempt to tackle this problem by restricting the domain to online catalog organization. We model an online catalog organization as a decision tree structure and propose a metric, based on the popularity of products and the relative importance of product attribute values, to evaluate the quality of a catalog organization. The problem is then formulated as a decision tree construction problem. Although traditional decision tree algorithms, such as C4.5, can be used to generate online catalog organization, the catalog constructed is generally not good based on our metric. An efficient greedy algorithm (GENCAT) is thus developed and the experimental results show that GENCAT produces better catalog organizations based on our metric.

6 citations


01 Jan 2001
TL;DR: The Pervasive Multimedia Markup Language (PMML), an XML based notation for specifying rich media content without making any assumption on the capability of the viewing devices is proposed.
Abstract: With the rapid development of the Internet based connection to di erent devices such as PDA, WAP phones, and pagers, one-document-many-presentation has become a converging issue in the development of various markup languages for description of content and presentation. Many works have been done in this area while few of them consider the issue of rich media. To address it, this paper proposes the Pervasive Multimedia Markup Language (PMML), an XML based notation for specifying rich media content without making any assumption on the capability of the viewing devices.

Posted Content
TL;DR: In this article, the authors present an algorithm for computing a maximum agreement subtree of two unrooted evolutionary trees in O(n −1.5} log n) time for trees with unbounded degrees.
Abstract: We present an algorithm for computing a maximum agreement subtree of two unrooted evolutionary trees. It takes O(n^{1.5} log n) time for trees with unbounded degrees, matching the best known time complexity for the rooted case. Our algorithm allows the input trees to be mixed trees, i.e., trees that may contain directed and undirected edges at the same time. Our algorithm adopts a recursive strategy exploiting a technique called label compression. The backbone of this technique is an algorithm that computes the maximum weight matchings over many subgraphs of a bipartite graph as fast as it takes to compute a single matching.