Showing papers by "Costas S. Iliopoulos published in 2009"

PDF

Open Access

Journal Article•

Efficient (δ, γ)-pattern-matching with don't cares

[...]

Yoan José Pinzón Ardila¹, Manolis Christodoulakis¹, Costas S. Iliopoulos, Manal Mohamed¹•Institutions (1)

01 Jan 2009-The journal of combinatorial mathematics and combinatorial computing

TL;DR: The current result on the different matching problems is extended to handle the presence of “don’t care” symbols, and efficient algorithms are presented that calculate Iδ, Iγ , and I( δ,γ) = I δ∩Iγ , for pattern P with occurrences of ”don�’s cares”.

...read moreread less

Abstract: Here we consider string matching problems that arise naturally in applications to music retrieval. The δ-Matching problem calculates, for a given text T1..n and a pattern P1..m on an alphabet of integers, the list of all indices Iδ = {1 ≤ i ≤ n−m+1 : max m j=1 |Pj−Ti+j−1| ≤ δ}. The γ-Matching problem computes, for given T and P , the list of all indices Iγ = {1 ≤ i ≤ n − m + 1 : Pm j=1 |Pj − Ti+j−1| ≤ γ}. In this paper, we extend the current result on the different matching problems to handle the presence of “don’t care” symbols. We present efficient algorithms that calculate Iδ, Iγ , and I(δ,γ) = Iδ∩Iγ , for pattern P with occurrences of “don’t cares”.

...read moreread less

53 citations

Journal Article•DOI•

A New Efficient Algorithm for Computing the Longest Common Subsequence

[...]

Costas S. Iliopoulos¹, M. Sohel Rahman¹•Institutions (1)

King's College London¹

10 Jun 2009

TL;DR: This paper presents a new and efficient algorithm for solving the Longest Common Subsequence problem for two strings in O(ℛlog log’n+n) time, where ℛ is the total number of ordered pairs of positions at which the two strings match.

...read moreread less

Abstract: The Longest Common Subsequence (LCS) problem is a classic and well-studied problem in computer science. The LCS problem is a common task in DNA sequence analysis with many applications to genetics and molecular biology. In this paper, we present a new and efficient algorithm for solving the LCS problem for two strings. Our algorithm runs in O(ℛlog log n+n) time, where ℛ is the total number of ordered pairs of positions at which the two strings match.

...read moreread less

45 citations

Book Chapter•DOI•

LPF Computation Revisited

[...]

Maxime Crochemore¹, Lucian Ilie², Costas S. Iliopoulos³, Marcin Kubica⁴, Wojciech Rytter⁴, Tomasz Waleń⁴ - Show less +2 more•Institutions (4)

King's College London¹, University of Western Ontario², Curtin University³, University of Warsaw⁴

10 Nov 2009

TL;DR: Efficient algorithms for storing past segments of a text using two previously computed read-only arrays composing the Suffix Array of the text and an O(nlogn) strong in-place computation of the LPF table are presented.

...read moreread less

Abstract: We present efficient algorithms for storing past segments of a text. They are computed using two previously computed read-only arrays (SUF and LCP) composing the Suffix Array of the text. They compute the maximal length of the previous factor (subword) occurring at each position of the text in a table called LPF. This notion is central both in many conservative text compression techniques and in the most efficient algorithms for detecting motifs and repetitions occurring in a text. The main results are: a linear-time algorithm that computes explicitly the permutation that transforms the LCP table into the LPF table; a time-space optimal computation of the LPF table; and an O(nlogn) strong in-place computation of the LPF table.

...read moreread less

34 citations

Book Chapter•DOI•

Efficient Algorithms for Two Extensions of LPF Table: The Power of Suffix Arrays

[...]

Maxime Crochemore¹, Costas S. Iliopoulos², Marcin Kubica³, Wojciech Rytter³, Tomasz Waleń³ - Show less +1 more•Institutions (3)

King's College London¹, Curtin University², University of Warsaw³

08 Dec 2009

TL;DR: Two new tables storing different types of previous factors (past segments) of a string are computed efficiently in linear time on any integer alphabet, helpful to improve, for example, gapped palindrome detection and text compression using reverse factors.

...read moreread less

Abstract: Suffix arrays provide a powerful data structure to solve several questions related to the structure of all the factors of a string. We show how they can be used to compute efficiently two new tables storing different types of previous factors (past segments) of a string. The concept of a longest previous factor is inherent to Ziv-Lempel factorization of strings in text compression, as well as in statistics of repetitions and symmetries. The longest previous reverse factor for a given position i is the longest factor starting at i, such that its reverse copy occurs before, while the longest previous non-overlapping factor is the longest factor v starting at i which has an exact copy occurring before. The previous copies of the factors are required to occur in the prefix ending at position i ? 1. We design algorithms computing the table of longest previous reverse factors (LPrF table) and the table of longest previous non-overlapping factors (LPnF table). The latter table is useful to compute repetitions while the former is a useful tool for extracting symmetries. These tables are computed, using two previously computed read-only arrays (SUF and LCP) composing the suffix array, in linear time on any integer alphabet. The tables have not been explicitly considered before, but they have several applications and they are natural extensions of the LPF table which has been studied thoroughly before. Our results improve on the previous ones in several ways. The running time of the computation no longer depends on the size of the alphabet, which drops a log factor. Moreover the newly introduced tables store additional information on the structure of the string, helpful to improve, for example, gapped palindrome detection and text compression using reverse factors.

...read moreread less

28 citations