scispace - formally typeset
Open AccessJournal ArticleDOI

Similarity relations and cover automata

Reads0
Chats0
TLDR
The properties of similarity relations over a finite set of words beyond the scope of finite languages are investigated in detail for themselves in a generalized framework.
Abstract
Cover automata for finite languages have been much studied a few years ago. It turns out that a simple mathematical structure, namely similarity relations over a finite set of words, is underlying these studies. In the present work, we investigate in detail for themselves the properties of these relations beyond the scope of finite languages. New results with straightforward proofs are obtained in this generalized framework, and previous results concerning cover automata are obtained as immediate consequences.

read more

Content maybe subject to copyright    Report

RAIRO-Inf. Theor. Appl. 39 (2005) 115-123
DOI: 10.1051/ita:2005006
SIMILARITY RELATIONS AND COVER AUTOMATA
Jean-Marc Champarnaud
1
, Franck Guingne
1, 2
and
Georges Hansel
1
Abstract. Cover automata for finite languages have been much stud-
ied a few years ago. It turns out that a simple mathematical structure,
namely similarity relations over a finite set of words, is underlying these
studies. In the present work, we investigate in detail for themselves
the properties of these relations beyond the scope of finite languages.
New results with straightforward proofs are obtained in this general-
ized framework, and previous results concerning cover automata are
obtained as immediate consequences.
Mathematics Subject Classification. 68Q25, 68Q45, 68W01,
68W10.
1. Introduction
Let Σ be an alphabet and Σ
l
be the subset of words of Σ
whose length is not
greater than the integer l.ArelationoverΣ
l
is semi-transitive if, given three
words x, y, z Σ
l
such that |x|≤|y|≤|z|, transitivity holds when x y y z
or y x x z. In this paper, we present a general study of similarity relations
over Σ
l
, i.e. relations that are reflexive, symmetrical and semi-transitive. We
show in particular that right invariant similarity relations are recognized by semi-
automata and we characterize minimal semiautomata recognizing a given relation.
We use these general properties to study cover automata for a finite language.
Cover automata have been introduced by ampeanu, antean and Yu in [1]. A
finite language L is said to be of order l if the length of a longest word in L is equal
to l. A cover automaton for a language L of order l is a deterministic automaton A
Keywords and phr ases. Finite automata, cover automaton for a finite language, similarity
relation.
1
LIFAR, Universit´edeRouen,France;jean-marc.champarnaud@univ-rouen.fr
& franck.guingne@univ-rouen.fr & georges.hansel@univ-rouen.fr
2
XRCE, Xerox, 38240 Meylan, France; franck.guingne@xrce.xerox.com
c
EDP Sciences 2005

116 J.-M. CHAMPARNAUD, F. GUINGNE AND G. HANSEL
such that L(A)Σ
l
= L. Checking word membership to L on a cover automaton
for L only requires an additional test on the length of the word. Since covering
generally reduces the size of an automaton [7], it is of practical interest to be able
to compute a minimal cover automaton for L, that is a cover automaton with a
minimal number of states. It is shown in [1] that a minimal cover automaton can
be obtained from any cover automaton for L by merging states according to a state
relation involving the right languages of the states. Minimality with respect to L
comes from the properties of the similarity relation over Σ
l
that is underlying
the state relation. This word relation, called L-similarity, has been introduced by
Kaneps and Freivalds [5] and Dwork and Stockmeyer [3].
In this paper, we show how a semiautomaton recognizing the L-similarity re-
lation can be equipped with final states to yield a cover automaton for L.This
leads to a characterization of minimal cover automata for a finite language.
Notice that several efficient algorithms have been designed for computing a
minimal cover automaton, either from a deterministic automaton recognizing L,
or from an arbitrary cover automaton for L.In[1],Cˆampeanu, antean and Yu
present an O(n
4
) time and space algorithm to minimize an n-state cover automa-
ton for L.In[2],Cˆampeanu, aun and Yu provide an O(n
2
) time and space al-
gorithm whose input is an n-state deterministic automaton recognizing L.In[6],
orner describes an Hopcroft-like algorithm with an O(n log n)timeandO(n)
space complexity that works on both types of input.
Section 2 is devoted to a general study of similarity relations over Σ
l
and
Section 3 addresses right invariance property. The connexion between similarity
relations and semiautomata is investigated in Section 4. The application of the
study of similarity relations to the computation of a minimal cover automaton for
a finite language is developed in Section 5.
2. Similarity relations over Σ
l
Let l be an integer. In the following, Σ
l
denotes the subset of Σ
of words
having a length not greater than l.
Arelation over Σ
l
is semi-transitive iff for all x, y, z in Σ
l
such that
|x|≤|y|≤|z|, the following implications hold:
(i) x y and y z x z,
(ii) x y and x z y z.
A reflexive, symmetrical and semi-transitive relation is a similarity relation.In
the following, the relation is supposed to be a similarity relation over Σ
l
.
Two words x and y are similar (resp. dissimilar)ifx y (resp. x ∼ y). A
similarity set (resp. a dissimilarity set) is a subset of pairwise similar (resp. pair-
wise dissimilar) elements of Σ
l
. A dissimilarity set is maximal if its cardinality
is maximal among dissimilarity sets. A partition of Σ
l
whose all classes are sim-
ilarity sets is called a similarity partition. A similarity partition is minimal if its

SIMILARITY RELATIONS AND COVER AUTOMATA 117
cardinality is minimal among similarity partitions. Two similarity sets S and T
are said to be merg eable if S T is a similarity set. Hence the partition resulting
from merging two mergeable classes of a similarity partition is again a similarity
partition.
An element x Σ
l
is minimal if for all y Σ
l
,wehave
y x ⇒|y|≥|x|.
We denote by M the set of all minimal elements of Σ
l
.
Proposition 2.1.
1) The retriction of the relation to M is an equivalence relation.
2) For al l x Σ
l
, t here exists at least one minimal element similar to x.
Proof.
1) It follows from the very definition of minimal elements that two minimal
similar elements have the same length. Consequently, by Condition (i),
when restricted to M,therelation is transitive.
2) Let x Σ
l
.Lety be an element of smallest length among all elements
similar to x. It follows from Condition (i)thaty is a minimal element.
Let us fix some notation. We denote by π
M
= {M
1
,...,M
k
} the partition of M
in equivalence classes and by C = {c
1
,...,c
k
} a cross-section of π
M
, i.e. c
i
M
i
for all i =1,...,k. For all x M , let us denote by S
x
the similarity set of all the
elements similar to x. Finally, for all i =1,...,k,letusset
T
i
= S
c
i
\
i1
j=1
S
c
j
and T
i
= S
c
i
\
j=i
S
c
j
.
Remark 2.2. It follows from Condition (i)thatifx and x
are similar mini-
mal elements, then S
x
= S
x
. Moreover it follows from Proposition 2.1(2) that
xM
S
x
l
.
Proposition 2.3.
1) The set C is a maximal dissimilarity set.
2) Any minimal similarity partition has k elements and {T
1
,...,T
k
} is such
a minimal similarity partition.
Proof.
1) Being a cross-section of M,thesetC is a dissimilarity set. Let D be
any dissimilarity set. Suppose that |D| > |C|. Hence it follows from
Proposition 2.1(2) that there exist two elements y and z in D similar
to a same element c of C.Sincec is a minimal element, |y|≥|c| and
|z|≥|c| and therefore, by Condition (ii), we get that y and z are similar,
a contradiction.

118 J.-M. CHAMPARNAUD, F. GUINGNE AND G. HANSEL
2) Let π be a similarity partition of Σ
l
. Different elements of C belong to
different elements of π. Hence π has at least k elements. It remains only
to observe that {T
1
,...,T
k
} is a similarity partition (cf. Rem. 2.2).
The following proposition gives a complete characterization of maximal dissimi-
larity sets.
Proposition 2.4. Let D be a subset of Σ
l
. The following conditions are equiv-
alent:
1) D is a maximal dissimilarity set.
2) |D| = k and, for all i =1,...,k, there exists one and only one element
d
i
D such that d
i
T
i
.
Proof. 1) 2) Since D is a maximal dissimilarity set, it follows from Proposi-
tion 2.3(1) that |D| = |C| = k. By Proposition 2.1(2), we can chose for all d D
a minimal element f (d) C such that f (d) d.Letd, d
be two elements of D
and suppose that f (d)=f(d
). It follows from Condition (ii)thatd d
and,
since D is a dissimilarity set, we get that d = d
. Hence the mapping d f (d)is
one-to-one onto. Let d
i
= f
1
(c
i
), i =1,...,k.Thend
i
c
i
and d
i
∼ c
j
for j = i
(otherwise we would get d
i
d
j
). Hence d
i
T
i
, i =1,...,k, and 2) is satisfied.
2) 1) It suffices to observe that according to the definition of the sets T
i
,
i =1,...,k,wegetthatD = {d
1
,...,d
k
} is a dissimilarity set.
Corollary 2.5. Let D be a dissimilarity set. The following conditions are equiv-
alent:
1) D is a maximal dissimilarity set.
2) D is a cross-section of a similarity p artition of Σ
l
.
Proof. 1) 2) According to Proposition 2.4, D = {d
1
,...,d
k
},withd
i
T
i
for
all i =1,...,k. Hence D is a cross-section of the similarity partition {T
1
,...,T
k
}.
2) 1) Let π = {U
1
,...,U
p
} be a similarity partition of Σ
l
whose D is a
cross-section. Since π is a similarity partition, we have p k and since D is a
dissimilarity set, we have p k. Hence |D| = p = k. We can assume that c
i
U
i
for all i =1,...,k and denote by d
i
the unique element of D U
i
.SinceD is a
dissimilarity set, we get that d
i
c
j
if and only if i = j. Hence d
i
T
i
for all
i =1,...,k and D is a maximal dissimilarity set (cf. Prop. 2.4).
Lemma 2.6. Let S and T be two similarity sets. L et s (resp. t)beoneofthe
smallest elements of S (resp. T ). The following conditions are equivalent:
1) S and T are mergeable;
2) s and t ar e similar.
Proof. 1) 2) is obvious. Let us prove that 2) 1). Suppose that |s|≤|t|.Lety
be an element of S and z be an element of T .Since|s|≤|t|≤|z|, by Condition (i)
we get that s z. Consequently, since |s|≤|y| and |s|≤|z|, by Condition (ii)
we get that y z. Hence S and T are mergeable.

SIMILARITY RELATIONS AND COVER AUTOMATA 119
Theorem 2.7. Let π be a similarity partition of Σ
l
. The following conditions
are equivalent:
1) π is a minimal similarity partition.
2) π admits a maximal dissimilarity cross-section.
3) π admits a dissimilarity cross-section.
4) π cannot be reduced by merging elements.
Proof. 1) 2) Since π is minimal, it has k elements (cf. Prop. 2.3) and conse-
quently the set C is a maximal dissimilarity cross-section of π.
2) 3) is obvious.
3) 1) Let D be a dissimilarity cross-section of π. It follows from Corollary 2.5
that D is a maximal dissimilarity set. Hence D has k elements and π is minimal.
Thus we have already shown that 1) 2) 3). The implication 1) 4) is
obvious and it follows from Lemma 2.6 that 4) 3). The proof is complete.
3. Right invariant similarity relations
A similarity relation over Σ
l
is right invariant if x y xz yz, for all
z Σ
such that |xz|, |yz|≤l. A similarity partition (U
i
)isright invariant with
respect to if the conditions x, y U
i
, |xz|≤l, |yz|≤l,andxz U
j
imply
yz U
j
.
Proposition 3.1. Let be a right invariant similarity relation over Σ
l
.Then
there exists a minimal right invariant similarity partition.
Proof. First we define a mapping (c, a) c ·a from C × ΣtoC by defining c · a as
any element c
C that is similar to the word ca. This mapping is then inductively
extended to a mapping (c, x) c · x from C × Σ
to C by setting
c · x =
c if x =
(c · y) · a if x = ya.
Now we construct a partition of Σ
l
denoted {U
1
,...,U
k
},withk = |C|,byxing,
for all x Σ
l
,towhichsetU
i
it belongs. Remark that the empty word belongs
to C and we can suppose that = c
1
.Thenweset
x U
i
· x = c
i
.
Let us first inductively check that x U
i
x c
i
and hence that (U
i
)isa
similarity partition. By definition U
1
and trivially = c
1
c
1
. Suppose that
x = ya with y U
j
and x U
i
. By the induction hypothesis, y c
j
.Sincethe
relation is right invariant we get that x c
j
a. On the other hand
· x =( · y) · a = c
j
· a = c
i
.
By definition of c
j
· a,wehavec
j
a c
j
· a.Thuswehavex c
j
a and c
j
a c
j
· a.
But |x|≥|c
j
a| and |c
j
a|≥|c
j
· a|. Hence x c
j
· a, i.e. x c
i
.

Citations
More filters

Notes on Hyper-minimization.

TL;DR: The most efficienthyper-minimization algorithms for several error profiles are presented and the languages of hyper-minimal automata are investigated.
Journal ArticleDOI

Unweighted and weighted hyper-minimization

TL;DR: A generalization of this lossy compression method to the weighted setting over semifields is presented, which allows the recognized weighted language to differ for finitely many input strings.
Book ChapterDOI

Computing all l-cover automata fast

TL;DR: This contribution presents an alternative simple algorithm with running time O(n log n), in which the computation is split into three phases that allows the calculation of all the sizes of minimal l-cover automata in the same time bound.
Journal ArticleDOI

Cover transducers for functions with finite domain

TL;DR: This work studies the problem of reducing the number of states of a cover transducer, and reports experimental results from an implementation using WFSC (Weighted Finite State Compiler), a Xerox tool for handling weighted finite state automata and transducers.
Journal ArticleDOI

More on deterministic and nondeterministic finite cover automata

TL;DR: This work studies how to adapt lower bound techniques for nondeterministic finite automata to NFCAs such as, e.g., the biclique edge cover technique, solving an open problem from the literature.
References
More filters
Journal ArticleDOI

A time complexity gap for two-way probabilistic finite-state automata

TL;DR: It is shown that if a two-way probabilistic finite-state automaton (2pfa) M recognizes a nonregular language L with error probability bounded below $\frac{1}{2}$, then there is a positive constant b such that, for infinitely many inputs x, the expected running time of M on input x must exceed $2^{n^{b}}$ where n is the length of x.
Journal ArticleDOI

Minimal cover-automata for finite languages

TL;DR: This paper describes an efficient algorithm that, for a given DFA accepting a finite language, constructs a minimal deterministic finite cover-automaton of the language.
Book ChapterDOI

Running Time to Recognize Nonregular Languages by 2-Way Probabilistic Automata

TL;DR: The running time for the recognition of nonregular 2-dimensional languages by 4-way pfa can be essentially smaller, namely, linear.
Journal ArticleDOI

A time and space efficient algorithm for minimizing cover automata for finite languages

TL;DR: This paper presents an algorithm which converts an n–state DFA for some finite language L into a corresponding minimal DFCA, using only O(n log n) time and O( n) space.
Related Papers (5)
Frequently Asked Questions (1)
Q1. What have the authors contributed in "Similarity relations and cover automata" ?

Cover automata for finite languages have been much studied a few years ago. In the present work, the authors investigate in detail for themselves the properties of these relations beyond the scope of finite languages. New results with straightforward proofs are obtained in this generalized framework, and previous results concerning cover automata are obtained as immediate consequences.