scispace - formally typeset
Open AccessBook ChapterDOI

Grammars Based on the Shuffle Operation

TLDR
Six classes of generative mechanisms producing languages by starting from a finite set of words and shuffling the current words with words in given sets are considered, with most of the corresponding six families of languages found to be incomparable.
Abstract
We consider generative mechanisms producing languages by starting from a finite set of words and shuffling the current words with words in given sets, depending on certain conditions. Namely, regular and finite sets are given for controlling the shuffling: strings are shuffled only to strings in associated sets. Six classes of such grammars are considered, with the shuffling being done on a leftmost position, on a prefix, arbitrarily, globally, in parallel, or using a maximal selector. Most of the corresponding six families of languages, obtained for finite, respectively for regular selection, are found to be incomparable. The relations of these families with Chomsky language families are briefly investigated.

read more

Content maybe subject to copyright    Report

Journal of Universal Computer Science, vol. 1, no. 1 (1995), 67-82
submitted: 8/11/94, accepted: 10/1/95, appeared: 28/1/95Springer Pub. Co.
Grammars Based on the Shue Op eration
Gheorghe Paun
Institute of Mathematics of the Romanian Academy of Sciences
PO Box 1 { 764, Bucuresti, Romania
Grzegorz Rozenberg
University of Leiden, Department of Computer Science
Niels Bohrweg 1, 2333 CA Leiden, The Netherlands
Arto Salomaa
Academy of Finland and UniversityofTurku
Department of Mathematics, 20500 Turku, Finland
Abstract:
We consider generative mechanisms producing languages by starting from a nite set of words and
shuing the currentwords with words in given sets, dep ending on certain conditions. Namely, regular and nite
sets are given for controlling the shuing: strings are shued only to strings in asso ciated sets. Six classes of such
grammars are considered, with the shuing being done on a leftmost p osition, on a prex, arbitrarily, globally,
in parallel, or using a maximal selector. Most of the corresponding six families of languages, obtained for nite,
respectively for regular selection, are found to be incomparable. The relations of these families with Chomsky
language families are briey investigated.
Key Words:
Shue op eration, Chomsky grammars, L Systems
Categories:
F4.2 [Mathematical Logic and Formal Languages]: Grammars and other Rewriting Sys-
tems:
Grammar types
, F4.3 [Mathematical Logic and Formal Languages]: Formal Languages:
Operations
on languages
1 Introduction
In formal language theory, besides the basic twotyp es of grammars, the Chomsky grammars and the
Lindenmayer systems, there are many "exotic" classes of generative devices, based not on the process
of rewriting symbols (or strings) by strings, but using various op erations of adjoining strings. We quote
here the string adjunct grammars in [7], the semi-contextual grammars in [3], the
-grammars in [10], the
pattern grammars in [2], and the contextual grammars [8]. The starting point of the present paper is this
last class of grammars, via the paper [9], where a variant of contextual grammars was introduced based
on the shue op eration. Basically, one gives two nite sets of strings,
B
and
C
,over some alphab et,
and one considers the set of strings obtained by starting from
B
and iteratively shuing strings from
C
,
without any restriction. This corresponds to the simple contextual grammars in [8].
We consider here a sort of couterpart of the contextual grammars with choice, where the adjoining
is controlled by a selection mapping. In fact, we proceed in a way similar to that used in conditional
contextual grammars [12], [13], [14] and in mo dular contextual grammars [15]: we start with several pairs
of the form (
R
i
;C
i
),
R
i
a language (we consider here only the case when
R
i
is regular or nite) and
C
i
a nite set of strings, and allow the strings in
C
i
to be shued only to strings in
R
i
. Depending on the
place of the string in
R
i
in the current string generated by our grammar, we can distinguish several types
of grammars: prex (the string in
R
i
appears in the left-hand of the processed string, as a prex of it),
leftmost (we look for the leftmost possible occurrence of a string in
R
i
), arbitrary (no condition on the
place where the string in
R
i
appears), global (the whole current string is in
R
i
), and parallel (the current
string is p ortioned into strings in
R
i
). An interesting variant is to use a substring as base for shuing
only when it is maximal. Twelve families of languages are obtained in this way. Their study is the sub ject
of this pap er.
It is worth noting that the shue op eration appears in various contexts in algebra and in formal
language theory; we quote only [1], [4], [5], [6] (and [11], for applications). In [1], [6] the operation is used
in a generative-likeway, for identifying families of languages of the form
FAM
(
[
;
; Shuf
;
FIN
), the
smallest family of languages containing the nite languages and closed under union, concatenation and
shue. (From this point of view, the family investigated in [9] is a particular case,
FAM
(
S huf
;
FIN
)).
67

The shue of the symbols of twowords is also related to the concurrent execution of two processes
described by these words, hence our mo dels can be interpreted in terms of concurrent processes, too. For
instance, the prex mode of work corresponds to the concurrent execution of a process strictly at the
beginning of another process, the "beginning" being dened modulo a regular language; in the global
case the elementary actions of the two processes can b e freely intercalated.
As it is exp ected, the languages generated byshue grammars are mostly incomparable with Chomsky
languages (due to the fact that we do not use nonterminals). Somewhat surprising is the fact that in the
regular selection case the ve mo des of work described above, with only one exception, cannot simulate
each other (the corresp onding families of languages are incomparable).
2 Classes of shue grammars
As usual,
V
denotes the set of all words over the alphabet
V
, the emptyword is denoted by
, the length
of
x
2
V
by
j
x
j
and the set of non-emptywords over
V
is identied by
V
+
. The number of o ccurrences
of a symbol
a
in a string
x
will be denoted by
j
x
j
a
.For basic notions in formal language theory (Chomsky
grammars and L systems) we refer to [16], [17]. We only mention that
F I N ; RE G; C F; C S;
0
L
are the
families of nite, regular, context-free, context-sensitive, and of 0L languages, respectively.
For
x; y
2
V
we dene the
shue
(product) of
x; y
, denoted
x
t?
y
,as
x
t?
y
=
f
x
1
y
1
x
2
y
2
:::x
n
y
n
j
x
=
x
1
x
2
:::x
n
;y
=
y
1
y
2
:::y
n
;
x
i
;y
i
2
V
;
1
i
n; n
1
g
:
Various prop erties of this op eration, such as commutativity and associativity, will b e implicitely used in
the sequel. The op eration
t?
is extended in the natural way to languages,
L
1
t?
L
2
=
f
z
j
z
2
x
t?
y; x
2
L
1
;y
2
L
2
g
;
and iterated,
L
(0)
=
f
g
;
L
(
i
+1)
=
L
(
i
)
t?
L; i
0
;
L
t?
=
[
i
0
L
(
i
)
:
The grammars considered in [9] are triples of the form
G
=(
V; B; C
), where
V
is an alphabet,
B
and
C
are nite languages over
V
. The language generated by
G
is dened as the smallest language
L
over
V
containing
B
and having the property that if
x
2
L
and
u
2
C
, then
x
t?
u
L
. (Therefore, this language
is equal to
B
t?
C
t?
).
There is no restriction in [9] about the shuing of elements of
C
to current strings. Such a control
of the grammar work can b e done in various ways of introducing a context-dependency.We use here the
following natural idea:
Denition 1.
A
shue grammar
is a construct
G
=(
V; B;
(
R
1
;C
1
)
;:::;
(
R
n
;C
n
))
;
where
V
is an alphab et,
B
is a nite language over
V
,
R
i
are languages over
V
and
C
i
are nite languages
over
V
,1
i
n
.
The parameter
n
1 is called the
degree
of
G
.If
R
i
are languages in a given family
F
, then wesay
that
G
is
with F choice
. Here we consider only the cases
F
=
FIN
and
F
=
RE G
.
The idea is to allow the strings in
C
i
to b e shued only to strings in the corresponding set
R
i
. The
sets
R
i
are called
selectors
.
68

Denition 2.
For a shue grammar
G
as above, a constant
i;
1
i
n
, and two strings
x; y
in
V
,
we dene the following derivation relations:
x
=
)
arb
i
yiff x
=
x
1
x
2
x
3
;x
1
;x
3
2
V
;x
2
2
R
i
;y
=
x
1
x
0
2
x
3
;
f or some x
0
2
2
x
2
t?
u; u
2
C
i
;
x
=
)
pr
i
yiff x
=
x
1
x
2
;x
2
2
V
;x
1
2
R
i
;y
=
x
0
1
x
2
;
f or some x
0
1
2
x
1
t?
u; u
2
C
i
;
x
=
)
lm
i
yiff x
=
x
1
x
2
x
3
;x
1
;x
3
2
V
;x
2
2
R
i
;y
=
x
1
x
0
2
x
3
;
f or some x
0
2
2
x
2
t?
u; u
2
C
i
; and ther e is no j;
1
j
n;
such that x
=
v
1
v
2
v
3
;
j
v
1
j
<
j
x
1
j
;v
2
2
R
j
;
x
=
)
gl
i
yiff x
2
R
i
;y
=
x
t?
u; f or some u
2
C
i
:
These derivation relations are called
arbitrary, prex, leftmost, and global
derivations, respectively.
Moreover, we dene the
paral lel
derivation as
x
=
)
pl
yiff x
=
x
1
x
2
:::x
k
;k
1
;y
=
x
0
1
x
0
2
:::x
0
k
;
for x
i
2
R
j
i
;x
0
i
2
x
i
t?
u
i
;u
i
2
C
j
i
;
1
j
i
n;
1
i
k:
We denote
M
=
f
arb; pr; lm; gl ; pl
g
.
Denition 3.
The
language generated
byashue grammar
G
in the mode
f
2
M
,f
pl
g
is dened
as follows:
L
f
(
G
)=
B
[f
x
2
V
j
w
=
)
f
i
1
w
1
=
)
f
i
2
:::
=
)
f
i
m
w
m
=
x;
w
2
B;
1
i
j
n;
1
j
m; m
1
g
:
For the parallel mode of derivation we dene
L
pl
(
G
)=
B
[f
x
2
V
j
w
=
)
pl
w
1
=
)
pl
:::
=
)
pl
w
m
=
x; w
2
B; m
1
g
:
The corresp onding families of languages generated byshue grammars with
F
choice,
F
2f
F I N ; RE G
g
,
are denoted by
ARB
(
F
)
;PR
(
F
)
;LM
(
F
)
;GL
(
F
)
;PL
(
F
), resp ectively; by
SL
we denote the family of lan-
guages generated by grammars as in [9], without choice. Subscripts
n
can b e added to
ARB ; P R; LM ; GL; P L
when only languages generated by grammars of degree at most
n; n
1, are considered.
3 The generative capacityofshue grammars
From denition wehave the inclusions
X
n
(
F
)
X
n
+1
(
F
),
X
n
(
FIN
)
X
n
(
RE G
), for all
n
1,
X
2f
ARB ; P R; LM ; GL; P L
g
and
F
2f
F I N ; RE G
g
.
Every family
ARB
(
FIN
)
;PR
(
FIN
)
;LM
(
FIN
)
;GL
(
FIN
)
;PL
(
FIN
) contains each nite language.
This is obvious, b ecause for
G
=(
A; B ;
(
B;
f
g
)) wehave
L
f
(
G
)=
B
for all
f
. In fact, wehave
Theorem 1.
(i)
Every family
ARB
1
(
RE G
)
;PR
1
(
RE G
)
;LM
1
(
RE G
)
;GL
1
(
RE G
)
;PL
1
(
RE G
)
includes strictly the
family SL.
(ii)
The family SL is incomparable with each
X
n
(
FIN
)
;n
1
;X
2f
ARB ; P R; LM ; P L
g
:
(iii)
GL
(
FIN
)=
GL
1
(
FIN
)=
FIN:
Proof.
(i) If
G
=(
V; B; C
) is a simple shue grammar, then
L
(
G
)=
L
f
(
G
0
) for each
f
2
M
and
G
0
=(
V; B;
(
V
;C
))
:
69

(The only p oint which needs some discussion is the fact that a parallel derivation
x
=
)
pl
y
in
G
0
, with
x
=
x
1
x
2
:::x
k
and
y
=
x
0
1
x
0
2
:::x
0
k
;k ge
2, can be simulated bya
k
-step derivation in
G
, because the
strings shued into
x
1
;:::;x
k
do not overlap each other.) Consequently,
SL
X
1
(
RE G
), for all
X
.
The inclusion is prop er in view of the following necessary condition for a language to be in
SL
(Lemma
10 in [9]): if
L
2
SL
and
V
0
=
f
a
2
V
j
f or every n
1
there is x
2
L with
j
x
j
a
n
g
;
then each string in
V
0
is a subword of a string in
L
. This condition rejects languages suchas
L
=
a
+
[
b
+
;
this language can b e generated by the shue grammar with nite choice
G
=(
f
a; b
g
;
f
a; b
g
;
(
f
a
g
;
f
; a
g
)
;
(
f
b
g
;
f
; b
g
))
in all modes of derivation excepting the global one. For
gl
we take
G
0
=(
f
a; b
g
;
f
a; b
g
;
(
V
;
f
a
g
)
;
(
b
;
f
b
g
))
:
(ii) Wehave already seen that
X
1
(
FIN
)
,
SL
6
=
;
for
X
2f
ARB ; P R; LM ; P L
g
:
Consider now the
simple shue grammar
G
=(
f
a; b; c
g
;
f
abc
g
;
f
abc
g
)
:
Wehave
L
(
G
)
\
a
+
b
+
c
+
=
f
a
m
b
m
c
m
j
m
1
g
:
Assume that
L
(
G
)=
L
f
(
G
0
) for some shue grammar
G
0
=(
f
a; b; c
g
;B;
(
R
1
;C
1
)
;::: :::;
(
R
n
;C
n
))
with nite
R
1
;:::;R
n
and
f
2f
arb; pr;lm
g
:
Take a string
z
=
a
m
b
m
c
m
in
L
(
G
) with arbitrarily large
m
. A derivation
w
=
)
f
i
z
must b e p ossible in
G
0
, with
w
=
a
p
b
p
c
p
;p<m
, for some 1
i
n:
Wemust
have
w
=
w
1
w
2
w
3
;w
2
2
R
i
, and
z
=
w
1
w
0
2
w
3
for
w
0
2
2
w
2
t?
u; u
2
C
i
.If
u
=
, this derivation step can be
omitted, hence wemay assume that
u
6
=
.
The sets
R
i
;C
i
are nite. Denote
r
= max
fj
x
jj
x
2
R
i
;
1
i
n
g
;
q
= max
fj
x
jj
x
2
C
i
;
1
i
n
g
:
Therefore
j
w
2
j
r;
j
u
j
q
, that is
p
m
,
q
.For
m>r
+
q
wehave
p
m
,
q>r
, hence either
w
2
2
sub
(
a
p
b
p
,
1
)or
w
2
2
sub
(
b
p
,
1
c
p
). In both cases, the shuing with
u
modies at most twoofthe
three subwords
a
p
;b
p
;c
p
, hence a parasitic string is obtained. (In the prex derivation case we precisely
know that only
a
p
is mo died.)
Consider now the parallel case. Take
G
0
=(
f
a; b; c
g
;B;
(
R
1
;C
1
)
;:::;
(
R
n
;C
n
)) such that
L
pl
(
G
0
)=
L
(
G
)
;R
i
nite sets. For obtaining a string
a
m
b
m
c
m
with large enough
m
we need a derivation
a
p
b
p
c
p
=
)
pl
a
m
b
m
c
m
;p < m:
This implies wehave sets
R
i
containing strings
a
r
;r
1 and with the corresponding
sets
C
i
containing strings
a
q
;q
1 (similarly for
b
and
c
).
Assume that we nd in the sets
R
i
two strings
a
r
1
;a
r
2
and in the associated sets
C
i
we nd two
strings
a
q
1
;a
q
2
. Similarly,wehave pairs (
b
r
3
;b
q
3
) and (
c
r
4
;c
q
4
). The string
a
t
b
t
c
t
for
t
=
r
1
r
2
r
3
r
4
is in
L
(
G
) and it can be rewritten using the same pairs of strings containing the symbols
b
and
c
and dierent
pairs for
a
:
a
t
b
t
c
t
=
)
pl
a
(
t=r
1
)(
r
1
+
q
1
)
b
s
c
s
;
a
t
b
t
c
t
=
)
pl
a
(
t=r
2
)(
r
2
+
q
2
)
b
s
c
s
; f or some s:
Consequently,wemust have
t
r
1
(
r
1
+
q
1
)=
t
r
2
(
r
2
+
q
2
)
;
which implies
q
1
r
1
=
q
2
r
2
:
70

For all such pairs (
a
r
;a
q
)we obtain the same value for
q
r
. Denote it by
. Rewriting some
a
s
b
s
c
s
using
such pairs we obtain
s
r
(
r
+
q
)=
s
(1 +
)
occurrences of
a
. Continuing
k
1 steps, we get
s
(1 +
)
k
occurrences of
a
, hence a geometrical progres-
sion, starting at
a
s
b
s
c
s
.
Symbols
a
can b e also introduced in a derivation
a
p
b
p
c
p
=
)
pl
a
m
b
m
c
m
by using pairs (
a
i
b
j
;z
) with
x
containing occurrences of
a
.At most one such pair can b e used in a derivation step. Let
h
be the
largest number of symbols
a
in strings
z
as above (the number of such strings is nite). Thus, in a
derivation
a
p
b
p
c
p
=
)
pl
a
m
b
m
c
m
, the number
m
can be mo died by such pairs in an interval [
m
1
;m
2
]
with
m
2
,
m
1
h:
We start from at most
g
=
card
(
A
) strings of the form
a
s
b
s
c
s
, hence wehave at most
g
geometrical
progressions
s
(1 +
)
k
;k
1. The dierence
k
=
s
(1 +
)
k
+1
,
s
(1 +
)
k
can b e arbitrarily large. When
k
>gh
, the at most
g
progressions can have an elementbetween
s
(1 +
)
k
+1
and
s
(1 +
)
k
; each such
element can have at most
h
values. Consequently, at least one natural number
t
between
s
(1 +
)
k
+1
and
s
(1 +
)
k
is not reached, the corresp onding string
a
t
b
t
c
t
, although in
L
(
G
), is not in
L
pl
(
G
0
). This
contradiction concludes the pro of of point (ii).
(iii) As only strings in the sets
R
i
;
1
i
n
, of a grammar
G
=(
V; B;
(
R
1
;C
1
)
;:::;
(
R
n
;C
n
)) can
be derived, wehave
GL
(
FIN
)
FIN
. The inclusion
FIN
GL
1
(
FIN
) has b een pointed out at the
beginning of this section.
}
Corollary.
The families
X
n
(
RE G
)
;n
1
;X
2f
ARB ; LM ; P R; P L; GL
g
,contain non-context-free
languages.
For
PL
(
FIN
)
;PL
(
RE G
) and
GL
(
RE G
) this assertion can be strenghtened.
Theorem 2.
Every propagating unary 0L language belongs to the family
PL
1
(
FIN
)
.
Proof.
For a unary 0L system
G
=(
f
a
g
;w;P
) with
w
2
V
and
P
=
f
a
!
a
i
1
;a
!
a
i
2
;:::;a
!
a
i
r
g
;
with
i
j
1
;
1
j
r
,we construct the shue grammar
G
0
=(
f
a
g
;
f
w
g
;
(
f
a
g
;
f
a
i
1
,
1
;a
i
2
,
1
;:::;a
i
r
,
1
g
))
:
It is easy to see that
L
(
G
)=
L
pl
(
G
0
).
}
This implies that
PL
(
FIN
)
;PL
(
RE G
) contain one-letter non-regular languages. This is not true for
the other mo des of derivation.
Theorem 3.
A one-letter language is in
ARB
(
F
)
;PR
(
F
)
;LM
(
F
)
;F
2f
F I N ; RE G
g
,orin
GL
(
RE G
)
if and only if it is regular.
Proof.
Takeashue grammar
G
=(
f
a
g
;B;
(
R
1
;C
1
)
;:::;
(
R
n
;C
n
)) with regular sets
R
i
;
1
i
n
.
Clearly,
L
arb
(
G
)=
L
pr
(
G
)=
L
lm
(
G
). Because wework with one-letter strings, a string
x
can be derived
using a comp onent(
R
i
;C
i
) if (and only if ) the shortest string in
R
i
is contained in
x
, hence is of length
at most
j
x
j
. Therefore we can replace each
R
i
by
a
k
i
, where
k
i
= min
fj
x
jj
x
2
R
i
g
;
1
i
n;
without modifying the generated language. Denote
K
= max
f
k
i
j
1
i
n
g
71

Figures
Citations
More filters
Book ChapterDOI

Contextual grammars and formal languages

TL;DR: The chapter by S. Marcus in this handbook gives a lucid account of the motivation behind contextual grammars from the natural point of view.
Book ChapterDOI

Table-Based division by small integer constants

TL;DR: A family of architectures derived from a simple recurrence whose body can be implemented very efficiently as a look-up table that matches the hardware resources of the target FPGA that addresses the need for divisions by small integer constants in fixed or floating point.
Journal ArticleDOI

Computing by Folding

TL;DR: A new computing paradigm based on the idea of string folding is introduced, which is promising not only because of the expected theoretical results, but alsoBecause of the possible indirect applications in various fields (as for instance, mathematical linguistics, DNA computing, computing using light, and so on).
Journal Article

Contextual Grammars with Uniform Sets of Trajectories

TL;DR: It is proved that when the alphabet has at least two symbols, the nonuniform contextual grammars with trajectories are strictly more powerful than the uniform variant.
References
More filters
Book

The mathematical theory of L systems

TL;DR: A survey of the different areas of the theory of developmental systems and languages in such a way that it discusses typical results obtained in each particular problem area.
Related Papers (5)
Frequently Asked Questions (9)
Q1. What are the contributions in this paper?

The authors consider generative mechanisms producing languages by starting from a nite set of words and shu ing the current words with words in given sets, depending on certain conditions. 

In formal language theory, besides the basic two types of grammars, the Chomsky grammars and the Lindenmayer systems, there are many "exotic" classes of generative devices, based not on the process of rewriting symbols (or strings) by strings, but using various operations of adjoining strings. 

The derivation proceeds from right to left; the nonterminals [w] in the left-hand side of sentential forms memorize the pre x w of large enough length to control the derivation in G in the pre x mode. 

Because the authors work with one-letter strings, a string x can be derived using a component (Ri; Ci) if (and only if) the shortest string in Ri is contained in x, hence is of length at most jxj. 

If i n, then the authors have x 2 Ri, otherwise the derivation is not allowed: for x = x1x2x3; x2 2 Ri; x1x2 6= , the authors have x1x2x3 2 Rn+1; hence the derivation in (Ri; Ci) is not allowed. 

As u contains occurrences of both a and b, the authors can derive c(ab)s(ab)p with arbitrary s (all such strings are in Lmax(G)) in such a way to obtain c(ab)sy with y containing a substring aa or a substring bb. 

Depending on the place of the string in Ri in the current string generated by their grammar, the authors can distinguish several types of grammars: pre x (the string in Ri appears in the left-hand of the processed string, as a pre x of it), leftmost (we look for the leftmost possible occurrence of a string in Ri), arbitrary (no condition on the place where the string in Ri appears), global (the whole current string is in Ri), and parallel (the current string is portioned into strings in Ri). 

To a string of the form of w0 above only the fth component of G can be applied and again either the derivation must be nished, or it continues only in (R6; C6). 

As it is expected, the languages generated by shu e grammars are mostly incomparable with Chomsky languages (due to the fact that the authors do not use nonterminals).