scispace - formally typeset
Search or ask a question

Showing papers on "Chomsky hierarchy published in 2004"


Proceedings ArticleDOI
01 Jan 2004
TL;DR: PEGs address frequently felt expressiveness limitations of CFGs and REs, simplifying syntax definitions and making it unnecessary to separate their lexical and hierarchical components, and are here proven equivalent in effective recognition power.
Abstract: For decades we have been using Chomsky's generative system of grammars, particularly context-free grammars (CFGs) and regular expressions (REs), to express the syntax of programming languages and protocols The power of generative grammars to express ambiguity is crucial to their original purpose of modelling natural languages, but this very power makes it unnecessarily difficult both to express and to parse machine-oriented languages using CFGs Parsing Expression Grammars (PEGs) provide an alternative, recognition-based formal foundation for describing machine-oriented syntax, which solves the ambiguity problem by not introducing ambiguity in the first place Where CFGs express nondeterministic choice between alternatives, PEGs instead use prioritized choice PEGs address frequently felt expressiveness limitations of CFGs and REs, simplifying syntax definitions and making it unnecessary to separate their lexical and hierarchical components A linear-time parser can be built for any PEG, avoiding both the complexity and fickleness of LR parsers and the inefficiency of generalized CFG parsing While PEGs provide a rich set of operators for constructing grammars, they are reducible to two minimal recognition schemas developed around 1970, TS/TDPL and gTS/GTDPL, which are here proven equivalent in effective recognition power

467 citations


Journal ArticleDOI
01 Apr 2004
TL;DR: It is argued that some preliminary “good properties” obtained may plead in favour of the use of analogy in the study of formal languages in relationship with natural language.
Abstract: In this paper, we advocate a study of analogies between strings of symbols for their own sake. We show how some sets of strings, i.e., some formal languages, may be characterized by use of analogies. We argue that some preliminary “good properties” obtained may plead in favour of the use of analogy in the study of formal languages in relationship with natural language.

92 citations


01 Jan 2004
TL;DR: This chapter simplifies the analysis of implementations of the table ADT by treating each query or update of a list element or tree node, or comparison of two of them, as an elementary operation.
Abstract: ly, a table is a mapping (function) from keys to values. Given a search key k, table search has to find the table entry (k,v) containing that key. The found entry may be retrieved, or removed (deleted) from the table, or its value, v, may be updated. If the table has no such entry, a new entry with key k may be created and inserted in the table. Operations on a table also initialize a table to the empty one or indicate that an entry with the given key is absent. Insertions and deletions modify the mapping of keys onto values specified by the table. Example 3.2. Table 3.1 presents a very popular (at least in textbooks on algorithms and data structures) table having three-letter identifiers of airports as keys and associated data such as airport locations, as values. Each identifier has a unique integer representation k = 26c0 + 26c1 + c2 where the ci; i = 0,1,2, are ordinal numbers of letters in the English alphabet (A corresponds to 0, B to 1, . . . , Z to 25). For example, AKL corresponds to 262 ·0 + 26 ·10 + 11 = 271. In total, there are 263 = 17576 possible different keys and entries. Table 3.1: A map between airport codes and locations. Key Associated value v Code k City Country State / Place AKL 271 Auckland New Zealand DCA 2080 Washington USA District Columbia (D.C.) FRA 3822 Frankfurt Germany Rheinland-Pfalz GLA 4342 Glasgow UK Scotland HKG 4998 Hong Kong China LAX 7459 Los Angeles USA California SDF 12251 Louisville USA Kentucky ORY 9930 Paris France As can be seen from this example, we may map the keys to integers. We deal with both static (where the database is fixed in advance and no insertions, deletions or updates are done) and dynamic (where insertions, deletions or updates are allowed) implementations of the table ADT. In all our implementations of the table ADT, we may simplify the analysis as follows. We use lists and trees as our basic containers. We treat each query or update of a list element or tree node, or comparison of two of them, as an elementary operation. The following lemma summarizes some obvious relationships. Lemma 3.3. Suppose that a table is built up from empty by successive insertions, and we then search for a key k uniformly at random. Let Tss(k) (respectively Tus(k)) be the time to perform successful (respectively unsuccessful) search for k. Then • the time taken to retrieve, delete, or update an element with key k is at least Tss(k); • the time taken to insert an element with key k is at least Tus(k); • Tss(k)≤ Tus(k). Chapter 3: Efficiency of Searching 55 In addition • the worst case value for Tss(k) equals the worst case value for Tus(k); • the average value of Tss(k) equals 1 plus the average of the times for the unsuccessful searches undertaken while building the table. Proof. To insert a new element, we first try to find where it would be if it were contained in the data structure, and then perform a single insert operation into the container. To delete an element, we first find it, and then perform a delete operation on the container. Analogous statements hold for updating and retrieval. Thus for a given state of the table formed by insertions from an empty table, the time for successful search for a given element is the time that it took for unsuccessful search for that element, as we built the table, plus 1. This means that the time for unsuccessful search is always at least the time for successful search for a given element (the same in the worst case), and the average time for successful search for an element in a table is 1 more than the average of all the times for unsuccessful searches. If the data structure used to implement a table arranges the records in a list, the efficiency of searching depends on whether the list is sorted. In the case of the telephone book, we quickly find the desired phone number (data record) by name (key). But it is almost hopeless to search directly for a phone number unless we have a special reverse directory where the phone number serves as a key. We discuss unsorted lists in the Exercises below, and sorted lists in the next section. Exercises Exercise 3.1.1. The sequential search algorithm simply starts at the head of a list and examines elements in order until it finds the desired key or reaches the end of the list. An array-based version is shown in Figure 3.1. algorithm sequentialSearch Input: array a[0..n−1]; key k begin for i← 0 while i< n step i← i+ 1 do if a[i] = k then return i end for return not found end Figure 3.1: A sequential search algorithm. Show that both successful and unsuccessful sequential search in a list of size n have worst-case and average-case time complexity Θ(n). Exercise 3.1.2. Show that sequential search is slightly more efficient for sorted lists than unsorted ones. What is the time complexity of successful and unsuccessful search? 3.2 Sorted lists and binary search A sorted list implementation allows for much better search method that uses the divide-and-conquer paradigm. The basic idea of binary search is simple. Let k be the desired key for which we want to search. 56 Section 3.2: Sorted lists and binary search • If the list is empty, return “not found”. Otherwise: • Choose the key m of the middle element of the list. If m= k, return its record; if m > k, make a recursive call on the head sublist; if m < k, make a recursive call on the tail sublist. Example 3.4. Figure 3.2 illustrates binary search for the key k= 42 in a list of size 16. At the first iteration, the search key 42 is compared to the key a[7] = 53 in the middle position m= (0 + 15)/2 = 7. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 7 14 27 33 42 49 51 53 67 70 77 81 89 94 95 99

22 citations


Journal ArticleDOI
Yun Wang1
TL;DR: The grammatical toolmainly deals with the systems composed of structured entities, they are called entity grammar systems (EGSs), which have the general form of the objects in the physical world, which means EGSs could be used as a tool to study the complex system composed of many objects with different structures, just like the biological systems.

17 citations


Journal ArticleDOI
TL;DR: This paper presents an overview of computational biology approaches and surveys some of the natural computing models using, in both cases, a formal language-based approach.
Abstract: This paper presents an overview of computational biology approaches and surveys some of the natural computing models using, in both cases, a formal language-based approach.

16 citations


Journal ArticleDOI
TL;DR: In this article, a new type of shift dynamics is introduced, called functional shift, which is defined by a set of bi-infinite sequences of some functions on a given set of symbols.
Abstract: We introduce a new type of shift dynamics as an extended model of symbolic dynamics, and investigate the characteristics of shift spaces from the viewpoints of both dynamics and computation. This shift dynamics is called a functional shift, which is defined by a set of bi-infinite sequences of some functions on a set of symbols. To analyse the complexity of functional shifts, we measure them in terms of topological entropy, and locate their languages in the Chomsky hierarchy. Through this study, we argue that considering functional shifts from the viewpoints of both dynamics and computation gives us opposite results about the complexity of systems. We also describe a new class of shift spaces whose languages are not recursively enumerable.

10 citations


Book ChapterDOI
13 Dec 2004
TL;DR: In this article, the only remaining step in the Chomsky hierarchy is to consider those groups with a context-sensitive word problem and prove some results about these groups, and also establish some results for other context sensitive decision problems in groups.
Abstract: There already exist classifications of those groups which have regular, context-free or recursively enumerable word problem. The only remaining step in the Chomsky hierarchy is to consider those groups with a context-sensitive word problem. In this paper we consider this problem and prove some results about these groups. We also establish some results about other context-sensitive decision problems in groups.

7 citations


01 Jan 2004
TL;DR: It is proved that the languages of dependency nets coding rigid CDGs have finite elasticity, and a learning algorithm is shown that leads to the learnability of rigid or kvalued CDGs (without optional and iterative types) from strings.
Abstract: This paper is concerned with learning in the model of Gold the Categorial Dependency Grammars (CDG), which express discontinuous (non-projective) dependencies. We show that rigid and k-valued CDG (without optional and iterative types) are learnable from strings. In fact, we prove that the languages of dependency nets coding rigid CDGs have finite elasticity, and we show a learning algorithm. As a standard corollary, this result leads to the learnability of rigid or kvalued CDGs (without optional and iterative types) from strings.

6 citations


Journal Article
TL;DR: Active P automata are computing with the structure of the membrane systems, using operations like membrane creation, division and dissolution, which is applied to the parsing of (natural language) sentences into dependency trees.
Abstract: New classes of P automata are introduced corresponding to the basic classes of languages in the Chomsky hierarchy. Unlike the previously defined P automata, active P automata are computing with the structure of the membrane systems, using operations like membrane creation, division and dissolution. The model is applied to the parsing of (natural language) sentences into dependency trees.

5 citations


Book ChapterDOI
01 Jan 2004
TL;DR: The topics presented in this paper are in some sense modifications of the classical notion of a rewriting system, introduced by Axel Thue at the beginning of 20th century.
Abstract: This is an overview on context-sensitive grammars. The paper contains also an appendix about Chomsky type-0 grammars (also called phrase-structure grammars). These grammars and families of languages are arising in classical language theory. Most of the topics presented in this paper are in some sense modifications of the classical notion of a rewriting system, introduced by Axel Thue at the beginning of 20th century, [44]. A rewriting system is a (finite) set of rules u → ν, where u and ν are words, indicating that an occurrence of u (as a subword) can be replaced by ν. A rewriting system only transforms words into other words, languages into other languages. After supplementing it with some mechanism for “squeezing out” a language, a rewriting system can be used as a device for defining languages. This is what Chomsky did, with linguistic goals in mind, when he introduced different types of grammars, [3, 4, 5], see also [6]. At the beginning, the classification was not very clear but by mid-60’s the four classes of the Chomsky hierarchy of grammars and languages have become pretty standard: recursively enumerable,or of type 0; context-sensitive, or of type 1; context-free, or of type 2; regular,or of type 3.

5 citations


Journal Article
TL;DR: In this article, the only remaining step in the Chomsky hierarchy is to consider those groups with a context-sensitive word problem and prove some results about these groups, and also establish some results for other context sensitive decision problems in groups.
Abstract: There already exist classifications of those groups which have regular, context-free or recursively enumerable word problem. The only remaining step in the Chomsky hierarchy is to consider those groups with a context-sensitive word problem. In this paper we consider this problem and prove some results about these groups. We also establish some results about other context-sensitive decision problems in groups.

Posted Content
TL;DR: This paper shows the existence and coexistence of different notions of equivalence by extending the no-tion of oracles used in formal languages, which allows distinctions to be made between the trustworthy oracles assumed by formal languages and the untrust-worthy oracle used by natural languages.
Abstract: Design methods in information systems frequently create software descriptions using formal languages Nonetheless, most software designers prefer to describe software using natural languages This distinction is not simply a matter of convenience Natural languages are not the same as formal languages; in particular, natural languages do not follow the notions of equivalence used by formal languages In this paper, we show both the existence and coexistence of different notions of equivalence by extending the no-tion of oracles used in formal languages This allows distinctions to be made between the trustworthy oracles assumed by formal languages and the untrust-worthy oracles used by natural languages By examin-ing the notion of equivalence, we hope to encourage designers of software to rethink the place of ambiguity in software design

Journal ArticleDOI
TL;DR: The grammatical complexity of the symbol sequences generated from the Henon map and the Lozi map is calculated using the recently developed methods to construct the pruning front and it is found that the complexity exhibits a self-similar structure as a function of the system parameter.
Abstract: We calculate the grammatical complexity of the symbol sequences generated from the Henon map and the Lozi map using the recently developed methods to construct the pruning front. When the map is hyperbolic, the language of symbol sequences is regular in the sense of the Chomsky hierarchy and the corresponding grammatical complexity takes finite values. It is found that the complexity exhibits a self-similar structure as a function of the system parameter, and the similarity of the pruning fronts is discussed as an origin of such self-similarity. For non-hyperbolic cases, it is observed that the complexity monotonically increases as we increase the resolution of the pruning front.

Journal ArticleDOI
TL;DR: It is shown here how very similar hierarchies can be obtained for families of sets of piecewise continuous functions, using systems of ordinary differential equations in the same way that automata are used in establishing the traditional Chomsky hierarchy.
Abstract: The time-honored Chomsky hierarchy has long shown its value as a structural tool in formal languages and automata theory, and gained followers in various areas. We show here how very similar hierarchies can be obtained for families of sets of piecewise continuous functions. We use systems of ordinary differential equations in the same way that automata are used in establishing the traditional Chomsky hierarchy. A functional memory is provided by state-dependent delays which are used in a novel way, paired with certain state components, giving memory structures similar to push-down stores and Turing machine tapes. The resulting machine model may be viewed as a “functional computing machine’’, with functional input, functional memory and, though this is not emphasized here, functional output.

Journal Article
TL;DR: In this paper, it was shown that the disjunctivity problem for Chomsky-0-grammars is Π 0 3 -complete while the corresponding problem for linear or context-free or context sensitive grammars was shown to be Ω 0 2 -complete, which implies that for any language class C which contains the linear languages, the class of languages in C corresponding to disjunctive sequences is not recursively presentable.
Abstract: An infinite binary sequence is disjunctive if every binary word occurs as a subword in the sequence. For a computational analysis of disjunctive sequences we can identify an infinite 0-1-sequence either with its prefix set or with its corresponding set, where a set A of binary words corresponds to a sequence a if a is the characteristic sequence of A. Most of the previous investigations of disjunctive sequences have dealt with prefix sets. Here, following the more common point of view in computability theory, we focus our investigations on the sets corresponding to disjunctive sequences. We analyze the computational complexity and the Chomsky complexity of sets corresponding to disjunctive sequences. In particular, we show that no such set is regular but that there are linear languages with disjunctive characteristic sequences. Moreover, we discuss decidability questions. Here our main results are that the disjunctivity problem for Chomsky-0-grammars is Π 0 3 -complete while the corresponding problem for linear or context free or context sensitive grammars is Π 0 2 -complete. The latter implies that, for any language class C which contains the linear languages, the class of the languages in C corresponding to disjunctive sequences is not recursively presentable.


Proceedings ArticleDOI
01 Dec 2004
TL;DR: The Chomsky hierarchy automata are taken into account as sufficient models to represent the features of the live beings' memory and a new robotic control architecture is proposed, which has its building blocks in the modular and hierarchical functioning of the brain.
Abstract: Behavior-based robotics has its foundations on the emergence of robotic behaviors, and aims to providing intelligence and autonomy to actions performed by agents in search of their goals. Although there is a great variety of research in this field, there are not many works about the formalization of the concepts related to the cognitive entity of the autonomous agents (AAs). This paper makes an effort to establish a parallel with natural models in order to achieve insights of how the robotic behaviors can be represented and set up. The Chomsky hierarchy automata are taken into account as sufficient models to represent the features of the live beings' memory. In this way, we consider that a pushdown automata (PDA) can represent the short-term memory, and the Turing machine (TM) can work as a long-term memory. Besides, in order to implement in AAs the concepts concerning to these automata, a new robotic control architecture is proposed, which has its building blocks in the modular and hierarchical functioning of the brain.

Journal ArticleDOI
TL;DR: The decidability of the equivalence of grammars with respect to the differentiation function and structure function is discussed and the decidable of the k -narrowness of context-free Grammars is proved.
Abstract: We introduce the notion of a differentiation function of a context-free grammar which gives the number of terminal words that can be derived in a certain number of steps. A grammar is called narrow (or k -narrow) iff its differentiation function is bounded by a constant (by k ). We present the basic properties of differentiation functions, especially we relate them to structure function of context-free languages and narrow grammars to slender languages. We discuss the decidability of the equivalence of grammars with respect to the differentiation function and structure function and prove the decidability of the k -narrowness of context-free grammars. Furthermore, we introduce languages representing the graph of the differentiation and structure function and relate these languages to those of the Chomsky hierarchy.

Journal Article
TL;DR: It is shown that a context–free string-graph grammar (one hyperedge is replaced at a time) can be used to model discontinuous constituents in natural languages.
Abstract: Discontinuous constituents are a frequent problem in natural language analyses. A constituent is called discontinuous if it is interrupted by other constituents. In German they can appear with separable verb prefixes or relative clauses in the Nachfeld. They can not be captured by a context-free Chomsky grammar. A subset of hypergraph grammars are string-graph grammars where the result of a derivation must be formed like a string i.e. terminal edges are connected to two nodes and are lined up in a row. Nonterminal edges do not have to fulfill this property. In this paper it is shown that a context-free string-graph grammar (one hyperedge is replaced at a time) can be used to model discontinuous constituents in natural languages.