scispace - formally typeset
Search or ask a question

Showing papers on "Context-sensitive grammar published in 2003"


Journal ArticleDOI
01 Dec 2003
TL;DR: It is demonstrated how the combined formalism, Circular Reference Attributed Grammars (CRAGs), can take advantage of both strengths of RAGs and CAGs, making it possible to express solutions to many problems in an easy way.
Abstract: This paper presents a combination of Reference Attributed Grammars (RAGs) and Circular Attribute Grammars (CAGs). While RAGs allow the direct and easy specification of non-locally dependent information, CAGs allow iterative fixed-point computations to be expressed directly using recursive (circular) equations. We demonstrate how the combined formalism, Circular Reference Attributed Grammars (CRAGs), can take advantage of both these strengths, making it possible to express solutions to many problems in an easy way. We exemplify with the specification and computation of the nullable, first, and follow sets used in parser construction, a problem which is highly recursive and normally programmed by hand using an iterative algorithm. We also present a general demand-driven evaluation algorithm for CRAGs and some optimizations of it. The approach has been implemented and experimental results include computations on a series of grammars including Java 1.2. We also revisit some of the classical examples of CAGs and show how their solutions are facilitated by CRAGs.

65 citations


Journal ArticleDOI
TL;DR: It is shown that there effectively exists an algorithm that identifies a very simple grammar G equivalent to G* in the limit from positive data, satisfying the property that the time for updating a conjecture is bounded by O(m), and the total number of prediction errors made by the algorithm is boundedBy O(n), where n is the size of G*, m = Max{N|Σ |+1, |Σ|3} and N is the total length of all positive data provided

64 citations


Proceedings Article
07 Jul 2003
TL;DR: A new class of formal grammars, which allow the use of all set-theoretic operations as an integral part of the formalism of rules, are introduced, which allows to conjecture the practical applicability of the new concept.
Abstract: As a direct continuation of the earlier research on conjunctive grammars - context-free grammars equipped with intersection - this paper introduces a new class of formal grammars, which allow the use of all set-theoretic operations as an integral part of the formalism of rules. Rigorous semantics for such grammars is defined by language equations in a way that allows to generalize some techniques from the theory of context-free grammars, including Chomsky normal form, Cocke-Kasami-Younger recognition algorithm and some limited extension of the notion of a parse tree, which together allow to conjecture the practical applicability of the new concept.

33 citations


Journal ArticleDOI
James Rogers1
TL;DR: The use of weak monadic second-order languages over structures of varying dimension as specification languages for grammars and automata is explored, focusing on the extension of the longstanding results characterizing the regular and context-free languages in terms of definability in wS1S and wSnS to a characterization of the Tree-Adjoining Languages.

33 citations


Journal ArticleDOI
TL;DR: A cubic-time recognition and parsing algorithm for this family of grammars, which is applicable to an arbitrary conjunctive grammar without any initial transformations, and which can be modified to work in quadratic time and use linear space.

33 citations


Proceedings ArticleDOI
23 Jun 2003
TL;DR: A new description for subdivision surfaces based on a graph grammar formalism that gives an effective representation that allows simple implementation and is suitable for adaptive computations.
Abstract: In this paper we develop a new description for subdivision surfaces based on a graph grammar formalism. Subdivision schemes are specified by a context sensitive grammar in which production rules represent topological and geometrical transformations to the surface's control mesh. This methodology can be used for all known subdivision surface schemes. Moreover, it gives an effective representation that allows simple implementation and is suitable for adaptive computations.

29 citations


Journal ArticleDOI
TL;DR: The closure properties of the language family generated by linear conjunctive grammars are investigated; the main result is its closure under complement, which implies that it is closed under all set-theoretic operations.

27 citations


Journal ArticleDOI
TL;DR: This work introduces and study a natural extension of Marcus external contextual grammars, and attempts to fill a gap regarding the linguistic relevance of these mechanisms which consists in defining a tree structure on the strings generated by many-dimensional external contextual Grammars.
Abstract: We introduce and study a natural extension ofMarcus external contextual grammars. This mathematically simple mechanism whichgenerates a proper subclass of simple matrix languages,known to be mildly context-sensitive ones, is still mildlycontext-sensitive. Furthermore, we get an infinite hierarchy ofmildly context-sensitive families of languages.Then we attempt to fill a gap regarding the linguistic relevanceof these mechanisms which consists in defining a tree structure on thestrings generated by many-dimensional external contextual grammars,and investigate some related issues. Several open problemsare finally discussed.

25 citations


Proceedings ArticleDOI
07 Jul 2003
TL;DR: A MetaGrammar is introduced, which allows the grammar writer to specify in compact manner syntactic properties that are potentially framework- and to some extent language-independent, from which grammars for several frameworks and languages are automatically generated offline.
Abstract: We introduce a MetaGrammar, which allows us to automatically generate, from a single and compact MetaGrammar hierarchy, parallel Lexical Functional Grammars (LFG) and Tree-Adjoining Grammars (TAG) for French and for English: the grammar writer specifies in compact manner syntactic properties that are potentially framework-, and to some extent language-independent (such as subcategorization, valency alternations and realization of syntactic functions), from which grammars for several frameworks and languages are automatically generated offline.

19 citations


Proceedings Article
01 Jan 2003
TL;DR: An algorithm for the inference of context-free graph grammars from examples, which builds on an earlier system for frequent substructure discovery, and is biased toward Grammar features that minimize description length.
Abstract: We present an algorithm for the inference of context-free graph grammars from examples. The algorithm builds on an earlier system for frequent substructure discovery, and is biased toward grammars that minimize description length. Grammar features include recursion, variables and relationships. We present an illustrative example, demonstrate the algorithm’s ability to learn in the presence of noise, and show real-world examples.

19 citations


Journal ArticleDOI
TL;DR: It is proved that every recursively enumerable language can be generated by a scattered context grammar with a reduced number of both nonterminals and context-sensing productions.

Journal ArticleDOI
TL;DR: This work considerably improves the result, proving that five components suffice in order to generate any recursively enumerable language.

Journal ArticleDOI
TL;DR: A variant of P systems is defined, namely, probabilistic rewriting P systems, where the selection of rewriting rules is probabilists, and it is shown that, with non-zero cut-point, probable rewriting rules are chosen.
Abstract: In this paper we define a variant of P systems, namely, probabilistic rewriting P systems, where the selection of rewriting rules is probabilistic. We show that, with non-zero cut-point, probabilis...

Journal ArticleDOI
01 Apr 2003-Grammars
TL;DR: The aim of this paper is to give prospective PhD students in the area hints at where to start a promising research; and to supplement earlier reference lists on parallel grammars, trying to cover recent papers as well as ``older'' papers which were somehow neglected in other reviews.
Abstract: The aim of this paper is at least 2 fold: to give prospective PhD students in the area hints at where to start a promising research; and to supplement earlier reference lists on parallel grammars, trying to cover recent papers as well as ``older'' papers which were somehow neglected in other reviews. Together with the nowadays classical book on L systems by G. Rozenberg and A. Salomaa and with the articles on L systems in the Handbook of Formal Languages, researchers will be equipped with a hopefully comprehensive list of references and ideas around parallel grammars.

Journal ArticleDOI
TL;DR: It is proved that every recursively enumerable language can be generated by a scattered context grammar with no more than two context-sensitive productions.

Proceedings ArticleDOI
22 Sep 2003
TL;DR: This paper presents techniques for the formal specification and efficient incremental implementation of spreadsheet-like tools using strong attribute grammars and first incremental results are presented.
Abstract: This paper presents techniques for the formal specification and efficient incremental implementation of spreadsheet-like tools. The spreadsheets are specified by strong attribute grammars. In this style of attribute grammar programming every single inductive computation is expressed within the attribute grammar formalism. Well-known attribute grammar techniques are used to reason about such grammars. For example, ordered scheduling algorithms can be used to statically guarantee termination of the attribute grammars and to derive efficient implementations. A strong attribute grammar for a spreadsheet is defined and the first incremental results are presented.

Journal Article
TL;DR: A fundamental framework of automata and grammars theory based on quantum logic is preliminarily established and it is showed that the language generated by any l valued regular grammar is equivalent to that recognized by some automaton with e moves based onquantum logic.
Abstract: In this paper, a fundamental framework of automata and grammars theory based on quantum logic is preliminarily established. First, the introduce quantum grammar, which is called l valued grammars, is introduced. It is particularly showed that the language (called quantum language) generated by any l valued regular grammar is equivalent to that recognized by some automaton with e moves based on quantum logic (called l valued automata), and conversely, any quantum language recognized by l valued automaton is also equivalent to that generated by some l valued grammar. Afterwards, the l valued pumping lemma is built, and then a decision characterization of quantum languages is presented. Finally, the relationship between regular grammars and quantum grammars (l valued regular grammars) is briefly discussed. Summarily, the introduced work lays a foundation for further studies on more complicated quantum automata and quantum grammars such as quantum pushdown automata and Turing machine as well as quantum context-free grammars and context-sensitive grammars.

Dissertation
14 Nov 2003
TL;DR: The main conclusions are: - formal learning theory is relevant to linguistics, - identification in the limit is feasible for non-trivial classes, and the `Shinohara approach' can lead to a learnable class, but this completely depends on the specific nature of the formalism and the notion of complexity.
Abstract: In 1967 E. M. Gold published a paper in which the language classes from the Chomsky-hierarchy were analyzed in terms of learnability, in the technical sense of identification in the limit. His results were mostly negative, and perhaps because of this his work had little impact on linguistics. In the early eighties there was renewed interest in the paradigm, mainly because of work by Angluin and Wright. Around the same time, Arikawa and his co-workers refined the paradigm by applying it to so-called Elementary Formal Systems. By making use of this approach Takeshi Shinohara was able to come up with an impressive result; any class of context-sensitive grammars with a bound on its number of rules is learnable. Some linguistically motivated work on learnability also appeared from this point on, most notably Wexler & Culicover 1980 and Kanazawa 1994. The latter investigates the learnability of various classes of categorial grammar, inspired by work by Buszkowski and Penn, and raises some interesting questions. We follow up on this work by exploring complexity issues relevant to learning these classes, answering an open question from Kanazawa 1994, and applying the same kind of approach to obtain (non)learnable classes of Combinatory Categorial Grammars, Tree Adjoining Grammars, Minimalist grammars, Generalized Quantifiers, and some variants of Lambek Grammars. We also discuss work on learning tree languages and its application to learning Dependency Grammars. Our main conclusions are: - formal learning theory is relevant to linguistics, - identification in the limit is feasible for non-trivial classes, - the `Shinohara approach' -i.e., placing a numerical bound on the complexity of a grammar- can lead to a learnable class, but this completely depends on the specific nature of the formalism and the notion of complexity. We give examples of natural classes of commonly used linguistic formalisms that resist this kind of approach, - learning is hard work. Our results indicate that learning even `simple' classes of languages requires a lot of computational effort, - dealing with structure (derivation-, dependency-) languages instead of string languages offers a useful and promising approach to learnabilty in a linguistic context

Journal ArticleDOI
TL;DR: An abductive model based on Constraint Handling Rule Grammars (CHRGs) for detecting and correcting errors in problem domains that can be described in terms of strings of words accepted by a logic grammar is proposed.
Abstract: We propose an abductive model based on Constraint Handling Rule Grammars (CHRGs) for detecting and correcting errors in problem domains that can be described in terms of strings of words accepted by a logic grammar. We provide a proof of concept for the specific problem of detecting and repairing natural language errors, in particular, those concerning feature agreement. Our methodology relies on grammar and string transformation in accordance with a user-defined dictionary of possible repairs. This transformation also serves as top-down guidance for our essentially bottom-up parser. With respect to previous approaches to error detection and repair, including those that also use constraints and/or abduction, our methodology is surprisingly simple while far-reaching and efficient.

Journal ArticleDOI
01 Apr 2003-Grammars
TL;DR: Two new variants of Marcus contextual grammars are introduced: total Marcus contextual grammar with total leftmost derivation constrained by maximal use of selectors.
Abstract: We propose and study a model of the graphical syllable using Marcus contextual grammars. For this purpose we introduce two new variants of Marcus contextual grammars: total Marcus contextual grammar with total leftmost derivation, and total Marcus contextual grammar with total leftmost derivation constrained by maximal use of selectors.

Journal ArticleDOI
TL;DR: It is demonstrated that some well-known relationships concerning the language families resulting from ordinary ET0L grammars do not hold in terms of the forbiddingET0L Grammars, or their forbidding versions with conditions of length one are equally powerful.

Journal ArticleDOI
TL;DR: This paper explores the behavior of range concatenation grammars in counting, a domain in which bad reputation of other classical syntactic formalisms is well known and leads to some surprising results.

22 Sep 2003
TL;DR: The improved version of Synapse employs incremental learning based on the rule generation mechanism called inductive CYK algorithm, which generates the minimum production rules required for parsing positive samples, and the form of production rules is extended to include not only A →βγ but also A → β, called extended Chomsky normal form.
Abstract: This paper describes recent improvements in Synapse system [5, 6] for inductive inference of context free grammars from sample strings. For effective inference of grammars, Synapse employs incremental learning based on the rule generation mechanism called inductive CYK algorithm, which generates the minimum production rules required for parsing positive samples. In the improved version, the form of production rules is extended to include not only A → βγ but also A → β, called extended Chomsky normal form, where each of β and γ is either terminal or nonterminal symbol. By this extension and other improvements, Synapse can synthesize both ambiguous grammars and unambiguous grammars with less computation time compared to the previous system.

22 Sep 2003
TL;DR: This paper explores the learnability of context-free grammars given positive examples and lexical semantics, and finds that the learner has a representation of the meaning of each lexical item.
Abstract: Context-free grammars cannot be identified in the limit from positive examples (Gold, 1967), yet natural language grammars are more powerful than context-free grammars and humans learn them with remarkable ease from positive examples (Marcus, 1993). Identifiability results for formal languages ignore a potentially powerful source of information available to learners of natural languages, namely, meanings. This paper explores the learnability of context-free grammars given positive examples and lexical semantics. That is, the learner has a representation of the meaning of each lexical item.

01 Jan 2003
TL;DR: Conjunctive Grammars were introduced in 2000 as a generalization of context-free grammars that allows the use of an explicit intersection oper-ation in rules and several theoretical results on their properties have been obtained and numerous open problems are proposed.
Abstract: Conjunctive grammars were introduced in 2000 as a generalization ofcontext-free grammars that allows the use of an explicit intersection oper-ation in rules. Several theoretical results on their properties have been ob-tained since then, and a number of efficient parsing algorith ms that justifythe practical value of the concept have been developed. This article reviewsthese results and proposes numerous open problems. 1 Introduction The generative power of context-free grammars is generallyconsidered to beinsufficient for denoting many languages that arise in pract ice: it has often beenobserved that all natural languages contain non-context-free constructs, whilethe non-context-freeness of programming languages was proved already in early1960s. A review of several widely different subject areas led the authors of [5] tothe noteworthy conclusion that “the world seems to be non-context-free”.This leaves the aforementioned world with the question of fin ding an ade-quate tool for denoting formal languages. As the descriptive means of context-free grammars are not sufficient but necessary for practical use, the attemptsat developing new generative devices have usually been made by generalizingcontext-free grammars in this or that way. However, most of the time an exten-sion that appears to be minorleads to a substantialincrease in the generative power(context-sensitive and indexed grammars being good examples), which is usuallyaccompanied by strong and very undesirable complexity hardness results. Theability to encode hard problems makes a formalism, in effect, a peculiar low-levelprogramming language, where writing a grammar resembles coding in assembly

Journal ArticleDOI
15 Jun 2003
TL;DR: It is proved that this normal form for rewriting systems defining Church- Rosser languages can be achieved for each Church-Rosser language and that the construction is effective.
Abstract: In this paper the context-splittable normal form for rewriting systems defining Church-Rosser languages is introduced. Context-splittable rewriting rules look like rules of context-sensitive grammars with swapped sides. To be more precise, they have the form uvw → uxw with u, v, w being words, v being nonempty and x being a single letter or the empty word. It is proved that this normal form can be achieved for each Church-Rosser language and that the construction is effective. Some interesting consequences of this characterization are given, too.

Proceedings ArticleDOI
07 Jul 2003
TL;DR: It is shown that the class of rigid and k-valued NL grammars is unlearnable from strings, for each k; this result is obtained by a specific construction of a limit point in the considered class, that does not use product operator.
Abstract: This paper is concerned with learning categorial grammars in Gold's model. In contrast to k-valued classical categorial grammars, k-valued Lambek grammars are not learnable from strings. This result was shown for several variants but the question was left open for the weakest one, the non-associative variant NL.We show that the class of rigid and k-valued NL grammars is unlearnable from strings, for each k; this result is obtained by a specific construction of a limit point in the considered class, that does not use product operator.Another interest of our construction is that it provides limit points for the whole hierarchy of Lambek grammars, including the recent pregroup grammars.Such a result aims at clarifying the possible directions for future learning algorithms: it expresses the difficulty of learning categorial grammars from strings and the need for an adequate structure on examples.

Journal ArticleDOI
TL;DR: The k-valued non-associative Lambek grammars learned from function-argument sentences is at the frontier between learnable and unlearnable classes of languages.


01 Jan 2003
TL;DR: It is shown that some center–embedded phenomena which cannot be generated by a CG with regular selectors according to the recursion definition Mg belong to CLMg(CF ).
Abstract: Contextual Grammars (CGs) provide an appropriate description of natural languages. Unfortunately, no parser which runs in polynomial time was known for some linguistically relevant classes. In this paper, an intertwined two–level Earley–based parser for CGs with finite, regular and context–free selectors is presented. In both phases context–free grammars are defined which identify individual selectors and contexts in the input string and which find context–selector pairs. The Earley algorithms provides an efficient data structure to exchange the results between the two phases (i.e. phase one −→ phase two: contexts and selectors, phase two −→ phase one: erased contexts according to a production) and to reuse intermedidate results effectively when repeatingly new contexts and selectors are predicted for each removed context in the input string and consequently new productions are identified. As we deploy a polynomial parser for CGs with context–free selectors, the linguistic relevance of this family of languages (i.e. CLα(CF )) is discussed here. We show that some center–embedded phenomena which cannot be generated by a CG with regular selectors according to the recursion definition Mg belong to CLMg(CF ).