scispace - formally typeset
Search or ask a question

Showing papers on "Context-sensitive grammar published in 1999"


Proceedings ArticleDOI
20 Jun 1999
TL;DR: Two computationally-tractable ways of estimating the parameters of Stochastic "Unification-Based" Grammars from a training corpus of syntactic analyses are described and applied to estimate a stochastic version of Lexical-Functional Grammar.
Abstract: Log-linear models provide a statistically sound framework for Stochastic "Unification-Based" Grammars (SUBGs) and stochastic versions of other kinds of grammars. We describe two computationally-tractable ways of estimating the parameters of such grammars from a training corpus of syntactic analyses, and apply these to estimate a stochastic version of Lexical-Functional Grammar.

213 citations


Proceedings ArticleDOI
20 Jun 1999
TL;DR: The precise relationship between Probabilistic context-free grammars and shift-reduce probabilistic pushdown automata is investigated, showing that, while they define the same classes of probabilism languages, they appear to impose different inductive biases.
Abstract: Both probabilistic context-free grammars (PCFGs) and shift-reduce probabilistic pushdown automata (PPDAs) have been used for language modeling and maximum likelihood parsing. We investigate the precise relationship between these two formalisms, showing that, while they define the same classes of probabilistic languages, they appear to impose different inductive biases.

92 citations


Journal ArticleDOI
TL;DR: In this paper, six different types of shape grammars are defined by considering different kinds of restrictions on rule format and rule ordering, and the effects that these different restrictions have on the generative power, practicality, pedagogical value, and other characteristics of a shape grammar are discussed.
Abstract: The issue of decidability in relation to shape grammars is considered here. Decidability concerns, first, the identification of different types of grammars and, second, the answerability or solvability of questions about these types of grammars. In this paper, the first of these two topics is examined. Six different types of shape grammars are defined by considering different kinds of restrictions on rule format and rule ordering. The effects that these different restrictions have on the generative power, practicality, pedagogical value, and other characteristics of a shape grammar are discussed. In a subsequent paper, “Shape grammars: five questions” (Knight, 1998), the answerabilities of various questions about the types of shape grammars outlined here are explored. The decidability issues addressed in this paper and the subsequent one are key to the practical use of shape grammars in design projects where specific goals and constraints need to be satisfied.

55 citations


01 Jan 1999
TL;DR: The recognizers presented in this dissertation will recognize languages generated by Minimalist Grammars as defined in [Sta97] and will be used to rigorously explore the computational consequences of psycholinguistic theories of human sentence processing.
Abstract: In this paper I will present a formal specification of a recognizer for languages generated by Minimalist Grammars (Stabler, 1997). Minimalist Grammars are simple, formal grammars modeling some important aspects of the kind of grammars developed in the framework of Chomsky's Minimalist Program (Chomsky, 1995). A Minimalist Grammar is defined by a set of lexical items, which varies from language to language, and two universal structure building functions, which are defined on trees or configurations: merge and move. Michaelis (1998) has shown that the set of languages generated by Minimalist Grammars is mildly context-sensitive: it falls properly between the set of context-sensitive languages and the set of context-free languages. Mildly context-sensitive languages are assumed to be appropriately powerful for the description of natural languages (Joshi, 1985). Minimalist Grammars can move material from positions arbitrarily deep inside a sentence. This property contributes to the non-context-free nature of these grammars. Michaelis' equivalence result (Michaelis, 1998) permits a representation of Minimalist Grammars in which the operations of the grammar are characterized in such a way that they are strictly local. In this representation, configurations are reduced to those properties that determine their behavior with regard to the structure building functions merge and move. This representation will be used in this paper to derive a top-down recognizer for languages generated by Minimalist Grammars. The recognizer starts from the assumption that the sentence to be parsed is actually a grammatical sentence, and then tries to disassemble it into lexical items by repeatedly applying the structure building functions merge and move in reverse. The recognizer presented in this paper has the correct-prefix property: it goes through the input sentence from left to right and, in case of an ungrammatical sentence, it will halt at the first word that does not fit into a grammatical structure, i.e., the recognizer will not go beyond a prefix that cannot be extended to a grammatical sentence in the language. Similarly, the parser of the human sentence processor detects an ungrammaticality at the first word that makes a sentence ungrammatical, and for garden path sentences it will generally hesitate at the first word that does not fit into the structure that has been hypothesized for the sentence. This is a computationally advantageous property for a recognizer to have, because it prevents the recognizer from spending any effort on a sentence once it is known that it is ungrammatical. Besides contributing to a deeper understanding of Minimalist Grammars and theories that can be formalized in a similar fashion, e.g. Strict Asymmetry Grammars (e.g. Di Sciullo, 1999), the work reported in this paper may also be relevant for psycholinguistic inquiries. In most psycholinguistic proposals, the operations of the human parser are informally sketched in the context of a small set of example sentences, leaving open the question whether a parser with the desired properties covering the entire language actually exists. Another drawback of some psycholinguistic work is that it is based on simplistic and outdated conceptions of syntactic structure. Having a formal, sound and complete parsing model for Minimalist Grammars may help to remedy these problems.

55 citations



Book ChapterDOI
01 Sep 1999
TL;DR: In this paper, the authors introduce a new form of attribute grammars (extended AGs) that work directly over extended context-free grammar, rather than over standard context free grammar, and characterize the expressiveness of extended AGs in terms of monadic second-order logic.
Abstract: Document specification languages like XML, model documents using extended context-free grammars. These differ from standard context-free grammars in that they allow arbitrary regular expressions on the right-hand side of productions. To query such documents, we introduce a new form of attribute grammars (extended AGs) that work directly over extended context-free grammars rather than over standard context-free grammars. Viewed as a query language, extended AGs are particularly relevant as they can take into account the inherent order of the children of a node in a document. We show that two key properties of standard attribute grammars carry over to extended AGs: efficiency of evaluation and decidability of well-definedness. We further characterize the expressiveness of extended AGs in terms of monadic second-order logic and establish the complexity of their non-emptiness and equivalence problem to be complete for EXPTIME. As an application we show that the Region Algebra expressions can be efficiently translated into extended AGs. This translation drastically improves the known upper bound on the complexity of the emptiness and equivalence test for Region Algebra expressions.

42 citations


Book ChapterDOI
30 Aug 1999
TL;DR: The model of generalized P-systems, GP- systems for short, is considered, a new model for computations using membrane structures and recently introduced by Gheorghe Păun, allowing for the simulation of graph controlled grammars of arbitrary type based on productions working on single objects.
Abstract: We consider a variant of P-systems, a new model for computations using membrane structures and recently introduced by Gheorghe Paun. Using the membranes as a kind of filter for specific objects when transferring them into an inner compartment turns out to be a very powerful mechanism in combination with suitable rules to be applied within the membranes. The model of generalized P-systems, GP-systems for short, considered in this paper allows for the simulation of graph controlled grammars of arbitrary type based on productions working on single objects; for example, the general results we establish in this paper can immediately be applied to the graph controlled versions of context-free string grammars, n-dimensional #-context-free array grammars, and elementary graph grammars.

36 citations


01 Jan 1999
TL;DR: This work presents in detail a method for constructing local grammars using large corpora using a particular representation by graphs which is well-adapted to the syntax of natural languages.
Abstract: Local grammars are finite-state grammars or finite-state automata that represent sets of utterances of a natural language. Local grammars have been used to describe a wide variety of sets of strings, ranging from finite sets of words related by prefixation and suffixation to sets of sentences syntactically and semantically related. Formal representations of finite-state grammars are quite varied, although equivalent. We have chosen a particular representation by graphs which is well-adapted to the syntax of natural languages. Using a particular example, we will present in detail a method for constructing local grammars using large corpora.

28 citations


Journal ArticleDOI
TL;DR: It is proved that all recursively enumerable languages can be generated by context-free returning parallel communicating grammar systems by showing how the parallel communicating grammars can simulate two-counter machines.

26 citations


Proceedings Article
01 Jan 1999
TL;DR: This work presents a necessary and sufficient, but in general undecidable, criterion for exponential ambiguity, and provides the possibility to distinguish infinitely ambiguous context-free grammars by the growth-rate of their ambiguity functions.

25 citations


01 Jan 1999
TL;DR: The aim of this work is to show that "lean" programs are possible for grammatical inference, and it is shown that rational unification allows to infer such grammars.
Abstract: We propose to set the grammatical inference problem in a logical framework. The search for admissible solutions in a given class of languages is reduced to the problem of unifying a set of terms. This point of view has been already developed in the particular context of categorial grammars, a type of lexicalized grammar. We present the state of the art in this domain and propose several improvements. The case of regular grammars is studied in a second part. We show that rational unification allows to infer such grammars. We give corresponding Prolog programs in both cases. Indeed, one of the aim of this work is to show that "lean" programs are possible for grammatical inference. This approach has been successful in the field of automated theorem proving and we expect to observe the same benefits in grammatical inference : efficiency and extendibility.

01 Jan 1999
TL;DR: This paper demonstrates that the multimodal categorial grammars are in fact Turing-complete in their weak generative capacity, and discusses a restriction to the so-caled weak Sahlqvist lexical rules, for which to ensure decidability.
Abstract: In this paper, we demonstrate that the multimodal categorial grammars are in fact Turing-complete in their weak generative capacity The result follows from a straightforward reduction of generalized rewriting systems to a mixed associative and modal categorial calculus We conclude with a discussion of a restriction to the so-caled weak Sahlqvist lexical rules, for which we can ensure decidability

Proceedings ArticleDOI
08 Jun 1999
TL;DR: It is shown that the positive version of RCGs, as simple LMGs or integer indexing LFPs, exactly covers the class PTIME of languages recognizable in deterministic polynomial time, and which extends CFGs, aims at being a convincing challenger as a syntactic base for various tasks, especially in natural language processing.
Abstract: The notion of mild context-sensitivity was formulated in an attempt to express the formal power which is both necessary and sufficient to define the syntax of natural languages. However, some linguistic phenomena such as Chinese numbers and German word scrambling lie beyond the realm of mildly context-sensitive formalisms. On the other hand, the class of range concatenation grammars provides added power w.r.t. mildly context-sensitive grammars while keeping a polynomial parse time behavior. In this report, we show that this increased power can be used to define the above-mentioned linguistic phenomena with a polynomial parse time of a very low degree.

Patent
Robert C. Moore1
16 Nov 1999
TL;DR: In this paper, a method for transforming a first set of rule expressions forming a first grammar to a second set of rules forming a second grammar is presented, identifying at least one left-recursive category of the first grammar and applying a left-corner transform to substantially only the leftrecursive rule expressions in forming the second grammar.
Abstract: A method for transforming a first set of rule expressions forming a first grammar to a second set of rule expressions forming a second grammar includes identifying at least one left-recursive category of the first grammar; and applying a left-corner transform to substantially only the left-recursive rule expressions of the first grammar in forming the second grammar.

01 Sep 1999
TL;DR: The design and use of the Simple Language Generator (SLG) is introduced, which allows the user to construct small but interesting stochastic context free languages with relative ease.
Abstract: : This paper introduces the design and use of the Simple Language Generator (SLG). SLG allows the user to construct small but interesting stochastic context free languages with relative ease. Although context free grammars are convenient for representing natural language syntax, they do not easily support the semantic and pragmatic constraints that make certain combinations of words or structures more likely than others. Context free grammars for languages involving many interacting constraints can become extremely complex and cannot reasonably be written by hand. SLG allows the basic syntax of a grammar to be specified in context free form and constraints to be applied atop this framework in a relatively natural fashion. This combination of grammar and constraints is then converted into a standard stochastic context free grammar for use in generating sentences or in making context dependent likelihood predictions of the sequence of words in a sentence.

Journal ArticleDOI
TL;DR: The main point of this paper is the systematic study of all possibilities of defining leftmost derivation in matrix grammars and finds a characterization of the recursively enumerable languages for matrix Grammars with the leftmost restriction defined on classes of a given partition of the nonterminal alphabet.
Abstract: Matrix grammars are one of the classical topics of formal languages, more specifically, regulated rewriting. Although this type of control on the work of context-free grammars is one of the earliest, matrix grammars still raise interesting questions (not to speak about old open problems in this area). One such class of problems concerns the leftmost derivation (in grammars without appearance checking). The main point of this paper is the systematic study of all possibilities of defining leftmost derivation in matrix grammars. Twelve types of such a restriction are defined, only four of which being discussed in literature. For seven of them, we find a proof of a characterization of recursively enumerable languages (by matrix grammars with arbitrary context-free rules but without appearance checking). Other three cases characterize the recursively enumerable languages modulo a morphism and an intersection with a regular language. In this way, we solve nearly all problems listed as open on page 67 of the monograph [7], which can be seen as the main contribution of this paper. Moreover, we find a characterization of the recursively enumerable languages for matrix grammars with the leftmost restriction defined on classes of a given partition of the nonterminal alphabet.

Journal ArticleDOI
TL;DR: A hypergraph-generating system, called HRNCE grammars, which is structurally simple and descriptively powerful, which can generate all recursively enumerable languages.

Journal ArticleDOI
TL;DR: In this article, six different types of shape grammars were defined in terms of different restrictions on rule format and rule ordering, and the generative power, practicality, pedagogical value, and other characteristics of each type of shape grammar were discussed.
Abstract: In the paper “Shape grammars: six types”, the issue of decidability in relation to shape grammars was introduced. Decidability concerns, first, the identification of different types of grammars, and, second, the answerability or solvability of questions about these types of grammars. The first of these two topics was explored in “Six types”. Six different types of shape grammars were defined in terms of different restrictions on rule format and rule ordering. The generative power, practicality, pedagogical value, and other characteristics of each type of shape grammars were discussed. In this paper, the second of the two topics in decidability is addressed. Five questions about the different types of shape grammars defined in “Six types” are posed. These questions are formulated for their practical value in design applications of shape grammars, as well as their theoretical interest. The answerability of each question is examined in detail for each type of shape grammar.

Journal ArticleDOI
TL;DR: It is shown that context-free picture grammars are strictly weaker than both random permitting and random forbidding context picture Grammars, and also that random permitting context is strictly stronger than random context.
Abstract: We use random context picture grammars to generate pictures through successive refinement. The productions of such a grammar are context-free, but their application is regulated — "permitted" or "forbidden" — by context randomly distributed in the developing picture. Grammars using this relatively weak context often succeed where context-free grammars fail, e.g. in generating the Sierpinski carpets. On the other hand it proved possible to develop iteration theorems for three subclasses of these grammars, namely a pumping–shrinking, a pumping and a shrinking lemma for context-free, random permitting and random forbidding context picture grammars, respectively. Finding necessary conditions is problematic in the case of most models of context-free grammars with context-sensing ability, since they consider a variable and its context as a finite connected array. We have already shown that context-free picture grammars are strictly weaker than both random permitting and random forbidding context picture grammars, also that random permitting context is strictly weaker than random context. We now show that grammars which use forbidding context only are strictly weaker than random context picture grammars.

Proceedings Article
Brian J. Ross1
13 Jul 1999
TL;DR: The DCTG-GP system improves on other grammar-based GP systems by permitting nontrivial semantic aspects of the language to be defined with the grammar, and automatically analyzes grammar rules in order to determine their minimal depth and termination characteristics, when generating random program trees of varied shapes and sizes.
Abstract: DCTG-GP is a genetic programming system that uses definite clause translation grammars. A DCTG is a logical version of an attribute grammar that supports the definition of context-free languages, and it allows semantic information associated with a language to be easily accommodated by the grammar. This is useful in genetic programming for defining the interpreter of a target language, or incorporating both syntactic and semantic problem-specific constraints into the evolutionary search. The DCTG-GP system improves on other grammar-based GP systems by permitting nontrivial semantic aspects of the language to be defined with the grammar. It also automatically analyzes grammar rules in order to determine their minimal depth and termination characteristics, which are required when generating random program trees of varied shapes and sizes. An application using DCTG-GP is described.


Journal ArticleDOI
TL;DR: The main aim of this paper is to provide an approach to the parallel composition of graph grammars, formalizing the intuitive idea of ‘divide and conquer’ described above.
Abstract: The specification of complex systems is usually done by the ‘divide and conquer’ idea: the system is divided into smaller, less complex components that are developed separately and then merged in some way to form the specification of the whole system. The main aim of this paper is to provide an approach to the parallel composition of graph grammars, formalizing the intuitive idea of ‘divide and conquer’ described above. This parallel composition of graph grammars provides a suitable formalism for the specification of concurrent systems based on the specifications of their components. ‘Dividing’ is formalized by special graph grammar morphisms, called specialization morphisms. These morphisms also describe structural and behavioural compatibilities between graph grammars. As a main result, we characterize the parallel composition as the pullback in the category of graph grammars.

Journal Article
TL;DR: If jwj 6 = jxj, then it is proved there exists a G separating w from x of size O(log log n), and this bound is best possible.
Abstract: We study the following problem: given two words w and x, with jwj; jxj n, what is the size of the smallest context-free grammar G which generates exactly one of fw; xg? If jwj 6 = jxj, then we prove there exists a G separating w from x of size O(log log n), and this bound is best possible. If jwj = jxj, then we get an upper bound on the size of G of O(log n), and a lower bound of (logn loglog n).

Book
01 Jan 1999
TL;DR: Theoretic Analysis of Intonation and Linear Logic Treatment of Phrase Structure Grammars For Unbounded Dependencies and Mathematical Vernacular and Conceptual Well-Formedness in Mathematical Language.
Abstract: Invited papers.- Type Grammar Revisited.- Optimal Parameters.- Selected papers.- Strong Equivalence of Generalized Ajdukiewicz and Lambek Grammars.- Linguistic, Philosophical, and Pragmatic Aspects of Type-Directed Natural Language Parsing.- Derivational and Representational Views of Minimalist Transformational Grammar.- The MSO Logic-Automaton Connection in Linguistics.- The Logic of Tune A Proof-Theoretic Analysis of Intonation.- A Linear Logic Treatment of Phrase Structure Grammars For Unbounded Dependencies.- Underspecification in Type-Logical Grammars.- On Fibring Feature Logics with Concatenation Logics.- An Operational Model for Parsing De.nite Clause Grammars with In.nite Terms.- Mathematical Vernacular and Conceptual Well-Formedness in Mathematical Language.

Journal ArticleDOI
TL;DR: This model is a generalization of multi-bracketed contextual grammars, which possess an induced Dyck-structure to control the derivation process and to provide derivation trees.
Abstract: We study the generative capacity of multi-bracketed contextual rewriting grammars. This model is a generalization of multi-bracketed contextual grammars, which were studied in [Kap98a]. They possess an induced Dyck-structure to control the derivation process and to provide derivation trees. The generative capacity of this class is investigated and compared to Chomsky grammars and to tree adjoining grammars with local constraints. It will be shown that this class of grammars covers the basic natural language constructions such as duplication, multiple agreement and crossed-serial dependencies. Furthermore, two natural variants of the derivation relation, namely top-down and bottom-up derivation modes are examined.

Journal Article
TL;DR: A new representation scheme for extended context-free grammars (the symbol-threaded expression forest), a new normal form for these grammARS (dot normal form) and new regular expression algorithms are introduced.
Abstract: We investigate the complexity of a variety of normal-form transformations for extended context-free grammars, where by extended we mean that the set of right-hand sides for each nonterminal in such a grammar is a regular set. The study is motivated by the implementation project GraMa which will provide a C++ toolkit for the symbolic manipulation of context-free objects just as Grail does for regular objects. The results are that all transformations of interest take time linear in the size of the given grammar giving resulting grammars that are larger by a constant factor than the original grammar. Our results generalize known bounds for context-free grammars but do so in nontrivial ways. Specifically, we introduce a new representation scheme for extended context-free grammars (the symbol-threaded expression forest), a new normal form for these grammars (dot normal form) and new regular expression algorithms.


Journal ArticleDOI
01 Dec 1999-Grammars
TL;DR: It is shown that the hypergraph-generating power of (remote-free) C-hNCE Grammars includes properly that of HR and S-HH grammars together, which indicates that confluent node rewriting plays as important a role in generating sets of hypergraphs as it does in generating set of graphs.
Abstract: Context-free hypergraph grammars allow to define sets of hypergraphs in a recursive way In the literature, three main approaches can be found: hyperedge rewriting (HR), separated handle rewriting (S-HH), and confluent node rewriting (C-hNCE) With respect to their graph-generating power, S-HH grammars and so-called remote-free C-hNCE grammars characterize confluent node rewriting in graphs, which in turn is more powerful than hyperedge rewriting With respect to their hypergraph-generating power, HR and S-HH grammars have been shown to be incomparable In this paper, we show that the hypergraph-generating power of (remote-free) C-hNCE grammars includes properly that of HR and S-HH grammars together This indicates that confluent node rewriting plays as important a role in generating sets of hypergraphs as it does in generating sets of graphs

Journal ArticleDOI
TL;DR: In this paper, the contexts are adjoined by shuffling them on certain trajectories in order to generate mildly context-sensitive languages, which is a new way of generating mildly context sensitive languages.
Abstract: We introduce and investigate a new way of generating mildly context-sensitive languages The main idea is that the contexts are adjoined by shuffling them on certain trajectories In this way we obtain also a very general class of contextual grammars such that most of the fundamental classes of contextual gram-mars, for instance, internal contextual grammars, external contextual grammarsn-contextual grammars, are particular cases of contextual grammars with contexts shuffled on trajectories The approach is very flexible, able to model various aspects from linguistics

Journal ArticleDOI
TL;DR: It is shown that there is a regular language which cannot be generated by context-free evolutionary grammars, thus disproving a conjecture from Dassow et al. (BioSystems 43 (1997) 169–177).