scispace - formally typeset
Search or ask a question

Showing papers on "Context-sensitive grammar published in 1998"


Book
01 Jun 1998
TL;DR: The dissertation investigates learnability of various classes of classical categorial grammars within the Gold paradigm of identification in the limit from positive data and proves that finite elasticity is preserved under the inverse image of a finite-valued relation, extending results of Wright's and of Moriyama and Sato's.
Abstract: The dissertation investigates learnability of various classes of classical categorial grammars within the Gold paradigm of identification in the limit from positive data. Both learning from functor-argument structures and learning from flat strings are considered. The class of rigid grammars, the class of k-valued grammars (k = 2, 3, ...), the class of least-valued grammars, and the class of least-cardinality grammars are shown to be learnable from structures, and the class of rigid grammars and the class of k-valued grammars (k = 2, 3, ...) are also shown to be learnable from strings. An interesting class that is not learnable even from structures is treated as well. In proving learnability results, I make essential use of the concept known as finite elasticity, which is a property of language classes. I prove that finite elasticity is preserved under the inverse image of a finite-valued relation, extending results of Wright's and of Moriyama and Sato's. I use this theorem to 'transfer' finite elasticity from the class of rigid structure languages to the class of k-valued structure languages, and then to the class of k-valued string languages. The learning algorithms used incorporate Buszkowski and Penn's algorithms for determining categorial grammars from input consisting of functor-argument structures. Some of the learnability results are extended to such loosely 'categorial' formalisms as combinatory grammar and Montague grammar. The appendix presents Prolog implementations of some of the learning algorithms used in the dissertation.

161 citations


Book ChapterDOI
14 Dec 1998
TL;DR: It is shown that this type of a minimalist grammar constitutes a subclass of mildly context-sensitive grammars in the sense that for each MG there is a weakly equivalent linear context-free rewriting system (LCFRS).
Abstract: The change within the linguistic framework of transformational grammar from GB-theory to minimalism brought up a particular type of formal grammar, as well. We show that this type of a minimalist grammar (MG) constitutes a subclass of mildly context-sensitive grammars in the sense that for each MG there is a weakly equivalent linear context-free rewriting system (LCFRS). Moreover, an infinite hierarchy of MGs is established in relation to a hierarchy of LCFRSs.

138 citations


Proceedings ArticleDOI
Mark Johnson1
10 Aug 1998
TL;DR: This paper describes how to construct a finite-state machine (FSM) approximating a 'unification-based' grammar using a left-corner grammar transform, and is exact for left-linear and right-linear CFGs, and for trees up to a user-specified depth of center-embedding.
Abstract: This paper describes how to construct a finite-state machine (FSM) approximating a 'unification-based' grammar using a left-corner grammar transform. The approximation is presented as a series of grammar transforms, and is exact for left-linear and right-linear CFGs, and for trees up to a user-specified depth of center-embedding.

63 citations


Proceedings ArticleDOI
01 May 1998
TL;DR: This work shows that the queries expressible by BAGs are precisely those definable by first-order inductions of linear depth, or, equivalently, those computable in linear time on a parallel machine with polynomially many processors, and shows that RAGs is more expressive than monadic second-order logic for queries of any arity.
Abstract: Structured document databases can be naturally viewed as derivation trees of a context-free grammar. Under this view, the classical formalism of attribute grammars becomes a formalism for structured document query languages. From this perspective, we study the expressive power of BAGs: Boolean-valued attribute grammars with propositional logic formulas as semantic rules, and RAGs: relation-valued attribute grammars with first-order logic formulas as semantic rules. BAGs can express only unary queries; RAGs can express queries of any arity. We first show that the (unary) queries expressible by BAGs are precisely those definable in monadic second-order logic. We then show that the queries expressible by RAGs are precisely those definable by first-order inductions of linear depth, or, equivalently, those computable in linear time on a parallel machine with polynomially many processors. Further, we show that RAGs that only use synthesized attributes are strictly weaker than RAGs that use both synthesized and inherited attributes. We show that RAGs are more expressive than monadic second-order logic for queries of any arity. Finally, we discuss relational attribute grammars in the context of BAGs and RAGs. We show that in the case of BAGs this does not increase the expressive power, while different semantics for relational RAGs capture the complexity classes NP, coNP and UP ∩ coUP.

54 citations


Journal ArticleDOI
TL;DR: It is proved here that each recursively enumerable language can be written as the weak coding of the image by an inverse morphism of a language generated by an insertion grammar (with the maximal length of strings u, v as above equal to seven).

47 citations


Journal Article
TL;DR: The paper discusses some classes of contextual grammars---mainly those with "maximal use of selectors"---giving some arguments that these Grammars can be considered a good model for natural language syntax, and some ideas for associating a structure to the generated words, in the form of a tree, or of a dependence relation.
Abstract: The paper discusses some classes of contextual grammars---mainly those with "maximal use of selectors"---giving some arguments that these grammars can be considered a good model for natural language syntax.A contextual grammar produces a language starting from a finite set of words and interatively adding contexts to the currently generated words, according to a selection procedure: each context has associated with it a selector, a set of words; the context is adjoined to any occurrence of such a selector in the word to be derived. In grammars with maximal use of selectors, a context is adjoined only to selectros for which no superword is a selector. Maximality can be defined either locally or globally (with respect to all selectors in the grammar). The obtained families of languages are incomparable with that of Chomsky context-free languages (and with other families of languages that contain linear languages and that are not "too large"; see Section 5) and have a series of properties supporting the assertion that these grammars are a possible adequate model for the syntax of natural languages. They are able to straightforwardly describe all the usual restrictions appearing in natural (and artificial) languages, which lead to the non-context-freeness of these languages: reduplication, crossed dependencies, and multiple agreements; however, there are center-embedded constructions that cannot be covered by these grammars.While these assertions concern only the weak generative capacity of contextual grammars, some ideas are also proposed for associating a structure to the generated words, in the form of a tree, or of a dependence relation (as considered in descrpitive linguistics and also similar to that in link grammars).

38 citations


Proceedings ArticleDOI
10 Aug 1998
TL;DR: An efficient algorithm for compiling into weighted finite automata an interesting class of weighted context-free grammars that represent regular languages that can be combined with other speech recognition components are described.
Abstract: Weighted context-free grammars are a convenient formalism for representing grammatical constructions and their likelihoods in a variety of language-processing applications. In particular, speech understanding applications require appropriate grammars both to constrain speech recognition and to help extract the meaning of utterances. In many of those applications, the actual languages described are regular, but context-free representations are much more concise and easier to create. We describe an efficient algorithm for compiling into weighted finite automata an interesting class of weighted context-free grammars that represent regular languages. The resulting automata can then be combined with other speech recognition components. Our method allows the recognizer to dynamically activate or deactivate grammar rules and substitute a new regular language for some terminal symbols, depending on previously recognized inputs, all without recompilation. We also report experimental results showing the practicality of the approach.

38 citations


Proceedings Article
01 Jan 1998
TL;DR: A process for low-temperature separation of air, wherein liquefied nitrogen and liquefying air enriched with oxygen obtained from preliminary rectification are subjected to secondary rectification to produce gaseous nitrogen containing less than 0.3 vol.% of oxygen and argon impurities.
Abstract: A process for low-temperature separation of air, wherein liquefied nitrogen and liquefied air enriched with oxygen obtained from preliminary rectification are subjected to secondary rectification to produce gaseous nitrogen containing less than 0.3 vol.% of oxygen and argon impurities, gaseous oxygen-argon mixture and liquefied oxygen-argon mixture containing up to 4.5 vol.% of argon. Subsequently, the gaseous oxygen-argon mixture is subjected to further rectification to produce argon-oxygen mixture containing argon with impurities of 3-0.1 vol.% of oxygen and less than 0.1 vol.% of nitrogen, as well as oxygen with a concentration of from 99.7 to 99.99 vol.%.

30 citations


Book ChapterDOI
16 Nov 1998
TL;DR: A characterization is given of the class of tree languages which can be generated by context-free hyperedge replacement (HR) graph grammars, in terms of macro tree transducers (MTTs), which yields a normal form for tree generating HR graph Grammars.
Abstract: A characterization is given of the class of tree languages which can be generated by context-free hyperedge replacement (HR) graph grammars, in terms of macro tree transducers (MTTs) This characterization yields a normal form for tree generating HR graph grammars Moreover, two natural, structured ways of generating trees with HR graph grammars are considered and an inclusion diagram of the corresponding classes of tree languages is proved Finally, the MSO definable tree transductions are characterized in terms of MTTs

26 citations


Journal ArticleDOI
TL;DR: A new systematic approach for the uniform random generation of combinatorial objects based on the notion of object grammars which give recursive descriptions of objects and generalize context-freegrammars is presented.
Abstract: This paper presents a new systematic approach for the uniform random generation of combinatorial objects. The method is based on the notion of object grammars which give recursive descriptions of objects and generalize context-freegrammars. The application of particular valuations to these grammars leads to enumeration and random generation of objects according to non algebraic parameters.

24 citations


Journal ArticleDOI
TL;DR: The paper contains some results on the generative capacity of Categorial Grammars based on L♦, and the set of tree languages generated by L♦-grammars neither contains nor is contained in the class of context free tree languages.
Abstract: In Moortgat (1996) the Lambek Calculus L(Lambek, 1958) is extended by a pair of residuation modalities ♦ and □↓. Categorial Grammars based on the resulting logic L♦ are attractive for the purpose of modelling linguistic phenomena since they offer a compromise between the strict constituent structures imposed by context free grammars and related formalisms on the one hand, and the complete absence of hierarchical information in Lambek grammars on the other hand. The paper contains some results on the generative capacity of Categorial Grammars based on L♦. First it is shownthat adding residuation modalities does not extend the weak generative capacity. This is proved by extending the proof for the context freeness of L-grammars from Pentus (1993) to L♦. Second, the strong generative capacity of L♦-grammars is compared to context free grammars. The results are mainly negative; the set of tree languages generated by L♦-grammars neither contains nor is contained in the class of context free tree languages.

Proceedings ArticleDOI
10 Aug 1998
TL;DR: In this article, a precompilation technique for wide-coverage lexicalized grammars is described, which allows some of the computation associated with different structures to be shared.
Abstract: In wide-coverage lexicalized grammars many of the elementary structures have substructures in common. This means that in conventional parsing algorithms some of the computation associated with different structures is duplicated. In this paper we describe a precompilation technique for such grammars which allows some of this computation to be shared. In our approach the elementary structures of the grammar are transformed into finite state automata which can be merged and minimised using standard algorithms, and then parsed using an automaton-based parser. We present algorithms for constructing automata from elementary structures, merging and minimising them, and string recognition and parse recovery with the resulting grammar.

Book ChapterDOI
01 Jun 1998
TL;DR: Most results from LR parsing can be extended to positional grammars while preserving its well known efficiency, thanks to this analogy.
Abstract: Positional grammars naturally extend context-free grammars for string languages to grammars for visual languages by considering new relations in addition to string concatenation. Thanks to this analogy, most results from LR parsing can be extended to positional grammars while preserving its well known efficiency. The positional grammar model is the underlying formalism of the VLCC (Visual Language Compiler-Compiler) system for the automatic generation of visual programming environments. VLCC inherits and extends to the visual field, concepts and techniques of compiler generation tools like YACC. Due to their nature, the positional grammars are a very suitable formalism for processing languages integrating visual and textual constructs.

Journal ArticleDOI
01 Jan 1998-Grammars
TL;DR: First, the idea of associating a tree to a derivation in such a grammar is considered, which can be done in a natural way, by associating parentheses to the contexts of the grammar, which obtains a restriction on the derivations in a contextual grammar, as well as a direct manner of defining the ambiguity of contextual grammars.
Abstract: The aim of this paper is to start investigations on the possibilities of introducing a structure in the strings generated by internal contextual grammars. First, we consider the idea of associating a tree to a derivation in such a grammar. This can be done in a natural way, by associating parentheses to the contexts of the grammar. In this way we obtain a restriction on the derivation in a contextual grammar, as well as a direct manner of defining the ambiguity of contextual grammars. Then, we consider a relation on the set of symbols appearing in a string, in the sense already used in descriptive linguistics. By starting from a set of axioms which are structured strings and adjoining to them contexts as usual in contextual grammars, but having prescribed dependences between their symbols, we obtain a set of structured strings. By imposing conditions on the structure of the strings (crossed-noncrossed dependences, a tree structure, a link structure in the sense of link grammars, etc), we obtain a restriction on the derivation in a contextual grammar, as well as a direct manner of defining the structure of languages generated by contextual grammars. The linguistic relevance of these structures associated to strings generated by contextual grammars remains to be further explored.

Journal ArticleDOI
TL;DR: This paper describes an efficient procedure to compute the relative entropy between two stochastic deterministic regular tree grammars.

Book ChapterDOI
16 Nov 1998
TL;DR: In this article, a concurrent semantics for DPO graph grammars has been provided by showing that each graph grammar can be unfolded into an acyclic branching structure, that is itself a (nondeterministic occurrence) graph grammar describing all the possible computations of the original grammar.
Abstract: In a recent paper, mimicking Winskel’s construction for Petri nets, a concurrent semantics for (double-pushout) DPO graph grammars has been provided by showing that each graph grammar can be unfolded into an acyclic branching structure, that is itself a (nondeterministic occurrence) graph grammar describing all the possible computations of the original grammar.

Journal ArticleDOI
TL;DR: Different algebraic structures which are natural algebraic frames for categorial grammars are discussed, including absolutely free algebras of functor-argument structures and phrase structures together with power-set algeBRas of types to provide algorithms for equivalence problems and related questions.

Journal ArticleDOI
TL;DR: The use of preference clauses for resolution of prepositional phrase attachment ambiguities are illustrated, and the growing consensus in the literature on the need to explicitly specify preference criteria for ambiguity resolution is pointed out.

Journal ArticleDOI
TL;DR: This paper explains how to transform grammars so that they generate their languages in a uniform way and describes analogical transformations for EIL systems.
Abstract: This paper explains how to transform grammars so that they generate their languages in a uniform way. Specifically, it transforms any phrase-structure grammar to an equivalent three-nonterminal phrase-structure grammar in which every sentential form consists of a terminal word followed by the concatenation of several permutations of a word over {0, 1}, where 0 and 1 are two of the three nonterminals. Then, it converts any phrase-structure grammar to an equivalent three-nonterminal phrase-structure grammar in which every sentential form consists of a terminal word preceded by the concatenation of several permutations of a binary word. In addition, it describes analogical transformations for EIL systems. In its conclusion, this paper discusses some open problems and applications.

Book ChapterDOI
01 Jan 1998
TL;DR: It is concluded that an attribute grammar oriented algorithm development may be a fruitful one, and may go hand in hand with a more algebraic style of program development.
Abstract: For a long time, attribute grammars have formed an isolated programming formalism. We show how we may embed the attribute grammar approach in a modern functional programming language. The advantages of both sides reinforce each other: the former provides compositionality and the latter naming abstraction and higher-orderness. Through a sequence of program transformations we show different aspects of the techniques involved. We conclude with the observation that an attribute grammar oriented algorithm development may be a fruitful one, and may go hand in hand with a more algebraic style of program development.

01 Jan 1998
TL;DR: This paper proposes a polynomial parse time method for range concatenation grammars as a high-level intermediate definition formalism, and uses it to give both a new insight into the multicomponent adjunction mechanism and at providing a practical implementation scheme.
Abstract: The notion of mild context-sensitivity is an attempt to express the formal power needed to define the syntax of natural languages. However, all incarnati- ons of mildly context-sensitive formalisms are not equivalent. On the one hand, near the bottom of the hierarchy, we find tree adjoining grammars and, on the other hand, near the top of the hierarchy, we find multicomponent tree adjoining grammars. This paper proposes a polynomial parse time method for these two tree rewriting formalisms. This method uses range concatenation grammars as a high-level intermediate definition formalism, and yields several algorithms. Range concatenation grammar is a syntactic formalism which is both powerful, in so far as it extends linear context-free rewriting systems, and efficient, in so far as its sentences can be parsed in polynomial time. We show that any unrestricted tree adjoining grammar can be transformed into an equivalent range concatenation grammar which can be parsed in O(n6) time, and, moreover, if the input tree adjoining grammar has some restricted form, its parse time decreases to O(n5). We generalize one of these algorithms in order to process multicomponent tree adjoining grammars. We show some upper bounds on their parse times, and we introduce a hierarchy of restricted forms which can be parsed more efficiently. Our approach aims at giving both a new insight into the multicomponent adjunction mechanism and at providing a practical implementation scheme.

Book ChapterDOI
14 Dec 1998
TL;DR: This paper shows that a formalism fitting better to linguistic structures can be obtained by using a sequence of pushdowns instead of one pushdown for the storage of the indices in a derivation, and argues that the corresponding restriction on writing is more natural from a linguistic point of view.
Abstract: Linear indexed grammars (LIGs) can be used to describe nonlocal dependencies. The indexing mechanism, however, can only account for dependencies that are nested. In natural languages one can easily find examples to which this simple model cannot be applied straightforwardly. In this paper I will show that a formalism fitting better to linguistic structures can be obtained by using a sequence of pushdowns instead of one pushdown for the storage of the indices in a derivation. Crucially, we have to avoid unwanted interactions between the push-downs that would make possible the simulation of a turing machine. [1] solves this problem for multi-pushdown automata by restricting reading to the first nonempty pushdown. I will argue that the corresponding restriction on writing is more natural from a linguistic point of view. I will show that, under each of both restrictions, grammars with a sequence of n pushdowns give rise to a subclass of the nth member of the hierarchy defined by [15,16], and therefore are mildly context sensitive.

Journal ArticleDOI
01 May 1998-Grammars
TL;DR: It will be shown that some subclasses of such grammars are strictly included in the context-free languages and that there are regular languages which cannot be generated by any bracketed contextual grammar.
Abstract: Bracketed contextual grammars are contextual grammars with an induced Dyck-structure to control the derivation process and to provide derivation trees. In this paper, we study the generative capacity and closure properties of bracketed and fully bracketed contextual grammars. It will be shown that some subclasses of such grammars are strictly included in the context-free languages and that there are regular languages which cannot be generated by any bracketed contextual grammar.

Journal ArticleDOI
Pavel Martinek1
TL;DR: The grammatical inference problem is solved for a class of languages which can be generated by pure grammars with non-shortening productions and an algorithm for assigning a pure grammar to any language from the class is described.
Abstract: The grammatical inference problem is solved for a class of languages which can be generated by pure grammars with non-shortening productions. Necessary and sufficient condition for determination whether a language belongs to this class is formulated and proved. Finally, an algorithm for assigning a pure grammar to any language from the class is described.

01 Jan 1998
TL;DR: This article presents an on-line bottom-up parsing algorit hm for stochastic context-free grammars that is able to find the N -most probable parses of the input sequence; to deal with multiple interpretations of sentences containing compound words; and toDeal with “out of vocabulary” words.
Abstract: This article presents an on-line bottom-up parsing algorit hm for stochastic context-free grammars that, in addition to the usual functionalities of standard S CFG parsing algorithms, is able (1) to find theN -most probable parses of the input sequence; (2) to deal with multiple interpretations of sentences containing compound words; and (3) to deal with “out of vocabulary” words. Furthermore, the presented algorithm appears to be particular ly suitable for speech applications and is proved to be at least as efficient as the corresponding Earl ey-like or CYK-like algorithms. In terms of space complexity, even in the case where the number o f parse trees associated with a given input is exponential in its number of words ( n), the chart used by the algorithm for their representation remains O(n2) space complex.

Book ChapterDOI
16 Nov 1998
TL;DR: A new kind of grmmar is introduced in which the right side of the production is simply appended to the intermediate structure in such a way that the left side becomes its "neighborhood" in the new structure, which permits the grammatical definition of many different kinds of "n-dimensional" discrete structures.
Abstract: Phrase structure grammars, in which non-terminal symbols on the left side of a production can be rewritten by the string on the right side, together with their Chomsky hierarch classification, are familiar to computer scientists. But, these gramamars are most effective only to generate, and parse, strings. In this report, we introduce a new kind of grmmar in which the right side of the production is simply appended to the intermediate structure in such a way that the left side becomes its "neighborhood" in the new structure. This permits the grammatical definition of many different kinds of "n-dimensional" discrete structures. Several examples are given. Moreover, these grammars yield a formal theory grounded in antimatroid closure spaces. For example, we show that restricted neighborhood expansion grammars capture the essence of finite state and context free phrase structure grammars.

Book ChapterDOI
16 Nov 1998
TL;DR: It is shown that the generated classes of line-drawing languages are incomparable, but that chain-code Grammars can simulate those collage grammars which use only similarity transformations.
Abstract: Collage grammars and context-free chain-code grammars are compared with respect to their generative power It is shown that the generated classes of line-drawing languages are incomparable, but that chain-code grammars can simulate those collage grammars which use only similarity transformations



Book ChapterDOI
21 Nov 1998
TL;DR: Various concepts of leftmost derivation in grammars controlled by bicoloured digraphs are investigated, especially regarding their descriptive capacity, to unify the presentation of known results and to obtain new results concerning Grammars with regular control, and periodically time-variant grammARS.
Abstract: In this paper, we investigate various concepts of leftmost derivation in grammars controlled by bicoloured digraphs, especially regarding their descriptive capacity. This approach allows us to unify the presentation of known results regarding especially programmed grammars and matrix grammars, and to obtain new results concerning grammars with regular control, and periodically time-variant grammars. Moreover, we get new results on leftmost derivations in conditional grammars.