scispace - formally typeset
Search or ask a question

Showing papers on "Tree-adjoining grammar published in 1995"


Posted Content
TL;DR: This article developed a formal grammatical system called a link grammar and showed how English grammar can be encoded in such a system, and gave algorithms for efficiently parsing with a link grammars.
Abstract: We develop a formal grammatical system called a link grammar, show how English grammar can be encoded in such a system, and give algorithms for efficiently parsing with a link grammar. Although the expressive power of link grammars is equivalent to that of context free grammars, encoding natural language grammars appears to be much easier with the new system. We have written a program for general link parsing and written a link grammar for the English language. The performance of this preliminary system -- both in the breadth of English phenomena that it captures and in the computational resources used -- indicates that the approach may have practical uses as well as linguistic significance. Our program is written in C and may be obtained through the internet.

839 citations


Journal ArticleDOI
TL;DR: A constructive procedure is presented for converting a CFG into a left anchored LTIG that preserves ambiguity and generates the same trees and it was possible to parse more quickly with the LTIGs than with the original CFGs.

180 citations


Journal ArticleDOI
TL;DR: A solid representation formalism is defined on the basis of the parsable family of IE-graphs and multiaspect taxonomies defined for features allow one to solve problems of constructing unique and unambiguous descriptions of modelled solids.
Abstract: A solid representation formalism is defined on the basis of the parsable family of IE-graphs. edNLC-type graph grammars are used for the dynamic building and manipulation of such representations. The other two syntactic pattern recognition schemes, namely programmed edNLC grammars and a syntax-directed scheme translating graph languages, are introduced for reasoning over structural descriptions and an automatic modification of a modelling process. Multiaspect taxonomies defined for features allow one to solve problems of constructing unique and unambiguous descriptions of modelled solids.

32 citations


Journal ArticleDOI
TL;DR: Various non-context-free sets of arrays which can be generated in a simple way by cooperating array grammar systems are presented and show the power of the mechanism of cooperation for picture descritpion.
Abstract: The aim of this paper is to elaborate the power of cooperation in generating pictures by array grammars. As it is expected, the generative capacity of cooperating array grammar systems (with a fixed number, with a number greater than a given threshold, or with the maximal number of derivation steps in each component when it is enabled) is strictly greater than that of context-free array grammars. Yet the same result is also obtained in the case of systems with regular components, which contradicts the corresponding result for string grammar systems. In fact, some more results for array grammar systems are obtained which either contradict the results for the corresponding string grammar systems or are not even known for these string grammar systems. Various non-context-free sets of arrays which can be generated in a simple way by cooperating array grammar systems are presented and show the power of the mechanism of cooperation for picture descritpion.

26 citations


Journal ArticleDOI
TL;DR: The use of Tree Adjoining Grammar as a formal mechanism for syntactic composition is explored and the TAG approach avoids the need for distinguishing two modes of phrase structure composition, ‘GT’ and ‘adjunction’, and allows the elimination of the requirement ‘extension requirement’.
Abstract: . The compositional approach to phrase structure proposed in Chomsky's ‘Minimalist Program’ paper (1992) can be implemented in various ways. This paper explores the use of Tree Adjoining Grammar (Kroch & Joshi 1985, 1986; Frank 1992, inter alia) as a formal mechanism for syntactic composition and compares it to the operations proposed in Chomsky (1992). The TAG approach avoids the need for distinguishing two modes of phrase structure composition, ‘GT’ and ‘adjunction’, and allows the elimination of the requirement ‘extension requirement’. Further, it permits a purely local interpretation of transformational movement and a simplification of the conception of derivational economy. It also has important consequences for the proper treatment of the binding and scope effects of successive cyclic movement.

20 citations


Proceedings ArticleDOI
26 Feb 1995
TL;DR: There are two significant results: First, the problem is solvable in an application in which the string length is bounded, that is most applications, and the ambiguity search algorithm developed can solve the PCP problem with bounded string length.
Abstract: This paper presents a solution to the CFG Ambiguity Problem which has come about in a recent automatic language acquisition research. The ambiguity problem of a grammar has not been addressed in prior language acquisition research. A theorem has told us to classify the CFG ambiguity problem as undecidable. Many researchers hence simply ignore the ambiguity problem or generate only unambiguous grammar. However, grammatical ambiguity lurks behind every language acquisition scheme. An ambiguity is undesirable because different derivation trees represent different meanings. This paper has investigated the CFG ambiguity problem. There are two significant results: First, the problem is solvable in an application in which the string length is bounded, that is most applications. Second, the ambiguity search algorithm developed can solve the PCP problem with bounded string length.

18 citations



Journal ArticleDOI
TL;DR: Colimits (pushouts) are used to model composition and (reverse) graph grammar morphisms to describe refinements of typed graph grammars to preserve the structure of a composed grammar and common subgrammars is shown to be compatible with the semantics.

14 citations


01 Jan 1995
TL;DR: This paper claims that Attribute Grammars can be used to describe computations on structures that are not just trees, but also on abstractions allowing for infinite structures, and introduces two new notions: {\em scheme productions\/} and {\em conditional productions}.
Abstract: Although Attribute Grammars were introduced thirty years ago, their lack of expressiveness has resulted in limited use outside the domain of static language processing In this paper we show that it is possible to extend this expressiveness We claim that Attribute Grammars can be used to describe computations on structures that are not just trees, but also on abstractions allowing for infinite structures To gain this expressiveness, we introduce two new notions: {\em scheme productions\/} and {\em conditional productions} The result is a language that is comparable in power to most first-order functional languages, with a distinctive declarative character Our extensions deal with a different part of the Attribute Grammars formalism than what is used in most works on Attribute Grammars including global analysis and evaluator generation Hence, most existing results are directly applicable to our extended Attribute Grammars including efficient implementation (in our case, using the FNC-2 system http://www-rocqinriafr/charme/FNC-2/) The major contribution of this approach is to restore and re-emphasize the intrinsic power of Attribute Grammars Furthermore, our extensions call for new studies on applying to functional programming the analysis and implementation techniques developed for Attribute Grammars

13 citations


Proceedings ArticleDOI
27 Mar 1995
TL;DR: The decidability of the generation problem for those unification grammars which are based on context-free phrase structure rule skeletons, like e.g. LFG and PATR-II is proved.
Abstract: In this paper, we prove the decidability of the generation problem for those unification grammars which are based on context-free phrase structure rule skeletons, like e.g. LFG and PATR-II. The result shows a perhaps unexpected asymmetry, since it is valid also for those unification grammars whose parsing problem is undecidable, e.g. grammars which do not satisfy the off-line parsability constraint. The general proof is achieved by showing that the space of the derivations which have to be considered in order to decide the problem for a given input is always restricted to derivations whose length is limited by some fixed upper bound which is determined relative to the "size" of the input.

12 citations


Journal ArticleDOI
TL;DR: It is demonstrated that the number of nonterminals can be decreased by one in the present characterizations if scattered context grammars start their derivations from a word rather than a single symbol.
Abstract: The syntactic complexity of scattered context grammars with respect to the number of nonterminals is investigated. First, the family of the recursively enumerable languages is characterized by some basic operations, such as quotient and coding, over the languages generated by propagating scattered context grammars with four nonterminals. Then, a new method of achieving the characterization of the family of recursively enumerable languages by scattered context grammars is given; in fact, this family is characterized by scattered context grammars with only five nonterminals and a single erasing production. Finally, it is demonstrated that the number of nonterminals can be decreased by one in the present characterizations if scattered context grammars start their derivations from a word rather than a single symbol.

Posted Content
Miles Osborne1
TL;DR: This thesis concentrates upon automatic grammar correction (or machine learning of grammar) as a solution to the problem of undergeneration by hypothesising that the combined use of data-driven and model-based learning would allowData-driven learning to compensate for model- based learning's incompleteness, whilst model-Based learning would compensate for data- driven learning's unsoundness.
Abstract: When parsing unrestricted language, wide-covering grammars often undergenerate. Undergeneration can be tackled either by sentence correction, or by grammar correction. This thesis concentrates upon automatic grammar correction (or machine learning of grammar) as a solution to the problem of undergeneration. Broadly speaking, grammar correction approaches can be classified as being either {\it data-driven}, or {\it model-based}. Data-driven learners use data-intensive methods to acquire grammar. They typically use grammar formalisms unsuited to the needs of practical text processing and cannot guarantee that the resulting grammar is adequate for subsequent semantic interpretation. That is, data-driven learners acquire grammars that generate strings that humans would judge to be grammatically ill-formed (they {\it overgenerate}) and fail to assign linguistically plausible parses. Model-based learners are knowledge-intensive and are reliant for success upon the completeness of a {\it model of grammaticality}. But in practice, the model will be incomplete. Given that in this thesis we deal with undergeneration by learning, we hypothesise that the combined use of data-driven and model-based learning would allow data-driven learning to compensate for model-based learning's incompleteness, whilst model-based learning would compensate for data-driven learning's unsoundness. We describe a system that we have used to test the hypothesis empirically. The system combines data-driven and model-based learning to acquire unification-based grammars that are more suitable for practical text parsing. Using the Spoken English Corpus as data, and by quantitatively measuring undergeneration, overgeneration and parse plausibility, we show that this hypothesis is correct.

Posted Content
TL;DR: This article motivates a variant of Datalog grammars which allows for a meta-grammatical treatment of coordination, which improves in some respects over previous work on coordination in logic Grammars, although more research is needed for testing it in other respects.
Abstract: In previous work we studied a new type of DCGs, Datalog grammars, which are inspired on database theory. Their efficiency was shown to be better than that of their DCG counterparts under (terminating) OLDT-resolution. In this article we motivate a variant of Datalog grammars which allows us a meta-grammatical treatment of coordination. This treatment improves in some respects over previous work on coordination in logic grammars, although more research is needed for testing it in other respects.

Journal ArticleDOI
TL;DR: An algorithm is presented for checking whether an infinite transition system, defined by a graph grammar of a restricted kind, is a model of a formula of the temporal logic CTL, and how to adapt the formalism of graph grammars, for expressing such infinite transition systems.

Journal ArticleDOI
TL;DR: Tree insertion grammar (TIG) as mentioned in this paper is a tree-based formalism that makes use of tree substitution and tree adjunction, which is related to tree adjoining grammar, but the adjunction permitted in TIG is different from ours.
Abstract: Tree insertion grammar (TIG) is a tree-based formalism that makes use of tree substitution and tree adjunction. TIG is related to tree adjoining grammar. However, the adjunction permitted in TIG is...

Proceedings Article
01 Jan 1995


01 Jan 1995
TL;DR: The implementation of LCS as a TAG can be done, providing the full power of the well-defined mathematical properties of TAGs as a basis for describing the formal properties of LCS.
Abstract: We are interested in building a lexicon for interlingual machine translation (MT) and in examining the formal properties of an interlingua as a language in its own right. As such it should be possible to define a lexicalized grammar for the representation of lexical entries and a set of operations over that grammar that can be used to both analyze and generate interlingua representations. The interlingua we discuss in this paper is Le.,dcal Conceptual Structure (LCS) as formulated by Dorr (1993) based on work by Jackendoff (1983, 1990). This is described in the next section, and is followed by the presentation of a grammar for LCS as a representation language. The grammar formalism whose operations we examine with respect to their ability to compose LCS representations is Feature-Based Lexicalized Adjoining Grammar, (FB-LTAG), a version of Tree Adjoining Grammar (TAG) (Joshi et al (1975), Schabes (1990), Vijay-Shanker (1987)), and its description, along with example TAG structures, forms our final section. What we find is that the implementation of LCS as a TAG, although not completely straightforward, can be done, providing the full power of the well-defined mathematical properties of TAGs as a basis for describing the formal properties of LCS.

Journal ArticleDOI
TL;DR: A generalization of the context-free LR(k)-notion is presented, which characterizes for each language class in the hierarchy generated by coupled-context-free grammars — and therefore for TAGs, too — a subclass, which can be parsed in linear time.

Journal ArticleDOI
Akira Nakamura1
TL;DR: Some properties of parallel coordinate grammars are discussed and a relationship between the sequential coordinate Grammars and parallel ones is examined.
Abstract: In a coordinate grammar, the rewriting rules replace sets of symbols having given coordinates by sets of symbols whose coordinates are given functions of the coordinates of the original symbols. Usually, at each step of a derivation, only one rule is applied and only one instance of its left hand side is rewritten. This type is referred to sequential grammars. As a counterpart of this grammar, parallel coordinate grammars are defined as generalized parallel isometric grammars. In the parallel grammars, the rewriting rule are used in parallel in a derivation application. The paper discusses some properties of parallel coordinate grammars and examines a relationship between the sequential coordinate grammars and parallel ones.

Journal ArticleDOI
TL;DR: The aim of this study is to bring to the fore some of the constraints which have to be taken into account for the realisation of a parser for non lexicalised nominal compounds.
Abstract: Irregular nominal compounds can be defined as noun phrases having a regular syntactic construction but having restrictions on their syntactic variations and specific semantic behaviours. The aim of our study is to bring to the fore some of the constraints which have to be taken into account for the realisation of a parser for non lexicalised nominal compounds. Therefore, two nominal compounds are studied from a linguistic standpoint. The first one verre a vin (wineglass) can be classified as a true compound noun although accepting several modifications. The second one verre de vin (glass of wine) is a compositional noun phrase although having idiosyncratic characters. The features drawn from the observation of variations and meaning construction of these two compounds are used to evaluate four unification formalisms in their ability to represent and parse precisely such sequences: PATR-II, Lexicalised Tree Adjoining Grammar, OLMES and Acceptability Controlled Grammar. The first two are general grammar formalisms whereas the last two are dedicated to idioms and compound parsing. The conclusions of this evaluation yield a set of principles which should govern the construction of a parser better suited for compound noun parsing and interpretation.

01 Jan 1995
TL;DR: An attractive general purpose grammar formalism, concatenative predicate grammar, in which all the mentioned formalisms can be represented is presented, which results in both a more readable notation, and an elegant hierarchical classification of the grammar formalisms.
Abstract: Linear Context Free Rewriting Systems (LCFRS, [Wei88]) are a general class of trans-context-free grammar systems; it is the largest well-known class of mildly context sensitive grammar; languages recognized by LCFRS strictly include those generated by the HG, TAG, LIG, CCG family. (Parallel) Multiple Context-Free Grammar (PMCFG, [KNSK92]) is a straightforward extension of LCFRS. Literal Movement Grammars, introduced by the author of this paper in [Gro95c], are a form of CFG augmented with inherited string-valued attributes. LCFRS, PMCFG and LMG are primarily aimed at the analysis of natural language. String Attributed Grammars are the concatenative variant of the attribute grammar formalism, which is widely used in programming language semantics. The properties of the class of attribute output languages OUT(SAG) are studied in [Eng86]. We present an attractive general purpose grammar formalism, concatenative predicate grammar, in which all the mentioned formalisms can be represented. This results in both a more readable notation, and an elegant hierarchical classification of the grammar formalisms.


Journal ArticleDOI
TL;DR: The addition of run-time semantics via circular attribute grammars permits automatically generated environments to be complete, in that incremental static semantic checking and fast incremental execution are now available within a single framework.
Abstract: Attribute grammars are traditionally constrained to be noncircular. In using attribute grammars to specify the semantics of programming languages, this noncircularity limitation has restricted attribute grammars to compile-time or static semantics. Inductive attribute grammars add a general form of circularity to this standard approach. Inductive attribute grammars have the expressiveness required to describe the full semantics of programming languages, while at the same time maintaining the declarative character of standard attribute grammars. This expanded view of attribute grammars proves to be useful in interactive language-based programming environments, as inductive attribute grammars allow the environment to provide an interpreter for incremental re-evaluation of programs after small changes to the code. The addition of run-time semantics via circular attribute grammars permits automatically generated environments to be complete, in that incremental static semantic checking and fast incremental execution are now available within a single framework.

Journal ArticleDOI
TL;DR: Terminally coded (TC) grammars are introduced, which generalize parenthesis Grammar in the sense that from each word ω generated by a TC grammar the authors can recover the unlabeled tree t underlying its derivation tree(s), and there is a length-preserving homomorphism that maps ω to an encoding of t.
Abstract: We introduce terminally coded (TC) grammars, which generalize parenthesis grammars in the sense that from each word ω generated by a TC grammar we can recover the unlabeled tree t underlying its derivation tree(s). More precisely, there is a length-preserving homomorphism that maps ω to an encoding of t. Basic properties of TC grammars are established. For backwards deterministic TC grammars we give a shift-reduce precedence parsing method without look-ahead, which implies that TC languages can be recognized in linear time. The class of TC languages contains all parenthesis languages, and is contained in the classes of simple precedence languages and NTS languages.

Posted Content
TL;DR: A class of unification grammars is defined that exactly describes the class of indexed languages.
Abstract: Indexed languages are interesting in computational linguistics because they are the least class of languages in the Chomsky hierarchy that has not been shown not to be adequate to describe the string set of natural language sentences. We here define a class of unification grammars that exactly describe the class of indexed languages.

Proceedings ArticleDOI
23 May 1995
TL;DR: A new framework to represent circuit structures, defined by a logic grammar called DCSG (definite clause set grammars), developed for analyzing free word-order language, is developed.
Abstract: We have developed a new framework to represent circuit structures. A circuit is viewed as a sentence, and its elements as words. Circuit structures are defined by a logic grammar called DCSG (definite clause set grammars), developed for analyzing free word-order language. A set of grammar rules itself forms a logic program which implements top-down parsing. Using the grammar rules, given circuits are decomposed into a parse tree. The parse tree shows hierarchical structures of functional blocks composing the circuits. >