Showing papers on "Tree-adjoining grammar published in 2000"

PDF

Open Access

Journal Article•

[...]

01 Jan 2000-Informatica (lithuanian Academy of Sciences)

TL;DR: An object-oriented extension to canonical attribute grammars is described, permitting attributes to be references to arbitrary nodes in the syntax tree, and Attributes to be accessed via the reference attributes.

...read moreread less

Abstract: An object-oriented extension to canonical attribute grammars is described, permitting attributes to be references to arbitrary nodes in the syntax tree, and attributes to be accessed via the reference attributes. Important practical problems such as name and type analysis for object-oriented languages can be expressed in a concise and modular manner in these grammars, and an optimal evaluation algorithm is available. An extensive example is given, capturing all the key constructs in object-oriented languages including block structure, classes, inheritance, qualified use, and assignment compatibility in the presence of subtyping. The formalism and algorithm have been implemented in APPLAB, an interactive language development tool.

...read moreread less

192 citations

Proceedings Article•DOI•

Statistical parsing with an automatically-extracted tree adjoining grammar

[...]

David Chiang¹•Institutions (1)

University of Pennsylvania¹

03 Oct 2000

TL;DR: This work describes the induction of a probabilistic LTAG model from the Penn Treebank and finds that this induction method is an improvement over the EM-based method of (Hwa, 1998), and that the induced model yields results comparable to lexicalized PCFG.

...read moreread less

Abstract: We discuss the advantages of lexicalized tree-adjoining grammar as an alternative to lexicalized PCFG for statistical parsing, describing the induction of a probabilistic LTAG model from the Penn Treebank and evaluating its parsing performance. We find that this induction method is an improvement over the EM-based method of (Hwa, 1998), and that the induced model yields results comparable to lexicalized PCFG.

...read moreread less

180 citations

Book Chapter•DOI•

Bilexical Grammars and their Cubic-Time Parsing Algorithms

[...]

Jason Eisner¹•Institutions (1)

University of Rochester¹

01 Jan 2000

TL;DR: This chapter introduces weighted bilexical grammars, a formalism in which individual lexical items, such as verbs and their arguments, can have idiosyncratic selectional influences on each other.

...read moreread less

Abstract: This chapter introduces weighted bilexical grammars, a formalism in which individual lexical items, such as verbs and their arguments, can have idiosyncratic selectional influences on each other. Such ‘bilexicalism’ has been a theme of much current work in parsing. The new formalism can be used to describe bilexical approaches to both dependency and phrase-structure grammars, and a slight modification yields link grammars. Its scoring approach is compatible with a wide variety of probability models.

...read moreread less

155 citations

Book Chapter•DOI•

Learning Context-Free Grammars with a Simplicity Bias

[...]

Pat Langley¹, Sean Stromsten¹•Institutions (1)

Daimler AG¹

31 May 2000

TL;DR: A rational reconstruction of Wolff's SNPR - the GRIDS system - is presented which incorporates a bias toward grammars that minimize description length, and the algorithm alternates between merging existing nonterminal symbols and creating new symbols, using a beam search to move from complex to simpler Grammars.

...read moreread less

Abstract: We examine the role of simplicity in directing the induction of context-free grammars from sample sentences. We present a rational reconstruction of Wolff's SNPR - the GRIDS system - which incorporates a bias toward grammars that minimize description length. The algorithm alternates between merging existing nonterminal symbols and creating new symbols, using a beam search to move from complex to simpler grammars. Experiments suggest that this approach can induce accurate grammars and that it scales reasonably to more difficult domains.

...read moreread less

101 citations

Book Chapter•DOI•

Computational Complexity of Problems on Probabilistic Grammars and Transducers

[...]

Francisco Casacuberta¹, Colin de la Higuera•Institutions (1)

Polytechnic University of Valencia¹

11 Sep 2000

TL;DR: It is proved that the problem of parsing a given string or its most probable parse with stochastic regular grammars is NP-hard and does not allow for a polynomial time approximation scheme.

...read moreread less

Abstract: Determinism plays an important role in grammatical inference. However, in practice, ambiguous grammars (and non determinism grammars in particular) are more used than determinism grammars. Computing the probability of parsing a given string or its most probable parse with stochastic regular grammars can be performed in linear time. However, the problem of finding the most probable string has yet not given any satisfactory answer. In this paper we prove that the problem is NP-hard and does not allow for a polynomial time approximation scheme. The result extends to stochastic regular syntax-directed translation schemes.

...read moreread less

86 citations

Patent•

Method and apparatus for embedding grammars in a natural language understanding (NLU) statistical parser

[...]

Mark E. Epstein¹•Institutions (1)

IBM¹

25 Oct 2000

TL;DR: This paper applied a context free grammar to the text input to determine substrings and corresponding parse trees, and examined each possible substring using an inventory of queries corresponding to the CFG.

...read moreread less

Abstract: A method and system for use in a natural language understanding system for including grammars within a statistical parser. The method involves a series of steps. The invention receives a text input. The invention applies a first context free grammar to the text input to determine substrings and corresponding parse trees, wherein the substrings and corresponding parse trees further correspond to the first context free grammar. Additionally, the invention can examine each possible substring using an inventory of queries corresponding to the CFG.

...read moreread less

74 citations

Proceedings Article•

Automated Extraction of TAGs from the Penn Treebank

[...]

John Chen, K. Vijay-Shanker

23 Feb 2000

TL;DR: The authors extract different LTAGs from the Penn Treebank and show that certain strategies yield an improved extracted LTAG in terms of compactness, broad coverage, and supertagging accuracy.

...read moreread less

Abstract: The accuracy of statistical parsing models can be improved with the use of lexical information Statistical parsing using Lexicalized tree adjoining grammar (LTAG), a kind of lexicalized grammar, has remained relatively unexplored We believe that is largely in part due to the absence of large corpora accurately bracketed in terms of a perspicuous yet broad coverage LTAG Our work attempts to alleviate this difficulty We extract different LTAGs from the Penn Treebank We show that certain strategies yield an improved extracted LTAG in terms of compactness, broad coverage, and supertagging accuracy Furthermore, we perform a preliminary investigation in smoothing these grammars by means of an external linguistic resource, namely, the tree families of an XTAG grammar, a hand built grammar of English

...read moreread less

68 citations

Proceedings Article•DOI•

Parsing with the shortest derivation

[...]

Rens Bod¹•Institutions (1)

University of Amsterdam¹

31 Jul 2000

TL;DR: It is shown that the common wisdom is wrong for stochastic grammars that use elementary trees instead of context-free rules, such as Stochastic Tree-Substitution Grammars used by Data-Oriented Parsing models, and a non-probabilistic metrics based on the shortest derivation outperforms a probabilistic metric on the ATIS and OVIS corpora.

...read moreread less

Abstract: Common wisdom has it that the bias of stochastic grammars in favor of shorter derivations of a sentence is harmful and should be redressed. We show that the common wisdom is wrong for stochastic grammars that use elementary trees instead of context-free rules, such as Stochastic Tree-Substitution Grammars used by Data-Oriented Parsing models. For such grammars a non-probabilistic metric based on the shortest derivation outperforms a probabilistic metric on the ATIS and OVIS corpora, while it obtains competitive results on the Wall Street Journal (WSJ) corpus. This paper also contains the first published experiments with DOP on the WSJ.

...read moreread less

48 citations

Journal Article•DOI•

Spinal-Formed Context-Free Tree Grammars

[...]

Akio Fujiyoshi¹, Takumi Kasai¹•Institutions (1)

University of Electro-Communications¹

02 Jan 2000-Theory of Computing Systems \/ Mathematical Systems Theory

TL;DR: It is shown that the class of string languages generated by spine Grammars coincides with that of tree adjoining grammars.

...read moreread less

Abstract: In this paper we introduce a restricted model of context-free tree grammars called spine grammars, and study their formal properties including considerably simple normal forms. Recent research on natural languages has suggested that formalisms for natural languages need to generate a slightly larger class of languages than context-free grammars, and for that reason tree adjoining grammars have been widely studied relating them to natural languages. It is shown that the class of string languages generated by spine grammars coincides with that of tree adjoining grammars. We also introduce acceptors called linear pushdown tree automata, and show that linear pushdown tree automata accept exactly the class of tree languages generated by spine grammars. Linear pushdown tree automata are obtained from pushdown tree automata with a restriction on duplicability for the pushdown stacks.

...read moreread less

46 citations

Book Chapter•DOI•

Inference of Finite-State Transducers by Using Regular Grammars and Morphisms

[...]

Francisco Casacuberta¹•Institutions (1)

Polytechnic University of Valencia¹

11 Sep 2000

TL;DR: A technique to infer finite-state transducers is proposed in this work, based on the formal relations between finite- state transducers and regular grammars.

...read moreread less

Abstract: A technique to infer finite-state transducers is proposed in this work. This technique is based on the formal relations between finite-state transducers and regular grammars. The technique consists of: 1) building a corpus of training strings from the corpus of training pairs; 2) inferring a regular grammar and 3) transforming the grammar into a finite-state transducer.

...read moreread less

41 citations

Book Chapter•DOI•

Computation of the N Best Parse Trees for Weighted and Stochastic Context-Free Grammars

[...]

Víctor M. Jiménez¹, Andrés Marzal¹•Institutions (1)

James I University¹

30 Aug 2000-Lecture Notes in Computer Science

TL;DR: An efficient algorithm is proposed to solve one of the problems associated to the use of weighted and stochastic Context-Free Grammars: the problem of computing the N best parse trees of a given string.

...read moreread less

Abstract: Context-Free Grammars are the object of increasing interest in the pattern recognition research community in an attempt to overcome the limited modeling capabilities of the simpler regular grammars, and have application in a variety of fields such as language modeling, speech recognition, optical character recognition, computational biology, etc. This paper proposes an efficient algorithm to solve one of the problems associated to the use of weighted and stochastic Context-Free Grammars: the problem of computing the N best parse trees of a given string. After the best parse tree has been computed using the CYK algorithm, a large number of alternative parse trees are obtained, in order by weight (or probability), in a small fraction of the time required by the CYK algorithm to find the best parse tree. This is confirmed by experimental results using grammars from two different domains: a chromosome grammar, and a grammar modeling natural language sentences from the Wall Street Journal corpus.

...read moreread less

Book Chapter•DOI•

Synthesizing Context Free Grammars from Sample Strings Based on Inductive CYK Algorithm

[...]

Katsuhiko Nakamura¹, Takashi Ishiwata•Institutions (1)

Tokyo Denki University¹

11 Sep 2000

TL;DR: This paper describes a method of synthesizing context free grammars from positive and negative sample strings, which is implemented in a grammatical inference system called Synapse, based on incremental learning for positive samples and a rule generation method by "inductive CYK algorithm,” which generates minimal production rules required for parsing positive samples.

...read moreread less

Abstract: This paper describes a method of synthesizing context free grammars from positive and negative sample strings, which is implemented in a grammatical inference system called Synapse. The method is based on incremental learning for positive samples and a rule generation method by “inductive CYK algorithm,” which generates minimal production rules required for parsing positive samples. Synapse can generate unambiguous grammars as well as ambiguous grammars. Some experiments showed that Synapse can synthesize several simple context free grammars in considerably short time.

...read moreread less

Proceedings Article•

Exploiting auxiliary distributions in stochastic unification-based grammars

[...]

Mark Johnson¹, Stefan Riezler²•Institutions (2)

Brown University¹, University of Stuttgart²

29 Apr 2000

TL;DR: This paper describes a method for estimating conditional probability distributions over the parses of "unification-based" grammars which can utilize auxiliary distributions that are estimated by other means, and applies this estimator to a Stochastic Lexical-Functional Grammar.

...read moreread less

Abstract: This paper describes a method for estimating conditional probability distributions over the parses of "unification-based" grammars which can utilize auxiliary distributions that are estimated by other means. We show how this can be used to incorporate information about lexical selectional preferences gathered from other sources into Stochastic "Unification-based" Grammars (SUBGs). While we apply this estimator to a Stochastic Lexical-Functional Grammar, the method is general, and should be applicable to stochastic versions of HPSGs, categorial grammars and transformational grammars.

...read moreread less

Journal Article•DOI•

A shrinking lemma for random forbidding context languages

[...]

Andries P. J. van der Walt¹, Sigrid Ewert¹•Institutions (1)

Stellenbosch University¹

28 Apr 2000-Theoretical Computer Science

TL;DR: It is shown that random context grammars are strictly weaker than the non-erasing random context Grammars and prove a shrinking lemma for their languages.

...read moreread less

Book Chapter•DOI•

Specification of mobile code system using graph grammars

[...]

Fernando Luís Dotti¹, Leila Ribeiro²•Institutions (2)

Pontifícia Universidade Católica do Rio Grande do Sul¹, Universidade Federal do Rio Grande do Sul²

01 Sep 2000

TL;DR: A formal approach for the specification of mobile code systems based on graph grammars, that is a formal description technique suitable for the description of highly parallel systems, and is intuitive even for non-theoreticians is introduced.

...read moreread less

Abstract: In this paper we introduce a formal approach for the specification of mobile code systems. This approach is based on graph grammars, that is a formal description technique that is suitable for the description of highly parallel systems, and is intuitive even for non-theoreticians We define a special class of graph grammars using the concepts of object-based systems and include location information explicitly. Aspects of modularity and execution in an open environment are discussed.

...read moreread less

Journal Article•DOI•

A Cubic Time Extension of Context-Free Grammars

[...]

Pierre Boullier¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

01 May 2000-Grammars

TL;DR: A generalization of context-free grammars which nonetheless still has a cubic parse time complexity is presented, which belongs to an extension of mildly context-sensitive languages in which the constant growth property is relaxed and which can thus potentially be used in natural language processing.

...read moreread less

Abstract: Context-free grammars and cubic parse time are so related in people's minds that they often think that parsing any extension of context-free grammars must need some extra time. Of course, this is not necessarily true and this paper presents a generalization of context-free grammars which nonetheless still has a cubic parse time complexity. This extension, which defines a subclass of context-sensitive languages, has both a theoretical and a practical interest. The class of languages defined by these grammars is closed under both intersection and complement (in fact this class contains both the intersection and the complement of context-free languages). Moreover, these languages belong to an extension of mildly context-sensitive languages in which the constant growth property is relaxed and which can thus potentially be used in natural language processing.

...read moreread less

Journal Article•DOI•

Generative power of three-dimensional scattered context grammars

[...]

Alexander Meduna

01 Sep 2000-Theoretical Computer Science

TL;DR: It is proved that the three-nonterminal scattered context grammars characterize the family of recursively enumerable languages.

...read moreread less

Journal Article•

A Uniform Framework for Problems on Context-Free Grammars.

[...]

Javier Esparza, Peter Rossmanith, Stefan Schwoon

01 Jan 2000-Bulletin of The European Association for Theoretical Computer Science

Building a class-based verb lexicon using TAGs

[...]

Karin Kipper¹, Hoa Trang Dang¹, William Schuler¹, Martha Palmer¹•Institutions (1)

University of Pennsylvania¹

01 May 2000

TL;DR: This work has used a Lexicalized Tree Adjoining Grammar to capture the syntax associated with each verb class and has added semantic predicates to each tree, which allow for a compositional interpretation.

...read moreread less

Abstract: We present a class-based approach to building a verb lexicon that makes explicit the close relation between syntax and semantics for Levin classes. We have used a Lexicalized Tree Adjoining Grammar to capture the syntax associated with each verb class and have added semantic predicates to each tree, which allow for a compositional interpretation.

...read moreread less

Book Chapter•DOI•

Encoding frequency information in lexicalized grammars

[...]

John M. Carroll¹, David J. Weir¹•Institutions (1)

University of Sussex¹

31 Oct 2000

TL;DR: In this article, the authors address the issue of how to associate frequency information with lexicalized grammar formalisms, using Lexicalized Tree Adjoining Grammar as a representative framework, and evaluate their adequacy from both a theoretical and empirical perspective using data from existing large treebanks.

...read moreread less

Abstract: We address the issue of how to associate frequency information with lexicalized grammar formalisms, using Lexicalized Tree Adjoining Grammar as a representative framework. We consider systematically a number of alternative probabilistic frameworks, evaluating their adequacy from both a theoretical and empirical perspective using data from existing large treebanks. We also propose three orthogonal approaches for backing off probability estimates to cope with the large number of parameters involved.

...read moreread less

A Feature-Based Lexicalized Tree Adjoining Grammar for Korean

[...]

Chung-hye Han, Juntae Yoon, Nari Kim, Martha Palmer

01 Jan 2000

TL;DR: The work reported here is a first step towards the development of an implemented TAG grammar for Korean, which is continuously updated with the addition of new analyses and modification of old ones.

...read moreread less

Abstract: This document describes an on-going project of developing a grammar of Korean, the Korean XTAG grammar, written in the TAG formalism and implemented for use with the XTAG system enriched with a Korean morphological analyzer The Korean XTAG grammar described in this report is based on the TAG formalism (Joshi et al (1975)), which has been extended to include lexicalization (Schabes et al (1988)), and unification-based feature structures (Vijay-Shanker and Joshi (1991)) The document first describes the modifications that we have made to the XTAG system (The XTAG-Group (1998)) to handle rich inflectional morphology in Korean Then various syntactic phenomena that can be currently handled are described, including adverb modification, relative clauses, complex noun phrases, auxiliary verb constructions, gerunds and adjunct clauses The work reported here is a first step towards the development of an implemented TAG grammar for Korean, which is continuously updated with the addition of new analyses and modification of old ones

...read moreread less

Journal Article•DOI•

On the linear computational complexity of the parser for quasi-context sensitive languages

[...]

Janusz Jurek¹•Institutions (1)

Jagiellonian University¹

01 Feb 2000-Pattern Recognition Letters

TL;DR: An efficient, O(n), parsing algorithm for languages generated by dynamically programmed grammars, so-called DPLL(k) grammARS, is presented and can be used for analysis of complex trend functions describing the behaviour of an industrial equipment.

...read moreread less

Proceedings Article•DOI•

Compact non-left-recursive grammars using the selective left-corner transform and factoring

[...]

Mark Johnson¹, Brian Roark¹•Institutions (1)

Brown University¹

31 Jul 2000

TL;DR: This paper produces a transformed grammar which simulates left-corner recognition of a user-specified set of the original productions, and top-down recognition of the others, and combined with two factorizations produces non-left-recursive grammars that are not much larger than the original.

...read moreread less

Abstract: The left-corner transform removes left-recursion from (probabilistic) context-free grammars and unification grammars, permitting simple top-down parsing techniques to be used. Unfortunately the grammars produced by the standard left-corner transform are usually much larger than the original. The selective left-corner transform described in this paper produces a transformed grammar which simulates left-corner recognition of a user-specified set of the original productions, and top-down recognition of the others. Combined with two factorizations, it produces non-left-recursive grammars that are not much larger than the original.

...read moreread less

Journal Article•DOI•

Generalised Stream X-Machines and Cooperating Distributed Grammar Systems

[...]

Marian Gheorghe¹•Institutions (1)

University of Sheffield¹

01 Dec 2000-Formal Aspects of Computing

TL;DR: Any language accepted by a Turing machine may be written as a translation of a regular set performed by a generalised stream X-machine with underlying distributed grammars based on context-free rules, under = k derivation strategy.

...read moreread less

Abstract: Stream X-machines are a general and powerful computational model. By coupling the control structure of a stream X-machine with a set of formal grammars a new machine called a generalised stream X-machine with underlying distributed grammars, acting as a translator, is obtained. By introducing this new mechanism a hierarchy of computational models is provided. If the grammars are of a particular class, say regular or context-free, then finite sets are translated into finite sets, when ?k, = k derivation strategies are used, and regular or context-free sets, respectively, are obtained for ?k, * and terminal derivation strategies. In both cases, regular or context-free grammars, the regular sets are translated into non-context-free languages. Moreover, any language accepted by a Turing machine may be written as a translation of a regular set performed by a generalised stream X-machine with underlying distributed grammars based on context-free rules, under = k derivation strategy. On the other hand the languages generated by some classes of cooperating distributed grammar systems may be obtained as images of regular sets through some X-machines with underlying distributed grammars. Other relations of the families of languages computed by generalised stream X-machines with the families of languages generated by cooperating distributed grammar systems are established. At the end, an example dealing with the specification of a scanner system illustrates the use of the introduced mechanism as a formal specification model.

...read moreread less

Constraining non-local dependencies in tree-adjoining grammar: computational and linguistic perspectives

[...]

Seth Kulick, Aravind K. Joshi, Anthony Kroch

01 Jan 2000

TL;DR: Investigation of whether the notion of locality inherent in Tree Adjoining Grammar (TAG) will allow for an efficient approach to automatic extraction of predicate-argument structure of Chinese Treebank parse trees.

...read moreread less

Abstract: Working on Information Extraction and management of the TIDES (“Translingual Information Detection, Extraction and Summarization”) program at Penn. Helped develop Penn’s first contribution to the Automatic Content Extraction evaluation (pipelined statistical system). Current research is investigation of whether the notion of locality inherent in Tree Adjoining Grammar (TAG) will allow for an efficient approach to automatic extraction of predicate-argument structure. Developing software for tagging predicate argument structure of Chinese Treebank parse trees.

...read moreread less

Proceedings Article•DOI•

Compiling language models from a linguistically motivated unification grammar

[...]

Manny Rayner¹, Beth Ann Hockey¹, Frankie James¹, Elizabeth Owen Bratt², Sharon Goldwater², Jean Mark Gawron² - Show less +2 more•Institutions (2)

Ames Research Center¹, SRI International²

31 Jul 2000

TL;DR: The authors describe a series of experiments which investigate the question empirically, by incrementally constructing a grammar and discovering what problems emerge when successively larger versions are compiled into finite state graph representations and used as language models for a medium-vocabulary recognition task.

...read moreread less

Abstract: Systems now exist which are able to compile unification grammars into language models that can be included in a speech recognizer, but it is so far unclear whether non-trivial linguistically principled grammars can be used for this purpose. We describe a series of experiments which investigate the question empirically, by incrementally constructing a grammar and discovering what problems emerge when successively larger versions are compiled into finite state graph representations and used as language models for a medium-vocabulary recognition task.

...read moreread less

Rapid grammar development and parsing: constraint dependency grammars with abstract role values

[...]

Chris White, Mary P. Harper

01 Jan 2000

Proceedings Article•DOI•

Corpus-based grammar specialization

[...]

Nicola Cancedda¹, Christer Samuelsson¹•Institutions (1)

Xerox¹

13 Sep 2000

TL;DR: The method, applicable to any unification grammar with a phrase-structure backbone, is shown to be effective in specializing a broad-coverage LFG for French.

...read moreread less

Abstract: Broad-coverage grammars tend to be highly ambiguous. When such grammars are used in a restricted domain, it may be desirable to specialize them, in effect trading some coverage for a reduction in ambiguity. Grammar specialization is here given a novel formulation as an optimization problem, in which the search is guided by a global measure combining coverage, ambiguity and grammar size. The method, applicable to any unification grammar with a phrase-structure backbone, is shown to be effective in specializing a broad-coverage LFG for French.

...read moreread less

Journal Article•DOI•

Regulated Grammars under Leftmost Derivation

[...]

Henning Fernau¹•Institutions (1)

University of Tübingen¹

01 Jan 2000-Grammars

TL;DR: This work investigates various concepts of leftmost derivation in grammars controlled by bicoloured digraphs, paying specific attention to their descriptive capacity, to unify the presentation of known results regarding especially programmed and matrix Grammars and to obtain new results concerning grammar with regular control, and periodically time-variant grammARS.

...read moreread less

Abstract: In this paper, we investigate various concepts of leftmost derivation in grammars controlled by bicoloured digraphs, paying specific attention to their descriptive capacity. This approach allows us to unify the presentation of known results regarding especially programmed and matrix grammars, and to obtain new results concerning grammars with regular control, and periodically time-variant grammars. Moreover, we consider leftmost derivations in grammars with (regular) context conditions.

...read moreread less

Proceedings Article•

Left-to-right parsing and bilexical context-free grammars

[...]

Mark-Jan Nederhof, Giorgio Satta¹•Institutions (1)

University of Padua¹

29 Apr 2000

TL;DR: Evidence that left-to-right parsing cannot be realised within acceptable time-bounds if the so called correct-prefix property is to be ensured is provided.

...read moreread less

Abstract: We compare the asymptotic time complexity of left-to-right and bidirectional parsing techniques for bilexical context-free grammars, a grammar formalism that is an abstraction of language models used in several state-of-the-art real-world parsers. We provide evidence that left-to-right parsing cannot be realised within acceptable time-bounds if the so called correct-prefix property is to be ensured. Our evidence is based on complexity results for the representation of regular languages.

...read moreread less