scispace - formally typeset
Search or ask a question

Showing papers on "Tree-adjoining grammar published in 2003"


Journal ArticleDOI
TL;DR: This paper present an implementation of a discourse parsing system for a lexicalized Tree-Adjoining Grammar for discourse, specifying the integration of sentence and discourse level processing, based on the assumption that the compositional aspects of semantics at the discourse level parallel those at sentence level.
Abstract: We present an implementation of a discourse parsing system for a lexicalized Tree-Adjoining Grammar for discourse, specifying the integration of sentence and discourse level processing. Our system is based on the assumption that the compositional aspects of semantics at the discourse level parallel those at the sentence level. This coupling is achieved by factoring away inferential semantics and anaphoric features of discourse connectives. Computationally, this parallelism is achieved because both the sentence and discourse grammar are LTAG-based and the same parser works at both levels. The approach to an LTAG for discourse has been developed by Webber and colleagues in some recent papers. Our system takes a discourse as input, parses the sentences individually, extracts the basic discourse constituent units from the sentence derivations, and reparses the discourse with reference to the discourse grammar while using the same parser used at the sentence level.

71 citations


Proceedings ArticleDOI
12 Apr 2003
TL;DR: A semantic construction method for Feature-Based Tree Adjoining Grammar which is based on the derived tree is proposed and compared with related proposals.
Abstract: We propose a semantic construction method for Feature-Based Tree Adjoining Grammar which is based on the derived tree, compare it with related proposals and briefly discuss some implementation possibilities,

67 citations


Journal ArticleDOI
01 Dec 2003
TL;DR: It is demonstrated how the combined formalism, Circular Reference Attributed Grammars (CRAGs), can take advantage of both strengths of RAGs and CAGs, making it possible to express solutions to many problems in an easy way.
Abstract: This paper presents a combination of Reference Attributed Grammars (RAGs) and Circular Attribute Grammars (CAGs). While RAGs allow the direct and easy specification of non-locally dependent information, CAGs allow iterative fixed-point computations to be expressed directly using recursive (circular) equations. We demonstrate how the combined formalism, Circular Reference Attributed Grammars (CRAGs), can take advantage of both these strengths, making it possible to express solutions to many problems in an easy way. We exemplify with the specification and computation of the nullable, first, and follow sets used in parser construction, a problem which is highly recursive and normally programmed by hand using an iterative algorithm. We also present a general demand-driven evaluation algorithm for CRAGs and some optimizations of it. The approach has been implemented and experimental results include computations on a series of grammars including Java 1.2. We also revisit some of the classical examples of CAGs and show how their solutions are facilitated by CRAGs.

65 citations


Journal ArticleDOI
TL;DR: It is shown that there effectively exists an algorithm that identifies a very simple grammar G equivalent to G* in the limit from positive data, satisfying the property that the time for updating a conjecture is bounded by O(m), and the total number of prediction errors made by the algorithm is boundedBy O(n), where n is the size of G*, m = Max{N|Σ |+1, |Σ|3} and N is the total length of all positive data provided

64 citations


Book ChapterDOI
14 Apr 2003
TL;DR: A new grammar guided genetic programming system called TAG3P+ is introduced, where tree-adjoining grammars (TAGs) are used as means to set language bias for genetic programming.
Abstract: In this paper, we introduce a new grammar guided genetic programming system called tree-adjoining grammar guided genetic programming (TAG3P+), where tree-adjoining grammars (TAGs) are used as means to set language bias for genetic programming. We show that the capability of TAGs in handling context-sensitive information and categories can be useful to set a language bias that cannot be specified in grammar guided genetic programming. Moreover, we bias the genetic operators to preserve the language bias during the evolutionary process. The results pace the way towards a better understanding of the importance of bias in genetic programming.

46 citations


Proceedings Article
07 Jul 2003
TL;DR: A new class of formal grammars, which allow the use of all set-theoretic operations as an integral part of the formalism of rules, are introduced, which allows to conjecture the practical applicability of the new concept.
Abstract: As a direct continuation of the earlier research on conjunctive grammars - context-free grammars equipped with intersection - this paper introduces a new class of formal grammars, which allow the use of all set-theoretic operations as an integral part of the formalism of rules. Rigorous semantics for such grammars is defined by language equations in a way that allows to generalize some techniques from the theory of context-free grammars, including Chomsky normal form, Cocke-Kasami-Younger recognition algorithm and some limited extension of the notion of a parse tree, which together allow to conjecture the practical applicability of the new concept.

33 citations


Journal ArticleDOI
James Rogers1
TL;DR: The use of weak monadic second-order languages over structures of varying dimension as specification languages for grammars and automata is explored, focusing on the extension of the longstanding results characterizing the regular and context-free languages in terms of definability in wS1S and wSnS to a characterization of the Tree-Adjoining Languages.

33 citations


Journal ArticleDOI
TL;DR: A cubic-time recognition and parsing algorithm for this family of grammars, which is applicable to an arbitrary conjunctive grammar without any initial transformations, and which can be modified to work in quadratic time and use linear space.

33 citations


01 Jan 2003
TL;DR: This paper represents a dependency grammar in terms of elementary dependency trees anchored on lexical items, which allows one to transfer all the key insights from TAG to dependency grammars.
Abstract: In this paper, we present a formalism for dependency grammar based on some key ideas from Tree-Adjoining Grammars. We represent a dependency grammar in terms of elementary dependency trees anchored on lexical items. These elementary trees correctly capture the dependencies associated with the lexical anchor. These trees may also include nodes that represent items on which the lexical anchor depends. These nodes are well motivated. We also describe operations that combine elementary or derived dependency trees, which are analogous to “substitution” and ”adjoining” in TAG. This characterization of a dependency grammar allows one to transfer all the key insights from TAG to dependency grammars.

33 citations



Journal ArticleDOI
TL;DR: The closure properties of the language family generated by linear conjunctive grammars are investigated; the main result is its closure under complement, which implies that it is closed under all set-theoretic operations.

Book
01 Jan 2003
TL;DR: This book presents a unified formal approach to various contemporary linguistic formalisms such as Government & Binding, Minimalism or Tree Adjoining Grammar and features a complete and well illustrated introduction to the connection between declarative approaches formalized in monadic second-order logic (MSO) and generative ones formalization in various forms of automata as well as of tree grammars.
Abstract: This book presents a unified formal approach to various contemporary linguistic formalisms such as Government & Binding, Minimalism or Tree Adjoining Grammar. Through a careful introduction of mathematical techniques from logic, automata theory and universal algebra, the book aims at graduate students and researchers who want to learn more about tightly constrained logical approaches to natural language syntax. Therefore it features a complete and well illustrated introduction to the connection between declarative approaches formalized in monadic second-order logic (MSO) and generative ones formalized in various forms of automata as well as of tree grammars. Since MSO logic (on trees) yields only context-free languages, and at least the last two of the formalisms mentioned above clearly belong to the class of mildly context-sensitive formalisms, it becomes necessary to deal with the problem of the descriptive complexity of the formalisms involved in another way. The proposed genuinely new two-step approach overcomes this limitation of MSO logic while still retaining the desired tightly controlled formal properties.

Proceedings ArticleDOI
07 Jul 2003
TL;DR: A MetaGrammar is introduced, which allows the grammar writer to specify in compact manner syntactic properties that are potentially framework- and to some extent language-independent, from which grammars for several frameworks and languages are automatically generated offline.
Abstract: We introduce a MetaGrammar, which allows us to automatically generate, from a single and compact MetaGrammar hierarchy, parallel Lexical Functional Grammars (LFG) and Tree-Adjoining Grammars (TAG) for French and for English: the grammar writer specifies in compact manner syntactic properties that are potentially framework-, and to some extent language-independent (such as subcategorization, valency alternations and realization of syntactic functions), from which grammars for several frameworks and languages are automatically generated offline.

Proceedings Article
01 Jan 2003
TL;DR: An algorithm for the inference of context-free graph grammars from examples, which builds on an earlier system for frequent substructure discovery, and is biased toward Grammar features that minimize description length.
Abstract: We present an algorithm for the inference of context-free graph grammars from examples. The algorithm builds on an earlier system for frequent substructure discovery, and is biased toward grammars that minimize description length. Grammar features include recursion, variables and relationships. We present an illustrative example, demonstrate the algorithm’s ability to learn in the presence of noise, and show real-world examples.

Journal ArticleDOI
TL;DR: A variant of P systems is defined, namely, probabilistic rewriting P systems, where the selection of rewriting rules is probabilists, and it is shown that, with non-zero cut-point, probable rewriting rules are chosen.
Abstract: In this paper we define a variant of P systems, namely, probabilistic rewriting P systems, where the selection of rewriting rules is probabilistic. We show that, with non-zero cut-point, probabilis...

Journal ArticleDOI
01 Apr 2003-Grammars
TL;DR: The aim of this paper is to give prospective PhD students in the area hints at where to start a promising research; and to supplement earlier reference lists on parallel grammars, trying to cover recent papers as well as ``older'' papers which were somehow neglected in other reviews.
Abstract: The aim of this paper is at least 2 fold: to give prospective PhD students in the area hints at where to start a promising research; and to supplement earlier reference lists on parallel grammars, trying to cover recent papers as well as ``older'' papers which were somehow neglected in other reviews. Together with the nowadays classical book on L systems by G. Rozenberg and A. Salomaa and with the articles on L systems in the Handbook of Formal Languages, researchers will be equipped with a hopefully comprehensive list of references and ideas around parallel grammars.

Proceedings ArticleDOI
22 Sep 2003
TL;DR: This paper presents techniques for the formal specification and efficient incremental implementation of spreadsheet-like tools using strong attribute grammars and first incremental results are presented.
Abstract: This paper presents techniques for the formal specification and efficient incremental implementation of spreadsheet-like tools. The spreadsheets are specified by strong attribute grammars. In this style of attribute grammar programming every single inductive computation is expressed within the attribute grammar formalism. Well-known attribute grammar techniques are used to reason about such grammars. For example, ordered scheduling algorithms can be used to statically guarantee termination of the attribute grammars and to derive efficient implementations. A strong attribute grammar for a spreadsheet is defined and the first incremental results are presented.

Journal Article
TL;DR: A fundamental framework of automata and grammars theory based on quantum logic is preliminarily established and it is showed that the language generated by any l valued regular grammar is equivalent to that recognized by some automaton with e moves based onquantum logic.
Abstract: In this paper, a fundamental framework of automata and grammars theory based on quantum logic is preliminarily established. First, the introduce quantum grammar, which is called l valued grammars, is introduced. It is particularly showed that the language (called quantum language) generated by any l valued regular grammar is equivalent to that recognized by some automaton with e moves based on quantum logic (called l valued automata), and conversely, any quantum language recognized by l valued automaton is also equivalent to that generated by some l valued grammar. Afterwards, the l valued pumping lemma is built, and then a decision characterization of quantum languages is presented. Finally, the relationship between regular grammars and quantum grammars (l valued regular grammars) is briefly discussed. Summarily, the introduced work lays a foundation for further studies on more complicated quantum automata and quantum grammars such as quantum pushdown automata and Turing machine as well as quantum context-free grammars and context-sensitive grammars.

Dissertation
14 Nov 2003
TL;DR: The main conclusions are: - formal learning theory is relevant to linguistics, - identification in the limit is feasible for non-trivial classes, and the `Shinohara approach' can lead to a learnable class, but this completely depends on the specific nature of the formalism and the notion of complexity.
Abstract: In 1967 E. M. Gold published a paper in which the language classes from the Chomsky-hierarchy were analyzed in terms of learnability, in the technical sense of identification in the limit. His results were mostly negative, and perhaps because of this his work had little impact on linguistics. In the early eighties there was renewed interest in the paradigm, mainly because of work by Angluin and Wright. Around the same time, Arikawa and his co-workers refined the paradigm by applying it to so-called Elementary Formal Systems. By making use of this approach Takeshi Shinohara was able to come up with an impressive result; any class of context-sensitive grammars with a bound on its number of rules is learnable. Some linguistically motivated work on learnability also appeared from this point on, most notably Wexler & Culicover 1980 and Kanazawa 1994. The latter investigates the learnability of various classes of categorial grammar, inspired by work by Buszkowski and Penn, and raises some interesting questions. We follow up on this work by exploring complexity issues relevant to learning these classes, answering an open question from Kanazawa 1994, and applying the same kind of approach to obtain (non)learnable classes of Combinatory Categorial Grammars, Tree Adjoining Grammars, Minimalist grammars, Generalized Quantifiers, and some variants of Lambek Grammars. We also discuss work on learning tree languages and its application to learning Dependency Grammars. Our main conclusions are: - formal learning theory is relevant to linguistics, - identification in the limit is feasible for non-trivial classes, - the `Shinohara approach' -i.e., placing a numerical bound on the complexity of a grammar- can lead to a learnable class, but this completely depends on the specific nature of the formalism and the notion of complexity. We give examples of natural classes of commonly used linguistic formalisms that resist this kind of approach, - learning is hard work. Our results indicate that learning even `simple' classes of languages requires a lot of computational effort, - dealing with structure (derivation-, dependency-) languages instead of string languages offers a useful and promising approach to learnabilty in a linguistic context

Journal ArticleDOI
TL;DR: An abductive model based on Constraint Handling Rule Grammars (CHRGs) for detecting and correcting errors in problem domains that can be described in terms of strings of words accepted by a logic grammar is proposed.
Abstract: We propose an abductive model based on Constraint Handling Rule Grammars (CHRGs) for detecting and correcting errors in problem domains that can be described in terms of strings of words accepted by a logic grammar. We provide a proof of concept for the specific problem of detecting and repairing natural language errors, in particular, those concerning feature agreement. Our methodology relies on grammar and string transformation in accordance with a user-defined dictionary of possible repairs. This transformation also serves as top-down guidance for our essentially bottom-up parser. With respect to previous approaches to error detection and repair, including those that also use constraints and/or abduction, our methodology is surprisingly simple while far-reaching and efficient.

Journal ArticleDOI
TL;DR: A new method of description of pictures of digitized rectangular arrays is introduced based on contextual grammars, defined and their properties are studied.

Journal ArticleDOI
TL;DR: This work considers the application of context-free grammars to algorithm analysis of multivariate generating function equations from which average and higher moments are easily accessible.

Proceedings Article
01 Apr 2003
TL;DR: It is argued that in practice, the width of attestedID/LP grammars is small, yielding effectively polynomial time complexity for ID/LP grammar parsing, thereby getting finer-grained bounds on the parsing complexity of ID/ LP grammARS.
Abstract: We present a new formalism, partially ordered multiset context-free grammars (poms-CFG), along with an Earley-style parsing algorithm. The formalism, which can be thought of as a generalization of context-free grammars with partially ordered right-hand sides, is of interest in its own right, and also as infrastructure for obtaining tighter complexity bounds for more expressive context-free formalisms intended to express free or multiple word-order, such as ID/LP grammars. We reduce ID/LP grammars to poms-grammars, thereby getting finer-grained bounds on the parsing complexity of ID/LP grammars. We argue that in practice, the width of attested ID/LP grammars is small, yielding effectively polynomial time complexity for ID/LP grammar parsing.

Journal ArticleDOI
01 Apr 2003-Grammars
TL;DR: Two new variants of Marcus contextual grammars are introduced: total Marcus contextual grammar with total leftmost derivation constrained by maximal use of selectors.
Abstract: We propose and study a model of the graphical syllable using Marcus contextual grammars. For this purpose we introduce two new variants of Marcus contextual grammars: total Marcus contextual grammar with total leftmost derivation, and total Marcus contextual grammar with total leftmost derivation constrained by maximal use of selectors.

Journal ArticleDOI
TL;DR: It is demonstrated that some well-known relationships concerning the language families resulting from ordinary ET0L grammars do not hold in terms of the forbiddingET0L Grammars, or their forbidding versions with conditions of length one are equally powerful.

Journal ArticleDOI
TL;DR: This paper explores the behavior of range concatenation grammars in counting, a domain in which bad reputation of other classical syntactic formalisms is well known and leads to some surprising results.

22 Sep 2003
TL;DR: The improved version of Synapse employs incremental learning based on the rule generation mechanism called inductive CYK algorithm, which generates the minimum production rules required for parsing positive samples, and the form of production rules is extended to include not only A →βγ but also A → β, called extended Chomsky normal form.
Abstract: This paper describes recent improvements in Synapse system [5, 6] for inductive inference of context free grammars from sample strings. For effective inference of grammars, Synapse employs incremental learning based on the rule generation mechanism called inductive CYK algorithm, which generates the minimum production rules required for parsing positive samples. In the improved version, the form of production rules is extended to include not only A → βγ but also A → β, called extended Chomsky normal form, where each of β and γ is either terminal or nonterminal symbol. By this extension and other improvements, Synapse can synthesize both ambiguous grammars and unambiguous grammars with less computation time compared to the previous system.

22 Sep 2003
TL;DR: This paper explores the learnability of context-free grammars given positive examples and lexical semantics, and finds that the learner has a representation of the meaning of each lexical item.
Abstract: Context-free grammars cannot be identified in the limit from positive examples (Gold, 1967), yet natural language grammars are more powerful than context-free grammars and humans learn them with remarkable ease from positive examples (Marcus, 1993). Identifiability results for formal languages ignore a potentially powerful source of information available to learners of natural languages, namely, meanings. This paper explores the learnability of context-free grammars given positive examples and lexical semantics. That is, the learner has a representation of the meaning of each lexical item.

01 Jan 2003
TL;DR: Conjunctive Grammars were introduced in 2000 as a generalization of context-free grammars that allows the use of an explicit intersection oper-ation in rules and several theoretical results on their properties have been obtained and numerous open problems are proposed.
Abstract: Conjunctive grammars were introduced in 2000 as a generalization ofcontext-free grammars that allows the use of an explicit intersection oper-ation in rules. Several theoretical results on their properties have been ob-tained since then, and a number of efficient parsing algorith ms that justifythe practical value of the concept have been developed. This article reviewsthese results and proposes numerous open problems. 1 Introduction The generative power of context-free grammars is generallyconsidered to beinsufficient for denoting many languages that arise in pract ice: it has often beenobserved that all natural languages contain non-context-free constructs, whilethe non-context-freeness of programming languages was proved already in early1960s. A review of several widely different subject areas led the authors of [5] tothe noteworthy conclusion that “the world seems to be non-context-free”.This leaves the aforementioned world with the question of fin ding an ade-quate tool for denoting formal languages. As the descriptive means of context-free grammars are not sufficient but necessary for practical use, the attemptsat developing new generative devices have usually been made by generalizingcontext-free grammars in this or that way. However, most of the time an exten-sion that appears to be minorleads to a substantialincrease in the generative power(context-sensitive and indexed grammars being good examples), which is usuallyaccompanied by strong and very undesirable complexity hardness results. Theability to encode hard problems makes a formalism, in effect, a peculiar low-levelprogramming language, where writing a grammar resembles coding in assembly

Proceedings ArticleDOI
07 Jul 2003
TL;DR: It is shown that the class of rigid and k-valued NL grammars is unlearnable from strings, for each k; this result is obtained by a specific construction of a limit point in the considered class, that does not use product operator.
Abstract: This paper is concerned with learning categorial grammars in Gold's model. In contrast to k-valued classical categorial grammars, k-valued Lambek grammars are not learnable from strings. This result was shown for several variants but the question was left open for the weakest one, the non-associative variant NL.We show that the class of rigid and k-valued NL grammars is unlearnable from strings, for each k; this result is obtained by a specific construction of a limit point in the considered class, that does not use product operator.Another interest of our construction is that it provides limit points for the whole hierarchy of Lambek grammars, including the recent pregroup grammars.Such a result aims at clarifying the possible directions for future learning algorithms: it expresses the difficulty of learning categorial grammars from strings and the need for an adequate structure on examples.