Showing papers on "Tree-adjoining grammar published in 2011"

PDF

Open Access

Journal Article•DOI•

The Equivalence of Tree Adjoining Grammars and Monadic Linear Context-free Tree Grammars

[...]

Stephan Kepser¹, James Rogers²•Institutions (2)

University of Tübingen¹, Earlham College²

01 Jul 2011-Journal of Logic, Language and Information

TL;DR: The equivalence of tree languages and monadic linear context-free grammars was shown by as discussed by the authors, who showed that a tree language is a member of this class iff it is the two-dimensional yield of an MSO-definable three-dimensional tree language.

...read moreread less

Abstract: The equivalence of leaf languages of tree adjoining grammars and monadic linear context-free grammars was shown about a decade ago. This paper presents a proof of the strong equivalence of these grammar formalisms. Non-strict tree adjoining grammars and monadic linear context-free grammars define the same class of tree languages. We also present a logical characterisation of this tree language class showing that a tree language is a member of this class iff it is the two-dimensional yield of an MSO-definable three-dimensional tree language.

...read moreread less

35 citations

Journal Article•DOI•

Recovering grammar relationships for the Java Language Specification

[...]

Ralf Lämmel¹, Vadim Zaytsev¹•Institutions (1)

University of Koblenz and Landau¹

01 Jun 2011-Software Quality Journal

TL;DR: This work describes a refined method for grammar convergence, and it uses it in a major study, where it is used to recover the relationships between all the grammars that occur in the different versions of the Java Language Specification.

...read moreread less

Abstract: Grammar convergence is a method that helps in discovering relationships between different grammars of the same language or different language versions. The key element of the method is the operational, transformation-based representation of those relationships. Given input grammars for convergence, they are transformed until they are structurally equal. The transformations are composed from primitive operators; properties of these operators and the composed chains provide quantitative and qualitative insight into the relationships between the grammars at hand. We describe a refined method for grammar convergence, and we use it in a major study, where we recover the relationships between all the grammars that occur in the different versions of the Java Language Specification (JLS). The relationships are represented as grammar transformation chains that capture all accidental or intended differences between the JLS grammars. This method is mechanized and driven by nominal and structural differences between pairs of grammars that are subject to asymmetric, binary convergence steps. We present the underlying operator suite for grammar transformation in detail, and we illustrate the suite with many examples of transformations on the JLS grammars. We also describe the extraction effort, which was needed to make the JLS grammars amenable to automated processing. We include substantial metadata about the convergence process for the JLS so that the effort becomes reproducible and transparent.

...read moreread less

35 citations

Journal Article•DOI•

A grammatical formalism based on patterns of part of speech tags

[...]

Pablo Gamallo Otero¹, Isaac González López•Institutions (1)

University of Santiago de Compostela¹

01 Jan 2011-International Journal of Corpus Linguistics

TL;DR: A grammatical formalism, called DepPattern, to write dependency grammars using patterns of Part of Speech tags augmented with lexical and morphological information, which inherits ideas from Sinclair's work and Pattern Grammar is described.

...read moreread less

Abstract: In this paper, we describe a grammatical formalism, called DepPattern, to write dependency grammars using patterns of Part of Speech (PoS) tags augmented with lexical and morphological information. The formalism inherits ideas from Sinclair’s work and Pattern Grammar. To properly analyze semi-fixed idiomatic expressions, DepPattern distinguishes between open-choice and idiomatic rules. A grammar is defined as a set of lexical-syntactic rules at different levels of abstraction. In addition, a compiler was implemented so as to generate deterministic and robust parsers from DepPattern grammars. These parsers identify dependencies which can be used to improve corpus-based applications such as information extraction. At the end of this article, we describe an experiment which evaluates the efficiency of a dependency parser generated from a simple DepPattern grammar. In particular, we evaluated the precision of a semantic extraction method making use of a DepPattern-based parser.

...read moreread less

34 citations

Journal Article•DOI•

An interactive, visual approach to developing and applying parametric three-dimensional spatial grammars

[...]

Frank Hoisl¹, Kristina Shea¹•Institutions (1)

Technische Universität München¹

01 Nov 2011-Ai Edam Artificial Intelligence for Engineering Design, Analysis and Manufacturing

TL;DR: This approach puts the creation and use of 3-D spatial grammars on a more general level and supports designers with facilitated definition and application of their own rules in a familiar computer-aided design environment without requiring programming.

...read moreread less

Abstract: Spatial grammars are rule based, generative systems for the specification of formal languages. Set and shape grammar formulations of spatial grammars enable the definition of spatial design languages and the creation of alternative designs. Since the introduction of the underlying formalism, they have been successfully applied to different domains including visual arts, architecture, and engineering. Although many spatial grammars exist on paper, only a few, limited spatial grammar systems have been computationally implemented to date; this is especially true for three-dimensional (3-D) systems. Most spatial grammars are hard-coded, that is, once implemented, the vocabulary and rules cannot be changed without reprogramming. This article presents a new approach and prototype implementation for a 3-D spatial grammar interpreter that enables interactive, visual development and application of grammar rules. The method is based on a set grammar that uses a set of parameterized primitives and includes the definition of nonparametric and parametric rules, as well as their automatic application. A method for the automatic matching of the left hand side of a rule in a current working shape, including defining parametric relations, is outlined. A prototype implementation is presented and used to illustrate the approach through three examples: the "kindergarten grammar," vehicle wheel rims, and cylinder cooling fins. This approach puts the creation and use of 3-D spatial grammars on a more general level and supports designers with facilitated definition and application of their own rules in a familiar computer-aided design environment without requiring programming.

...read moreread less

34 citations

Dissertation•

Searching for Compact Hierarchical Structures in DNA by means of the Smallest Grammar Problem

[...]

Matthias Gallé

15 Feb 2011

TL;DR: It is proved that the number of smallest grammars can be exponential in the size of the sequence and then analysed the stability of the discovered structures between minimal Grammar Parsing for real-life examples.

...read moreread less

Abstract: Motivated by the goal of discovering hierarchical structures inside DNA sequences, we address the Smallest Grammar Problem, the problem of finding a smallest context-free grammar that generates exactly one sequence. This NP-Hard problem has been widely studied for applications like Data Compression, Structure Discovery and Algorithmic Information Theory. From the theoretical point of view, our contributions to this problem is a new formalisation of the Smallest Grammar Problem based on two complementary optimisation problems: the choice of constituents of the final grammar and the choice of how to parse the sequence with these constituents. We give a polynomial time solution for this last problem, which me named the ''Minimal Grammar Parsing" problem. This decomposition allows us to define a new complete and correct search space for the Smallest Grammar Problem. Based on this search space, we propose new algorithms able to return grammars 10\% smaller than the state of the art on complete genomes. Regarding efficiency, we study different equivalence classes of repeats and introduce an efficient in-place schema to update the suffix array data structure used to compute these words. We conclude this thesis analysing the applications. For Structure Discovery, we consider the impact of the non-uniqueness of smallest grammars. We prove that the number of smallest grammars can be exponential in the size of the sequence and then analyse the stability of the discovered structures between minimal grammars for real-life examples. With respect to Data Compression, we extend our algorithms to use rigid patterns as words and achieve compression rate up to 25\% better compared to the previous best DNA grammar-based coder.

...read moreread less

33 citations

Book Chapter•DOI•

A local greibach normal form for hyperedge replacement grammars

[...]

Christina Jansen¹, Jonathan Heinen¹, Joost-Pieter Katoen¹, Thomas Noll¹•Institutions (1)

RWTH Aachen University¹

26 May 2011

TL;DR: A normal form for hyperedge replacement Grammars is introduced as a generalisation of the Greibach Normal Form for string grammars and the adapted construction to support the required concretisations.

...read moreread less

Abstract: Heap-based data structures play an important role in modern programming concepts. However standard verification algorithms cannot cope with infinite state spaces as induced by these structures. A common approach to solve this problem is to apply abstraction techniques. Hyperedge replacement grammars provide a promising technique for heap abstraction as their production rules can be used to partially abstract and concretise heap structures. To support the required concretisations, we introduce a normal form for hyperedge replacement grammars as a generalisation of the Greibach Normal Form for string grammars and the adapted construction.

...read moreread less

31 citations

Book Chapter•DOI•

Comparison of context-free grammars based on parsing generated test data

[...]

Bernd Fischer¹, Ralf Lämmel², Vadim Zaytsev•Institutions (2)

University of Southampton¹, University of Koblenz and Landau²

03 Jul 2011

TL;DR: An automated approach is developed that is practically useful in revealing evidence of nonequivalence of grammars and discovering correspondence mappings for grammar nonterminals and two studies are discussed that show how the approach is used in comparing Grammars of open source Java parsers as well as grammARS from the course work for a compiler construction class.

...read moreread less

Abstract: There exist a number of software engineering scenarios that essentially involve equivalence or correspondence assertions for some of the context-free grammars in the scenarios. For instance, when applying grammar transformations during parser development--be it for the sake of disambiguation or grammar-class compliance--one would like to preserve the generated language. Even though equivalence is generally undecidable for context-free grammars, we have developed an automated approach that is practically useful in revealing evidence of nonequivalence of grammars and discovering correspondence mappings for grammar nonterminals. Our approach is based on systematic test data generation and parsing. We discuss two studies that show how the approach is used in comparing grammars of open source Java parsers as well as grammars from the course work for a compiler construction class.

...read moreread less

30 citations

Journal Article•DOI•

A unifying approach to picture grammars

[...]

Matteo Pradella¹, Alessandra Cherubini¹, Stefano Crespi Reghizzi¹•Institutions (1)

Polytechnic University of Milan¹

01 Sep 2011-Information & Computation

TL;DR: In this paper, the authors focus on a simple type of tiling, named regional, and define the corresponding regional tile grammars, which can be unified and extended using an approach, whereby the right part of a rule is formalized by means of a finite set of permitted tiles.

...read moreread less

Abstract: Several old and recent classes of picture grammars, that variously extend context-free string grammars in two dimensions, are based on rules that rewrite arrays of pixels. Such grammars can be unified and extended using an approach, whereby the right part of a rule is formalized by means of a finite set of permitted tiles. We focus on a simple type of tiling, named regional, and define the corresponding regional tile grammars. They include both Siromoneyʼs (or Matzʼs) Kolam grammars and their generalization by Průsa, as well as Drewesʼs grid grammars. Regionally defined pictures can be recognized with polynomial-time complexity by an algorithm extending the CKY one for strings. Regional tile grammars and languages are strictly included into our previous tile grammars and languages, and are incomparable with Giammarresi–Restivo tiling systems (or Wang systems).

...read moreread less

29 citations

Book Chapter•DOI•

A general framework for regulated rewriting based on the applicability of rules

[...]

Rudolf Freund¹, Marian Kogler², Marion Oswald¹•Institutions (2)

Vienna University of Technology¹, Martin Luther University of Halle-Wittenberg²

01 Jan 2011

TL;DR: A general model for various mechanisms of regulated rewriting based on the applicability of rules is introduced, especially graph-controlled, programmed, matrix, random context, and ordered grammars as well as some basic variants of grammar systems.

...read moreread less

Abstract: We introduce a general model for various mechanisms of regulated rewriting based on the applicability of rules, especially we consider graph-controlled, programmed, matrix, random context, and ordered grammars as well as some basic variants of grammar systems. Most of the general relations between graph-controlled grammars, matrix grammars, random-context grammars, and ordered grammars established in this paper are independent from the objects and the kind of rules and only based on the notion of applicability of rules within the different regulating mechanisms and their specific structure in allowing sequences of rules to be applied. For example, graph-controlled grammars are always at least as powerful as programmed and matrix grammars. For the simulation of random context and ordered grammars by matrix and graph-controlled grammars, some specific requirements have to be fulfilled by the types of rules.

...read moreread less

23 citations

Book Chapter•DOI•

Distributional learning of simple context-free tree grammars

[...]

Anna Kasprzik¹, Ryo Yoshinaka•Institutions (1)

University of Trier¹

05 Oct 2011

TL;DR: This paper demonstrates how existing distributional learning techniques for context-free grammars can be adapted to simple context- free tree Grammars in a straightforward manner once the necessary notions and properties for string languages have been redefined for trees.

...read moreread less

Abstract: This paper demonstrates how existing distributional learning techniques for context-free grammars can be adapted to simple context-free tree grammars in a straightforward manner once the necessary notions and properties for string languages have been redefined for trees. Distributional learning is based on the decomposition of an object into a substructure and the remaining structure, and on their interrelations. A corresponding learning algorithm can emulate those relations in order to determine a correct grammar for the target language.

...read moreread less

20 citations

Journal Article•DOI•

One-Nonterminal Conjunctive Grammars over a Unary Alphabet

[...]

Artur Jeż¹, Alexander Okhotin²•Institutions (2)

University of Wrocław¹, University of Turku²

01 Aug 2011-Theory of Computing Systems \/ Mathematical Systems Theory

TL;DR: The compressed membership problem for one-nonterminal conjunctive grammars over {a} is proved to be EXPTIME-complete; the same problem for the context-free grammar is decidable in NLOGSPACE, but becomes NP-complete if the grammar is compressed as well.

...read moreread less

Abstract: Conjunctive grammars over an alphabet Σ={a} are studied, with the focus on the special case with a unique nonterminal symbol. Such a grammar is equivalent to an equation X=ϕ(X) over sets of natural numbers, using union, intersection and addition. It is shown that every grammar with multiple nonterminals can be encoded into a grammar with a single nonterminal, with a slight modification of the language. Based on this construction, the compressed membership problem for one-nonterminal conjunctive grammars over {a} is proved to be EXPTIME-complete; the same problem for the context-free grammars is decidable in NLOGSPACE, but becomes NP-complete if the grammar is compressed as well. The equivalence problem for these grammars is shown to be co-r.e.-complete, both finiteness and co-finiteness are r.e.-complete, while equivalence to a fixed unary language with a regular positional notation is decidable.

...read moreread less

Journal Article•DOI•

Lower bounds for context-free grammars

[...]

Yuval Filmus¹•Institutions (1)

University of Toronto¹

01 Sep 2011-Information Processing Letters

TL;DR: Ellul, Krawetz, Shallit and Wang prove an exponential lower bound on the size of any context-free grammar generating the language of all permutations over some alphabet, and obtain exponential lower bounds for many other languages.

...read moreread less

Proceedings Article•DOI•

Improved Functional Flow and Reachability Analyses Using Indexed Linear Tree Grammars

[...]

Jonathan Kochems¹, C.-H. Luke Ong•Institutions (1)

University of Oxford¹

01 Jan 2011

TL;DR: This work presents an algorithm that uses indexed linear tree grammars (ILTGs) both to describe the input set and compute the set that approximates the collecting semantics, thus enabling a more precise binding analysis than afforded by regular Grammars.

...read moreread less

Abstract: The collecting semantics of a program defines the strongest static property of interest. We study the analysis of the collecting semantics of higher-order functional programs, cast as left-linear term rewriting systems. The analysis generalises functional flow analysis and the reachability problem for term rewriting systems, which are both undecidable. We present an algorithm that uses indexed linear tree grammars (ILTGs) both to describe the input set and compute the set that approximates the collecting semantics. ILTGs are equi-expressive with pushdown tree automata, and so, strictly more expressive than regular tree grammars. Our result can be seen as a refinement of Jones and Andersen's procedure, which uses regular tree grammars. The main technical innovation of our algorithm is the use of indices to capture (sets of) substitutions, thus enabling a more precise binding analysis than afforded by regular grammars. We give a simple proof of termination and soundness, and demonstrate that our method is more accurate than other approaches to functional flow and reachability analyses in the literature.

...read moreread less

Book Chapter•DOI•

Integrating attribute grammar and functional programming language features

[...]

Ted Kaminski¹, Eric Van Wyk¹•Institutions (1)

University of Minnesota¹

03 Jul 2011

TL;DR: This work integrates rich static types (including parametric polymorphism, typed distinctions between decorated and undecorated trees, limited type inference, and generalized algebraic data-types) and pattern-matching and maintains familiar and convenient attribute grammar notations and especially their highly extensible nature.

...read moreread less

Abstract: While attribute grammars have several features making them advantageous for specifying language processing tools, functional programming languages offer a myriad of features also well-suited for such tasks. Much other work shows the close relationship between these two approaches, often in the form of embedding attribute grammars into lazy functional languages. This paper continues in this tradition, but in the other direction, by integrating various functional language features into attribute grammars. Specifically we integrate rich static types (including parametric polymorphism, typed distinctions between decorated and undecorated trees, limited type inference, and generalized algebraic data-types) and pattern-matching, all in a manner that maintains familiar and convenient attribute grammar notations and especially their highly extensible nature.

...read moreread less

Journal Article•DOI•

The Smallest Grammar Problem as Constituents Choice and Minimal Grammar Parsing

[...]

Rafael Carrascosa, François Coste, Matthias Gallé, Gabriel Infante-Lopez

26 Oct 2011-Algorithms

TL;DR: A new perspective on the smallest grammar problem is proposed by splitting it into two tasks: choosing which words will be the constituents of the grammar and searching for the largest grammar given this set of constituents.

...read moreread less

Abstract: The smallest grammar problem—namely, finding a smallest context-free grammar that generates exactly one sequence—is of practical and theoretical importance in fields such as Kolmogorov complexity, data compression and pattern discovery. We propose a new perspective on this problem by splitting it into two tasks: (1) choosing which words will be the constituents of the grammar and (2) searching for the smallest grammar given this set of constituents. We show how to solve the second task in polynomial time parsing longer constituent with smaller ones. We propose new algorithms based on classical practical algorithms that use this optimization to find small grammars. Our algorithms consistently find smaller grammars on a classical benchmark reducing the size in 10% in some cases. Moreover, our formulation allows us to define interesting bounds on the number of small grammars and to empirically compare different grammars of small size.

...read moreread less

Proceedings Article•DOI•

Embedded parser generators

[...]

Jonas Almström Duregård¹, Patrik Jansson¹•Institutions (1)

Chalmers University of Technology¹

22 Sep 2011

TL;DR: This work presents a novel method of embedding context-free grammars in Haskell, and to automatically generate parsers and pretty-printers from them, and supports adding anti-quotation to the generated quasi-quoters, which allows users of the defined language to mix concrete and abstract syntax almost seamlessly.

...read moreread less

Abstract: We present a novel method of embedding context-free grammars in Haskell, and to automatically generate parsers and pretty-printers from them. We have implemented this method in a library called BNFC-meta (from the BNF Converter, which it is built on). The library builds compiler front ends using metaprogramming instead of conventional code generation. Parsers are built from labelled BNF grammars that are defined directly in Haskell modules. Our solution combines features of parser generators (static grammar checks, a highly specialised grammar DSL) and adds several features that are otherwise exclusive to combinatory libraries such as the ability to reuse, parameterise and generate grammars inside Haskell.To allow writing grammars in concrete syntax, BNFC-meta provides a quasi-quoter that can parse grammars (embedded in Haskell files) at compile time and use metaprogramming to replace them with their abstract syntax. We also generate quasi-quoters so that the languages we define with BNFC-meta can be embedded in the same way. With a minimal change to the grammar, we support adding anti-quotation to the generated quasi-quoters, which allows users of the defined language to mix concrete and abstract syntax almost seamlessly. Unlike previous methods of achieving anti-quotation, the method used by BNFC-meta is simple, efficient and avoids polluting the abstract syntax types.

...read moreread less

Journal Article•DOI•

One-sided random context grammars

[...]

Alexander Meduna¹, Petr Zemek¹•Institutions (1)

Brno University of Technology¹

01 May 2011-Acta Informatica

TL;DR: It is demonstrated that without erasing rules, one-sided random context grammars characterize the family of context-sensitive languages, and with erasingrules, these grammARS characterize theFamily of recursively enumerable languages.

...read moreread less

Abstract: The notion of a one-sided random context grammar is defined as a context-free-based regulated grammar, in which a set of permitting symbols and a set of forbidding symbols are attached to every rule, and its set of rules is divided into the set of left random context rules and the set of right random context rules. A left random context rule can rewrite a nonterminal if each of its permitting symbols occurs to the left of the rewritten symbol in the current sentential form while each of its forbidding symbols does not occur there. A right random context rule is applied analogically except that the symbols are examined to the right of the rewritten symbol. The paper demonstrates that without erasing rules, one-sided random context grammars characterize the family of context-sensitive languages, and with erasing rules, these grammars characterize the family of recursively enumerable languages. In fact, these characterization results hold even if the set of left random context rules coincides with the set of right random context rules. Several special cases of these grammars are considered, and their generative power is established. In its conclusion, some important open problems are suggested to study in the future.

...read moreread less

Book Chapter•DOI•

A new method for dependent parsing

[...]

Trevor Jim¹, Yitzhak Mandelbaum¹•Institutions (1)

AT&T Labs¹

26 Mar 2011

TL;DR: This paper proposes a point-free language of dependent grammars, which it is believed closely corresponds to existing context-free parsing algorithms, and gives a novel transformation from conventional dependent Grammars to point- free ones.

...read moreread less

Abstract: Dependent grammars extend context-free grammars by allowing semantic values to be bound to variables and used to constrain parsing. Dependent grammars can cleanly specify common features that cannot be handled by context-free grammars, such as length fields in data formats and significant indentation in programming languages. Few parser generators support dependent parsing, however. To address this shortcoming, we have developed a new method for implementing dependent parsers by extending existing parsing algorithms. Our method proposes a point-free language of dependent grammars, which we believe closely corresponds to existing context-free parsing algorithms, and gives a novel transformation from conventional dependent grammars to point-free ones. To validate our technique, we have specified the semantics of both source and target dependent grammar languages, and proven our transformation sound and complete with respect to those semantics. Furthermore, we have empirically validated the suitability of our point-free language by adapting four parsing engines to support it: an Earley parsing engine; a GLR parsing engine; memoizing, arrow-style parser combinators; and PEG parser combinators.

...read moreread less

Proceedings Article•DOI•

GRAPE: using graph grammars to implement shape grammars

[...]

Thomas Grasl, Athanassios Economou¹•Institutions (1)

Georgia Institute of Technology¹

03 Apr 2011

TL;DR: How alternative representations from graph theory including graphs, overcomplete graphs and hyperedge graphs can support some of the intuitions handled in shape grammars by direct visual computations with shapes is shown.

...read moreread less

Abstract: An implementation of a shape grammar interpreter is described. The underlying graph-theoretic framework is briefly discussed to show how alternative representations from graph theory including graphs, overcomplete graphs and hyperedge graphs can support some of the intuitions handled in shape grammars by direct visual computations with shapes. The resulting plugin implemented in Rhino, code-named GRAPE, is briefly described in the end.

...read moreread less

Book Chapter•DOI•

Correction of Invalid XML Documents with Respect to Single Type Tree Grammars

[...]

Martin Svoboda¹, Irena Mlýnková¹•Institutions (1)

Charles University in Prague¹

11 Jul 2011

TL;DR: The aim of this paper is the proposal of a correction framework involving structural repairs of elements with respect to single type tree grammars, involving an efficient algorithm and a prototype implementation.

...read moreread less

Abstract: XML documents and related technologies represent a widely accepted standard for managing semi-structured data. However, a surprisingly high number of XML documents is affected by well-formedness errors, structural invalidity or data inconsistencies. The aim of this paper is the proposal of a correction framework involving structural repairs of elements with respect to single type tree grammars. Via the inspection of the state space of a finite automaton recognising regular expressions, we are always able to find all minimal repairs against a defined cost function. These repairs are compactly represented by shortest paths in recursively nested multigraphs, which can be translated to particular sequences of edit operations altering XML trees. We have proposed an efficient algorithm and provided a prototype implementation.

...read moreread less

Journal Article•DOI•

On the complexity of regular-grammars with integer attributes

[...]

Marco Manna¹, Francesco Scarcello¹, Nicola Leone¹•Institutions (1)

University of Calabria¹

01 Mar 2011-Journal of Computer and System Sciences

TL;DR: This paper studies the complexity of the classical problem of deciding whether a string belongs to the language generated by any attribute grammar from a given class C, and shows that even in the most general case the problem is in polynomial space.

...read moreread less

Journal Article•DOI•

Lexicalized Non-Local MCTAG with Dominance Links is NP-Complete

[...]

Lucas Champollion¹•Institutions (1)

University of Tübingen¹

01 Jul 2011-Journal of Logic, Language and Information

TL;DR: It is found that there are NP-hard grammars among non-local MCTAGs even if any or all of the following restrictions are imposed: lexicalization, dominance links, and dominance links.

...read moreread less

Abstract: An NP-hardness proof for non-local Multicomponent Tree Adjoining Grammar (MCTAG) by Rambow and Satta (1st International Workshop on Tree Adjoining Grammers 1992), based on Dahlhaus and Warmuth (in J Comput Syst Sci 33:456---472, 1986), is extended to some linguistically relevant restrictions of that formalism. It is found that there are NP-hard grammars among non-local MCTAGs even if any or all of the following restrictions are imposed: (i) lexicalization: every tree in the grammar contains a terminal; (ii) dominance links: every tree set contains at most two trees, and in every such tree set, there is a link between the foot node of one tree and the root node of the other tree, indicating that the former node must dominate the latter in the derived tree. This is the version of MCTAG proposed in Becker et al. (Proceedings of the 5th conference of the European chapter of the Association for Computational Linguistics 1991) to account for German long-distance scrambling. This result restricts the field of possible candidates for an extension of Tree Adjoining Grammar that would be both mildly context-sensitive and linguistically adequate.

...read moreread less

DOI•

Tree grammars with multilinear interpretation

[...]

Yonggang Guan, Günter Hotz, A. Reichert

05 Sep 2011

Proceedings Article•

Tree Parsing with Synchronous Tree-Adjoining Grammars

[...]

Matthias Büchse¹, Mark-Jan Nederhof², Heiko Vogler¹•Institutions (2)

Dresden University of Technology¹, University of St Andrews²

05 Oct 2011

TL;DR: This work introduces a formulation of synchronous tree-adjoining grammars which is effectively closed under input and output restrictions to regular tree languages, i.e., the restricted translations can again be represented by Grammars.

...read moreread less

Abstract: Restricting the input or the output of a grammar-induced translation to a given set of trees plays an important role in statistical machine translation. The problem for practical systems is to find a compact (and in particular, finite) representation of said restriction. For the class of synchronous tree-adjoining grammars, partial solutions to this problem have been described, some being restricted to the unweighted case, some to the monolingual case. We introduce a formulation of this class of grammars which is effectively closed under input and output restrictions to regular tree languages, i.e., the restricted translations can again be represented by grammars. Moreover, we present an algorithm that constructs these grammars for input and output restriction, which is inspired by Earley's algorithm.

...read moreread less

Book Chapter•DOI•

On some classes of 2D languages and their relations

[...]

Marcello M. Bersani¹, Achille Frigeri¹, Alessandra Cherubini¹•Institutions (1)

Polytechnic University of Milan¹

23 May 2011

TL;DR: This work refines the relationship among the classes of languages generated by the above grammars and Local languages and states some considerations about closure properties of (regular) pure 2D context-free languages.

...read moreread less

Abstract: Many formal models have been proposed to recognize or to generate two-dimensional words. In this paper, we focus our analysis on (regular) pure 2D context-free grammars, regional tile grammars and Průsa grammars, showing that nevertheless they have been proposed as a generalization of string context free grammars their expressiveness is different. This work refines the relationship among the classes of languages generated by the above grammars and Local languages and states some considerations about closure properties of (regular) pure 2D context-free languages.

...read moreread less

Journal Article•DOI•

Nonterminal complexity of tree controlled grammars

[...]

Sherzod Turaev¹, Jürgen Dassow², Mohd Hasan Selamat¹•Institutions (2)

Information Technology University¹, Otto-von-Guericke University Magdeburg²

01 Sep 2011-Theoretical Computer Science

TL;DR: It is proved that the number of nonterminals in tree controlled grammars without erasing rules leads to an infinite hierarchy of families of tree controlled languages, while every recursively enumerable language can be generated by a tree controlled grammar with erasingrules and at most nine nonterminal.

...read moreread less

Proceedings Article•

Using Derivation Trees for Treebank Error Detection

[...]

Seth Kulick¹, Ann Bies¹, Justin Mott¹•Institutions (1)

University of Pennsylvania¹

19 Jun 2011

TL;DR: This work introduces a new approach to checking treebank consistency based on a variant of Tree Adjoining Grammar that overcomes the problems of earlier approaches based on using strings of words rather than tree structure to identify the appropriate contexts for comparison.

...read moreread less

Abstract: This work introduces a new approach to checking treebank consistency. Derivation trees based on a variant of Tree Adjoining Grammar are used to compare the annotation of word sequences based on their structural similarity. This overcomes the problems of earlier approaches based on using strings of words rather than tree structure to identify the appropriate contexts for comparison. We report on the result of applying this approach to the Penn Arabic Treebank and how this approach leads to high precision of error detection.

...read moreread less

Proceedings Article•

Implementing attribute grammars using conventional compiler construction tools

[...]

Daniel Rodriguez-Cerezo¹, Antonio Sarasa-Cabezuelo¹, José-Luis Sierra¹•Institutions (1)

Complutense University of Madrid¹

15 Nov 2011

TL;DR: A straightforward and structure-preserving coding pattern to encode arbitrary non-circular attribute grammars as syntax-directed translation schemes for bottom-up parser generation tools that makes it possible the direct implementation of attribute grammar-based specifications using widely-used translation scheme-driven tools for the development of bottom- up language translators.

...read moreread less

Abstract: This article describes a straightforward and structure-preserving coding pattern to encode arbitrary non-circular attribute grammars as syntax-directed translation schemes for bottom-up parser generation tools. According to this pattern, a bottom-up oriented translation scheme is systematically derived from the original attribute grammar. Semantic actions attached to each syntax rule are written in terms of a small repertory of primitive attribution operations. By providing alternative implementations for these attribution operations, it is possible to plug in different semantic evaluation strategies in a seamlessly way (e.g., a demand-driven strategy, or a data-driven one). The pattern makes it possible the direct implementation of attribute grammar-based specifications using widely-used translation scheme-driven tools for the development of bottom-up language translators (e.g. YACC, BISON, CUP, etc.). As a consequence, this initial coding can be subsequently refined to yield final efficient implementations. Since these implementations still preserve the ability of being extended with new features described at the attribute grammar level, the advantages from the point of view of development and maintenance become apparent.

...read moreread less

Journal Article•DOI•

Attribute Grammars as Recursion Schemes over Cyclic Representations of Zippers

[...]

Eric Badouel¹, Bernard Fotsing², Rodrigue Tchougong²•Institutions (2)

French Institute for Research in Computer Science and Automation¹, University of Rennes²

01 Mar 2011-Electronic Notes in Theoretical Computer Science

TL;DR: This work presents an alternative first-order functional interpretation of attribute grammars where the input tree is replaced with an extended cyclic tree each node of which is aware of its context viewed as an additional child tree.

...read moreread less

Book Chapter•DOI•

A datalog recognizer for almost affine λ-CFGs

[...]

Pierre Bourreau¹, Sylvain Salvati¹•Institutions (1)

L'Abri¹

06 Sep 2011

TL;DR: An efficient algorithm based on Datalog programming is presented in [Kan07] for context-free grammar of almost linear λ-terms, which are linear κ-terms augmented with a restricted form of copy.

...read moreread less

Abstract: The recent emergence of linguistic formalisms exclusively based on the simply-typed λ-calculus to represent both syntax and semantics led to the presentation of innovative techniques which apply to both the problems of parsing and generating natural languages. A common feature of these techniques consists in using strong relations between typing properties and syntactic structures of families of simply-typed λ-terms. Among significant results, an efficient algorithm based on Datalog programming is presented in [Kan07] for context-free grammar of almost linear λ-terms, which are linear λ-terms augmented with a restricted form of copy. We present an extension of this method to terms for which deletion is allowed.

...read moreread less