Showing papers on "Context-sensitive grammar published in 2010"

PDF

Open Access

Book•

[...]

25 Aug 2010

TL;DR: This book provides an extensive overview of the formal language landscape between CFG and PTIME, moving from Tree Adjoining Grammars to Multiple Context-Free grammars and then to Range Concatenation Grammar while explaining available parsing techniques for these formalisms.

...read moreread less

Abstract: Given that context-free grammars (CFG) cannot adequately describe natural languages, grammar formalisms beyond CFG that are still computationally tractable are of central interest for computational linguists. This book provides an extensive overview of the formal language landscape between CFG and PTIME, moving from Tree Adjoining Grammars to Multiple Context-Free Grammars and then to Range Concatenation Grammars while explaining available parsing techniques for these formalisms. Although familiarity with the basic notions of parsing and formal languages is helpful when reading this book, it is not a strict requirement. The presentation is supported with many illustrations and examples relating to the different formalisms and algorithms, and chapter summaries, problems and solutions. The book will be useful for students and researchers in computational linguistics and in formal language theory.

...read moreread less

134 citations

Proceedings Article•DOI•

Semantics and algorithms for data-dependent grammars

[...]

Trevor Jim¹, Yitzhak Mandelbaum¹, David Walker²•Institutions (2)

AT&T Labs¹, Princeton University²

17 Jan 2010

TL;DR: The design and theory of a new parsing engine, YAKKER, capable of satisfying the many needs of modern programmers and modern data processing applications is presented and its use on examples ranging from difficult programming language grammars to web server logs to binary data specification is illustrated.

...read moreread less

Abstract: We present the design and theory of a new parsing engine, YAKKER, capable of satisfying the many needs of modern programmers and modern data processing applications. In particular, our new parsing engine handles (1) full scannerless context-free grammars with (2) regular expressions as right-hand sides for defining nonterminals. YAKKER also includes (3) facilities for binding variables to intermediate parse results and (4) using such bindings within arbitrary constraints to control parsing. These facilities allow the kind of data-dependent parsing commonly needed in systems applications, particularly those that operate over binary data. In addition, (5) nonterminals may be parameterized by arbitrary values, which gives the system good modularity and abstraction properties in the presence of data-dependent parsing. Finally, (6) legacy parsing libraries,such as sophisticated libraries for dates and times, may be directly incorporated into parser specifications. We illustrate the importance and utility of this rich collection of features by presenting its use on examples ranging from difficult programming language grammars to web server logs to binary data specification. We also show that our grammars have important compositionality properties and explain why such properties areimportant in modern applications such as automatic grammar induction.In terms of technical contributions, we provide a traditional high-level semantics for our new grammar formalization and show how to compile grammars into non deterministic automata. These automata are stack-based, somewhat like conventional push-down automata,but are also equipped with environments to track data-dependent parsing state. We prove the correctness of our translation of data-dependent grammars into these new automata and then show how to implement the automata efficiently using a variation of Earley's parsing algorithm.

...read moreread less

61 citations

Book Chapter•DOI•

Learning context free grammars with the syntactic concept lattice

[...]

Alexander Clark¹•Institutions (1)

Royal Holloway, University of London¹

13 Sep 2010

TL;DR: This work presents a learning algorithm for context free grammars which uses positive data and membership queries, and proves its correctness under the identification in the limit paradigm.

...read moreread less

Abstract: The Syntactic Concept Lattice is a residuated lattice based on the distributional structure of a language; the natural representation based on this is a context sensitive formalism. Here we examine the possibility of basing a context free grammar (CFG) on the structure of this lattice; in particular by choosing non-terminals to correspond to concepts in this lattice. We present a learning algorithm for context free grammars which uses positive data and membership queries, and prove its correctness under the identification in the limit paradigm. Since the lattice itself may be infinite, we consider only a polynomially bounded subset of the set of concepts, in order to get an efficient algorithm. We compare this on the one hand to learning algorithms for context free grammars, where the non-terminals correspond to congruence classes, and on the other hand to the use of context sensitive techniques such as Binary Feature Grammars and Distributional Lattice Grammars. The class of CFGs that can be learned in this way includes inherently ambiguous and thus non-deterministic languages; this approach therefore breaks through an important barrier in CFG inference.

...read moreread less

34 citations

Journal Article•DOI•

Second-Order Abstract Categorial Grammars as Hyperedge Replacement Grammars

[...]

Makoto Kanazawa¹•Institutions (1)

National Institute of Informatics¹

01 Apr 2010-Journal of Logic, Language and Information

TL;DR: A simple, direct proof of the fact that second-order ACGs are simulated by hyperedge replacement grammars is given, which implies that the string and tree generating power of the former is included in that of the latter.

...read moreread less

Abstract: Second-order abstract categorial grammars (de Groote in Association for computational linguistics, 39th annual meeting and 10th conference of the European chapter, proceedings of the conference, pp. 148---155, 2001) and hyperedge replacement grammars (Bauderon and Courcelle in Math Syst Theory 20:83---127, 1987; Habel and Kreowski in STACS 87: 4th Annual symposium on theoretical aspects of computer science. Lecture notes in computer science, vol 247, Springer, Berlin, pp 207---219, 1987) are two natural ways of generalizing "context-free" grammar formalisms for string and tree languages. It is known that the string generating power of both formalisms is equivalent to (non-erasing) multiple context-free grammars (Seki et al. in Theor Comput Sci 88:191---229, 1991) or linear context-free rewriting systems (Weir in Characterizing mildly context-sensitive grammar formalisms, University of Pennsylvania, 1988). In this paper, we give a simple, direct proof of the fact that second-order ACGs are simulated by hyperedge replacement grammars, which implies that the string and tree generating power of the former is included in that of the latter. The normal form for tree-generating hyperedge replacement grammars given by Engelfriet and Maneth (Graph transformation. Lecture notes in computer science, vol 1764. Springer, Berlin, pp 15---29, 2000) can then be used to show that the tree generating power of second-order ACGs is exactly the same as that of hyperedge replacement grammars.

...read moreread less

25 citations

Journal Article•DOI•

Adaptive star grammars and their languages

[...]

Frank Drewes¹, Berthold Hoffmann², Dirk Janssens³, Mark Minas⁴•Institutions (4)

Umeå University¹, University of Bremen², University of Antwerp³, Bundeswehr University Munich⁴

01 Jul 2010-Theoretical Computer Science

TL;DR: In adaptive star grammars, rules are actually schemata which, via the cloning of so-called multiple nodes, may adapt to potentially infinitely many contexts when they are applied, and they turn out to be restricted enough to share some of the basic characteristics of context-free devices.

...read moreread less

24 citations

Journal Article•DOI•

Conjunctive grammars with restricted disjunction

[...]

Alexander Okhotin¹, Christian Reitwieβner²•Institutions (2)

University of Turku¹, University of Würzburg²

01 Jun 2010-Theoretical Computer Science

TL;DR: If it is furthermore required that each rule of the general form A->w has a nonempty w, then a substantial subfamily of conjunctive languages can be generated, yet it remains unknown whether such grammars are as powerful as conj unctive grammARS of thegeneral form.

...read moreread less

21 citations

Book Chapter•DOI•

On erasing productions in random context grammars

[...]

Georg Zetzsche¹•Institutions (1)

University of Hamburg¹

06 Jul 2010

TL;DR: Three open questions in the theory of regulated rewriting are addressed, including whether every permitting random context grammar has a non-erasing equivalent and whether permitting random Context Grammars have the same generative capacity as matrix grammars without appearance checking.

...read moreread less

Abstract: Three open questions in the theory of regulated rewriting are addressed. The first is whether every permitting random context grammar has a non-erasing equivalent. The second asks whether the same is true for matrix grammars without appearance checking. The third concerns whether permitting random context grammars have the same generative capacity as matrix grammars without appearance checking. The main result is a positive answer to the first question. For the other two, conjectures are presented. It is then deduced from the main result that at least one of the two holds.

...read moreread less

20 citations

Journal Article•DOI•

On the Membership Problem for Non-Linear Abstract Categorial Grammars

[...]

Sylvain Salvati¹•Institutions (1)

University of Bordeaux¹

01 Apr 2010-Journal of Logic, Language and Information

TL;DR: In this article, it was shown that the membership problem for second order non-linear abstract categorical grammars is decidable, and that Montague-like semantics yield to a text generation problem.

...read moreread less

Abstract: In this paper we show that the membership problem for second order non-linear Abstract Categorial Grammars is decidable. A consequence of that result is that Montague-like semantics yield to a decidable text generation problem. Furthermore the proof we propose is based on a new tool, Higher Order Intersection Signatures, which grasps statically dynamic properties of ?-terms and presents an interest in its own.

...read moreread less

16 citations

Journal Article•DOI•

Towards Theorem Proving Graph Grammars using Event-B

[...]

Leila Ribeiro, Fernando Luís Dotti, Simone André da Costa, Fabiane Cristine Dillenburg

01 Nov 2010-Electronic Communication of The European Association of Software Science and Technology

TL;DR: It is shown that a graph grammar can be translated into an Event-B specification preserving its semantics, such that one can use several theorem provers available for Event- B to analyze the reachable states of the original graph grammar.

...read moreread less

Abstract: Graph grammars may be used as specification technique for different kinds of systems, specially in situations in which states are complex structures that can be adequately modeled as graphs (possibly with an attribute data part) and in which the behavior involves a large amount of parallelism and can be described as reactions to stimuli that can be observed in the state of the system. The verification of properties of such systems is a difficult task due to many aspects: in many situations the systems have an infinite number of states; states themselves are complex and large; there are a number of different computation possibilities due to the fact that rule applications may occur in parallel. There are already some approaches to verification of graph grammars based on model checking, but in these cases only finite state systems can be analyzed. Other approaches propose over- and/or under-approximations of the state-space, but in this case it is not possible to check arbitrary properties. In this work, we propose to use the Event-B formal method and its theorem proving tools to analyze graph grammars. We show that a graph grammar can be translated into an Event-B specification preserving its semantics, such that one can use several theorem provers available for Event-B to analyze the reachable states of the original graph grammar. The translation is based on a relational definition of graph grammars, that was shown to be equivalent to the Single-Pushout approach to graph grammars.

...read moreread less

16 citations

Journal Article•DOI•

New Directions in Type-Theoretic Grammars

[...]

Reinhard Muskens¹•Institutions (1)

Tilburg University¹

01 Apr 2010-Journal of Logic, Language and Information

TL;DR: It is explained how making this distinction obviates the need for directed types in type-theoretic grammars and a simple grammatical formalism is sketched in which representations at all levels are lambda terms.

...read moreread less

Abstract: This paper argues for the idea that in describing language we should follow Haskell Curry in distinguishing between the structure of an expression and its appearance or manifestation. It is explained how making this distinction obviates the need for directed types in type-theoretic grammars and a simple grammatical formalism is sketched in which representations at all levels are lambda terms. The lambda term representing the abstract structure of an expression is homomorphically translated to a lambda term representing its manifestation, but also to a lambda term representing its semantics.

...read moreread less

14 citations

Book Chapter•DOI•

Dependently typed grammars

[...]

Kasper Brink¹, Stefan Holdermans, Andres Löh¹•Institutions (1)

Utrecht University¹

21 Jun 2010

TL;DR: This article presents a framework for grammars and grammar transformations using Agda, and implements the left-corner transformation for left-recursion removal and proves a language-inclusion property as use cases.

...read moreread less

Abstract: Parser combinators are a popular tool for designing parsers in functional programming languages. If such combinators generate an abstract representation of the grammar as an intermediate step, it becomes easier to perform analyses and transformations that can improve the behaviour of the resulting parser. Grammar transformations must satisfy a number of invariants. In particular, they have to preserve the semantics associated with the grammar. Using conventional type systems, these constraints cannot be expressed satisfactorily, but as we show in this article, dependent types are a natural fit. We present a framework for grammars and grammar transformations using Agda. We implement the left-corner transformation for left-recursion removal and prove a language-inclusion property as use cases.

...read moreread less

Journal Article•DOI•

Left-forbidding cooperating distributed grammar systems

[...]

Filip Goldefus¹, Tomá Masopust², Alexander Meduna¹•Institutions (2)

Brno University of Technology¹, Academy of Sciences of the Czech Republic²

01 Sep 2010-Theoretical Computer Science

TL;DR: It is proved that twelve nonterminals are enough for cooperating distributed grammar systems working in the terminal derivation mode with two left-forbidding components (including erasing productions) to characterize the family of recursively enumerable languages.

...read moreread less

Book Chapter•DOI•

Lazy combinators for executable specifications of general attribute grammars

[...]

Rahmatullah Hafiz¹, Richard A. Frost¹•Institutions (1)

University of Windsor¹

18 Jan 2010

TL;DR: A lazy-evaluation based top-down parsing algorithm has been implemented as a set of higher-order functions (combinators) which support directly-executable specifications of fully general attribute grammars.

...read moreread less

Abstract: A lazy-evaluation based top-down parsing algorithm has been implemented as a set of higher-order functions (combinators) which support directly-executable specifications of fully general attribute grammars. This approach extends aspects of previous approaches, and allows natural language processors to be constructed as modular and declarative specifications while accommodating ambiguous context-free grammars (including direct and indirect left-recursive rules), augmented with semantic rules with arbitrary attribute dependencies (including dependencies from right). This one-pass syntactic and semantic analysis method has polynomial time and space (w.r.t. the input length) for processing ambiguous input, and helps language developers build and test their models with little concern for the underlying computational methods.

...read moreread less

Journal Article•DOI•

Boolean grammars and gsm mappings

[...]

Tommi Lehtinen¹, Tommi Lehtinen², Alexander Okhotin¹•Institutions (2)

University of Turku¹, Turku Centre for Computer Science²

01 Oct 2010-International Journal of Foundations of Computer Science

TL;DR: It is proved that the language family generated by Boolean grammars is effectively closed under injective gSM mappings and inverse gsm mappings.

...read moreread less

Abstract: It is proved that the language family generated by Boolean grammars is effectively closed under injective gsm mappings and inverse gsm mappings (where gsm stands for a generalized sequential machine). The same results hold for conjunctive grammars, unambiguous Boolean grammars and unambiguous conjunctive grammars.

...read moreread less

Journal Article•DOI•

Unidirectional Lambek Grammars in Polynomial Time

[...]

Yury Savateev¹•Institutions (1)

Moscow State University¹

01 May 2010-Theory of Computing Systems \/ Mathematical Systems Theory

TL;DR: A polynomial algorithm for deciding whether a given word belongs to a language generated by a given unidirectional Lambek grammar is presented.

...read moreread less

Abstract: Lambek grammars provide a useful tool for studying formal and natural languages. The generative power of unidirectional Lambek grammars equals that of context-free grammars. However, no feasible algorithm was known for deciding membership in the corresponding formal languages. In this paper we present a polynomial algorithm for deciding whether a given word belongs to a language generated by a given unidirectional Lambek grammar.

...read moreread less

Categorial Minimalist Grammar: From Generative Syntax To Logical Form

[...]

Maxime Amblard, Alain Lecomte, Christian Retoré

01 Dec 2010

TL;DR: The various structures and rules that are needed to derive a semantic representation from the categorial view of a transformational syntactic analysis are illustrated.

...read moreread less

Abstract: We first recall some basic notions on minimalist grammars and on categorial grammars. Next we shortly introduce partially commutative linear logic, and our representation of minimalist grammars within this categorial system, the so-called categorial minimalist grammars. Thereafter we briefly present λμ-DRT (Discourse Representation Theory) an extension of λ-DRT (compositional DRT) in the framework of λμ calculus: it avoids type raising and derives different readings from a single semantic representation, in a setting which follows discourse structure. We run a complete example which illustrates the various structures and rules that are needed to derive a semantic representation from the categorial view of a transformational syntactic analysis.

...read moreread less

Journal Article•DOI•

Learning context-free grammar using improved tabular representation

[...]

Olgierd Unold¹, Marcin Jaworski¹•Institutions (1)

Wrocław University of Technology¹

01 Jan 2010

TL;DR: The new version of TBL algorithm has been experimentally proved to be not so much vulnerable to block size and population size, and is able to find the solutions faster than standard one.

...read moreread less

Abstract: This paper describes an improved version of TBL algorithm [Y. Sakakibara, Learning context-free grammars using tabular representations, Pattern Recognition 38(2005) 1372-1383; Y. Sakakibara, M. Kondo, GA-based learning of context-free grammars using tabular representations, in: Proceedings of 16th International Conference in Machine Learning (ICML-99), Morgan-Kaufmann, Los Altos, CA, 1999] for inference of context-free grammars in Chomsky Normal Form. The TBL algorithm is a novel approach to overcome the hardness of learning context-free grammars from examples without structural information available. The algorithm represents the grammars by parsing tables and thanks to this tabular representation the problem of grammar learning is reduced to the problem of partitioning the set of nonterminals. Genetic algorithm is used to solve NP-hard partitioning problem. In the improved version modified fitness function and new delete specialized operator is applied. Computer simulations have been performed to determine improved a tabular representation efficiency. The set of experiments has been divided into 2 groups: in the first one learning the unknown context-free grammar proceeds without any extra information about grammatical structure, in the second one learning is supported by a partial knowledge of the structure. In each of the performed experiments the influence of partition block size in an initial population and the size of population at grammar induction has been tested. The new version of TBL algorithm has been experimentally proved to be not so much vulnerable to block size and population size, and is able to find the solutions faster than standard one.

...read moreread less

Journal Article•DOI•

Generative capacity of subregularly tree controlled grammars

[...]

Jürgen Dassow¹, Ralf Stiebe¹, Bianca Truthe¹•Institutions (1)

Otto-von-Guericke University Magdeburg¹

01 Oct 2010-International Journal of Foundations of Computer Science

TL;DR: Some results on the power of tree controlled grammars are presented where the regular languages are restricted to some known subclasses of the family of regular languages.

...read moreread less

Abstract: Tree controlled grammars are context-free grammars where the associated language only contains those terminal words which have a derivation where the word of any level of the corresponding derivation tree belongs to a given regular language. We present some results on the power of such grammars where we restrict the regular languages to some known subclasses of the family of regular languages.

...read moreread less

Journal Article•

Petri net controlled grammars with a bounded number of additional places

[...]

Jürgen Dassow, Sherzod Turaev

01 Jan 2010-Acta Cybernetica

TL;DR: The generative power and closure properties of the families of languages generated by such Petri net controlled grammars are investigated and it is shown that these families form an infinite hierarchy with respect to the numbers of additional places.

...read moreread less

Abstract: A context-free grammar and its derivations can be described by a Petri net, called a context-free Petri net, whose places and transitions correspond to the nonterminals and the production rules of the grammar, respectively, and tokens are separate instances of the nonterminals in a sentential form. Therefore, the control of the derivations in a context-free grammar can be implemented by adding some features to the associated of petri net. The addition of new places and new arcs from/to these new places to/from transitions of the net leads grammars controlled by k-Petri, nets, i.e., Petri nets with additional k places. In the paper we investigate the generative power and give closure properties of the families of languages generated by such Petri net controlled grammars, in particular, we show that these families form an infinite hierarchy with respect to the numbers of additional places.

...read moreread less

Book Chapter•DOI•

Extending stochastic context-free grammars for an application in bioinformatics

[...]

Frank Weinberg¹, Markus E. Nebel¹•Institutions (1)

Kaiserslautern University of Technology¹

24 May 2010

TL;DR: Stochastic context-free grammars are extended such that the probability of applying a production can depend on the length of the subword that is generated from the application and show that existing algorithms for training and determining the most probable parse tree can easily be adapted to the extended model without losses in performance.

...read moreread less

Abstract: We extend stochastic context-free grammars such that the probability of applying a production can depend on the length of the subword that is generated from the application and show that existing algorithms for training and determining the most probable parse tree can easily be adapted to the extended model without losses in performance. Furthermore we show that the extended model is suited to improve the quality of predictions of RNA secondary structures. The extended model may also be applied to other fields where SCFGs are used like natural language processing. Additionally some interesting questions in the field of formal languages arise from it.

...read moreread less

Posted Content•

A probabilistic top-down parser for minimalist grammars

[...]

Thomas Mainguy

09 Oct 2010-arXiv: Computation and Language

TL;DR: A way of rewriting Minimalist Grammars as Linear Context-Free Rewriting Systems, allowing to easily create a top-down parser, and a method of refining the probabilistic field by using algorithms used in data compression.

...read moreread less

Abstract: This paper describes a probabilistic top-down parser for minimalist grammars. Top-down parsers have the great advantage of having a certain predictive power during the parsing, which takes place in a left-to-right reading of the sentence. Such parsers have already been well-implemented and studied in the case of Context-Free Grammars, which are already top-down, but these are difficult to adapt to Minimalist Grammars, which generate sentences bottom-up. I propose here a way of rewriting Minimalist Grammars as Linear Context-Free Rewriting Systems, allowing to easily create a top-down parser. This rewriting allows also to put a probabilistic field on these grammars, which can be used to accelerate the parser. Finally, I propose a method of refining the probabilistic field by using algorithms used in data compression.

...read moreread less

Proceedings Article•DOI•

Tear-Insert-Fold grammars

[...]

Adrian Johnstone¹, Elizabeth Scott²•Institutions (2)

Royal Holloway, University of London¹, University of London²

28 Mar 2010

TL;DR: Tear-Insert-Fold grammars are introduced which add tree-manipulation annotations to standard CFGs which allow typical abstract forms to be constructed directly from the grammar for the concrete syntax and provide a convenient and concise specification of the relationship between the set of derivation trees and theSet of abstract trees for a parser.

...read moreread less

Abstract: Context Free Grammars (CFGs) are simple and powerful formalisms for defining languages (sets of strings) whose semantics are specified hierarchically --- the meaning of a string is determined by terminals and the meanings of substrings. This hierarchy is captured in the derivation tree corresponding to the string. Derivation trees usually contain more structure than is strictly required to determine the semantics of the string so in practice a simplified or abstract syntax tree is used as an internal representation of a concrete text. Indeed, much of the work of a compiler or source-source translator may be described in terms of stepwise transformation of such trees, culminating in a final traversal during which the translated text is output.This paper introduces Tear-Insert-Fold grammars which add tree-manipulation annotations to standard CFGs. These annotations allow typical abstract forms to be constructed directly from the grammar for the concrete syntax and provide a convenient and concise specification of the relationship between the set of derivation trees and the set of abstract trees for a parser. More significantly, for any TIF grammar Γ0 there is a TIF grammar Γ1 whose derivation trees are the abstract trees produced by Γ0.

...read moreread less

Journal Article•

Multigenerative grammar systems and matrix grammars

[...]

Roman Lukás, Alexander Meduna

01 Jan 2010-Kybernetika

TL;DR: It is proved thatMultigenerative grammar systems based on cooperating context-free grammatical components that simultaneously generate their strings in a rule-controlled or nonterminal-controlled rewriting way are equivalent with the matrix grammars.

...read moreread less

Abstract: Multigenerative grammar systems are based on cooperating context-free grammatical components that simultaneously generate their strings in a rule-controlled or nonterminal-controlled rewriting way, and after this simultaneous generation is completed, all the generated terminal strings are combined together by some common string operations, such as concatenation, and placed into the generated languages of these systems. The present paper proves that these systems are equivalent with the matrix grammars. In addition, we demonstrate that these systems with any number of grammatical components can be transformed to equivalent two-component versions of these systems. The paper points out that if these systems work in the leftmost rewriting way, they are more powerful than the systems working in a general way.

...read moreread less

Probabilistic Context-Free Grammars.

[...]

Yasubumi Sakakibara

01 Jan 2010

Proceedings Article•

Exploring the Spinal-Stig Model for Parsing French

[...]

Djamé Seddah

01 May 2010

TL;DR: It is found that the parsing performance of a STIG model is tied to the size of the underlying Tree Insertion Grammar, with a more compact grammar, a spinal STIG, outperforming a genuine STIG.

...read moreread less

Abstract: We evaluate statistical parsing of French using two probabilistic models derived from the Tree Adjoining Grammar framework: a Stochastic Tree Insertion Grammars model (STIG) and a specific instance of this formalism, called Spinal Tree Insertion Grammar model which exhibits interesting properties with regard to data sparseness issues common to small treebanks such as the Paris 7 French Treebank. Using David Chiangs STIG parser (Chiang, 2003), we present results of various experiments we conducted to explore those models for French parsing. The grammar induction makes use of a head percolation table tailored for the French Treebank and which is provided in this paper. Using two evaluation metrics, we found that the parsing performance of a STIG model is tied to the size of the underlying Tree Insertion Grammar, with a more compact grammar, a spinal STIG, outperforming a genuine STIG. We finally note that a ""spinal"" framework seems to emerge in the literature. Indeed, the use of vertical grammars such as Spinal STIG instead of horizontal grammars such as PCFGs, afflicted with well known data sparseness issues, seems to be a promising path toward better parsing performance.

...read moreread less

Book Chapter•DOI•

On Müller context-free grammars

[...]

Zoltán Ésik¹, Szabolcs Iván¹•Institutions (1)

University of Szeged¹

17 Aug 2010

TL;DR: It is shown that every Muller context-free grammar can be transformed into a normal form grammar in polynomial space without increasing the size of the grammar, and that many decision problems can be solved inPolynomial time for Mullercontext-free grammars in normal form.

...read moreread less

Abstract: We define context-free grammars with Muller acceptance condition that generate languages of countable words. We establish several elementary properties of the class of Muller context-free languages including closure properties and others. We show that every Muller context-free grammar can be transformed into a normal form grammar in polynomial space without increasing the size of the grammar, and then we show that many decision problems can be solved in polynomial time for Muller context-free grammars in normal form. These problems include deciding whether the language generated by a normal form grammar contains only well-ordered, scattered, or dense words. In a further result we establish a limitedness property of Muller context-free grammars: If the language generated by a grammar contains only scattered words, then either there is an integer n such that each word of the language has Hausdorff rank at most n, or the language contains scattered words of arbitrarily large Hausdorff rank. We also show that it is decidable which of the two cases applies.

...read moreread less

Markush Structure Reconstruction - A Prototype for their Reconstruction from Image and Text into a Searchable, Context Sensitive Grammar based Extension of SMILES.

[...]

Carina Haupt

01 Jan 2010

Book Chapter•DOI•

Bounding the maximal parsing performance of non-terminally separated grammars

[...]

Franco M. Luque¹, Gabriel Infante-Lopez¹•Institutions (1)

National University of Cordoba¹

13 Sep 2010

TL;DR: This paper develops methods to find upper bounds for the unlabeled F1 performance that any UWNTS grammar can achieve over a given treebank and defines a new metric that is NP-Hard but solvable with specialized software.

...read moreread less

Abstract: Unambiguous Non-Terminally Separated (UNTS) grammars have good learnability properties but are too restrictive to be used for natural language parsing. We present a generalization of UNTS grammars called Unambiguous Weakly NTS (UWNTS) grammars that preserve the learnability properties. Then, we study the problem of using them to parse natural language and evaluating against a gold treebank. If the target language is not UWNTS, there will be an upper bound in the parsing performance. In this paper we develop methods to find upper bounds for the unlabeled F1 performance that any UWNTS grammar can achieve over a given treebank. We define a new metric, show that its optimization is NP-Hard but solvable with specialized software, and show a translation of the result to a bound for the F1. We do experiments with the WSJ10 corpus, finding an F1 bound of 76.1% for the UWNTS grammars over the POS tags alphabet.

...read moreread less

Journal Article•DOI•

A Faithful Representation of Non-Associative Lambek Grammars in Abstract Categorial Grammars

[...]

Christian Retoré¹, Sylvain Salvati¹•Institutions (1)

L'Abri¹

01 Apr 2010-Journal of Logic, Language and Information

TL;DR: It is shown that Non-Associative Lambek grammars as well as their derivations can be defined using ACGs of order two, which solves a natural but still open question: can abstract categorial Grammars (ACGs) respresent usual categorial grammARS?

...read moreread less

Abstract: This paper solves a natural but still open question: can abstract categorial grammars (ACGs) respresent usual categorial grammars? Despite their name and their claim to be a unifying framework, up to now there was no faithful representation of usual categorial grammars in ACGs. This paper shows that Non-Associative Lambek grammars as well as their derivations can be defined using ACGs of order two. To conclude, the outcome of such a representation are discussed.

...read moreread less

Proceedings Article•DOI•

Derivation trees for context-sensitive grammars

[...]

Benedek Nagy

01 Sep 2010