scispace - formally typeset
Search or ask a question
Topic

Context-sensitive grammar

About: Context-sensitive grammar is a research topic. Over the lifetime, 1938 publications have been published within this topic receiving 45911 citations. The topic is also known as: CSG.


Papers
More filters
Proceedings ArticleDOI
31 May 2009
TL;DR: This work presents a theoretically principled model which learns compact and simple grammars, uncovering latent linguistic structures (e.g., verb subcategorisation), and in doing so far out-performs a standard PCFG.
Abstract: Tree substitution grammars (TSGs) are a compelling alternative to context-free grammars for modelling syntax. However, many popular techniques for estimating weighted TSGs (under the moniker of Data Oriented Parsing) suffer from the problems of inconsistency and over-fitting. We present a theoretically principled model which solves these problems using a Bayesian non-parametric formulation. Our model learns compact and simple grammars, uncovering latent linguistic structures (e.g., verb subcategorisation), and in doing so far out-performs a standard PCFG.

63 citations

BookDOI
01 Jan 1987
TL;DR: An Elementary Proof of the Peters-Ritchie Theorem and Computationally Relevant Properties of Natural Languages and Their Grammars are presented.
Abstract: Prologue.- What is Mathematical Linguistics?.- I. Early Nontransformational Grammar.- to Part I.- Formal Linguistics and Formal Logic.- An Elementary Proof of the Peters-Ritchie Theorem.- On Constraining the Class of Transformational Languages.- Generative Grammars without Transformation Rules-A Defense of Phrase Structure.- A Program for Syntax.- II Modern Context-Free-Like Models.- to Part II.- Natural Languages and Context-Free Languages.- Unbounded Dependency and Coordinate Structure.- On Some Formal Properties of MetaRules.- Some Generalizations of Categorial Grammars.- III More than Context-Free and Less than Transformational Grammar.- to Part III.- Cross-serial Dependencies in Dutch.- Evidence Against the Context-Freeness of Natural Language.- English is not a Context-Free Language.- The Complexity of the Vocabulary of Bambara.- Context-Sensitive Grammar and Natural Language Syntax.- How Non-Context Free is Variable Binding?.- Prologue.- Computationally Relevant Properties of Natural Languages and Their Grammars.- Index of Languages.- Name Index.

62 citations

Proceedings ArticleDOI
17 Jan 2010
TL;DR: The design and theory of a new parsing engine, YAKKER, capable of satisfying the many needs of modern programmers and modern data processing applications is presented and its use on examples ranging from difficult programming language grammars to web server logs to binary data specification is illustrated.
Abstract: We present the design and theory of a new parsing engine, YAKKER, capable of satisfying the many needs of modern programmers and modern data processing applications. In particular, our new parsing engine handles (1) full scannerless context-free grammars with (2) regular expressions as right-hand sides for defining nonterminals. YAKKER also includes (3) facilities for binding variables to intermediate parse results and (4) using such bindings within arbitrary constraints to control parsing. These facilities allow the kind of data-dependent parsing commonly needed in systems applications, particularly those that operate over binary data. In addition, (5) nonterminals may be parameterized by arbitrary values, which gives the system good modularity and abstraction properties in the presence of data-dependent parsing. Finally, (6) legacy parsing libraries,such as sophisticated libraries for dates and times, may be directly incorporated into parser specifications. We illustrate the importance and utility of this rich collection of features by presenting its use on examples ranging from difficult programming language grammars to web server logs to binary data specification. We also show that our grammars have important compositionality properties and explain why such properties areimportant in modern applications such as automatic grammar induction.In terms of technical contributions, we provide a traditional high-level semantics for our new grammar formalization and show how to compile grammars into non deterministic automata. These automata are stack-based, somewhat like conventional push-down automata,but are also equipped with environments to track data-dependent parsing state. We prove the correctness of our translation of data-dependent grammars into these new automata and then show how to implement the automata efficiently using a variation of Earley's parsing algorithm.

61 citations

Book ChapterDOI
03 Jul 2007
TL;DR: A negative answer is given, contrary to the conjectured positive one, by constructing a conjunctive grammar for the language \(\{ a^{4^{n}} : n \in \mathbb{N} \}\).
Abstract: Conjunctive grammars were introduced by A. Okhotin in [1] as a natural extension of context-free grammars with an additional operation of intersection in the body of any production of the grammar. Several theorems and algorithms for context-free grammars generalize to the conjunctive case. Still some questions remained open. A. Okhotin posed nine problems concerning those grammars. One of them was a question, whether a conjunctive grammar over unary alphabet can generate only regular languages. We give a negative answer, contrary to the conjectured positive one, by constructing a conjunctive grammar for the language \(\{ a^{4^{n}} : n \in \mathbb{N} \}\). We then generalise this result—for every set of numbers L such that their representation in some k-ary system is regular set we show that \(\{ a^{k^{n}} : n \in L \}\) is generated by some conjunctive grammar over unary alphabet.

61 citations

Book ChapterDOI
23 Sep 1996
TL;DR: It is shown that, in a formal sense, Old Georgian can be taken to provide an example of a non-semilinear language and that none of the aforementioned grammar formalisms is strong enough to generate this language.
Abstract: Mildly context sensitive grammar formalisms such as multi-component TAGs and linear context free rewrite systems have been introduced to capture the full complexity of natural languages. We show that, in a formal sense, Old Georgian can be taken to provide an example of a non-semilinear language. This implies that none of the aforementioned grammar formalisms is strong enough to generate this language.

61 citations


Network Information
Related Topics (5)
Graph (abstract data type)
69.9K papers, 1.2M citations
80% related
Time complexity
36K papers, 879.5K citations
79% related
Concurrency
13K papers, 347.1K citations
78% related
Model checking
16.9K papers, 451.6K citations
77% related
Directed graph
12.2K papers, 302.4K citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202311
202212
20211
20204
20191
20181