scispace - formally typeset
Search or ask a question
Topic

Parser combinator

About: Parser combinator is a research topic. Over the lifetime, 2215 publications have been published within this topic receiving 66739 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: Three statistical models for natural language parsing are described, leading to approaches in which a parse tree is represented as the sequence of decisions corresponding to a head-centered, top-down derivation of the tree.
Abstract: This article describes three statistical models for natural language parsing. The models extend methods from probabilistic context-free grammars to lexicalized grammars, leading to approaches in which a parse tree is represented as the sequence of decisions corresponding to a head-centered, top-down derivation of the tree. Independence assumptions then lead to parameters that encode the X-bar schema, subcategorization, ordering of complements, placement of adjuncts, bigram lexical dependencies, wh-movement, and preferences for close attachment. All of these preferences are expressed by probabilities conditioned on lexical heads. The models are evaluated on the Penn Wall Street Journal Treebank, showing that their accuracy is competitive with other models in the literature. To gain a better understanding of the models, we also give results on different constituent types, as well as a breakdown of precision/recall results in recovering various types of dependencies. We analyze various characteristics of the models through experiments on parsing accuracy, by collecting frequencies of various structures in the treebank, and through linguistically motivated examples. Finally, we compare the models to others that have been applied to parsing the treebank, aiming to give some explanation of the difference in performance of the various models.

1,956 citations

Journal ArticleDOI
TL;DR: In this article, a parsing algorithm which seems to be the most efficient general context-free algorithm known is described, which is similar to both Knuth's LR(k) algorithm and the familiar top-down algorithm.
Abstract: A parsing algorithm which seems to be the most efficient general context-free algorithm known is described. It is similar to both Knuth's LR(k) algorithm and the familiar top-down algorithm. It has a time bound proportional to n3 (where n is the length of the string being parsed) in general; it has an n2 bound for unambiguous grammars; and it runs in linear time on a large class of grammars, which seems to include most practical context-free programming language grammars. In an empirical comparison it appears to be superior to the top-down and bottom-up algorithms studied by Griffiths and Petrick.

1,516 citations

Journal ArticleDOI
TL;DR: It is proposed that the human sentence parsing device assigns phrase structure to word strings in two steps, and the assumption that the units which are shunted from the first stage to the second stage are defined by their length, rather than by their syntactic type explains the effects of constituent length on perceptual complexity in center embedded sentences.

1,155 citations

01 Jan 1968
TL;DR: A parsing algorithm which seems to be the most efficient general context-free algorithm known is described and appears to be superior to the top-down and bottom-up algorithms studied by Griffiths and Petrick.
Abstract: A parsing algorithm which seems to be the most efficient general context-free algorithm known is described. It is similar to both Knuth's LR(k) algorithm and the familiar top-down algorithm. It has a time bound proportional to n3 (where n is the length of the string being parsed) in general; it has an n2 bound for unambiguous grammars; and it runs in linear time on a large class of grammars, which seems to include most practical context-free programming language grammars. In an empirical comparison it appears to be superior to the top-down and bottom-up algorithms studied by Griffiths and Petrick.

1,154 citations

Proceedings ArticleDOI
08 Jun 2006
TL;DR: How treebanks for 13 languages were converted into the same dependency format and how parsing performance was measured is described and general conclusions about multi-lingual parsing are drawn.
Abstract: Each year the Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their systems on exactly the same data sets, in order to better compare systems. The tenth CoNLL (CoNLL-X) saw a shared task on Multilingual Dependency Parsing. In this paper, we describe how treebanks for 13 languages were converted into the same dependency format and how parsing performance was measured. We also give an overview of the parsing approaches that participants took and the results that they achieved. Finally, we try to draw general conclusions about multi-lingual parsing: What makes a particular language, treebank or annotation scheme easier or harder to parse and which phenomena are challenging for any dependency parser?

1,011 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
84% related
Graph (abstract data type)
69.9K papers, 1.2M citations
83% related
Semantics
24.9K papers, 653K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
80% related
Semi-supervised learning
12.1K papers, 611.2K citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202311
202218
20212
20204
20199
201813