A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text

doi:10.3115/974235.974260

Open AccessProceedings ArticleDOI

A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text

- pp 136-143

TLDR

The authors used a linear-time dynamic programming algorithm to find an assignment of parts of speech to words that optimizes the product of (a) lexical probabilities (probability of observing part of speech i given word i) and (b) contextual probabilities (pb probability of observing n following partsof speech).

Abstract:

A program that tags each word in an input sentence with the most likely part of speech has been written. The program uses a linear-time dynamic programming algorithm to find an assignment of parts of speech to words that optimizes the product of (a) lexical probabilities (probability of observing part of speech i given word i) and (b) contextual probabilities (probability of observing part of speech i given n following parts of speech). Program performance is encouraging; a 400-word sample is presented and is judged to be 99.5% correct. >

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Word association norms, mutual information, and lexicography

Kenneth Church, +1 more

- 01 Mar 1990 -

Computational Linguistics

TL;DR: The proposed measure, the association ratio, estimates word association norms directly from computer readable corpora, making it possible to estimate norms for tens of thousands of words.

...read moreread less

Proceedings ArticleDOI

Feature-rich part-of-speech tagging with a cyclic dependency network

Kristina Toutanova, +3 more

TL;DR: A new part-of-speech tagger is presented that demonstrates the following ideas: explicit use of both preceding and following tag contexts via a dependency network representation, broad use of lexical features, and effective use of priors in conditional loglinear models.

...read moreread less

Journal ArticleDOI

An empirical study of smoothing techniques for language modeling

Stanley F. Chen, +1 more

- 01 Oct 1999 -

Computer Speech & Language

TL;DR: This work surveys the most widely-used algorithms for smoothing models for language n -gram modeling, and presents an extensive empirical comparison of several of these smoothing techniques, including those described by Jelinek and Mercer (1980), and introduces methodologies for analyzing smoothing algorithm efficacy in detail.

...read moreread less

Proceedings ArticleDOI

Predicting the Semantic Orientation of Adjectives

Vasileios Hatzivassiloglou, +1 more

TL;DR: A log-linear regression model uses constraints from conjunctions to predict whether conjoined adjectives are of same or different orientations, achieving 82% accuracy in this task when each conjunction is considered independently.

...read moreread less

Proceedings ArticleDOI

A Simple Rule-Based Part of Speech Tagger

Eric D. Brill

TL;DR: This work presents a simple rule-based part of speech tagger which automatically acquires its rules and tags with accuracy comparable to stochastic taggers, demonstrating that the stochastics method is not the only viable method for part ofspeech tagging.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Three models for the description of language

Noam Chomsky

- 01 Sep 1956 -

IEEE Transactions on Information Theory

TL;DR: It is found that no finite-state Markov process that produces symbols with transition from state to state can serve as an English grammar, and the particular subclass of such processes that produce n -order statistical approximations to English do not come closer, with increasing n, to matching the output of anEnglish grammar.

...read moreread less

Journal ArticleDOI

Frequency Analysis of English Usage: Lexicon and Grammar. By W. Nelson Francis and Henry Kučera with the assistance of Andrew W. Mackie. Boston: Houghton Mifflin. 1982. x + 561:

Robert Burchfield

- 01 Apr 1985 -

Journal of English Linguistics

Journal ArticleDOI

Theory of Syntactic Recognition for Natural Languages

Mitchell P. Marcus

- 01 Apr 1980 -

Language

TL;DR: A theory of syntactic recognition for natural language can be found in this article, where the authors make use of the deterministic hypothesis that the syntax can be parsed by a mechanism which operates "strictly deterministically" in that it does not simulate a non-deterministic machine.

...read moreread less

Book

Collins COBUILD English Language Dictionary

John Sinclair

TL;DR: This is a dictionary of English as it is actually used and is also written and presented in plain English, enabling easier and earlier use of a monolingual dictionary.

...read moreread less

The Automatic Grammatical Tagging of the LOB Corpus

Geoffrey Leech, +2 more

TL;DR: An account of the automatic grammatical tagging of the LOB (LancasterOslo/Bergen) Corpus of British English, with special reference to the methods of tagging the authors have adopted.

...read moreread less