scispace - formally typeset
Open AccessProceedings ArticleDOI

A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text

Kenneth Church
- pp 136-143
TLDR
The authors used a linear-time dynamic programming algorithm to find an assignment of parts of speech to words that optimizes the product of (a) lexical probabilities (probability of observing part of speech i given word i) and (b) contextual probabilities (pb probability of observing n following partsof speech).
Abstract
A program that tags each word in an input sentence with the most likely part of speech has been written. The program uses a linear-time dynamic programming algorithm to find an assignment of parts of speech to words that optimizes the product of (a) lexical probabilities (probability of observing part of speech i given word i) and (b) contextual probabilities (probability of observing part of speech i given n following parts of speech). Program performance is encouraging; a 400-word sample is presented and is judged to be 99.5% correct. >

read more

Citations
More filters
Journal ArticleDOI

Word association norms, mutual information, and lexicography

TL;DR: The proposed measure, the association ratio, estimates word association norms directly from computer readable corpora, making it possible to estimate norms for tens of thousands of words.
Proceedings ArticleDOI

Feature-rich part-of-speech tagging with a cyclic dependency network

TL;DR: A new part-of-speech tagger is presented that demonstrates the following ideas: explicit use of both preceding and following tag contexts via a dependency network representation, broad use of lexical features, and effective use of priors in conditional loglinear models.
Journal ArticleDOI

An empirical study of smoothing techniques for language modeling

TL;DR: This work surveys the most widely-used algorithms for smoothing models for language n -gram modeling, and presents an extensive empirical comparison of several of these smoothing techniques, including those described by Jelinek and Mercer (1980), and introduces methodologies for analyzing smoothing algorithm efficacy in detail.
Proceedings ArticleDOI

Predicting the Semantic Orientation of Adjectives

TL;DR: A log-linear regression model uses constraints from conjunctions to predict whether conjoined adjectives are of same or different orientations, achieving 82% accuracy in this task when each conjunction is considered independently.
Proceedings ArticleDOI

A Simple Rule-Based Part of Speech Tagger

TL;DR: This work presents a simple rule-based part of speech tagger which automatically acquires its rules and tags with accuracy comparable to stochastic taggers, demonstrating that the stochastics method is not the only viable method for part ofspeech tagging.
References
More filters
Journal ArticleDOI

Three models for the description of language

TL;DR: It is found that no finite-state Markov process that produces symbols with transition from state to state can serve as an English grammar, and the particular subclass of such processes that produce n -order statistical approximations to English do not come closer, with increasing n, to matching the output of anEnglish grammar.
Journal ArticleDOI

Theory of Syntactic Recognition for Natural Languages

Mitchell P. Marcus
- 01 Apr 1980 - 
TL;DR: A theory of syntactic recognition for natural language can be found in this article, where the authors make use of the deterministic hypothesis that the syntax can be parsed by a mechanism which operates "strictly deterministically" in that it does not simulate a non-deterministic machine.
Book

Collins COBUILD English Language Dictionary

John Sinclair
TL;DR: This is a dictionary of English as it is actually used and is also written and presented in plain English, enabling easier and earlier use of a monolingual dictionary.

The Automatic Grammatical Tagging of the LOB Corpus

TL;DR: An account of the automatic grammatical tagging of the LOB (LancasterOslo/Bergen) Corpus of British English, with special reference to the methods of tagging the authors have adopted.
Related Papers (5)