Text Chunking Using Transformation-Based Learning

doi:10.1007/978-94-017-2390-9_10

Open AccessBook ChapterDOI

Text Chunking Using Transformation-Based Learning

- pp 157-176

TLDR

This work has shown that the transformation-based learning approach can be applied at a higher level of textual interpretation for locating chunks in the tagged text, including non-recursive “baseNP” chunks.

Abstract:

Transformation-based learning, a technique introduced by Eric Brill (1993b), has been shown to do part-of-speech tagging with fairly high accuracy. This same method can be applied at a higher level of textual interpretation for locating chunks in the tagged text, including non-recursive “baseNP” chunks. For this purpose, it is convenient to view chunking as a tagging problem by encoding the chunk structure in new tags attached to each word. In automatic tests using Treebank-derived data, this technique achieved recall and precision rates of roughly 93% for baseNP chunks (trained on 950K words) and 88% for somewhat more complex chunks that partition the sentence (trained on 200K words). Working in this new application and with larger template and training sets has also required some interesting adaptations to the transformation-based learning approach.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

Erik Tjong Kim Sang, +1 more

TL;DR: The CoNLL-2003 shared task on NER as mentioned in this paper was the first NER task with language-independent named entity recognition (NER) data sets and evaluation method, and a general overview of the systems that participated in the task and their performance.

...read moreread less

Proceedings ArticleDOI

Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

Michael Collins

TL;DR: Experimental results on part-of-speech tagging and base noun phrase chunking are given, in both cases showing improvements over results for a maximum-entropy tagger.

...read moreread less

Proceedings ArticleDOI

Shallow parsing with conditional random fields

Fei Sha, +1 more

TL;DR: This work shows how to train a conditional random field to achieve performance as good as any reported base noun-phrase chunking method on the CoNLL task, and better than any reported single model.

...read moreread less

Proceedings ArticleDOI

An Analysis of Active Learning Strategies for Sequence Labeling Tasks

Burr Settles, +1 more

TL;DR: This paper surveys previously used query selection strategies for sequence models, and proposes several novel algorithms to address their shortcomings, and conducts a large-scale empirical comparison.

...read moreread less

Proceedings ArticleDOI

Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data

Charles Sutton, +2 more

TL;DR: In this paper, a generalization of linear-chain CRFs, called dynamic conditional random fields (DCRFs), is proposed, in which each time slice contains a set of state variables and edges and parameters are tied across slices.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

An algorithm for finding noun phrase correspondences in bilingual corpora

Julian M. Kupiec

TL;DR: The paper describes an algorithm that employs English and French text taggers to associate noun phrases in an aligned bilingual corpus and provides an alternative to other approaches for finding word correspondences, with the advantage that linguistic structure is incorporated.

...read moreread less

Proceedings ArticleDOI

Surface grammatical analysis for the extraction of terminological noun phrases

Didier Bourigault

TL;DR: The type of analysis used (surface grammatical analysis) is highlighted, as the methodological approach adopted to adapt the rules (experimental approach).

...read moreread less

Proceedings ArticleDOI

A rule-based approach to prepositional phrase attachment disambiguation

Eric D. Brill, +1 more

TL;DR: The authors describe a new corpus-based approach to prepositional phrase attachment disambiguation, and present results comparing performance of this algorithm with other corpusbased approaches to this problem.

...read moreread less

Proceedings ArticleDOI

Automatic grammar induction and parsing free text: a transformation-based approach

Eric D. Brill

TL;DR: A transformational grammar is automatically learned that is capable of accurately parsing text into binary-branching syntactic trees with nonterminals unlabelled, and a set of simple structural transformations can be applied to reduce error.

...read moreread less

Proceedings Article

NPtool, a Detector of English Noun Phrases

Atro Voutilainen

TL;DR: NPtool as mentioned in this paper is a fast and accurate system for extracting noun phrases from English texts for the purposes of e.g. information retrieval, translation unit discovery, and corpus studies, but it does not support the extraction of nouns from nouns.

...read moreread less

Related Papers (5)

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

John Lafferty, +2 more

Building a large annotated corpus of English: the penn treebank

Mitchell Marcus, +2 more

- 01 Jun 1993 -

Computational Linguistics

Neural Computation

Text Chunking Using Transformation-Based Learning

Citations

Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

Shallow parsing with conditional random fields

An Analysis of Active Learning Strategies for Sequence Labeling Tasks

Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data

References

An algorithm for finding noun phrase correspondences in bilingual corpora

Surface grammatical analysis for the extraction of terminological noun phrases

A rule-based approach to prepositional phrase attachment disambiguation

Automatic grammar induction and parsing free text: a transformation-based approach

NPtool, a Detector of English Noun Phrases

Related Papers (5)

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

Building a large annotated corpus of English: the penn treebank

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

Long short-term memory