scispace - formally typeset
Open AccessBook ChapterDOI

Text Chunking Using Transformation-Based Learning

Lance Ramshaw, +1 more
- pp 157-176
TLDR
This work has shown that the transformation-based learning approach can be applied at a higher level of textual interpretation for locating chunks in the tagged text, including non-recursive “baseNP” chunks.
Abstract
Transformation-based learning, a technique introduced by Eric Brill (1993b), has been shown to do part-of-speech tagging with fairly high accuracy. This same method can be applied at a higher level of textual interpretation for locating chunks in the tagged text, including non-recursive “baseNP” chunks. For this purpose, it is convenient to view chunking as a tagging problem by encoding the chunk structure in new tags attached to each word. In automatic tests using Treebank-derived data, this technique achieved recall and precision rates of roughly 93% for baseNP chunks (trained on 950K words) and 88% for somewhat more complex chunks that partition the sentence (trained on 200K words). Working in this new application and with larger template and training sets has also required some interesting adaptations to the transformation-based learning approach.

read more

Citations
More filters
Proceedings ArticleDOI

Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

TL;DR: The CoNLL-2003 shared task on NER as mentioned in this paper was the first NER task with language-independent named entity recognition (NER) data sets and evaluation method, and a general overview of the systems that participated in the task and their performance.
Proceedings ArticleDOI

Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

TL;DR: Experimental results on part-of-speech tagging and base noun phrase chunking are given, in both cases showing improvements over results for a maximum-entropy tagger.
Proceedings ArticleDOI

Shallow parsing with conditional random fields

TL;DR: This work shows how to train a conditional random field to achieve performance as good as any reported base noun-phrase chunking method on the CoNLL task, and better than any reported single model.
Proceedings ArticleDOI

An Analysis of Active Learning Strategies for Sequence Labeling Tasks

TL;DR: This paper surveys previously used query selection strategies for sequence models, and proposes several novel algorithms to address their shortcomings, and conducts a large-scale empirical comparison.
Proceedings ArticleDOI

Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data

TL;DR: In this paper, a generalization of linear-chain CRFs, called dynamic conditional random fields (DCRFs), is proposed, in which each time slice contains a set of state variables and edges and parameters are tied across slices.
References
More filters
Proceedings ArticleDOI

An algorithm for finding noun phrase correspondences in bilingual corpora

TL;DR: The paper describes an algorithm that employs English and French text taggers to associate noun phrases in an aligned bilingual corpus and provides an alternative to other approaches for finding word correspondences, with the advantage that linguistic structure is incorporated.
Proceedings ArticleDOI

Surface grammatical analysis for the extraction of terminological noun phrases

TL;DR: The type of analysis used (surface grammatical analysis) is highlighted, as the methodological approach adopted to adapt the rules (experimental approach).
Proceedings ArticleDOI

A rule-based approach to prepositional phrase attachment disambiguation

TL;DR: The authors describe a new corpus-based approach to prepositional phrase attachment disambiguation, and present results comparing performance of this algorithm with other corpusbased approaches to this problem.
Proceedings ArticleDOI

Automatic grammar induction and parsing free text: a transformation-based approach

TL;DR: A transformational grammar is automatically learned that is capable of accurately parsing text into binary-branching syntactic trees with nonterminals unlabelled, and a set of simple structural transformations can be applied to reduce error.
Proceedings Article

NPtool, a Detector of English Noun Phrases

TL;DR: NPtool as mentioned in this paper is a fast and accurate system for extracting noun phrases from English texts for the purposes of e.g. information retrieval, translation unit discovery, and corpus studies, but it does not support the extraction of nouns from nouns.
Related Papers (5)