Text Chunking Using Transformation-Based Learning
Lance Ramshaw,Mitchell Marcus +1 more
- pp 157-176
TLDR
This work has shown that the transformation-based learning approach can be applied at a higher level of textual interpretation for locating chunks in the tagged text, including non-recursive “baseNP” chunks.Abstract:
Transformation-based learning, a technique introduced by Eric Brill (1993b), has been shown to do part-of-speech tagging with fairly high accuracy. This same method can be applied at a higher level of textual interpretation for locating chunks in the tagged text, including non-recursive “baseNP” chunks. For this purpose, it is convenient to view chunking as a tagging problem by encoding the chunk structure in new tags attached to each word. In automatic tests using Treebank-derived data, this technique achieved recall and precision rates of roughly 93% for baseNP chunks (trained on 950K words) and 88% for somewhat more complex chunks that partition the sentence (trained on 200K words). Working in this new application and with larger template and training sets has also required some interesting adaptations to the transformation-based learning approach.read more
Citations
More filters
Proceedings ArticleDOI
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
TL;DR: The CoNLL-2003 shared task on NER as mentioned in this paper was the first NER task with language-independent named entity recognition (NER) data sets and evaluation method, and a general overview of the systems that participated in the task and their performance.
Proceedings ArticleDOI
Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms
TL;DR: Experimental results on part-of-speech tagging and base noun phrase chunking are given, in both cases showing improvements over results for a maximum-entropy tagger.
Proceedings ArticleDOI
Shallow parsing with conditional random fields
Fei Sha,Fernando Pereira +1 more
TL;DR: This work shows how to train a conditional random field to achieve performance as good as any reported base noun-phrase chunking method on the CoNLL task, and better than any reported single model.
Proceedings ArticleDOI
An Analysis of Active Learning Strategies for Sequence Labeling Tasks
Burr Settles,Mark Craven +1 more
TL;DR: This paper surveys previously used query selection strategies for sequence models, and proposes several novel algorithms to address their shortcomings, and conducts a large-scale empirical comparison.
Proceedings ArticleDOI
Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data
TL;DR: In this paper, a generalization of linear-chain CRFs, called dynamic conditional random fields (DCRFs), is proposed, in which each time slice contains a set of state variables and edges and parameters are tied across slices.
References
More filters
Proceedings ArticleDOI
An algorithm for finding noun phrase correspondences in bilingual corpora
TL;DR: The paper describes an algorithm that employs English and French text taggers to associate noun phrases in an aligned bilingual corpus and provides an alternative to other approaches for finding word correspondences, with the advantage that linguistic structure is incorporated.
Proceedings ArticleDOI
Surface grammatical analysis for the extraction of terminological noun phrases
TL;DR: The type of analysis used (surface grammatical analysis) is highlighted, as the methodological approach adopted to adapt the rules (experimental approach).
Proceedings ArticleDOI
A rule-based approach to prepositional phrase attachment disambiguation
Eric D. Brill,Philip Resnik +1 more
TL;DR: The authors describe a new corpus-based approach to prepositional phrase attachment disambiguation, and present results comparing performance of this algorithm with other corpusbased approaches to this problem.
Proceedings ArticleDOI
Automatic grammar induction and parsing free text: a transformation-based approach
TL;DR: A transformational grammar is automatically learned that is capable of accurately parsing text into binary-branching syntactic trees with nonterminals unlabelled, and a set of simple structural transformations can be applied to reduce error.
Proceedings Article
NPtool, a Detector of English Noun Phrases
TL;DR: NPtool as mentioned in this paper is a fast and accurate system for extracting noun phrases from English texts for the purposes of e.g. information retrieval, translation unit discovery, and corpus studies, but it does not support the extraction of nouns from nouns.