Showing papers on "Conditional random field published in 2002"

PDF

Open Access

Proceedings Article•DOI•

Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

[...]

06 Jul 2002

TL;DR: Experimental results on part-of-speech tagging and base noun phrase chunking are given, in both cases showing improvements over results for a maximum-entropy tagger.

...read moreread less

Abstract: We describe new algorithms for training tagging models, as an alternative to maximum-entropy models or conditional random fields (CRFs). The algorithms rely on Viterbi decoding of training examples, combined with simple additive updates. We describe theory justifying the algorithms through a modification of the proof of convergence of the perceptron algorithm for classification problems. We give experimental results on part-of-speech tagging and base noun phrase chunking, in both cases showing improvements over results for a maximum-entropy tagger.

...read moreread less

2,221 citations

Book Chapter•DOI•

Machine Learning for Sequential Data: A Review

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

06 Aug 2002-Lecture Notes in Computer Science

TL;DR: This paper formalizes the principal learning tasks and describes the methods that have been developed within the machine learning research community for addressing these problems, including sliding window methods, recurrent sliding windows, hidden Markov models, conditional random fields, and graph transformer networks.

...read moreread less

Abstract: Statistical learning problems in many fields involve sequential data. This paper formalizes the principal learning tasks and describes the methods that have been developed within the machine learning research community for addressing these problems. These methods include sliding window methods, recurrent sliding windows, hidden Markov models, conditional random fields, and graph transformer networks. The paper also discusses some open research issues.

...read moreread less

698 citations

Proceedings Article•

Efficiently inducing features of conditional random fields

[...]

Andrew McCallum¹•Institutions (1)

University of Massachusetts Amherst¹

07 Aug 2002

TL;DR: In this article, an efficient feature induction method for CRFs is presented, based on the principle of iteratively constructing feature conjunctions that would significantly increase conditional log-likelihood if added to the model.

...read moreread less

Abstract: Conditional Random Fields (CRFs) are undirected graphical models, a special case of which correspond to conditionally-trained finite state machines. A key advantage of CRFs is their great flexibility to include a wide variety of arbitrary, non-independent features of the input. Faced with this freedom, however, an important question remains: what features should be used? This paper presents an efficient feature induction method for CRFs. The method is founded on the principle of iteratively constructing feature conjunctions that would significantly increase conditional log-likelihood if added to the model. Automated feature induction enables not only improved accuracy and dramatic reduction in parameter count, but also the use of larger cliques, and more freedom to liberally hypothesize atomic input variables that may be relevant to a task. The method applies to linear-chain CRFs, as well as to more arbitrary CRF structures, such as Relational Markov Networks, where it corresponds to learning clique templates, and can also be understood as supervised structure learning. Experimental results on named entity extraction and noun phrase segmentation tasks are presented.

...read moreread less

459 citations

Efficient Training of Conditional Random Fields

[...]

Hanna Wallach

01 Jan 2002

TL;DR: This thesis explores a number of parameter estimation techniques for conditional random fields, a recently introduced probabilistic model for labelling and segmenting sequential data, and hypothesises that general numerical optimisation techniques result in improved performance over iterative scaling algorithms for training CRFs.

...read moreread less

Abstract: This thesis explores a number of parameter estimation techniques for conditional random fields, a recently introduced [31] probabilistic model for labelling and segmenting sequential data. Theoretical and practical disadvantages of the training techniques reported in current literature on CRFs are discussed. We hypothesise that general numerical optimisation techniques result in improved performance over iterative scaling algorithms for training CRFs. Experiments run on a a subset of a well-known text chunking data set [28] confirm that this is indeed the case. This is a highly promising result, indicating that such parameter estimation techniques make CRFs a practical and efficient choice for labelling sequential data, as well as a theoretically sound and principled probabilistic framework.

...read moreread less

164 citations

Proceedings Article•

Discriminative Learning for Label Sequences via Boosting

[...]

Yasemin Altun¹, Thomas Hofmann¹, Mark Johnson¹•Institutions (1)

Brown University¹

01 Jan 2002

TL;DR: The proposed sequence boosting algorithm offers an interesting alternative to methods based on HMMs and the more recently proposed Conditional Random Fields and is attractive both, conceptually and computationally.

...read moreread less

Abstract: This paper investigates a boosting approach to discriminative learning of label sequences based on a sequence rank loss function. The proposed method combines many of the advantages of boosting schemes with the efficiency of dynamic programming methods and is attractive both, conceptually and computationally. In addition, we also discuss alternative approaches based on the Hamming loss for label sequences. The sequence boosting algorithm offers an interesting alternative to methods based on HMMs and the more recently proposed Conditional Random Fields. Applications areas for the presented technique range from natural language processing and information extraction to computational biology. We include experiments on named entity recognition and part-of-speech tagging which demonstrate the validity and competitiveness of our approach.

...read moreread less

59 citations