Open AccessProceedings Article
Interactive information extraction with constrained conditional random fields
Trausti Kristjansson,Aron Culotta,Paul A. Viola,Andrew McCallum +3 more
- pp 412-418
Reads0
Chats0
TLDR
This work applies a constrained Viterbi decoding which finds the optimal field assignments consistent with the fields explicitly specified or corrected by the user; and a mechanism for estimating the confidence of each extracted field, so that low-confidence extractions can be highlighted.Abstract:
Information Extraction methods can be used to automatically "fill-in" database forms from unstructured data such as Web documents or email. State-of-the-art methods have achieved low error rates but invariably make a number of errors. The goal of an interactive information extraction system is to assist the user in filling in database fields while giving the user confidence in the integrity of the data. The user is presented with an interactive interface that allows both the rapid verification of automatic field assignments and the correction of errors. In cases where there are multiple errors, our system takes into account user corrections, and immediately propagates these constraints such that other fields are often corrected automatically.
Linear-chain conditional random fields (CRFs) have been shown to perform well for information extraction and other language modelling tasks due to their ability to capture arbitrary, overlapping features of the input in a Markov model. We apply this framework with two extensions: a constrained Viterbi decoding which finds the optimal field assignments consistent with the fields explicitly specified or corrected by the user; and a mechanism for estimating the confidence of each extracted field, so that low-confidence extractions can be highlighted. Both of these mechanisms are incorporated in a novel user interface for form filling that is intuitive and speeds the entry of data--providing a 23% reduction in error due to automated corrections.read more
Citations
More filters
Proceedings ArticleDOI
ArnetMiner: extraction and mining of academic social networks
TL;DR: The architecture and main features of the ArnetMiner system, which aims at extracting and mining academic social networks, are described and a unified modeling approach to simultaneously model topical aspects of papers, authors, and publication venues is proposed.
Proceedings ArticleDOI
Hidden Conditional Random Fields for Gesture Recognition
TL;DR: This paper derives a discriminative sequence model with a hidden state structure, and demonstrates its utility both in a detection and in a multi-way classification formulation.
Journal ArticleDOI
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields
Lin Liao,Dieter Fox,Henry Kautz +2 more
TL;DR: The proposed system is able to robustly estimate a person’s activities using a model that is trained from data collected by other persons, and shows significant improvements over existing techniques.
ReportDOI
Reducing labeling effort for structured prediction tasks
Aron Culotta,Andrew McCallum +1 more
TL;DR: A new active learning paradigm is proposed which reduces not only how many instances the annotator must label, but also how difficult each instance is to annotate, which can vary widely in structured prediction tasks.
Journal ArticleDOI
Active learning for logistic regression: an evaluation
Andrew I. Schein,Lyle H. Ungar +1 more
TL;DR: A re-derive of the variance reduction method known in experimental design circles as ‘A-optimality’ and comparisons against different variations of the most widely used heuristic schemes are run to discover which methods work best for different classes of problems and why.
References
More filters
Journal ArticleDOI
A tutorial on hidden Markov models and selected applications in speech recognition
TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.
Proceedings Article
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.
Proceedings ArticleDOI
Shallow parsing with conditional random fields
Fei Sha,Fernando Pereira +1 more
TL;DR: This work shows how to train a conditional random field to achieve performance as good as any reported base noun-phrase chunking method on the CoNLL task, and better than any reported single model.
Proceedings ArticleDOI
Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons
Andrew McCallum,Wei Li +1 more
TL;DR: This work has shown that conditionally-trained models, such as conditional maximum entropy models, handle inter-dependent features of greedy sequence modeling in NLP well.