Nymble: a High-Performance Learning Name-finder

doi:10.3115/974557.974586

Open AccessProceedings ArticleDOI

Nymble: a High-Performance Learning Name-finder

Daniel M. Bikel, +3 more

- pp 194-201

Chats0

TLDR

This paper presents a statistical, learned approach to finding names and other nonrecursive entities in text (as per the MUC-6 definition of the NE task), using a variant of the standard hidden Markov model.

Abstract:

This paper presents a statistical, learned approach to finding names and other nonrecursive entities in text (as per the MUC-6 definition of the NE task), using a variant of the standard hidden Markov model. We present our justification for the problem and our approach, a detailed discussion of the model itself and finally the successful results of this new approach.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

A theory of learning from different domains

Shai Ben-David, +5 more

- 01 May 2010 -

Machine Learning

TL;DR: A classifier-induced divergence measure that can be estimated from finite, unlabeled samples from the domains and shows how to choose the optimal combination of source and target error as a function of the divergence, the sample sizes of both domains, and the complexity of the hypothesis class.

...read moreread less

Journal ArticleDOI

A survey of named entity recognition and classification

David Nadeau, +1 more

- 01 Jan 2007 -

Lingvisticae Investigationes

TL;DR: Observations about languages, named entity types, domains and textual genres studied in the literature, along with other critical aspects of NERC such as features and evaluation methods, are reported.

...read moreread less

Journal ArticleDOI

Head-Driven Statistical Models for Natural Language Parsing

Michael Collins

- 01 Dec 2003 -

Computational Linguistics

TL;DR: Three statistical models for natural language parsing are described, leading to approaches in which a parse tree is represented as the sequence of decisions corresponding to a head-centered, top-down derivation of the tree.

...read moreread less

Book

The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data

Ronen Feldman, +1 more

TL;DR: Providing an in-depth examination of core text mining and link detection algorithms and operations, this text examines advanced pre-processing techniques, knowledge representation considerations, and visualization approaches.

...read moreread less

Journal ArticleDOI

Automating the Construction of Internet Portals with Machine Learning

Andrew McCallum, +3 more

- 21 Jul 2000 -

Information Retrieval

TL;DR: New research in reinforcement learning, information extraction and text classification that enables efficient spidering, the identification of informative text segments, and the population of topic hierarchies are described.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Elements of information theory

Thomas M. Cover, +1 more

TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.

...read moreread less

Journal Article

Coping with ambiguity and unknown words through probabilistic models

Ralph Weischedel, +4 more

- 01 Jun 1993 -

Computational Linguistics

TL;DR: A new natural language system (PLUM) is constructed for extracting data from text, e.g., newswire text, based on results of experiments in predicting parts of speech of highly ambiguous words, predicting the intended interpretation of an utterance when more than one interpretation satisfies all known syntactic and semantic constraints.

...read moreread less

Proceedings ArticleDOI

SRI International FASTUS system: MUC-6 test results and analysis

Douglas E. Appelt, +7 more

TL;DR: SRI International participated in the MUC-6 evaluation using the latest version of SRI's FASTUS system as mentioned in this paper, which is a cascaded finite state transducers, each providing an additional level of analysis of the input and merging of the final results.

...read moreread less

Proceedings ArticleDOI

MITRE: description of the Alembic system used for MUC-6

John S. Aberdeen, +5 more

TL;DR: As with several other veteran MUC participants, MITRE's Alembic system has undergone a major transformation in the past two years.

...read moreread less

Proceedings ArticleDOI

BBN: description of the PLUM system as used for MUC-4

Damaris Ayuso, +5 more

TL;DR: BBN's PLUM system (Probabilistic Language Understanding Model) was developed as part of a DARPA-funded research effort on integrating probabilistic language models with more traditional linguistic techniques.

...read moreread less

Nymble: a High-Performance Learning Name-finder

Citations

A theory of learning from different domains

A survey of named entity recognition and classification

Head-Driven Statistical Models for Natural Language Parsing

The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data

Automating the Construction of Internet Portals with Machine Learning

References

Elements of information theory

Coping with ambiguity and unknown words through probabilistic models

SRI International FASTUS system: MUC-6 test results and analysis

MITRE: description of the Alembic system used for MUC-6

BBN: description of the PLUM system as used for MUC-4

Related Papers (5)

Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

A survey of named entity recognition and classification

Message Understanding Conference-6: a brief history

Unsupervised Models for Named Entity Classification