Automatic acquisition of hyponyms from large text corpora
Marti A. Hearst
- pp 539-545
Reads0
Chats0
TLDR
A set of lexico-syntactic patterns that are easily recognizable, that occur frequently and across text genre boundaries, and that indisputably indicate the lexical relation of interest are identified.Abstract:
We describe a method for the automatic acquisition of the hyponymy lexical relation from unrestricted text. Two goals motivate the approach: (i) avoidance of the need for pre-encoded knowledge and (ii) applicability across a wide range of text. We identify a set of lexico-syntactic patterns that are easily recognizable, that occur frequently and across text genre boundaries, and that indisputably indicate the lexical relation of interest. We describe a method for discovering these patterns and suggest that other lexical relations will also be acquirable in this way. A subset of the acquisition algorithm is implemented and the results are used to augment and critique the structure of a large hand-built thesaurus. Extensions and applications to areas such as information retrieval are suggested.read more
Citations
More filters
Proceedings ArticleDOI
Distant supervision for relation extraction without labeled data
TL;DR: This work investigates an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size.
Journal ArticleDOI
A survey of named entity recognition and classification
David Nadeau,Satoshi Sekine +1 more
TL;DR: Observations about languages, named entity types, domains and textual genres studied in the literature, along with other critical aspects of NERC such as features and evaluation methods, are reported.
Proceedings ArticleDOI
Audio Set: An ontology and human-labeled dataset for audio events
Jort F. Gemmeke,Daniel P. W. Ellis,Dylan Freedman,Aren Jansen,Wade Lawrence,R. Channing Moore,Manoj Plakal,Marvin Ritter +7 more
TL;DR: The creation of Audio Set is described, a large-scale dataset of manually-annotated audio events that endeavors to bridge the gap in data availability between image and audio research and substantially stimulate the development of high-performance audio event recognizers.
Journal ArticleDOI
Word sense disambiguation: A survey
TL;DR: This work introduces the reader to the motivations for solving the ambiguity of words and provides a description of the task, and overviews supervised, unsupervised, and knowledge-based approaches.
Book
Ontology Learning for the Semantic Web
Alexander Maedche,Steffen Staab +1 more
TL;DR: The authors present an ontology learning framework that extends typical ontology engineering environments by using semiautomatic ontology construction tools and encompasses ontology import, extraction, pruning, refinement and evaluation.
References
More filters
Journal ArticleDOI
Introduction to WordNet: An On-line Lexical Database
TL;DR: Standard alphabetical procedures for organizing lexical information put together words that are spelled alike and scatter words with similar or related meanings haphazardly through the list.
Journal Article
Lexical cohesion computed by thesaural relations as an indicator of the structure of text
Jane Morris,Graeme Hirst +1 more
TL;DR: Since the lexical chains are computable, and exist in non-domain-specific text, they provide a valuable indicator of text structure, and provide a semantic context for interpreting words, concepts, and sentences.
Proceedings ArticleDOI
A Practical Part-of-Speech Tagger
TL;DR: An implementation of a part-of-speech tagger based on a hidden Markov model that enables robust and accurate tagging with few resource requirements and accuracy exceeds 96%.
Proceedings ArticleDOI
Noun classification from predicate-argument structures
TL;DR: The resulting quasi-semantic classification of nouns demonstrates the plausibility of the distributional hypothesis, and has potential application to a variety of tasks, including automatic indexing, resolving nominal compounds, and determining the scope of modification.
Book ChapterDOI
Providing machine tractable dictionary tools
TL;DR: This paper discusses three different but related large-scale computational methods to transform Mrds into Mtds, the Longman Dictionary of Contemporary English (Ldoce), which requires some handcoding of initial information but are largely automatic.