scispace - formally typeset
Open AccessProceedings ArticleDOI

Automatic acquisition of hyponyms from large text corpora

Reads0
Chats0
TLDR
A set of lexico-syntactic patterns that are easily recognizable, that occur frequently and across text genre boundaries, and that indisputably indicate the lexical relation of interest are identified.
Abstract
We describe a method for the automatic acquisition of the hyponymy lexical relation from unrestricted text. Two goals motivate the approach: (i) avoidance of the need for pre-encoded knowledge and (ii) applicability across a wide range of text. We identify a set of lexico-syntactic patterns that are easily recognizable, that occur frequently and across text genre boundaries, and that indisputably indicate the lexical relation of interest. We describe a method for discovering these patterns and suggest that other lexical relations will also be acquirable in this way. A subset of the acquisition algorithm is implemented and the results are used to augment and critique the structure of a large hand-built thesaurus. Extensions and applications to areas such as information retrieval are suggested.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Distant supervision for relation extraction without labeled data

TL;DR: This work investigates an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size.
Journal ArticleDOI

A survey of named entity recognition and classification

TL;DR: Observations about languages, named entity types, domains and textual genres studied in the literature, along with other critical aspects of NERC such as features and evaluation methods, are reported.
Proceedings ArticleDOI

Audio Set: An ontology and human-labeled dataset for audio events

TL;DR: The creation of Audio Set is described, a large-scale dataset of manually-annotated audio events that endeavors to bridge the gap in data availability between image and audio research and substantially stimulate the development of high-performance audio event recognizers.
Journal ArticleDOI

Word sense disambiguation: A survey

TL;DR: This work introduces the reader to the motivations for solving the ambiguity of words and provides a description of the task, and overviews supervised, unsupervised, and knowledge-based approaches.
Book

Ontology Learning for the Semantic Web

TL;DR: The authors present an ontology learning framework that extends typical ontology engineering environments by using semiautomatic ontology construction tools and encompasses ontology import, extraction, pruning, refinement and evaluation.
References
More filters
Journal ArticleDOI

Introduction to WordNet: An On-line Lexical Database

TL;DR: Standard alphabetical procedures for organizing lexical information put together words that are spelled alike and scatter words with similar or related meanings haphazardly through the list.
Journal Article

Lexical cohesion computed by thesaural relations as an indicator of the structure of text

TL;DR: Since the lexical chains are computable, and exist in non-domain-specific text, they provide a valuable indicator of text structure, and provide a semantic context for interpreting words, concepts, and sentences.
Proceedings ArticleDOI

A Practical Part-of-Speech Tagger

TL;DR: An implementation of a part-of-speech tagger based on a hidden Markov model that enables robust and accurate tagging with few resource requirements and accuracy exceeds 96%.
Proceedings ArticleDOI

Noun classification from predicate-argument structures

TL;DR: The resulting quasi-semantic classification of nouns demonstrates the plausibility of the distributional hypothesis, and has potential application to a variety of tasks, including automatic indexing, resolving nominal compounds, and determining the scope of modification.
Book ChapterDOI

Providing machine tractable dictionary tools

TL;DR: This paper discusses three different but related large-scale computational methods to transform Mrds into Mtds, the Longman Dictionary of Contemporary English (Ldoce), which requires some handcoding of initial information but are largely automatic.