scispace - formally typeset
Proceedings ArticleDOI

Hybrid framework for information extraction for geographical terms in Hindi language texts

TLDR
A hybrid information extraction (IE) framework based on geographical term detection approach has been developed to extract geographical information from an unrestricted Hindi text and the relationship between geographical entities extracted with the adjacent text is shown graphically so that information about these entities can be related.
Abstract
A hybrid information extraction (IE) framework based on geographical term detection approach has been developed to extract geographical information from an unrestricted Hindi text The relationship between geographical entities extracted with the adjacent text is shown graphically so that information about these entities can be related The system, a combination of statistically and linguistically motivated techniques, identifies single geographical names and multiple geographical names as well The method is applied on Hindi language text, but the approach can be adapted for other languages also The paper presents some experiments illustrating the accuracy of the method The system being developed is in a prototype stage and will be extended to include relation mark-up as well

read more

Citations
More filters
Journal ArticleDOI

Anaphora Resolution in Hindi: Issues and Challenges

TL;DR: Pronoun resolution in context of EHMT (English-Hindi Machine Translation) systems is demonstrated to substantiate the need of anaphora resolution for NLP application.
Book ChapterDOI

A Practical Approach to Extracting Names of Geographical Entities and Their Relations from the Web

TL;DR: Experimental results show that the OMKast-Googling system has a satisfactory performance both in the entity name extraction and relation extraction.
References
More filters
Journal ArticleDOI

Information extraction

TL;DR: A relatively new development—information extraction (IE)—is the subject of this article and can transform the raw material, refining and reducing it to a germ of the original text.
Journal ArticleDOI

Automatic recognition of multi-word terms:. the C-value/NC-value method

TL;DR: This paper presents a domain-independent method for the automatic extraction of multi-word terms, from machine-readable special language corpora, using C-value/NC-value, which enhances the common statistical measure of frequency of occurrence for term extraction, making it sensitive to a particular type ofMulti- word terms, the nested terms.
Book

Information Extraction

TL;DR: This paper discusses attempts to derive templates directly from corpora; to derive knowledge structures and lexicons directly from Corpora, including discussion of the recent LE project ECRAN which attempted to tune existing lexicons to new corpora.
Proceedings ArticleDOI

Termight: Identifying and Translating Technical Terminology

TL;DR: A semi-automatic tool that helps professional translators and terminologists identify technical terms and their translations using part-of-speech tagging and word-alignment programs in an interface designed to minimize keystrokes is proposed.
Journal ArticleDOI

The interaction of knowledge sources in word sense disambiguation

TL;DR: This work presents a sense tagger which uses several knowledge sources and attempts to disambiguate all content words in running text rather than limiting itself to treating a restricted vocabulary of words.
Related Papers (5)