Proceedings ArticleDOI
Hybrid framework for information extraction for geographical terms in Hindi language texts
Kamlesh Dutta,Nupur Prakash,Saroj Kaushik +2 more
- pp 577-581
TLDR
A hybrid information extraction (IE) framework based on geographical term detection approach has been developed to extract geographical information from an unrestricted Hindi text and the relationship between geographical entities extracted with the adjacent text is shown graphically so that information about these entities can be related.Abstract:
A hybrid information extraction (IE) framework based on geographical term detection approach has been developed to extract geographical information from an unrestricted Hindi text The relationship between geographical entities extracted with the adjacent text is shown graphically so that information about these entities can be related The system, a combination of statistically and linguistically motivated techniques, identifies single geographical names and multiple geographical names as well The method is applied on Hindi language text, but the approach can be adapted for other languages also The paper presents some experiments illustrating the accuracy of the method The system being developed is in a prototype stage and will be extended to include relation mark-up as wellread more
Citations
More filters
Journal ArticleDOI
Anaphora Resolution in Hindi: Issues and Challenges
TL;DR: Pronoun resolution in context of EHMT (English-Hindi Machine Translation) systems is demonstrated to substantiate the need of anaphora resolution for NLP application.
Book ChapterDOI
A Practical Approach to Extracting Names of Geographical Entities and Their Relations from the Web
Cungen Cao,Shi Wang,Lin Jiang +2 more
TL;DR: Experimental results show that the OMKast-Googling system has a satisfactory performance both in the entity name extraction and relation extraction.
References
More filters
Journal ArticleDOI
Information extraction
Jim Cowie,Wendy G. Lehnert +1 more
TL;DR: A relatively new development—information extraction (IE)—is the subject of this article and can transform the raw material, refining and reducing it to a germ of the original text.
Journal ArticleDOI
Automatic recognition of multi-word terms:. the C-value/NC-value method
TL;DR: This paper presents a domain-independent method for the automatic extraction of multi-word terms, from machine-readable special language corpora, using C-value/NC-value, which enhances the common statistical measure of frequency of occurrence for term extraction, making it sensitive to a particular type ofMulti- word terms, the nested terms.
Book
Information Extraction
TL;DR: This paper discusses attempts to derive templates directly from corpora; to derive knowledge structures and lexicons directly from Corpora, including discussion of the recent LE project ECRAN which attempted to tune existing lexicons to new corpora.
Proceedings ArticleDOI
Termight: Identifying and Translating Technical Terminology
Ido Dagan,Kenneth Church +1 more
TL;DR: A semi-automatic tool that helps professional translators and terminologists identify technical terms and their translations using part-of-speech tagging and word-alignment programs in an interface designed to minimize keystrokes is proposed.
Journal ArticleDOI
The interaction of knowledge sources in word sense disambiguation
Mark Stevenson,Yorick Wilks +1 more
TL;DR: This work presents a sense tagger which uses several knowledge sources and attempts to disambiguate all content words in running text rather than limiting itself to treating a restricted vocabulary of words.