scispace - formally typeset
Open AccessJournal ArticleDOI

Translingual information retrieval: learning from bilingual corpora

Reads0
Chats0
TLDR
The results show that using bilingual corpora for automated extraction of term equivalences in context outperforms dictionarybased methods and is comparable to that of other statistical corpus-based methods.
About
This article is published in Artificial Intelligence.The article was published on 1998-08-01 and is currently open access. It has received 107 citations till now. The article focuses on the topics: Relevance (information retrieval) & Generalized vector space model.

read more

Citations
More filters
Journal ArticleDOI

Ontology learning and its application to automated terminology translation

TL;DR: The OntoLearn system is an infrastructure for automated ontology learning from domain text that uses natural language processing and machine learning techniques, and is part of a more general ontology engineering architecture.
Proceedings ArticleDOI

Cross-language information retrieval based on parallel texts and automatic mining of parallel texts from the Web

TL;DR: It is shown that using a probabilistic model, it is able to obtain performances close to those using an MT system, and the possibility of automatically gather parallel texts from the Web in an attempt to construct a reasonable training corpus is investigated.
Journal ArticleDOI

Cross-language plagiarism detection

TL;DR: The results of the evaluation indicate that CL-CNG, despite its simple approach, is the best choice to rank and compare texts across languages if they are syntactically related.

Automatic Cross-Language Retrieval Using Latent Semantic Indexing

TL;DR: A method for fully automated cross-language document retrieval in which no query translation is required and this automatic method performs comparably to a retrieval method based on machine translation (MT-LSI).
Proceedings ArticleDOI

An empirical study of required dimensionality for large-scale latent semantic indexing applications

TL;DR: The results suggest that there is something of an 'island of stability' in the k = 300 to 500 range, and indicate thatthere is relatively little room to employ k values outside of this range without incurring significant distortions in at least some term-term correlations.
References
More filters
Journal ArticleDOI

Indexing by Latent Semantic Analysis

TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Journal ArticleDOI

Improving Retrieval Performance by Relevance Feedback

TL;DR: Relevance feedback is an automatic process, introduced over 20 years ago, designed to produce query formulations following an initial retrieval operation to demonstrate the effectiveness of the various methods.
Proceedings ArticleDOI

OHSUMED: an interactive retrieval evaluation and new large test collection for research

TL;DR: A series of information retrieval experiments was carried out with a computer installed in a medical practice setting for relatively inexperienced physician end-users using a commercial MEDLINE product based on the vector space model, finding that these physicians searched just as effectively as more experienced searchers using Boolean searching.
Proceedings Article

Automatic Query expansion using SMART : TREC 3

TL;DR: This work continues the work in TREC 3, performing runs in the routing, ad-hoc, and foreign language environments, with a major focus on massive query expansion, adding from 300 to 530 terms to each query.
Related Papers (5)