Topic

Document retrieval

About: Document retrieval is a research topic. Over the lifetime, 6821 publications have been published within this topic receiving 214383 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Interactive text retrieval based on document similarities

[...]

Aljoscha Klose¹, Andreas Nürnberger¹, Rudolf Kruse¹, G. K. Hartmann², M. L. Richards² - Show less +1 more•Institutions (2)

Otto-von-Guericke University Magdeburg¹, Max Planck Society²

01 Jan 2000-Physics and Chemistry of The Earth Part A-solid Earth and Geodesy

TL;DR: A prototypical implementation of a software tool for document retrieval which groups/arranges (pre-processed) documents based on a similarity measure to help a user to navigate through similar documents.

...read moreread less

Abstract: In this article we present a prototypical implementation of a software tool for document retrieval which groups/arranges (pre-processed) documents based on a similarity measure. The prototype was developed based on self-organising maps to realise interactive associative search and visual exploration of document databases. This helps a user to navigate through similar documents. The navigation, especially the search for the first appropriate document, is supported by conventional keyword search methods. The usability of the presented approach is shown by a sample search.

...read moreread less

41 citations

Journal Article•DOI•

Fact Retrieval and Deductive Question-Answering Information Retrieval Systems

[...]

William S. Cooper¹•Institutions (1)

IBM¹

01 Apr 1964-Journal of the ACM

TL;DR: The problem of developing systems of logical inference for natural languages is discussed, and an example of such an analysis of a sublanguage of English is presented.

...read moreread less

Abstract: Information Retrieval systems may be classified either as Document Retrieval systems or Fact Retrieval systems. It is contended that at least some of the latter will require the capability for performing logical deductions among natural language sentences. The problem of developing systems of logical inference for natural languages is discussed, and an example of such an analysis of a sublanguage of English is presented. An experimental Fact Retrieval system which incorporates this analysis has been programmed for the IBM 7090 computer, and its main algorithms are stated.

...read moreread less

41 citations

Journal Article•DOI•

Within-Document Retrieval: A User-Centred Evaluation of Relevance Profiling

[...]

David J. Harper¹, Ivan Koychev¹, Yixing Sun¹, Iain Pirie¹•Institutions (1)

Robert Gordon University¹

01 Sep 2004

TL;DR: The study indicates that Profile Skim was as least as effective as FindSkim in identifying relevant pages, as measured by traditional information retrieval measures, and there is some evidence that ProfileSkim is a precision-enhancing tool.

...read moreread less

Abstract: We present a user-centred, task-oriented, comparative evaluation of two within-document retrieval tools. ProfileSkim computes a relevance profile for a document with respect to a query, and presents the profile as an interactive bar graph. FindSkim provides similar functionality to the web browser “Find” command. A novel simulated work task was devised, where participants are asked to identify (index) relevant pages of an electronic book, given topics from the existing book index. The original book index provides the ground truth, against which the indexing results of the participants can be compared. We confirmed a major hypothesis, namely ProfileSkim proved significantly more efficient than Find-Skim, as measured by time for task. The study indicates that ProfileSkim was as least as effective as FindSkim in identifying relevant pages, as measured by traditional information retrieval measures, and there is some evidence that ProfileSkim is a precision-enhancing tool. Based on qualitative data from questionnaires, we also provide strong evidence to support our conjecture that the participants would be more satisfied when using ProfileSkim than FindSkim. The experimental study confirmed the potential of relevance profiling for improving within-document retrieval. Relevance profiling should prove highly beneficial for users trying to identify relevant information within long documents.

...read moreread less

41 citations

Proceedings Article•DOI•

Ranking document clusters using markov random fields

[...]

Fiana Raiber¹, Oren Kurland¹•Institutions (1)

Technion – Israel Institute of Technology¹

28 Jul 2013

TL;DR: This work presents a novel cluster ranking approach that utilizes Markov Random Fields (MRFs), and shows that it significantly outperforms state-of- the-art cluster ranking methods and can be used to improve the performance of results-diversification methods.

...read moreread less

Abstract: An important challenge in cluster-based document retrieval is ranking document clusters by their relevance to the query. We present a novel cluster ranking approach that utilizes Markov Random Fields (MRFs). MRFs enable the integration of various types of cluster-relevance evidence; e.g., the query-similarity values of the cluster's documents and query-independent measures of the cluster. We use our method to re-rank an initially retrieved document list by ranking clusters that are created from the documents most highly ranked in the list. The resultant retrieval effectiveness is substantially better than that of the initial list for several lists that are produced by effective retrieval methods. Furthermore, our cluster ranking approach significantly outperforms state-of- the-art cluster ranking methods. We also show that our method can be used to improve the performance of (state-of- the-art) results-diversification methods.

...read moreread less

41 citations

English to Korean Statistical Transliteration for Information Retrieval

[...]

Jae Sung Lee¹•Institutions (1)

KAIST¹

01 Jan 2008

TL;DR: A language independent Statistical Transliteration Model (STM) that learns rules automatically from word-aligned pairs in order to generate transliteration variations and a hybrid method that is more effective to generate various transliterations and consequently to retrieve more relevant documents is proposed.

...read moreread less

Abstract: In Korean technical documents, many English words are transliterated into Korean in various ways. Most of these words are technical terms and proper nouns that are frequently used as query terms in information retrieval systems. As the communication with foreigners increases, an automatic transliteration system is needed to find the various transliterations for the cross lingual information systems, especially for the proper nouns and technical terms which are not registered in the dictionary. In this paper, we present a language independent Statistical Transliteration Model (STM) that learns rules automatically from word-aligned pairs in order to generate transliteration variations. For the transliteration from English to Korean, we compared two methods based on STM: the pivot method and the direct method. In the pivot method, the transliteration is done in two steps: converting English words into pronunciation symbols by using the STM and then converting these symbols into Korean words by using the Korean standard conversion rule. In the direct method, English words are directly converted to Korean words by using the STM without intermediate steps. After comparing the performance of the two methods, we propose a hybrid method that is more effective to generate various transliterations and consequently to retrieve more relevant documents.

...read moreread less

41 citations

Collapse

Network Information

Performance

Metrics

6,866

Papers

224,605

Citations

No. of papers in the topic in previous years
Year	Papers
2023	9
2022	39
2021	107
2020	130
2019	144
2018	111

Document retrieval

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics