scispace - formally typeset
Search or ask a question
Topic

Document retrieval

About: Document retrieval is a research topic. Over the lifetime, 6821 publications have been published within this topic receiving 214383 citations.


Papers
More filters
Proceedings Article
01 Jan 1999
TL;DR: The NIST Text REtrieval Conference (TREC) SDR Track as mentioned in this paper has provided an infrastructure for the development and evaluation of SDR technology and a common forum for the exchange of knowledge between the speech recognition and information retrieval research communities.
Abstract: This paper describes work within the NIST Text REtrieval Conference (TREC) over the last three years in designing and implementing evaluations of Spoken Document Retrieval (SDR) technology within a broadcast news domain. SDR involves the search and retrieval of excerpts from spoken audio recordings using a combination of automatic speech recognition and information retrieval technologies. The TREC SDR Track has provided an infrastructure for the development and evaluation of SDR technology and a common forum for the exchange of knowledge between the speech recognition and information retrieval research communities. The SDR Track can be declared a success in that it has provided objective, demonstrable proof that this technology can be successfully applied to realistic audio collections using a combination of existing technologies and that it can be objectively evaluated. The design and implementation of each of the SDR evaluations are presented and the results are summarized. Plans for the 2000 TREC SDR Track are presented and thoughts about how the track might evolve are discussed.

97 citations

01 Jan 1991
TL;DR: An overview of a retrieval model that is based on probabilistic inference networks is given and simplifications that allow to buid and evaluate networks efficiently, even with very large collections are described.
Abstract: Probabilistic inference techniques have been shown to significantly improve retrieval performance when compare to conventional retrieval models, but their use can be prohibitely expensive for large collections. We give an overview of a retrieval model that is based on probabilistic inference networks and describe simplifications that allow to buid and evaluate networks efficiently, even with very large collections

97 citations

Journal ArticleDOI
TL;DR: This study recommends query expansion using retrieval feedback for adding McSH search terms to a user's initial query.
Abstract: This paper evaluates the retrieval effectiveness of query expansion strategies on a MEDLINE test collection using Cornell University's SMART retrieval system. Three expansion strategies are tested on their ability to identify appropriate McSH terms for user queries: expansion using an inter-field statistical thesaurus, expansion via retrieval feedback and expansion using a combined approach. These expansion strategies do not require prior relevance decisions. The study compares retrieval effectiveness using the original unexpanded and the alternative expanded user queries on a collection of 75 queries and 2334 MEDLINE citations. Retrieval effectiveness is assessed using eleven point average precision scores (11-AvgP). The combination of expansion using the thesaurus followed by retrieval feedback gives the best improvement of 17% over a baseline performance of 0.5169 11-AvgP. However this improvement is almost identical to that achieved by expansion via retrieval feedback (16.4%). Query expansion using the inter-field thesaurus gives a significant but lower performance improvement (9.9%) over the same baseline. This study recommends query expansion using retrieval feedback for adding McSH search terms to a user's initial query.

97 citations

Journal ArticleDOI
01 Nov 2007
TL;DR: The system framework and some key techniques of content-based 3D model retrieval are identified and explained, including canonical coordinate normalization and preprocessing, feature extraction, similarity match, query representation and user interface, and performance evaluation.
Abstract: As the number of available 3D models grows, there is an increasing need to index and retrieve them according to their contents. This paper provides a survey of the up-to-date methods for content-based 3D model retrieval. First, the new challenges encountered in 3D model retrieval are discussed. Then, the system framework and some key techniques of content-based 3D model retrieval are identified and explained, including canonical coordinate normalization and preprocessing, feature extraction, similarity match, query representation and user interface, and performance evaluation. In particular, similarity measures using semantic clues and machine learning methods, as well as retrieval approaches using nonshape features, are given adequate recognition as improvements and complements for traditional shape-matching techniques. Typical 3D model retrieval systems and search engines are also listed and compared. Finally, future research directions are indicated, and an extensive bibliography is provided.

97 citations

Journal ArticleDOI
Roy Davies1
TL;DR: This paper reviews previous work on producing knowledge by information retrieval or classification and describes techniques by which hidden knowledge may be retrieved, e.g. serendipity in browsing, use of appropriate search strategies and, possibly in the future, methods based on Farradane's relational indexing or artificial intelligence.
Abstract: Knowledge can be created by drawing inferences from what is already known. Often some of the requisite information is lacking and has to be gathered by whatever research techniques are appropriate, e.g. experiments, surveys etc. Even if the information has all been published already, unless it is retrieved no inferences will be drawn from it and consequently there will exist some knowledge that is implicit in the literature and yet is not known by anyone. This ‘undiscovered public knowledge’, as it is termed by Swanson, may exist in the following forms: (i) a hidden refutation or qualification of a hypothesis; (ii) an undrawn conclusion from two or more premises; (iii) the cumulative evidence of weak, independent tests; (iv) solutions to analogous problems; (v) hidden correlations between factors. Methods of classification may also play a direct role in the creation of original knowledge. Novel solutions to problems may be discovered by generating different combinations of the basic features of the solutions, as is done in morphological analysis. Alternatively a natural classification may identify gaps in existing knowledge. This paper reviews previous work on producing knowledge by information retrieval or classification and describes techniques by which hidden knowledge may be retrieved, e.g. serendipity in browsing, use of appropriate search strategies and, possibly in the future, methods based on Farradane's relational indexing or artificial intelligence.

97 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
81% related
Metadata
43.9K papers, 642.7K citations
79% related
Recommender system
27.2K papers, 598K citations
79% related
Ontology (information science)
57K papers, 869.1K citations
78% related
Natural language
31.1K papers, 806.8K citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20239
202239
2021107
2020130
2019144
2018111