scispace - formally typeset
Search or ask a question
Topic

Document retrieval

About: Document retrieval is a research topic. Over the lifetime, 6821 publications have been published within this topic receiving 214383 citations.


Papers
More filters
Proceedings Article
01 Jan 1997
TL;DR: Evidence is presented that speech-driven IR can be effective, although with a reduced precision, and it is found that longer spoken queries produce higher precision retrieval than shorter queries.
Abstract: We report the results of three experiments using the errorful output of a large vocabulary continuous speech recognition (LVCSR) system as the input to a statistical information retrieval (IR) system. Our goal is to allow a user to speak, rather than type, query terms into an IR engine and still obtain relevant documents. The purpose of these experiments is to test whether IR systems are robust to errors in the query terms introduced by the speech recognizer. If the correctly recognized words in the search query outweigh the misinformation from the incorrectly recognized words, the relevant documents will still be retrieved. This paper presents evidence that speech-driven IR can be effective, although with a reduced precision. We also find that longer spoken queries produce higher precision retrieval than shorter queries. For queries containing many (50-60) search terms and a recognizer word error rate (WER) of 27.9%, the precision at 30 documents retrieved is degraded by only 11.1%. For roughly the same WER, however, we find that queries shorter than 10-15 words suffer more than a 30% loss of precision.

64 citations

Journal ArticleDOI
TL;DR: This article summarizes the evaluation studies that have been done with SAPHIRE, highlighting the lessons learned and laying out the challenges ahead to all medical information retrieval efforts.
Abstract: Information retrieval systems are being used increasingly in biomedical settings, but many problems still exist in indexing, retrieval, and evaluation. The SAPHIRE Project was undertaken to seek solutions for these problems. This article summarizes the evaluation studies that have been done with SAPHIRE, highlighting the lessons learned and laying out the challenges ahead to all medical information retrieval efforts. © 1995 John Wiley & Sons, Inc.

64 citations

Journal Article
TL;DR: A model for automated information retrieval in which questions posed by clinical users are analyzed to establish common syntactic and semantic patterns that are used to develop a set of general-purpose questions called generic queries is described.
Abstract: This paper describes a model for automated information retrieval in which questions posed by clinical users are analyzed to establish common syntactic and semantic patterns. The patterns are used to develop a set of general-purpose questions called generic queries. These generic queries are used in responding to specific clinical information needs. Users select generic queries in one of two ways. The user may type in questions, which are then analyzed, using natural language processing techniques, to identify the most relevant generic query; or the user may indicate patient data of interest and then pick one of several potentially relevant questions. Once the query and medical concepts have been determined, an information source is selected automatically, a retrieval strategy is composed and executed, and the results are sorted and filtered for presentation to the user. This work makes extensive use of the National Library of Medicine's Unified Medical Language System (UMLS): medical concepts are derived from the Metathesaurus, medical queries are based on semantic relations drawn from the UMLS Semantic Network, and automated source selection makes use of the Information Sources Map. The paper describes research currently under way to implement this model and reports on experience and results to date.

64 citations

Journal ArticleDOI
TL;DR: This experimental system features flexible document retrieval, a distributed architecture, and the capacity to store many very large documents.
Abstract: New technology is changing the way we store documents. This experimental system features flexible document retrieval, a distributed architecture, and the capacity to store many very large documents.

64 citations

Journal ArticleDOI
TL;DR: An iterative model of retrieval evaluation is proposed, starting first with the use of topical relevance to insure documents on the subject can be retrieved, followed by theUse of situational relevance to show the user can interact positively with the system.
Abstract: The traditional notion of topical relevance has allowed much useful work to be done in the evaluation of retrieval systems, but has limitations for complete assessment of retrieval systems. While topical relevance can be effective in evaluating various indexing and retrieval approaches, it is ineffective for measuring the impact that systems have on users. An alternative is to use a more situational definition of relevance, which takes account of the impact of the system on the user. Both types of relevance are examined from the standpoint of the medical domain, concluding that each have their appropriate use. But in medicine there is increasing emphasis on outcomes-oriented research which, when applied to information science, requires that the impact of an information system on the activities which prompt its use be assessed. An iterative model of retrieval evaluation is proposed, starting first with the use of topical relevance to insure documents on the subject can be retrieved. This is followed by the use of situational relevance to show the user can interact positively with the system. The final step is to study how the system impacts the user in the purpose for which the system was consulted, which can be done by methods such as protocol analysis and simulation. These diverse types of studies are necessary to increase our understanding of the nature of retrieval systems. © 1994 John Wiley & Sons, Inc.

64 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
81% related
Metadata
43.9K papers, 642.7K citations
79% related
Recommender system
27.2K papers, 598K citations
79% related
Ontology (information science)
57K papers, 869.1K citations
78% related
Natural language
31.1K papers, 806.8K citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20239
202239
2021107
2020130
2019144
2018111