scispace - formally typeset
Search or ask a question
Topic

Document retrieval

About: Document retrieval is a research topic. Over the lifetime, 6821 publications have been published within this topic receiving 214383 citations.


Papers
More filters
Proceedings Article
01 Jan 1996
TL;DR: Experiments with a concept based information retrieval system which relies on a program called MetaMap to account for textual variation in the process of mapping biomedical text such as MEDLINE bibliographic citations to the UMLS Metathesaurus confirm that the effort expended in handling textual variation is well-spent for at least one type of concept basedInformation retrieval.
Abstract: Accounting for textual variation in the documents and queries processed by information retrieval systems is considered essential for achieving good retrieval. Recent research has called into question several of the techniques used to support this endeavor. This paper reports on experiments with a concept based information retrieval system which relies on a program called MetaMap to account for textual variation in the process of mapping biomedical text such as MEDLINE bibliographic citations to the UMLS Metathesaurus. The experiments confirm that the effort expended in handling textual variation is well-spent for at least one type of concept based information retrieval.

43 citations

Book ChapterDOI
08 Sep 2008
TL;DR: The AMIDA Automatic Content Linking Device is a just-in-time document retrieval system for meeting environments that listens to a meeting and displays information about the documents from the group's history that are most relevant to what is being said.
Abstract: The AMIDA Automatic Content Linking Device (ACLD) is a just-in-time document retrieval system for meeting environments. The ACLD listens to a meeting and displays information about the documents from the group's history that are most relevant to what is being said. Participants can view an outline or the entire content of the documents, if they feel that these documents are potentially useful at that moment of the meeting. The ACLD proof-of-concept prototype places meeting-related documents and segments of previously recorded meetings in a repository and indexes them. During a meeting, the ACLD continually retrieves the documents that are most relevant to keywords found automatically using the current meeting speech. The current prototype simulates the real-time speech recognition that will be available in the near future. The software components required to achieve these functions communicate using the Hub, a client/server architecture for annotation exchange and storage in real-time. Results and feedback for the first ACLD prototype are outlined, together with plans for its future development within the AMIDA EU integrated project. Potential users of the ACLD supported the overall concept, and provided feedback to improve the user interface and to access documents beyond the group's own history.

42 citations

Proceedings ArticleDOI
18 Dec 2006
TL;DR: WordBars visually represents the frequencies of the terms found in the first 100 document surrogates returned from the initial query, and allows the users to interactively re-sort the search results based on thefrequency of the selected terms within the document surrogate, generating a new set of search results.
Abstract: It is common for web searchers to have difficulties crafting queries to fulfill their information needs. Even when they provide a good query, users often find it challenging to evaluate the results of their web searches. Sources of these problems include the lack of support for query refinement, and the static nature of the list-based representations of web search results. To address these issues, we have developed WordBars, an interactive tool for web information retrieval. WordBars visually represents the frequencies of the terms found in the first 100 document surrogates returned from the initial query. This system allows the users to interactively re-sort the search results based on the frequencies of the selected terms within the document surrogates, as well as to add and remove terms from the query, generating a new set of search results. Examples illustrate how WordBars can provide valuable support for query refinement and search results exploration, both when specific and vague initial queries are provided.

42 citations

Journal ArticleDOI
01 Apr 2002
TL;DR: This study indicates that while some document categorization algorithms could be adopted for database categorization, algorithms that take into consideration the special characteristics of databases may be more effective.
Abstract: Document categorization as a technique to improve the retrieval of useful documents has been extensively investigated. One important issue in a large-scale metasearch engine is to select text databases that are likely to contain useful documents for a given query. We believe that database categorization can be a potentially effective technique for good database selection, especially in the Internet environment where short queries are usually submitted. In this paper, we propose and evaluate several database categorization algorithms. This study indicates that while some document categorization algorithms could be adopted for database categorization, algorithms that take into consideration the special characteristics of databases may be more effective. Preliminary experimental results are provided to compare the proposed database categorization algorithms. A prototype database categorization system based on one of the proposed algorithms has been developed.

42 citations

Journal ArticleDOI
TL;DR: In this scheme,domain ontology is first constructed using the graph-based approach to automating construction of domain ontology GRAONTO proposed by the group, and query semantic extension and retrieval are then adopted for semantic-based knowledge retrieval.

42 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
81% related
Metadata
43.9K papers, 642.7K citations
79% related
Recommender system
27.2K papers, 598K citations
79% related
Ontology (information science)
57K papers, 869.1K citations
78% related
Natural language
31.1K papers, 806.8K citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20239
202239
2021107
2020130
2019144
2018111