Topic
Document retrieval
About: Document retrieval is a research topic. Over the lifetime, 6821 publications have been published within this topic receiving 214383 citations.
Papers published on a yearly basis
Papers
More filters
•
11 Jan 1995TL;DR: In this article, a user interface for a full-text document retrieval computerized system comprises a display with a words window in which each query word is displayed by means of a distinctive representation uniquely associated with each displayed word.
Abstract: A user interface for a full-text document retrieval computerized system comprises a display with a words window in which each query word is displayed by means of a distinctive representation uniquely associated with each displayed word. In a subsequent results window, each document header or title or representation is accompanied by an indicator which employs the same distinctive representation to directly indicate to the user the relative contributions of the individual query words to each listed document. In a preferred embodiment, the distinctive representation is integrated with an associated weight first indicator in a words window, and in the results window the distinctive representations are also integrated with an associated weight second indicator. The distinctive representation can take several forms, such as by a different color or by means of hatching or shading or by displayed icons.
158 citations
•
01 Jan 2003TL;DR: The first year of TREC Genomics Track featured two tasks: ad hoc retrieval and information extraction, which centered around the Gene Reference into Function (GeneRIF) resource of the National Library of Medicine.
Abstract: The first year of TREC Genomics Track featured two tasks: ad hoc retrieval and information extraction. Both tasks centered around the Gene Reference into Function (GeneRIF) resource of the National Library of Medicine, which was used as both pseudorelevance judgments for ad hoc document retrieval as well as target text for information extraction. The track attracted 29 groups who participated in one or both tasks.
157 citations
••
TL;DR: Development of the Envision database, system software, and protocol for client-server communication builds upon work to identify and represent “ objects” that will facilitate reuse and high-level communication of information from author to reader (user).
Abstract: Project Envision aims to build a “user-centered database from the computer science literature,” initially using the publications of the Association for Computing Machinery (ACM) Accordingly, we have interviewed potential users, as well as experts in library, information, and computer science—to understand their needs, to become aware of their perception of existing information systems, and to collect their recommendations Design and formative usability evaluation of our interface have been based on those interviews, leading to innovative query formulation and search results screens that work well according to our usability testing Our development of the Envision database, system software, and protocol for client-server communication builds upon work to identify and represent “objects” that will facilitate reuse and high-level communication of information from author to reader (user) All these efforts are leading not only to a usable prototype digital library but also to a set of nine principles for digital libraries, which we have tried to follow, covering issues of representation, architecture, and interfacing © 1993 John Wiley & Sons, Inc
157 citations
••
157 citations
•
01 Jan 2003
TL;DR: NLP needs to be optimized for IR in order to be effective and document retrieval is not an ideal application for NLP, at least given the current state-of-the-art in NLP.
Abstract: Many Natural Language Processing (NLP) techniques have been used in Information Retrieval. The results are not encouraging. Simple methods (stopwording, porter-style stemming, etc.) usually yield significant improvements, while higher-level processing (chunking, parsing, word sense disambiguation, etc.) only yield very small improvements or even a decrease in accuracy. At the same time, higher-level methods increase the processing and storage cost dramatically. This makes them hard to use on large collections. We review NLP techniques and come to the conclusion that (a) NLP needs to be optimized for IR in order to be effective and (b) document retrieval is not an ideal application for NLP, at least given the current state-of-the-art in NLP. Other IR-related tasks, e.g., question answering and information extraction, seem to be better suited.
156 citations