Topic

Document retrieval

About: Document retrieval is a research topic. Over the lifetime, 6821 publications have been published within this topic receiving 214383 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

INDEX: The statistical basis for an automatic conceptual phrase-indexing system

[...]

Leslie P. Jones¹, Edward W. Gassie¹, Sridhar Radhakrishnan¹•Institutions (1)

Louisiana State University¹

01 Mar 1990-Journal of the Association for Information Science and Technology

TL;DR: Two programs are described, INDEX and INDEXD, which locate repeated phrases in a document, gather statistical information about them, and rank them according to their value as index phrases, showing promise as the basis for a sophisticated conceptual indexing system.

...read moreread less

Abstract: In recent years researchers have become increasingly convinced that the performance of information retrieval systems can be greatly enhanced by the use of key phrases for automatic conceptual document indexing and retrieval. In this article we describe two programs, INDEX and INDEXD, which locate repeated phrases in a document, gather statistical information about them, and rank them according to their value as index phrases. The programs show promise as the basis for a sophisticated conceptual indexing system. The simpler program, INDEX, ranks phrases in such a way that frequently occurring phrases which contain several frequently occurring words are given a high ranking. INDEXD is an extension of INDEX which incorporates a dictionary for stemming, weighting of words and validation of syntax of output phrases. Sample output of both programs is included, and we discuss plans to combine INDEXD with linguistic and artificial intelligence techniques to provide a general conceptual phrase-indexing system that can incorporate expert knowledge about a given application area. © 1990 John Wiley & Sons, Inc.

...read moreread less

39 citations

Multimedia Search and Retrieval

[...]

Shih-Fu Chang¹, Qian Huang², Thomas Huang², Atul Puri², Behzad Shahraray³ - Show less +1 more•Institutions (3)

Columbia University¹, AT&T², University of Illinois at Urbana–Champaign³

01 Jan 1999

TL;DR: This paper presents a meta-modelling architecture suitable for media asset management systems in corporations, audio-visual broadcast servers, and personal media servers for consumers that was designed for media search on the Web.

...read moreread less

Abstract: Multimedia search and retrieval has become an active research field thanks to the increasing demand that accompanies many new practical applications. The applications include largescale multimedia search engines on the Web, media asset management systems in corporations, audio-visual broadcast servers, and personal media servers for consumers. Diverse requirements derived from these applications impose great challenges and incentives for research in this field.

...read moreread less

39 citations

Proceedings Article•DOI•

Macaw: An Extensible Conversational Information Seeking Platform

[...]

Hamed Zamani¹, Nick Craswell¹•Institutions (1)

Microsoft¹

25 Jul 2020

TL;DR: Macaw is an open-source framework with a modular architecture for CIS research that supports multi-turn, multi-modal, and mixed-initiative interactions, and enables research for tasks such as document retrieval, question answering, recommendation, and structured data exploration.

...read moreread less

Abstract: Conversational information seeking (CIS) has been recognized as a major emerging research area in information retrieval. Such research will require data and tools, to allow the implementation and study of conversational systems. This paper introduces Macaw, an open-source framework with a modular architecture for CIS research. Macaw supports multi-turn, multi-modal, and mixed-initiative interactions, and enables research for tasks such as document retrieval, question answering, recommendation, and structured data exploration. It has a modular design to encourage the study of new CIS algorithms, which can be evaluated in batch mode. It can also integrate with a user interface, which allows user studies and data collection in an interactive mode, where the back end can be fully algorithmic or a wizard of oz setup. Macaw is distributed under the MIT License.

...read moreread less

39 citations

Patent•

Retrieval support device, retrieval support method and program thereof

[...]

Takashi Nakagawa, 尚中川

23 Oct 2001

TL;DR: In this article, a retrieval support device capable of automatically obtaining output results of a retrieval engine specialized in a specific field, without imposing burden of selecting the retrieval engine on a user, is presented.

...read moreread less

Abstract: PROBLEM TO BE SOLVED: To provide a retrieval support device capable of automatically obtaining output results of a retrieval engine specialized in a specific field, without imposing burden of selecting the retrieval engine on a user. SOLUTION: Retrieval input sentences acquired from a client terminal device are parsed and matched with example sentences registered in advance. These example sentences are associated with the retrieval engine specialized in the specific field. Based on this, the retrieval input sentence and retrieval engine can be selected. Next, a key word is extracted from the retrieval input sentence, and if necessary, the key word is subjected to predetermined processing, such as an abbreviated designation, is converted into a formal nomenclature to create a retrieval request sentence. By using the retrieval request sentence, retrieval is requested to the preselected retrieval engine, and retrieval results are presented at the client terminal device. COPYRIGHT: (C)2003,JPO

...read moreread less

39 citations

Journal Article•DOI•

Statistical lattice-based spoken document retrieval

[...]

Tee Kiah Chia¹, Khe Chai Sim², Haizhou Li², Hwee Tou Ng¹•Institutions (2)

National University of Singapore¹, Institute for Infocomm Research Singapore²

29 Jan 2010-ACM Transactions on Information Systems

TL;DR: Experimental results show that the method consistently achieves better retrieval performance than using only the 1-best transcripts in statistical retrieval, outperforms a recently proposed lattice-based vector space retrieval method, and also compares favorably with a lattICE-based retrieval method based on the Okapi BM25 model.

...read moreread less

Abstract: Recent research efforts on spoken document retrieval have tried to overcome the low quality of 1-best automatic speech recognition transcripts, especially in the case of conversational speech, by using statistics derived from speech lattices containing multiple transcription hypotheses as output by a speech recognizer. We present a method for lattice-based spoken document retrieval based on a statistical n-gram modeling approach to information retrieval. In this statistical lattice-based retrieval (SLBR) method, a smoothed statistical model is estimated for each document from the expected counts of words given the information in a lattice, and the relevance of each document to a query is measured as a probability under such a model. We investigate the efficacy of our method under various parameter settings of the speech recognition and lattice processing engines, using the Fisher English Corpus of conversational telephone speech. Experimental results show that our method consistently achieves better retrieval performance than using only the 1-best transcripts in statistical retrieval, outperforms a recently proposed lattice-based vector space retrieval method, and also compares favorably with a lattice-based retrieval method based on the Okapi BM25 model.

...read moreread less

39 citations

Collapse

Network Information

Performance

Metrics

6,866

Papers

224,605

Citations

No. of papers in the topic in previous years
Year	Papers
2023	9
2022	39
2021	107
2020	130
2019	144
2018	111

Document retrieval

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics