Home
/
Topics
/
Document retrieval

Topic

Document retrieval

About: Document retrieval is a research topic. Over the lifetime, 6821 publications have been published within this topic receiving 214383 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

A few examples go a long way: constructing query models from elaborate query formulations

[...]

Krisztian Balog¹, Wouter Weerkamp¹, Maarten de Rijke¹•Institutions (1)

University of Amsterdam¹

20 Jul 2008

TL;DR: In this paper, the authors address a specific enterprise document search scenario, where the information need is expressed using a short query (of a few keywords) together with examples of key reference pages, and investigate how the examples can be utilized to improve the end-to-end performance on the document retrieval task.

...read moreread less

Abstract: We address a specific enterprise document search scenario, where the information need is expressed in an elaborate manner. In our scenario, information needs are expressed using a short query (of a few keywords) together with examples of key reference pages. Given this setup, we investigate how the examples can be utilized to improve the end-to-end performance on the document retrieval task. Our approach is based on a language modeling framework, where the query model is modified to resemble the example pages. We compare several methods for sampling expansion terms from the example pages to support query-dependent and query-independent query expansion; the latter is motivated by the wish to increase "aspect recall", and attempts to uncover aspects of the information need not captured by the query.For evaluation purposes we use the CSIRO data set created for the TREC 2007 Enterprise track. The best performance is achieved by query models based on query-independent sampling of expansion terms from the example documents.

...read moreread less

45 citations

Journal Article•DOI•

Operations Research Applied to Document Indexing and Retrieval Decisions

[...]

Abraham Bookstein¹, Donald H. Kraft²•Institutions (2)

University of Chicago¹, Louisiana State University²

01 Jul 1977-Journal of the ACM

TL;DR: The earher model is extended to include interactions among terms, which allows one to decide whether to retrieve a document by taking into consideration occurrences of all the words in the text.

...read moreread less

Abstract: This paper begins with a review of earher work in which a model of word occurrence formed the basis of a decision-making procedure for indexing or, more generally, retrieving documents in response to a request In the earlier work words were considered individually This paper extends the earher model to include interactions among terms The elaborated model allows one to decide whether to retrieve a document by taking into consideration occurrences of all the words in the text Retrieval in response to Boolean expresstons IS also considered, as are procedures for ranking documents in accordance with their assessed relevance to a request The discussion is within the framework of Bayesian decision theory

...read moreread less

45 citations

Journal Article•DOI•

A Review on Text Similarity Technique used in IR and its Application

[...]

Nitesh Pradhan, Manasi Gyanchandani, Rajesh Wadhvani

18 Jun 2015-International Journal of Computer Applications

TL;DR: Different types of similarity like lexical similarity, semantic similarity etc. are described, which play an important role in the categorization of text as well as document.

...read moreread less

Abstract: With large number of documents on the web, there is a increasing need to be able to retrieve the best relevant document. There are different techniques through which we can retrieve most relevant document from the large corpus. Similarity between words, sentences, paragraphs and documents is an important component in various tasks such as information retrieval, document clustering, word-sense disambiguation, automatic essay scoring, short answer grading, machine translation and text summarization. Text similarity means user’s query text is matched with the document text and on the basis on this matching user retrieves the most relevant documents. Text similarity also plays an important role in the categorization of text as well as document. We can measure the similarity between sentences, words, paragraphs and documents to categorize them in an efficient way. On the basis of this categorization, we can retrieve the best relevant document corresponding to user’s query. This paper describes different types of similarity like lexical similarity, semantic similarity etc.

...read moreread less

45 citations

Book Chapter•DOI•

CBR for Document Retrieval: The FALLQ Project

[...]

Mario Lenz¹, Hans-Dieter Burkhard¹•Institutions (1)

Humboldt University of Berlin¹

25 Jul 1997

TL;DR: The objective is to provide a tool that helps finding documents related to a given query, such as answers in Frequently Asked Questions databases, by developing a running prototypical system which is currently under practical evaluation.

...read moreread less

Abstract: This paper reports about a project on document retrieval in an industrial setting. The objective is to provide a tool that helps finding documents related to a given query, such as answers in Frequently Asked Questions databases. A CBR approach has been used to develop a running prototypical system which is currently under practical evaluation.

...read moreread less

45 citations

Journal Article•

An analysis of Internet search engines: assessment of over 200 search queries

[...]

Nicholas G. Tomaiuolo, Joan G. Packer

01 Jun 1996-Computers in Libraries archive

45 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
…
170
171
172
173
174
175
176
…
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

6,866

Papers

224,605

Citations

No. of papers in the topic in previous years
Year	Papers
2023	9
2022	39
2021	107
2020	130
2019	144
2018	111

Document retrieval

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics