Home
/
Topics
/
Document retrieval

Topic

Document retrieval

About: Document retrieval is a research topic. Over the lifetime, 6821 publications have been published within this topic receiving 214383 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•

Integrating a Structured-Text Retrieval System with an Object-Oriented Database System

[...]

Tak W. Yan, Jurgen Annevelink

12 Sep 1994

TL;DR: The integration of a structured-text retrieval system (TextMachine) into an object-oriented database system (Op) is described, using the external function capability of the database system to encapsulate the text retrieval system as an external information source.

...read moreread less

Abstract: We describe the integration of a structured-text retrieval system (TextMachine) into an object-oriented database system (OpOur approach is a light-weight one, using the external function capability of the database system to encapsulate the text retrieval system as an external information source. Yet, we are able to provide a tight integration in the query language and processing; the user can access the text retrieval system using a standard database query language. The effcient and effective retrieval of structured text performed by the text retrieval system is seamlessly combined with the rich modeling and general-purpose querying capabilities of the database system, resulting in an integrated system with querying power beyond those of the underlying systems. The integrated system also provides uniform access to textual data in the text retrieval system and structured data in the database system, thereby achieving information fusion. We discuss the design and implementation of our prototype system, and address issues such as the proper framework for external integration, the modeling of complex categorization and structure hierarchies of documents (under automatic document schema impand techniques to reduce the performance overhead of accessing an external source.

...read moreread less

55 citations

Proceedings Article•DOI•

Using character shape coding for information retrieval

[...]

Alan F. Smeaton¹, A.L. Spitz•Institutions (1)

Dublin City University¹

18 Aug 1997

TL;DR: A technique for performing information retrieval on document images in such a manner that the accuracy has great utility is developed, and a surprisingly good result is obtained.

...read moreread less

Abstract: In conventional information retrieval the task of finding users' search terms in a document is simple. When the document is not available in machine readable format, optical character recognition (OCR) can usually be performed. We have developed a technique for performing information retrieval on document images in such a manner that the accuracy has great utility. The method makes generalisations about the images of characters, then performs classification of these and agglomerates the resulting character shape codes into word tokens based on character shape coding. These are sufficiently specific in their representation of the underlying words to allow reasonable performance of retrieval. Using a collection of over 250 Mbytes of document texts and queries with known relevance assessments, we present a series of experiments to determine how various parameters in the retrieval strategy affect retrieval performance and we obtain a surprisingly good result.

...read moreread less

55 citations

Proceedings Article•

Iconic paper

[...]

M. Peairs¹•Institutions (1)

Ricoh¹

14 Aug 1995

TL;DR: A novel presentation of the document supports both indexing and recognition, thereby allowing the "desktop metaphor" to migrate back to the real desktop.

...read moreread less

Abstract: Iconic paper can be used to retrieve documents from an electronic database. It provides on a physical sheet of paper a representation that can be used by humans for recognition, and by machines for indexing. A document in the database can be accessed by a gesture indicating a particular icon on the page. A novel presentation of the document supports both indexing and recognition, thereby allowing the "desktop metaphor" to migrate back to the real desktop.

...read moreread less

55 citations

Book Chapter•DOI•

Processing Complex Similarity Queries with Distance-Based Access Methods

[...]

Paolo Ciaccia¹, Marco Patella¹, Pavel Zezula•Institutions (1)

University of Bologna¹

23 Mar 1998

TL;DR: This paper considers the relevant case where complex similarity queries are defined through a generic language £ and whose predicates refer to a single feature F and suggests that the index should process complex queries as a whole, thus evaluating multiple similarity predicates at a time.

...read moreread less

Abstract: Efficient evaluation of similarity queries is one of the basic requirements for advanced multimedia applications. In this paper, we consider the relevant case where complex similarity queries are defined through a generic language L and whose predicates refer to a single feature F. Contrary to the language level which deals only with similarity scores, the proposed evaluation process is based on distances between feature values — known spatial or metric indexes use distances to evaluate predicates. The proposed solution suggests that the index should process complex queries as a whole, thus evaluating multiple similarity predicates at a time. The flexibility of our approach is demonstrated by considering three different similarity languages, and showing how the M-tree access method has been extended to this purpose. Experimental results clearly show that performance of the extended M-tree is consistently better than that of state-of-the-art search algorithms.

...read moreread less

55 citations

Journal Article•DOI•

Information retrieval from the World Wide Web: a user-focused approach based on individual experience with search engines

[...]

Shu-Sheng Liaw¹, Hsiu-Mei Huang•Institutions (1)

China Medical University (Taiwan)¹

01 May 2006-Computers in Human Behavior

TL;DR: The results show that experience with search engines significantly affects users' attitudes toward search engines for information retrieval, the query- based service is more popular than the directory-based service, and users are not completely satisfied with the precision of retrieved information and the response time of search engines.

...read moreread less

55 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
…
136
137
138
139
140
141
142
…
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

6,866

Papers

224,605

Citations

No. of papers in the topic in previous years
Year	Papers
2023	9
2022	39
2021	107
2020	130
2019	144
2018	111

Document retrieval

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics