Home
/
Topics
/
Document retrieval

Topic

Document retrieval

About: Document retrieval is a research topic. Over the lifetime, 6821 publications have been published within this topic receiving 214383 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Indexing and abstracting by association

[...]

Lauren B. Doyle¹•Institutions (1)

System Development Corporation¹

01 Oct 1962-American Documentation

TL;DR: It is shown that the most strongly co-occurring word pairs, which are therefore “associated” in a statistical sense, can be represented in the form of an “association map.”

...read moreread less

Abstract: This article discusses the possibility of exploiting the statistics of word co-occurrence in text for purposes of document retrieval. Co-occurrence is defined and related to the mental processes of authors and readers; several means of quantitative measurement of word co-occurrence are then scrutinized. It is shown that the most strongly co-occurring word pairs, which are therefore “associated” in a statistical sense, can be represented in the form of an “association map.” The last half of the article presents two modes of use of association maps in literature searching.

...read moreread less

89 citations

Journal Article•DOI•

Briefly noted - Mathematical foundations of information retrieval

[...]

Sandor Dominich¹•Institutions (1)

University of Pannonia¹

01 Mar 2002-Computational Linguistics

TL;DR: This paper presents a meta-modelling framework for estimating the relevance of information retrieval in a number of discrete-time models and shows clear patterns in how these models are modified over time.

...read moreread less

Abstract: Acknowledgments. Preface. 1. Introduction. 2. Mathematics Handbook. 3. Information Retrieval Models. 4. Mathematical Theory of Information Retrieval. 5. Relevance Effectiveness in Information Retrieval. 6. Further Topics in Information Retrieval. Appendices. References. Index.

...read moreread less

89 citations

An Evaluation of Factors Affecting Document Ranking by Information Retrieval Systems.

[...]

Michael McGill

01 Oct 1979

88 citations

Journal Article•DOI•

A Latent Semantic Indexing-based approach to multilingual document clustering

[...]

Chih-Ping Wei¹, Christopher C. Yang², Chia-Min Lin•Institutions (2)

National Tsing Hua University¹, The Chinese University of Hong Kong²

01 Jun 2008

TL;DR: This study designs a Latent Semantic Indexing (LSI)-based MLDC technique capable of generating knowledge maps from multilingual documents, capable of maintaining a good balance between monolingual and cross-lingual clustering effectiveness when clustering a multilingual document corpus.

...read moreread less

Abstract: The creation and deployment of knowledge repositories for managing, sharing, and reusing tacit knowledge within an organization has emerged as a prevalent approach in current knowledge management practices. A knowledge repository typically contains vast amounts of formal knowledge elements, which generally are available as documents. To facilitate users' navigation of documents within a knowledge repository, knowledge maps, often created by document clustering techniques, represent an appealing and promising approach. Various document clustering techniques have been proposed in the literature, but most deal with monolingual documents (i.e., written in the same language). However, as a result of increased globalization and advances in Internet technology, an organization often maintains documents in different languages in its knowledge repositories, which necessitates multilingual document clustering (MLDC) to create organizational knowledge maps. Motivated by the significance of this demand, this study designs a Latent Semantic Indexing (LSI)-based MLDC technique capable of generating knowledge maps (i.e., document clusters) from multilingual documents. The empirical evaluation results show that the proposed LSI-based MLDC technique achieves satisfactory clustering effectiveness, measured by both cluster recall and cluster precision, and is capable of maintaining a good balance between monolingual and cross-lingual clustering effectiveness when clustering a multilingual document corpus.

...read moreread less

88 citations

Patent•

Document data processing method and apparatus for document retrieval

[...]

Atsushi Hatakeyama¹, Hiromichi Fujisawa¹, Kanji Kato¹, Hisamitsu Kawaguchi¹, Naoki Minegishi¹, Katsumi Tada¹, Asakawa Satoshi¹ - Show less +3 more•Institutions (1)

Hitachi¹

28 Feb 1992

TL;DR: In this paper, a component character table is created in which characters occurring in each of the condensed texts are registered without duplication, and a text body search is executed for extracting a document which satisfies query condition imposed on the search term by consulting the texts of the documents extracted through the component characters table search and the condensed text search.

...read moreread less

Abstract: High-speed full document retrieval method and system capable of providing result of retrieval within practically acceptable short search time. Upon registration of documents in a document database, condensed texts are created by decomposing each of textual character strings of the documents to be registered into fragmental character strings in dependence on character species and by checking mutual inclusion relations existing among the fragmental character strings. A component character table is created in which characters occurring in each of the condensed texts are registered without duplication. The condensed texts and the component character table are registered in the data base together with the texts of the documents to be registered. Upon retrieval of a document containing a search term designated by a user, a component character table search is first executed to extract those documents which contain all species of characters constituting the search term by consulting the component character table, and subsequently a condensed text search is executed by consulting the condensed texts of the documents. Finally, a text body search is executed for extracting a document which satisfies query condition imposed on the search term by consulting the texts of the documents extracted through the component character table search and the condensed text search.

...read moreread less

88 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
…
84
85
86
87
88
89
90
…
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

6,866

Papers

224,605

Citations

No. of papers in the topic in previous years
Year	Papers
2023	9
2022	39
2021	107
2020	130
2019	144
2018	111

Document retrieval

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics