Home
/
Topics
/
Optical character recognition

Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Region-Based Discriminative Feature Pooling for Scene Text Recognition

[...]

Chen-Yu Lee¹, Anurag Bhardwaj¹, Wei Di¹, Vignesh Jagadeesh¹, Robinson Piramuthu¹ - Show less +1 more•Institutions (1)

University of California, San Diego¹

23 Jun 2014

TL;DR: This work proposes a discriminative feature pooling method that automatically learns the most informative sub-regions of each scene character within a multi-class classification framework, whereas each sub-region seamlessly integrates a set of low-level image features through integral images.

...read moreread less

Abstract: We present a new feature representation method for scene text recognition problem, particularly focusing on improving scene character recognition. Many existing methods rely on Histogram of Oriented Gradient (HOG) or part-based models, which do not span the feature space well for characters in natural scene images, especially given large variation in fonts with cluttered backgrounds. In this work, we propose a discriminative feature pooling method that automatically learns the most informative sub-regions of each scene character within a multi-class classification framework, whereas each sub-region seamlessly integrates a set of low-level image features through integral images. The proposed feature representation is compact, computationally efficient, and able to effectively model distinctive spatial structures of each individual character class. Extensive experiments conducted on challenging datasets (Chars74K, ICDAR'03, ICDAR'11, SVT) show that our method significantly outperforms existing methods on scene character classification and scene text recognition tasks.

...read moreread less

98 citations

Proceedings Article•DOI•

Identification of text on colored book and journal covers

[...]

Karin Sobottka¹, Horst Bunke, H. Kronenberg•Institutions (1)

University of Bern¹

20 Sep 1999

TL;DR: An approach to automatic text location and identification of colored book and journal covers is proposed and a clustering algorithm is applied in a preprocessing step to reduce the amount of small variations in color.

...read moreread less

Abstract: An approach to automatic text location and identification of colored book and journal covers is proposed. To reduce the amount of small variations in color, a clustering algorithm is applied in a preprocessing step. Two methods have been developed for extracting text hypotheses. One is based on a top-down analysis using successive splitting of image regions. The other is a bottom-up region growing algorithm. The results of both methods are combined to robustly distinguish between text and non-text elements. Text elements are binarized using automatically extracted information about text color. The binarized text regions can be used as input for a conventional OCR module. Results are shown for parts of book and journal covers of different complexity. The proposed method is not restricted to cover pages, but can be applied to the extraction of text from other types of color images as well.

...read moreread less

98 citations

Proceedings Article•DOI•

Content-based retrieval in multimedia imaging

[...]

James Dowe

14 Apr 1993-Storage and Retrieval for Image and Video Databases

TL;DR: Content-based retrieval is founded on neural networks, this technology allows automatic filing of images and a wide range of possible queries of the resulting database, in contrast to methods such as entering SQL keys manually for each image as it is filed and later correctly re-entering those keys to retrieve the same image.

...read moreread less

Abstract: Content-based retrieval is founded on neural networks, this technology allows automatic filing of images and a wide range of possible queries of the resulting database. This is in contrast to methods such as entering SQL keys manually for each image as it is filed and later correctly re-entering those keys to retrieve the same image. An SQL-based approach does not take into account information that is hard to describe with text, such as sounds and images. Neural networks can be trained to translate `noisy' or chaotic image data into simpler, more reliable feature sets. By converting the images into the level of abstraction necessary for symbolic processing, standard database indexing methods can then be applied, or used in layers of associative database neural networks directly.

...read moreread less

98 citations

Book Chapter•DOI•

Bangla/English script identification based on analysis of connected component profiles

[...]

Lijun Zhou¹, Yue Lu¹, Chew Lim Tan²•Institutions (2)

East China Normal University¹, National University of Singapore²

13 Feb 2006

TL;DR: Experimental results demonstrate that the proposed technique is capable of identifying Bangla/English scripts on the real Bangladesh postal images.

...read moreread less

Abstract: Script identification is required for a multilingual OCR system. In this paper, we present a novel and efficient technique for Bangla/English script identification with applications to the destination address block of Bangladesh envelope images. The proposed approach is based upon the analysis of connected component profiles extracted from the destination address block images, however, it does not place any emphasis on the information provided by individual characters themselves and does not require any character/line segmentation. Experimental results demonstrate that the proposed technique is capable of identifying Bangla/English scripts on the real Bangladesh postal images.

...read moreread less

98 citations

Journal Article•DOI•

OCRSpell: an interactive spelling correction system for OCR errors in text

[...]

Kazem Taghva, Eric Stofsky

01 Mar 2001-International Journal on Document Analysis and Recognition

TL;DR: A spelling correction system designed specifically for OCR-generated text that selects candidate words through the use of information gathered from multiple knowledge sources is described, based on static and dynamic device mappings, approximate string matching, and n-gram analysis.

...read moreread less

Abstract: In this paper, we describe a spelling correction system designed specifically for OCR-generated text that selects candidate words through the use of information gathered from multiple knowledge sources. This system for text correction is based on static and dynamic device mappings, approximate string matching, and n-gram analysis. Our statistically based, Bayesian system incorporates a learning feature that collects confusion information at the collection and document levels. An evaluation of the new system is presented as well.

...read moreread less

97 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
…
46
47
48
49
50
51
52
…
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

7,941

Papers

180,323

Citations

No. of papers in the topic in previous years
Year	Papers
2023	186
2022	425
2021	333
2020	448
2019	430
2018	357

Optical character recognition

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics