Home
/
Topics
/
Optical character recognition

Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A script-independent methodology for optical character recognition

[...]

John Makhoul¹, Richard Schwartz¹, Christopher LaPre¹, Issam Bazzi¹•Institutions (1)

BBN Technologies¹

01 Sep 1998-Pattern Recognition

TL;DR: A methodology for OCR that exhibits the following properties: script-independent feature extraction, training, and recognition components; no separate segmentation at the character and word levels; and the training is performed automatically on data that is also not presegmented.

...read moreread less

47 citations

Proceedings Article•DOI•

Machine vision for keyword spotting using pseudo 2D hidden Markov models

[...]

S.-s. Kuo¹, O.E. Agazzi¹•Institutions (1)

Bell Labs¹

27 Apr 1993

TL;DR: An algorithm for robust machine recognition of keywords embedded in a poorly printed document is presented, where two statistical models, called pseudo-2D hidden Markov models (P2-DHMMs), are created for representing the actual keyword and all the other extraneous words, respectively.

...read moreread less

Abstract: An algorithm for robust machine recognition of keywords embedded in a poorly printed document is presented. For each keyword, two statistical models, called pseudo-2D hidden Markov models (P2-DHMMs), are created for representing the actual keyword and all the other extraneous words, respectively. Dynamic programming is then used for matching an unknown input word with the two models and making a maximum likelihood decision. Although the models are pseudo 2-D in the sense that they are not fully connected 2-D networks, they are shown to be general enough to characterize printed words efficiently. These models facilitate a nice 'elastic matching' property in both horizontal and vertical directions, which makes the recognizer not only independent of size and slant but also tolerant of highly deformed and noisy words. The system is evaluated on a synthetically created database which contains about 26000 words. A recognition accuracy of 99% is achieved when words in testing and training sets are in the same font size. An accuracy of 96% is achieved when they are in different sizes. In the latter case, the conventional 1-D HMM approach achieves only 70% accuracy rate. >

...read moreread less

47 citations

Journal Article•DOI•

Prototype reduction using an artificial immune model

[...]

Utpal Garain¹•Institutions (1)

Indian Statistical Institute¹

15 Aug 2008-Pattern Analysis and Applications

TL;DR: Empirical study shows that the proposed artificial immune system (AIS)-based pattern classification approach exhibits very good generalization ability in generating a smaller prototype library from a larger one and at the same time giving a substantial improvement in the classification accuracy of the underlying NN classifier.

...read moreread less

Abstract: Artificial immune system (AIS)-based pattern classification approach is relatively new in the field of pattern recognition. The study explores the potentiality of this paradigm in the context of prototype selection task that is primarily effective in improving the classification performance of nearest-neighbor (NN) classifier and also partially in reducing its storage and computing time requirement. The clonal selection model of immunology has been incorporated to condense the original prototype set, and performance is verified by employing the proposed technique in a practical optical character recognition (OCR) system as well as for training and testing of a set of benchmark databases available in the public domain. The effect of control parameters is analyzed and the efficiency of the method is compared with another existing techniques often used for prototype selection. In the case of the OCR system, empirical study shows that the proposed approach exhibits very good generalization ability in generating a smaller prototype library from a larger one and at the same time giving a substantial improvement in the classification accuracy of the underlying NN classifier. The improvement in performance has been statistically verified. Consideration of both OCR data and public domain datasets demonstrate that the proposed method gives results better than or at least comparable to that of some existing techniques.

...read moreread less

47 citations

Proceedings Article•DOI•

On-line Japanese character recognition experiments by an off-line method based on normalization-cooperated feature extraction

[...]

M. Hamanaka¹, Keiji Yamada, J. Tsukumo•Institutions (1)

NEC¹

20 Oct 1993

TL;DR: It is shown that an offline character recognition method is effective for use in an online Japanese character recognition, and has been improved with developments in nonlinear shape normalization, nonlinear pattern matching, and the normalization-cooperated feature extraction method.

...read moreread less

Abstract: It is shown that an offline character recognition method is effective for use in an online Japanese character recognition. Major conventional online recognition methods have restricted the number and the order of strokes. The offline method removes these restrictions, based on pattern matching of orientation feature patterns. It has been improved with developments in nonlinear shape normalization, nonlinear pattern matching, and the normalization-cooperated feature extraction method. It was used to examine 52,944 online Kanji characters in 1,064 categories. The recognition rate achieved 95.1%, and the cumulation recognition rate within the best five candidates was 99.3%. >

...read moreread less

47 citations

Proceedings Article•DOI•

Two-stage Approach for Word-wise Script Identification

[...]

Sukalpa Chanda¹, Srikanta Pal, Katrin Franke¹, Umapada Pal•Institutions (1)

Gjøvik University College¹

26 Jul 2009

TL;DR: A two-stage approach for word-wise identification of English, Devnagari and Bengali (Bangla) scripts is proposed, which allows identifying scripts with high speed, yet less accuracy when dealing with noisy data.

...read moreread less

Abstract: A two-stage approach for word-wise identification of English (Roman), Devnagari and Bengali (Bangla) scripts is proposed. This approach balances the tradeoff between recognition accuracy and processing speed. The 1st stage allows identifying scripts with high speed, yet less accuracy when dealing with noisy data. The advanced 2nd stage processes only those samples that yield low recognition confidence in the first stage. For both stages a rough character segmentation is performed and features are computed on segmented character components. Features used in the 1st stage are a 64-dimensional chain-code-histogram feature, while 400-dimensional gradient features are used in the 2nd stage. Final classification of a word to a particular script is done via majority voting of each recognized character component of the word. Extensive experiments with various confidence scores were conducted and reported here. The overall recognition accuracy and speed is remarkable. Correct classification of 98.51% on 11,123 test words is achieved, even when the recognition-confidence is as high as 95% at both stages.

...read moreread less

47 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
…
114
115
116
117
118
119
120
…
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

7,941

Papers

180,323

Citations

No. of papers in the topic in previous years
Year	Papers
2023	186
2022	425
2021	333
2020	448
2019	430
2018	357

Optical character recognition

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics