Home
/
Topics
/
Optical character recognition

Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Word spotting: a new approach to indexing handwriting

[...]

R. Manmatha, Chengfeng Han, Edward M. Riseman

18 Jun 1996

TL;DR: Experiments are shown demonstrating the feasibility of the approach for indexing handwriting and the method should also be applicable to retrieving previously stored material from personal digital assistants (PDAs).

...read moreread less

Abstract: There are many historical manuscripts written in a single hand which it would be useful to index. Examples include the W.B. DuBois collection at the University of Massachusetts and the early Presidential libraries at the Library of Congress. Since Optical Character Recognition (OCR) does not work well on handwriting, an alternative scheme based on matching the images of the words is proposed for indexing such texts. The current paper deals with the matching aspects of this process. Two different techniques for matching words are discussed. The first method matches words assuming that the transformation between the words may be modelled by a translation (shift). The second method matches words assuming that the transformation between the words may be modelled by an affine transform. Experiments are shown demonstrating the feasibility of the approach for indexing handwriting. The method should also be applicable to retrieving previously stored material from personal digital assistants (PDAs).

...read moreread less

261 citations

Journal Article•DOI•

High-order distance-based multiview stochastic learning in image classification.

[...]

Jun Yu¹, Yong Rui², Yuan Yan Tang³, Dacheng Tao⁴•Institutions (4)

Hangzhou Dianzi University¹, Microsoft², University of Macau³, University of Technology, Sydney⁴

17 Mar 2014-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: The proposed HD-MSL effectively combines varied features into a unified representation and integrates the labeling information based on a probabilistic framework and can automatically learn a combination coefficient for each view, which plays an important role in utilizing the complementary information of multiview data.

...read moreread less

Abstract: How do we find all images in a larger set of images which have a specific content? Or estimate the position of a specific object relative to the camera? Image classification methods, like support vector machine (supervised) and transductive support vector machine (semi-supervised), are invaluable tools for the applications of content-based image retrieval, pose estimation, and optical character recognition. However, these methods only can handle the images represented by single feature. In many cases, different features (or multiview data) can be obtained, and how to efficiently utilize them is a challenge. It is inappropriate for the traditionally concatenating schema to link features of different views into a long vector. The reason is each view has its specific statistical property and physical interpretation. In this paper, we propose a high-order distance-based multiview stochastic learning (HD-MSL) method for image classification. HD-MSL effectively combines varied features into a unified representation and integrates the labeling information based on a probabilistic framework. In comparison with the existing strategies, our approach adopts the high-order distance obtained from the hypergraph to replace pairwise distance in estimating the probability matrix of data distribution. In addition, the proposed approach can automatically learn a combination coefficient for each view, which plays an important role in utilizing the complementary information of multiview data. An alternative optimization is designed to solve the objective functions of HD-MSL and obtain different views on coefficients and classification scores simultaneously. Experiments on two real world datasets demonstrate the effectiveness of HD-MSL in image classification.

...read moreread less

260 citations

Journal Article•DOI•

Survey and bibliography of Arabic optical text recognition

[...]

Badr Al-Badr¹, Sabri A. Mahmoud²•Institutions (2)

University of Washington¹, King Saud University²

01 Jan 1995-Signal Processing

TL;DR: This paper introduces the general topic of optical character recognition (OCR), and introduces a five stage model for AOTR systems and classify research work according to this model, and presents an historical review of the Arabic text recognition systems.

...read moreread less

260 citations

Proceedings Article•DOI•

Adaptive document binarization

[...]

Jaakko Sauvola¹, Tapio Seppänen¹, S. Haapakoski¹, Matti Pietikäinen¹•Institutions (1)

University of Oulu¹

18 Aug 1997

TL;DR: A new method is presented for adaptive document image binarization, where the page is considered as a collection of subcomponents such as text, background and picture, using document characteristics to determine (surface) attributes, often used in document segmentation.

...read moreread less

Abstract: A new method is presented for adaptive document image binarization, where the page is considered as a collection of subcomponents such as text, background and picture. The problems caused by noise, illumination and many source type related degradations are addressed. The algorithm uses document characteristics to determine (surface) attributes, often used in document segmentation. Using characteristic analysis, two new algorithms are applied to determine a local threshold for each pixel. An algorithm based on soft decision control is used for thresholding the background and picture regions. An approach utilizing local mean and variance of gray values is applied to textual regions. Tests were performed with images including different types of document components and degradations. The results show that the method adapts and performs well in each case.

...read moreread less

257 citations

Neural Networks in Machine Learning

[...]

Parul Prashar

01 Jan 2014

TL;DR: This paper discusses neural network approaches used in machine learning, which is used in search engines, optical character recognition, computer vision etc.

...read moreread less

Abstract: Machine Learning is associated with the study and construction of systems that can learn on their own rather than following instructions. It is used in search engines, optical character recognition, computer vision etc. Neural networks are one of the several techniques used in machine learning. Here we are trying to discuss neural network approaches used in machine learning.

...read moreread less

257 citations

1
2
3
4
5
6
7
8
…
9
10
11
12
13
14
15
…
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

7,965

Papers

180,597

Citations

No. of papers in the topic in previous years
Year	Papers
2023	191
2022	428
2021	333
2020	448
2019	431
2018	357

Optical character recognition

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics