Home
/
Topics
/
Optical character recognition

Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Optical Character Recognition for Sanskrit Using Convolution Neural Networks

[...]

Meduri Avadesh¹, Navneet Goyal¹•Institutions (1)

Birla Institute of Technology and Science¹

24 Apr 2018

TL;DR: A Convolutional Neural Network based Optical Character Recognition system (OCR) which accurately digitizes Ancient Sanskrit manuscripts (Devanagari Script) that are not necessarily in good condition.

...read moreread less

Abstract: Ancient Sanskrit manuscripts are a rich source of knowledge about Science, Mathematics, Hindu mythology, Indian civilization, and culture. It therefore becomes critical that access to these manuscripts is made easy, to share this knowledge with the world and to facilitate further research on this Ancient literature. In this paper, we propose a Convolutional Neural Network (CNN) based Optical Character Recognition system (OCR) which accurately digitizes Ancient Sanskrit manuscripts (Devanagari Script) that are not necessarily in good condition. We use an image segmentation algorithm for calculating pixel intensities to identify letters in the image. The OCR considers typical compound characters (half letter combinations) as separate classes in order to improve the segmentation accuracy. The novelty of the OCR is its robustness to image quality, image contrast, font style and font size, which makes it an ideal choice for digitizing soiled and poorly maintained Sanskrit manuscripts.

...read moreread less

40 citations

Journal Article•DOI•

Robust Handwritten Character Recognition with Features Inspired by Visual Ventral Stream

[...]

Ali Borji, Mandana Hamidi¹, Fariborz Mahmoudi¹•Institutions (1)

Islamic Azad University¹

01 Oct 2008-Neural Processing Letters

TL;DR: This paper focuses on the applicability of the features inspired by the visual ventral stream for handwritten character recognition, and an analysis is conducted to evaluate the robustness of this approach to orientation, scale and translation distortions.

...read moreread less

Abstract: This paper focuses on the applicability of the features inspired by the visual ventral stream for handwritten character recognition. A set of scale and translation invariant C2 features are first extracted from all images in the dataset. Three standard classifiers kNN, ANN and SVM are then trained over a training set and then compared over a separate test set. In order to achieve higher recognition rate, a two stage classifier was designed with different preprocessing in the second stage. Experiments performed to validate the method on the well-known MNIST database, standard Farsi digits and characters, exhibit high recognition rates and compete with some of the best existing approaches. Moreover an analysis is conducted to evaluate the robustness of this approach to orientation, scale and translation distortions.

...read moreread less

40 citations

Proceedings Article•DOI•

Automatic text location in natural scene images

[...]

Chuang Li¹, Xiaoqing Ding¹, Youshou Wu¹•Institutions (1)

Tsinghua University¹

10 Sep 2001

TL;DR: A new connected component based segmentation algorithm which automatically extracts text regions from natural scene images is proposed in this paper, utilizing a multichannel decomposition method to locate text blocks in complex backgrounds.

...read moreread less

Abstract: A new connected component based segmentation algorithm which automatically extracts text regions from natural scene images is proposed in this paper. This approach utilizes a multichannel decomposition method to locate text blocks in complex backgrounds. Block alignment analysis and recognition confidence values are used in the combination and identification of the connected components. The algorithm is applied to a test image database and shows promising results.

...read moreread less

40 citations

Book•

Digital document processing : major directions and recent advances

[...]

Bidyut B. Chaudhuri

01 Jan 2007

TL;DR: This book discusses OCR Technologies for Machine Printed and Hand Printed Japanese Text, Meta-Data Extraction from Bibliographic Documents for the Digital Library, and Biometric and Forensic Aspects of Digital Document Processing.

...read moreread less

Abstract: Reading Systems: An Introduction to Digital Document Processing.- Document Structure and Layout Analysis.- OCR Technologies for Machine Printed and Hand Printed Japanese Text.- Multi-Font Printed Tibetan OCR.- On OCR of a Printed Indian Script.- A Bayesian Network Approach for On-line Handwriting Recognition.- New Advances and New Challenges in On-Line Handwriting Recognition and Electronic Ink Management.- Off-Line Roman Cursive Handwriting Recognition.- Robustness Design of Industrial Strength Recognition Systems.- Arabic Cheque Processing System: Issues and Future Trends.- OCR of Printed Mathematical Expressions.- The State of the Art of Document Image Degradation Modelling.- Advances in Graphics Recognition.- An Introduction to Super-Resolution Text.- Meta-Data Extraction from Bibliographic Documents for the Digital Library.- Document Information Retrieval.- Biometric and Forensic Aspects of Digital Document Processing.- Web Document Analysis.- Semantic Structure Analysis of Web Documents.- Bank Cheque Data Mining: Integrated Cheque Recognition Technologies.

...read moreread less

40 citations

Journal Article•

OCR For Printed Urdu Script Using Feed Forward Neural Network

[...]

Inam Shamsher, Zaheer Ahmad, Jehanzeb Khan Orakzai, Awais Adnan

22 Oct 2007-World Academy of Science, Engineering and Technology, International Journal of Computer, Electrical, Automation, Control and Information Engineering

TL;DR: In this article, an optical character recognition system for printed Urdu, a popular Pakistani/Indian script and is the third largest understandable language in the world, especially in the subcontinent but fewer efforts are made to make it understandable to computers.

...read moreread less

Abstract: This paper deals with an Optical Character Recognition system for printed Urdu, a popular Pakistani/Indian script and is the third largest understandable language in the world, especially in the subcontinent but fewer efforts are made to make it understandable to computers. Lot of work has been done in the field of literature and Islamic studies in Urdu, which has to be computerized. In the proposed system individual characters are recognized using our own proposed method/ algorithms. The feature detection methods are simple and robust. Supervised learning is used to train the feed forward neural network. A prototype of the system has been tested on printed Urdu characters and currently achieves 98.3% character level accuracy on average .Although the system is script/ language independent but we have designed it for Urdu characters only.

...read moreread less

40 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
…
136
137
138
139
140
141
142
…
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

7,941

Papers

180,323

Citations

No. of papers in the topic in previous years
Year	Papers
2023	186
2022	425
2021	333
2020	448
2019	430
2018	357

Optical character recognition

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics