Home
/
Topics
/
Optical character recognition

Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

NEOCR: a configurable dataset for natural image text recognition

[...]

Robert Nagy¹, Anders Dicker¹, Klaus Meyer-Wegener¹•Institutions (1)

University of Erlangen-Nuremberg¹

22 Sep 2011

TL;DR: Based on the rich annotations of the proposed NEOCR dataset new and more precise evaluations are now possible, which give more detailed information on where improvements are most required in natural image text OCR.

...read moreread less

Abstract: Recently growing attention has been paid to recognizing text in natural images. Natural image text OCR is far more complex than OCR in scanned documents. Text in real world environments appears in arbitrary colors, font sizes and font types, often affected by perspective distortion, lighting effects, textures or occlusion. Currently there are no datasets publicly available which cover all aspects of natural image OCR. We propose a comprehensive well-annotated configurable dataset for optical character recognition in natural images for the evaluation and comparison of approaches tackling with natural image text OCR. Based on the rich annotations of the proposed NEOCR dataset new and more precise evaluations are now possible, which give more detailed information on where improvements are most required in natural image text OCR.

...read moreread less

54 citations

Patent•

Retail terminal utilizing an imaging scanner for product attribute identification and consumer interactive querying

[...]

Edward G. Rantze¹, Joseph M. Lindacher¹•Institutions (1)

NCR Corporation¹

06 Mar 2000

TL;DR: In this paper, an attribute recognition program such as an optical character recognition (OCR) program is used on the scanned product label which generates text strings from alphanumeric label information and graphics maps/images from graphics/logos.

...read moreread less

Abstract: The present invention provides a system, method and apparatus for identifying a product through reading of the product label by a retail terminal. The product/product label is scanned by an imager of a retail terminal. An attribute recognition program such as an optical character recognition (OCR) program is used on the scanned product label which generates text strings from alphanumeric label information and graphics maps/images from graphics/logos. Text strings and/or graphics data are then compared to various text strings and graphics data in a database or look-up table to return information relative to the scanned text string(s)/graphic(s). In one form, kiosks, incorporating an imager and the necessary hardware and software to scan a product label and process the scanned information in accordance with the present principles, may provide printouts of product information, instructions, order forms or the like for the scanned product. Additionally, standard queries or user-generated queries may be answered relative to the scanned product label. Data, stored either locally or at a remote site accessible via a network or the like, is correlated to a plurality of text strings/graphics that correspond to alphanumeric text/graphics on a plurality of product labels.

...read moreread less

54 citations

Patent•

System and method for translating languages using portable display device

[...]

Robert Thomas Arenburg¹, Franck Barillaud¹, Bradford L. Cobb¹, Gary Hook¹•Institutions (1)

IBM¹

17 Apr 2003

TL;DR: In this paper, a method and system for translating written text from a first (foreign) language to a second (native) language is provided, where an image containing the text is first captured at the request of the user, and text zones are identified in the image and the zones are converted to text characters using optical character recognition.

...read moreread less

Abstract: A method and system for translating written text from a first (foreign) language to a second (native) language is provided. An image containing the text is first captured at the request of the user. Text zones are identified in the image and the zones are converted to text characters using optical character recognition. The text characters, which are in the first language, are translated to the second language. The translated text is then output to the user. The text may be converted to an image that can be displayed on a display or, alternatively, the text may be synthesized into speech that may be played over a speaker accessible to the user such as an earpiece. Data can be provided to the user as text, audio or text and audio combined.

...read moreread less

54 citations

Journal Article•DOI•

Offline Handwritten Script Identification in Document Images

[...]

Mallikarjun Hangarge, B. V. Dhandra

07 Oct 2010-International Journal of Computer Applications

TL;DR: A texture is investigated as a tool for determining the script of handwritten document image, based on the observation that text has a distinct visual texture.

...read moreread less

Abstract: Automatic handwritten script identification from document images facilitates many important applications such as sorting, transcription of multilingual documents and indexing of large collection of such images, or as a precursor to optical character recognition (OCR). In this paper, we investigate a texture as a tool for determining the script of handwritten document image, based on the observation that text has a distinct visual texture. Further, K nearest neighbour algorithm is used to classify 300 text blocks as well as 400 text lines into one of the three major Indian scripts: English, Devnagari and Urdu, based on 13 spatial spread features extracted using morphological filters. The proposed algorithm attains average classification accuracy as high as 99.2% for bi-script and 88.6% for tri-script separation at text line and text block level respectively with five fold cross validation test. General Terms Pattern Recognition, Document Image Analysis

...read moreread less

54 citations

Proceedings Article•DOI•

Automated labeling in document images

[...]

Jongwoo Kim¹, Daniel X. Le¹, George R. Thoma¹•Institutions (1)

National Institutes of Health¹

21 Dec 2000

TL;DR: The Medical Article Record System (MARS) as discussed by the authors employs document image analysis and understanding techniques and optical character recognition (OCR) to produce bibliographic records for its MEDLINER database.

...read moreread less

Abstract: The National Library of Medicine (NLM) is developing an automated system to produce bibliographic records for its MEDLINER database. This system, named Medical Article Record System (MARS), employs document image analysis and understanding techniques and optical character recognition (OCR). This paper describes a key module in MARS called the Automated Labeling (AL) module, which labels all zones of interest (title, author, affiliation, and abstract) automatically. The AL algorithm is based on 120 rules that are derived from an analysis of journal page layouts and features extracted from OCR output. Experiments carried out on more than 11,000 articles in over 1,000 biomedical journals show the accuracy of this rule-based algorithm to exceed 96%.© (2000) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

...read moreread less

54 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
…
97
98
99
100
101
102
103
…
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

7,941

Papers

180,323

Citations

No. of papers in the topic in previous years
Year	Papers
2023	186
2022	425
2021	333
2020	448
2019	430
2018	357

Optical character recognition

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics