Home
/
Topics
/
Optical character recognition

Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Optical recognition of chemical graphics

[...]

Richard G. Casey¹, Stephen K. Boyer¹, P. Healey¹, Alex Miller¹, B. Oudot¹, K. Zilles¹ - Show less +2 more•Institutions (1)

IBM¹

20 Oct 1993

TL;DR: A prototype system for encoding chemical structure diagrams from scanned printed documents is described, and the final coded output interfaces to conventional chemistry software for database storage and retrieval, publishing, and modeling.

...read moreread less

Abstract: A prototype system for encoding chemical structure diagrams from scanned printed documents is described. The system distinguishes a structure diagram from other printed material on a page image using size and spacing characteristics. It distinguishes line graphics from symbols in an intermediate vectorization stage. Line information is mapped into a connection diagram that represents atomic bonds. Atomic symbols are identified by means of chemical drawing conventions and optical character recognition. The final coded output interfaces to conventional chemistry software for database storage and retrieval, publishing, and modeling. >

...read moreread less

40 citations

Proceedings Article•DOI•

OCR error correction using a noisy channel model

[...]

Okan Kolak¹, Philip Resnik¹•Institutions (1)

University of Maryland, College Park¹

01 Jan 2002

TL;DR: A very general, theoretically optimal model is applied to the problem of OCR word correction, practical methods for parameter estimation are introduced, and performance on real data is evaluated.

...read moreread less

Abstract: In this paper, we take a pattern recognition approach to correcting errors in text generated from printed documents using optical character recognition (OCR). We apply a very general, theoretically optimal model to the problem of OCR word correction, introduce practical methods for parameter estimation, and evaluate performance on real data.

...read moreread less

40 citations

Patent•

Gesture-based selective text recognition

[...]

Dar-Shyang Lee¹, Lee-Feng Chien¹, Aries Hsieh¹, Pin Ting¹, Kin Wong¹ - Show less +1 more•Institutions (1)

Google¹

06 Oct 2010

TL;DR: In this article, an image is displayed on a touch screen and a user's underline gesture on the displayed image is detected and a text region including the text is identified in the surrounding region and cropped from the image.

...read moreread less

Abstract: An image is displayed on a touch screen. A user's underline gesture on the displayed image is detected. The area of the image touched by the underline gesture and a surrounding region approximate to the touched area are identified. Skew for text in the surrounding region is determined and compensated. A text region including the text is identified in the surrounding region and cropped from the image. The cropped image is transmitted to an optical character recognition (OCR) engine, which processes the cropped image and returns OCR'ed text. The OCR'ed text is outputted.

...read moreread less

40 citations

Posted Content•

Fooling OCR Systems with Adversarial Text Images.

[...]

Congzheng Song, Vitaly Shmatikov

15 Feb 2018-arXiv: Learning

TL;DR: It is demonstrated that state-of-the-art optical character recognition (OCR) based on deep learning is vulnerable to adversarial images.

...read moreread less

Abstract: We demonstrate that state-of-the-art optical character recognition (OCR) based on deep learning is vulnerable to adversarial images. Minor modifications to images of printed text, which do not change the meaning of the text to a human reader, cause the OCR system to "recognize" a different text where certain words chosen by the adversary are replaced by their semantic opposites. This completely changes the meaning of the output produced by the OCR system and by the NLP applications that use OCR for preprocessing their inputs.

...read moreread less

40 citations

Proceedings Article•

A Fast Re-scoring Strategy to Capture Long-Distance Dependencies

[...]

Anoop Deoras¹, Tomas Mikolov², Kenneth Church¹•Institutions (2)

Johns Hopkins University¹, Brno University of Technology²

27 Jul 2011

TL;DR: A re-scoring strategy is proposed that makes it feasible to capture more long-distance dependencies in the natural language and a hill climbing method (iterative decoding) is proposed to search over islands of confusability in the word lattice.

...read moreread less

Abstract: A re-scoring strategy is proposed that makes it feasible to capture more long-distance dependencies in the natural language. Two pass strategies have become popular in a number of recognition tasks such as ASR (automatic speech recognition), MT (machine translation) and OCR (optical character recognition). The first pass typically applies a weak language model (n-grams) to a lattice and the second pass applies a stronger language model to N best lists. The stronger language model is intended to capture more long-distance dependencies. The proposed method uses RNN-LM (recurrent neural network language model), which is a long span LM, to re-score word lattices in the second pass. A hill climbing method (iterative decoding) is proposed to search over islands of confusability in the word lattice. An evaluation based on Broadcast News shows speedups of 20 over basic N best re-scoring, and word error rate reduction of 8% (relative) on a highly competitive setup.

...read moreread less

40 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
…
139
140
141
142
143
144
145
…
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

7,941

Papers

180,323

Citations

No. of papers in the topic in previous years
Year	Papers
2023	186
2022	425
2021	333
2020	448
2019	430
2018	357

Optical character recognition

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics