Home
/
Topics
/
Optical character recognition

Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Dynamic-programming-based handwritten word recognition using the Choquet fuzzy integral as the match function

[...]

Paul D. Gader¹, Magdi A. Mohamed¹, James M. Keller¹•Institutions (1)

University of Missouri¹

01 Jan 1996-Journal of Electronic Imaging

TL;DR: Experimental results demonstrate the utility of the Choquet fuzzy integral in handwritten word recognition and indicate a simple choice of fuzzy integral works better than a more complex choice.

...read moreread less

Abstract: The Choquet fuzzy integral is applied to handwritten word recognition. A handwritten word recognition system is described. The word recognition system assigns a recognition confidence value to each string in a lexicon of candidate strings. The system uses a lexicon-driven approach that integrates segmentation and recognition via dynamic programming matching. The dynamic programming matcher finds a segmentation of the word image for each string in the lexicon. The traditional match score between a segmentation and a string is an average. In this paper, fuzzy integrals are used instead of an average. Experimental results demonstrate the utility of this approach. A surprising result is obtained that indicates a simple choice of fuzzy integral works better than a more complex choice.

...read moreread less

35 citations

Proceedings Article•DOI•

Confidence guided progressive search and fast match techniques for high performance Chinese/English OCR

[...]

Zhi-Dan Feng¹, Qiang Huo¹•Institutions (1)

University of Hong Kong¹

11 Aug 2002

TL;DR: Two innovative techniques that contribute to the high efficiency in recognition of the mixed Chinese/English text line are presented, including a progressive search strategy based on character verification and a tree-based fast match technique with a confidence-guided adaptive stopping mechanism.

...read moreread less

Abstract: In the past several years, we have been developing a high performance OCR engine for machine printed Chinese/English documents. We present two innovative techniques that contribute to the high efficiency in recognition of the mixed Chinese/English text line. They are (1) a progressive search strategy based on character verification, and (2) a tree-based fast match technique with a confidence-guided adaptive stopping mechanism. The efficacy of the proposed techniques is confirmed by experiments in a benchmark test.

...read moreread less

35 citations

Book•

Computer programs for spelling correction: An experiment in program design

[...]

James L. Peterson¹•Institutions (1)

University of Texas at Austin¹

01 Jan 1980

TL;DR: Reading is a need and a hobby at once and this condition is the on that will make you feel that you must read.

...read moreread less

Abstract: Some people may be laughing when looking at you reading in your spare time. Some may be admired of you. And some may want be like you who have reading hobby. What about your own feel? Have you felt right? Reading is a need and a hobby at once. This condition is the on that will make you feel that you must read. If you know are looking for the book enPDFd computer programs for spelling correction an experiment in program design as the choice of reading, you can find here.

...read moreread less

35 citations

Proceedings Article•DOI•

Evaluation of an automatic markup system

[...]

Kazem Taghva¹, Allen Condit¹, Julie Borsack¹•Institutions (1)

University of Nevada, Las Vegas¹

30 Mar 1995

TL;DR: This paper presents a preliminary report on the design and evaluation of a system to automatically markup technical documents, based on information provided by an OCR device that differs from traditional OCR devices in that it not only performs optical character recognition, but also provides detailed information about page layout, word geometry, and font usage.

...read moreread less

Abstract: One predominant application of OCR is the recognition of full text documents for information retrieval. Modern retrieval systems exploit both the textual content of the document as well as its structure. The relationship between textual content and character accuracy have been the focus of recent studies. It has been shown that due to the redundancies in text, average precision and recall is not heavily affected by OCR character errors. What is not fully known is to what extent OCR devices can provide reliable information that can be used to capture the structure of the document. In this paper, we present a preliminary report on the design and evaluation of a system to automatically markup technical documents, based on information provided by an OCR device. The device we use differs from traditional OCR devices in that it not only performs optical character recognition, but also provides detailed information about page layout, word geometry, and font usage. Our automatic markup program, which we call Autotag, uses this information, combined with dictionary lookup and content analysis, to identify structural components of the text. These include the document title, author information, abstract, sections, section titles, paragraphs, sentences, and de-hyphenated words. A visual examination of the hardcopy is compared to the output of our markup system to determine its correctness.© (1995) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

...read moreread less

35 citations

Journal Article•DOI•

Segmentation-free optical character recognition for printed Urdu text

[...]

Israr Ud Din¹, Imran Siddiqi¹, Shehzad Khalid¹, Tahir Azam•Institutions (1)

Bahria University¹

06 Sep 2017-Eurasip Journal on Image and Video Processing

TL;DR: A segmentation-free optical character recognition system for printed Urdu Nastaliq font using ligatures as units of recognition using Hidden Markov Models for classification is presented.

...read moreread less

Abstract: This paper presents a segmentation-free optical character recognition system for printed Urdu Nastaliq font using ligatures as units of recognition. The proposed technique relies on statistical features and employs Hidden Markov Models for classification. A total of 1525 unique high-frequency Urdu ligatures from the standard Urdu Printed Text Images (UPTI) database are considered in our study. Ligatures extracted from text lines are first split into primary (main body) and secondary (dots and diacritics) ligatures and multiple instances of the same ligature are grouped into clusters using a sequential clustering algorithm. Hidden Markov Models are trained separately for each ligature using the examples in the respective cluster by sliding right-to-left the overlapped windows and extracting a set of statistical features. Given the query text, the primary and secondary ligatures are separately recognized and later associated together using a set of heuristics to recognize the complete ligature. The system evaluated on the standard UPTI Urdu database reported a ligature recognition rate of 92% on more than 6000 query ligatures.

...read moreread less

35 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
…
159
160
161
162
163
164
165
…
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

7,941

Papers

180,323

Citations

No. of papers in the topic in previous years
Year	Papers
2023	186
2022	425
2021	333
2020	448
2019	430
2018	357

Optical character recognition

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics