scispace - formally typeset
Search or ask a question
Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.


Papers
More filters
Proceedings ArticleDOI
17 Jan 2005
TL;DR: The implementation and evaluation of a Hidden Markov Model to extract addresses from OCR text is presented and it is shown that this type of Information Extraction task seems to be affected negatively by the presence of OCRtext.
Abstract: This paper presents the implementation and evaluation of a Hidden Markov Model to extract addresses from OCR text. Although Hidden Markov Models discover addresses with high precision and recall, this type of Information Extraction task seems to be affected negatively by the presence of OCR text.

118 citations

Journal ArticleDOI
TL;DR: The performance of 10 parallel thinning algorithms from this perspective is reported on by gathering statistics from their performance on large sets of data and examining the effects of the differentthinning algorithms on an OCR system.
Abstract: Skeletonization algorithms have played an important role in the preprocessing phase of OCR systems. In this paper we report on the performance of 10 parallel thinning algorithms from this perspective by gathering statistics from their performance on large sets of data and examining the effects of the different thinning algorithms on an OCR system. >

118 citations

Journal ArticleDOI
Kongqiao Wang1, Jari Kangas1
TL;DR: A robust, connected-component-based character locating method using an aligning-and-merging-analysis (AMA) scheme to locate all the potential characters using the information about the bounding boxes of connected components in all color layers.

117 citations

Journal ArticleDOI
Berrin Yanikoglu1, Peter A. Sandon1
TL;DR: This work introduces a new segmentation algorithm, guided in part by the global characteristics of the handwriting, which finds the successive segmentation points by evaluating a cost function at each point along the baseline.

116 citations

Patent
10 Apr 1996
TL;DR: In this article, a language-independent and segment free OCR system and method comprises a unique feature extraction approach which represents two dimensional data relating to OCR as one independent variable (specifically the position within a line of text in the direction of the line) so that the same CSR technology based on HMMs can be adapted in a straightforward manner to recognize optical characters.
Abstract: A language-independent and segment free OCR system and method comprises a unique feature extraction approach which represents two dimensional data relating to OCR as one independent variable (specifically the position within a line of text in the direction of the line) so that the same CSR technology based on HMMs can be adapted in a straightforward manner to recognize optical characters. After a line finding stage, followed by a simple feature-extraction stage, the system can utilize a commercially available CSR system, with little or no modification, to perform the recognition of text by and training of the system. The whole system, including the feature extraction, training, and recognition components, are designed to be independent of the script or language of the text being recognized. The language-dependent parts of the system are confined to the lexicon and training data. Furthermore, the method of recognition does not require pre-segmentation of the data at the character and/or word levels, neither for training nor for recognition. In addition, a language model can be used to enhance system performance as an integral part of the recognition process and not as a post-process, as is commonly done with spell checking, for example.

115 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
87% related
Feature (computer vision)
128.2K papers, 1.7M citations
85% related
Image segmentation
79.6K papers, 1.8M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023186
2022425
2021333
2020448
2019430
2018357