scispace - formally typeset
Search or ask a question
Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.


Papers
More filters
Journal ArticleDOI
TL;DR: A real time industrial machine vision system incorporating optical character recognition (OCR) is employed to inspect markings on integrated circuit (IC) chips to identify print errors such as illegible characters, missing characters and upside down printing.
Abstract: In this paper, a real time industrial machine vision system incorporating optical character recognition (OCR) is employed to inspect markings on integrated circuit (IC) chips. This inspection is carried out while the ICs are coming out from the manufacturing line. A TSSOP-DGG type of IC package from Texas Instruments is used in the investigation. The IC chip markings are laser printed. This inspection system tests whether the laser printed marking on IC chips is proper. The inspection has to identify print errors such as illegible characters, missing characters and upside down printing. The vision inspection of the printed markings on the IC chip is carried out in three phases, namely, image preprocessing, feature extraction and classification. The MATLAB platform and its toolboxes are used for designing the inspection processing technique. Speed of the marking inspection is mostly dependent on the effectiveness of the feature extraction technique. The performances of four feature extraction techniques are compared in terms of their respective speed. The feature extracted data are used in a neural network for classifying the marking errors. A suggestion to optimize the number of input neurons of the neural network for a fast classification is also presented.

35 citations

Proceedings ArticleDOI
03 Aug 2003
TL;DR: This paper presents a system for the offline recognition of cursive handwritten lines of text based on continuous density HMMs and Statistical Language Models, which shows a recognition rate of ~85% with a lexicon containing 50'000 words.
Abstract: This paper presents a system for the offline recognitionof cursive handwritten lines of text. The system is based oncontinuous density HMMs and Statistical Language Models.The system recognizes data produced by a single writer.No a-priori knowledge is used about the content of the textto be recognized. Changes in the experimental setup withrespect to the recognition of single words are highlighted.The results show a recognition rate of ~85% with a lexiconcontaining 50'000 words. The experiments were performedover a publicly available database.

34 citations

Proceedings ArticleDOI
11 Aug 2002
TL;DR: By varying the number of gaussians, multiple hypotheses are provided to an OCR system and the final result is selected from the set of outputs, leading to an improvement of the system's performances.
Abstract: In this paper we propose a method to segment and recognize text embedded in video and images. We modelize the gray level distribution in the text images as mixture of gaussians, and then assign each pixel to one of the gaussian layer. The assignment is based on prior of the contextual information, which is modeled by a Markov random field (MRF) with online estimated coefficients. Each layer is then processed through a connected component analysis module and forwarded to the OCR system as one segmentation hypothesis. By varying the number of gaussians, multiple hypotheses are provided to an OCR system and the final result is selected from the set of outputs, leading to an improvement of the system's performances.

34 citations

Journal ArticleDOI
TL;DR: A system that locates words in document image archives bypassing character recognition and using word images as queries makes use of document image processing techniques, in order to extract powerful features for the description of the word images.

34 citations

Journal ArticleDOI
TL;DR: Experimental results showed a great success of the recognition method compared to the state of the art techniques, where it could achieve very high recognition rates exceeding 99%.
Abstract: Optical Character Recognition (OCR) is the process of recognizing printed or handwritten text on paper documents. This paper proposes an OCR system for Arabic characters. In addition to the preprocessing phase, the proposed recognition system consists mainly of three phases. In the first phase, we employ word segmentation to extract characters. In the second phase, Histograms of Oriented Gradient (HOG) are used for feature extraction. The final phase employs Support Vector Machine (SVM) for classifying characters. We have applied the proposed method for the recognition of Jordanian city, town, and village names as a case study, in addition to many other words that offers the characters shapes that are not covered with Jordan cites. The set has carefully been selected to include every Arabic character in its all four forms. To this end, we have built our own dataset consisting of more than 43.000 handwritten Arabic words (30000 used in the training stage and 13000 used in the testing stage). Experimental results showed a great success of our recognition method compared to the state of the art techniques, where we could achieve very high recognition rates exceeding 99%.

34 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
87% related
Feature (computer vision)
128.2K papers, 1.7M citations
85% related
Image segmentation
79.6K papers, 1.8M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023186
2022425
2021333
2020448
2019430
2018357