scispace - formally typeset
Search or ask a question
Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.


Papers
More filters
Proceedings ArticleDOI
12 Oct 2020
TL;DR: A novel Cascade Reasoning Network (CRN) is proposed that consists of a progressive attention module (PAM) and a multimodal reasoning graph (MRG) module that aims to explicitly model the connections and interactions between texts and visual concepts.
Abstract: We study the problem of text-based visual question answering (T-VQA) in this paper. Unlike general visual question answering (VQA) which only builds connections between questions and visual contents, T-VQA requires reading and reasoning over both texts and visual concepts that appear in images. Challenges in T-VQA mainly lie in three aspects: 1) It is difficult to understand the complex logic in questions and extract specific useful information from rich image contents to answer them; 2) The text-related questions are also related to visual concepts, but it is difficult to capture cross-modal relationships between the texts and the visual concepts; 3) If the OCR (optical character recognition) system fails to detect the target text, the training will be very difficult. To address these issues, we propose a novel Cascade Reasoning Network (CRN) that consists of a progressive attention module (PAM) and a multimodal reasoning graph (MRG) module. Specifically, the PAM regards the multimodal information fusion operation as a stepwise encoding process and uses the previous attention results to guide the next fusion process. The MRG aims to explicitly model the connections and interactions between texts and visual concepts. To alleviate the dependence on the OCR system, we introduce an auxiliary task to train the model with accurate supervision signals, thereby enhancing the reasoning ability of the model in question answering. Extensive experiments on three popular T-VQA datasets demonstrate the effectiveness of our method compared with SOTA methods. The source code is available at https://github.com/guanghuixu/CRN_tvqa.

35 citations

Journal ArticleDOI
TL;DR: The authors have proposed to use deep learning model as a feature extractor as well as a classifier for the recognition of 33 classes of basic characters of Devanagari ancient manuscripts and the accuracy achieved is better than other state-of-the-art techniques.
Abstract: Devanagari script is the most widely used script in India and other Asian countries. There is a rich collection of ancient Devanagari manuscripts, which is a wealth of knowledge. To make these manuscripts available to people, efforts are being done to digitize these documents. Optical Character Recognition (OCR) plays an important role in recognizing these documents. Convolutional Neural Network (CNN) is a powerful model that is giving very promising results in the field of character recognition, pattern recognition etc. CNN has never been used for the recognition of the Devanagari ancient manuscripts. Our aim in the proposed work is to use the power of CNN for extracting the wealth of knowledge from Devanagari handwritten ancient manuscripts. In addition, we aim is to experiment with various design options like number of layes, stride size, number of filters, kenel size and different functions in various layers and to select the best of these. In this paper, the authors have proposed to use deep learning model as a feature extractor as well as a classifier for the recognition of 33 classes of basic characters of Devanagari ancient manuscripts. A dataset containing 5484 characters has been used for the experimental work. Various experiments show that the accuracy achieved using CNN as a feature extractor is better than other state-of-the-art techniques. The recognition accuracy of 93.73% has been achieved by using the model proposed in this paper for Devanagari ancient character recognition.

35 citations

Proceedings ArticleDOI
16 Jul 2008
TL;DR: The projection distance metric and zoning based scheme for numeral recognition and a nearest neighbor classifier is used for subsequent purpose and gives around 93% and 90% of recognition accuracy for Kannada and Tamil numerals respectively.
Abstract: Handwritten character recognition has received extensive attention in academic and production fields. The recognition system can be either online or off-line. There is a large demand for Optical character recognition on hand written documents. India is a multi-lingual country and multi script country, where eighteen official scripts are accepted and have over hundred regional languages. In this paper we have proposed the projection distance metric and zoning based scheme for numeral recognition. We tested our proposed method for Kannada and Tamil numerals. A nearest neighbor classifier is used for subsequent purpose. The proposed method gives around 93% and 90% of recognition accuracy for Kannada and Tamil numerals respectively.

35 citations

Patent
30 Dec 2004
TL;DR: In this paper, an exception processing system for MICR documents is described, in which an exception does not prevent the routing of the document if it is not related to the routing/transit field and an optical character recognition (OCR) process (300, 414) is performed on the stored, electronic image of a document to correct digit errors in the stored data read from the documents.
Abstract: System and method for exception processing of MICR documents. MICR documents are read and sorted to a destination pocket for processing subject to a determination that an exception does not prevent the routing of the document. In example embodiments, for example, an error does not prevent the routing of the document if it is not related to the routing/transit field. In the case of digit errors, an optical character recognition (OCR) process (300, 414) is performed on the stored, electronic image of the document to correct digit errors in the stored data read from the documents. If a determination is made that correction or other exception processing cannot be handled through the OCR process, the image and corresponding MICR data is displayed on a user terminal (528), for manual verification or correction by reference to an image of the document, rather than the document itself.

35 citations

Patent
18 Oct 1994
TL;DR: In this paper, pixel density matrices, horizontal connectivity of boxes, and vertical connectivity matrices are derived from known patterns, and a set of Pareto non-inferior candidate patterns is selected from the set of candidates.
Abstract: Pattern recognition, for instance optical character recognition, is achieved by defining a minimal bounding rectangle around a pattern, dividing the pattern into a grid of boxes, comparing a vector derived from this partitioned pattern to vectors similarly derived from known patterns, choosing a set of Pareto non-inferior candidate patterns, and selecting a recognized pattern from the set of candidates. The vectors include pixel density matrices, matrices of horizontal connectivity of boxes, and matrices of vertical connectivity of boxes.

35 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
87% related
Feature (computer vision)
128.2K papers, 1.7M citations
85% related
Image segmentation
79.6K papers, 1.8M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023186
2022425
2021333
2020448
2019430
2018357