scispace - formally typeset
Search or ask a question
Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.


Papers
More filters
Journal ArticleDOI
Ching Y. Suen1, C. Nadal1, R. Legault1, T.A. Mai1, Louisa Lam1 
01 Jul 1992
TL;DR: It is shown that it is possible to reduce the substitution rate to a desired level while maintaining a fairly high recognition rate in the classification of totally unconstrained handwritten ZIP code numerals.
Abstract: Four independently, developed expert algorithms for recognizing unconstrained handwritten numerals are presented. All have high recognition rates. Different experimental approaches for incorporating these recognition methods into a more powerful system are also presented. The resulting multiple-expert system proves that the consensus of these methods tends to compensate for individual weaknesses, while preserving individual strengths. It is shown that it is possible to reduce the substitution rate to a desired level while maintaining a fairly high recognition rate in the classification of totally unconstrained handwritten ZIP code numerals. If reliability is of the utmost importance, substitutions can be avoided completely (reliability=100%) while retaining a recognition rate above 90%. Results are compared with those for some of the most effective numeral recognition systems found in the literature. >

422 citations

Journal ArticleDOI
TL;DR: A complete Optical Character Recognition (OCR) system for printed Bangla, the fourth most popular script in the world, is presented and extension of the work to Devnagari, the third most popular Script in the World, is discussed.

381 citations

Book
24 Nov 2005
TL;DR: This 2005 book provides a needed review of signal processing theory, the pattern recognition metrics, and the practical application know-how from basic premises and shows both digital and optical implementations.
Abstract: Correlation is a robust and general technique for pattern recognition and is used in many applications, such as automatic target recognition, biometric recognition and optical character recognition The design, analysis and use of correlation pattern recognition algorithms requires background information, including linear systems theory, random variables and processes, matrix/vector methods, detection and estimation theory, digital signal processing and optical processing This 2005 book provides a needed review of this diverse background material and develops the signal processing theory, the pattern recognition metrics, and the practical application know-how from basic premises It shows both digital and optical implementations It also contains technology presented by the team that developed it and includes case studies of significant interest, such as face and fingerprint recognition Suitable for graduate students taking courses in pattern recognition theory, whilst reaching technical levels of interest to the professional practitioner

366 citations

Journal ArticleDOI
TL;DR: Two methods for automatically locating text in complex color images that computes the local spatial variation in the gray-scale image, and locates text in regions with high variance are presented.

362 citations

Posted Content
TL;DR: The COCO-Text dataset is described, which contains over 173k text annotations in over 63k images and presents an analysis of three leading state-of-the-art photo Optical Character Recognition (OCR) approaches on the dataset.
Abstract: This paper describes the COCO-Text dataset. In recent years large-scale datasets like SUN and Imagenet drove the advancement of scene understanding and object recognition. The goal of COCO-Text is to advance state-of-the-art in text detection and recognition in natural images. The dataset is based on the MS COCO dataset, which contains images of complex everyday scenes. The images were not collected with text in mind and thus contain a broad variety of text instances. To reflect the diversity of text in natural scenes, we annotate text with (a) location in terms of a bounding box, (b) fine-grained classification into machine printed text and handwritten text, (c) classification into legible and illegible text, (d) script of the text and (e) transcriptions of legible text. The dataset contains over 173k text annotations in over 63k images. We provide a statistical analysis of the accuracy of our annotations. In addition, we present an analysis of three leading state-of-the-art photo Optical Character Recognition (OCR) approaches on our dataset. While scene text detection and recognition enjoys strong advances in recent years, we identify significant shortcomings motivating future work.

361 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
87% related
Feature (computer vision)
128.2K papers, 1.7M citations
85% related
Image segmentation
79.6K papers, 1.8M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023186
2022425
2021333
2020448
2019430
2018357