scispace - formally typeset
Search or ask a question
Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.


Papers
More filters
Journal ArticleDOI
TL;DR: An OCR system developed for the recognition of basic characters in printed Kannada text, which can handle different font sizes and font types and can be extended for the Recognition of other south Indian languages, especially for Telugu.
Abstract: Optical Character Recognition (OCR) systems have been effectively developed for the recognition of printed characters of non-Indian languages. Efforts are on the way for the development of efficient OCR systems for Indian languages, especially for Kannada, a popular South Indian language. We present in this paper an OCR system developed for the recognition of basic characters (vowels and consonants) in printed Kannada text, which can handle different font sizes and font types. Hu’s invariant moments and Zernike moments that have been progressively used in pattern recognition are used in our system to extract the features of printed Kannada characters. Neural classifiers have been effectively used for the classification of characters based on moment features. An encouraging recognition rate of 96.8% has been obtained. The system methodology can be extended for the recognition of other south Indian languages, especially for Telugu.

71 citations

Proceedings ArticleDOI
10 Sep 2001
TL;DR: An automatic technique for the identification of printed Roman, Chinese, Arabic, Devnagari and Bangla text lines from a single document is proposed and has an accuracy of about 97.33%.
Abstract: In a general situation, a document page may contain several scriptforms. For optical character recognition (OCR) of such a document page, it is necessary to separate the scripts before feeding them to their individual OCR systems. An automatic technique for the identification of printed Roman, Chinese, Arabic, Devnagari and Bangla text lines from a single document is proposed. Shape based features, statistical features and some features obtained from the concept of a water reservoir are used for script identification. The proposed scheme has an accuracy of about 97.33%.

71 citations

01 Jan 2010
TL;DR: The recognition rate of the proposed OCR system with the image document of Devnagari Script has been found to be quite high and a technique for OCR System for different five fonts and sizes of printed DevNagari script using Artificial Neural Network is proposed.
Abstract: There are about 300 million people in India who speak Hindi and write Devnagari script. Research in Optical Character Recognition (OCR) is popular for its application potential in banks, post offices, defense organizations and library automation etc. However most of the OCR systems are available for European texts. In this paper, we have proposed a technique for OCR System for different five fonts and sizes of printed Devnagari script using Artificial Neural Network. The recognition rate of the proposed OCR system with the image document of Devnagari Script has been found to be quite high.

71 citations

Journal ArticleDOI
TL;DR: This paper presents a machine-printed and hand-written text classification scheme for Bangla and Devnagari, the two most popular Indian scripts, which has an accuracy of 98.6%.

70 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
87% related
Feature (computer vision)
128.2K papers, 1.7M citations
85% related
Image segmentation
79.6K papers, 1.8M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023186
2022425
2021333
2020448
2019430
2018357