Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Posted Content•

PP-OCR: A Practical Ultra Lightweight OCR System

[...]

Yuning Du, Chenxia Li, Ruoyu Guo, Xiaoting Yin, Weiwei Liu, Jun Zhou, Yifan Bai, Yu Zilin, Yehua Yang, Qingqing Dang¹, Haoshuang Wang - Show less +7 more•Institutions (1)

Baidu¹

21 Sep 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper proposes a practical ultra lightweight OCR system, i.e., PP-OCR, with an overall model size of only 3.5M, and introduces a bag of strategies to either enhance the model ability or reduce the model size.

...read moreread less

Abstract: The Optical Character Recognition (OCR) systems have been widely used in various of application scenarios, such as office automation (OA) systems, factory automations, online educations, map productions etc. However, OCR is still a challenging task due to the various of text appearances and the demand of computational efficiency. In this paper, we propose a practical ultra lightweight OCR system, i.e., PP-OCR. The overall model size of the PP-OCR is only 3.5M for recognizing 6622 Chinese characters and 2.8M for recognizing 63 alphanumeric symbols, respectively. We introduce a bag of strategies to either enhance the model ability or reduce the model size. The corresponding ablation experiments with the real data are also provided. Meanwhile, several pre-trained models for the Chinese and English recognition are released, including a text detector (97K images are used), a direction classifier (600K images are used) as well as a text recognizer (17.9M images are used). Besides, the proposed PP-OCR are also verified in several other language recognition tasks, including French, Korean, Japanese and German. All of the above mentioned models are open-sourced and the codes are available in the GitHub repository, i.e., this https URL.

...read moreread less

52 citations

Proceedings Article•DOI•

Multifont classification using typographical attributes

[...]

Min-Chul Jung¹, Yong-Chul Shin, Sargur N. Srihari•Institutions (1)

State University of New York System¹

20 Sep 1999

TL;DR: A multifont classification scheme to help with the recognition of multifont and multisize characters that uses typographical attributes such as ascenders, descenders and serifs obtained from a word image as an input to a neural network classifier.

...read moreread less

Abstract: This paper introduces a multifont classification scheme to help with the recognition of multifont and multisize characters. It uses typographical attributes such as ascenders, descenders and serifs obtained from a word image. The attributes are used as an input to a neural network classifier to produce the multifont classification results. It can classify 7 commonly used fonts for all point sizes from 7 to 18. The approach developed in this scheme can handle a wide range of image quality even with severely touching characters. The detection of the font can improve character segmentation as well as character recognition because the identification of the font provides information on the structure and typographical design of characters. Therefore, this multifont classification algorithm can be used for maintaining good recognition rates of a machine printed OCR system regardless of fonts and sizes. Experiments have shown that font classification accuracies reach high performance levels of about 95 percent even with severely touching characters. The technique developed for the selected 7 fonts in this paper can be applied to any other fonts.

...read moreread less

52 citations

Patent•

Method of optical character recognition

[...]

Warner C. Scott¹•Institutions (1)

Texas Instruments¹

01 Aug 1983

TL;DR: In this article, a method for recognizing and providing an output corresponding to a character in which the character is received by an imager, digitized, and transmitted to a memory is presented.

...read moreread less

Abstract: A method for recognizing and providing an output corresponding to a character in which the character is received by an imager, digitized, and transmitted to a memory. Data in the memory is read in a sequence which circumnavigates the test character. Only data representative of the periphery of the character are read. During the circumnavigation, character parameters, such as height, width, perimeter, area and waveform are determined. The character parameters are compared with reference character parameters and the ASCII code for the reference character which matches the character is provided as an output.

...read moreread less

52 citations

Proceedings Article•DOI•

Document Image Binarization using LSTM: A Sequence Learning Approach

[...]

Muhammad Zeshan Afzal¹, Joan Pastor-Pellicer², Faisal Shafait³, Thomas M. Breuel⁴, Andreas Dengel¹, Marcus Liwicki¹ - Show less +2 more•Institutions (4)

German Research Centre for Artificial Intelligence¹, Polytechnic University of Valencia², National University of Sciences and Technology³, Kaiserslautern University of Technology⁴

22 Aug 2015

TL;DR: The proposed approach significantly outperforms standard binarization approaches both for F-Measure and OCR accuracy with the availability of enough training samples.

...read moreread less

Abstract: We propose to address the problem of Document Image Binarization (DIB) using Long Short-Term Memory (LSTM) which is specialized in processing very long sequences. Thus, the image is considered as a 2D sequence of pixels and in accordance to this a 2D LSTM is employed for the classification of each pixel as text or background. The proposed approach processes the information using local context and then propagates the information globally in order to achieve better visual coherence. The method is robust against most of the document artifacts. We show that with a very simple network without any feature extraction and with limited amount of data the proposed approach works reasonably well for the DIBCO 2013 dataset. Furthermore a synthetic dataset is considered to measure the performance of the proposed approach with both binarization and OCR groundtruth. The proposed approach significantly outperforms standard binarization approaches both for F-Measure and OCR accuracy with the availability of enough training samples.

...read moreread less

52 citations

Proceedings Article•DOI•

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation

[...]

Sharon Fogel¹, Hadar Averbuch-Elor², Sarel Cohen¹, Shai Mazor¹, Roee Litman¹ - Show less +1 more•Institutions (2)

Amazon.com¹, Cornell University²

14 Jun 2020

TL;DR: This work presents ScrabbleGAN, a semi-supervised approach to synthesize handwritten text images that are versatile both in style and lexicon, and relies on a novel generative model which can generate images of words with an arbitrary length.

...read moreread less

Abstract: Optical character recognition (OCR) systems performance have improved significantly in the deep learning era. This is especially true for handwritten text recognition (HTR), where each author has a unique style, unlike printed text, where the variation is smaller by design. That said, deep learning based HTR is limited, as in every other task, by the number of training examples. Gathering data is a challenging and costly task, and even more so, the labeling task that follows, of which we focus here. One possible approach to reduce the burden of data annotation is semi-supervised learning. Semi supervised methods use, in addition to labeled data, some unlabeled samples to improve performance, compared to fully supervised ones. Consequently, such methods may adapt to unseen images during test time. We present ScrabbleGAN, a semi-supervised approach to synthesize handwritten text images that are versatile both in style and lexicon. ScrabbleGAN relies on a novel generative model which can generate images of words with an arbitrary length. We show how to operate our approach in a semi-supervised manner, enjoying the aforementioned benefits such as performance boost over state of the art supervised HTR. Furthermore, our generator can manipulate the resulting text style. This allows us to change, for instance, whether the text is cursive, or how thin is the pen stroke.

...read moreread less

52 citations

Collapse

Network Information

Performance

Metrics

7,941

Papers

180,323

Citations

No. of papers in the topic in previous years
Year	Papers
2023	186
2022	425
2021	333
2020	448
2019	430
2018	357

Optical character recognition

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics