scispace - formally typeset
Search or ask a question
Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.


Papers
More filters
Patent
03 Oct 2007
TL;DR: In this article, a business document is scanned to create an imaged document and a set of extracted data is extracted from the business document image via optical character recognition (OCR) and compared with data in business information management or enterprise resource planning (ERP) system.
Abstract: Systems and methods of reconciling data from an imaged document. In one embodiment, a business document is scanned to create a business document image. A set of extracted data is extracted from the business document image via optical character recognition (OCR). The set of OCR extracted data is then compared with data in business information management or enterprise resource planning (ERP) system. A set of ERP data is retrieved from the ERP system that relates to the set of OCR extracted data. The retrieved ERP data is than assigned to the set of OCR extracted data to create a set of assigned data. The business document image is then displayed in a business document image pane, the set of OCR extracted data is displayed in the OCR data pane, and the retrieved ERP data is displayed in the ERP data pane. The set of assigned data is validated, and the ERP system is updated with the set of validated, assigned data. In other embodiments, data is extracted from text files without using OCR.

42 citations

Proceedings ArticleDOI
17 Jan 2005
TL;DR: The UNLV/ISRI Analytic Tools for OCR Evaluation together with a large and diverse collection of scanned document images with the associated ground-truth text will allow anyone to conduct a meaningful test comparing the performance of competing page-reading algorithms.
Abstract: We announce the availability of the UNLV/ISRI Analytic Tools for OCR Evaluation together with a large and diverse collection of scanned document images with the associated ground-truth text. This combination of tools and test data will allow anyone to conduct a meaningful test comparing the performance of competing page-reading algorithms. The value of this collection of software tools and test data is enhanced by knowledge of the past performance of several systems using exactly these tools and this data. These performance comparisons were published in previous ISRI Test Reports and are also provided. Another value is that the tools can be used to test the character accuracy of any page-reading OCR system for any language included in the Unicode standard. The paper concludes with a summary of the programs, test data, and documentation that is available and gives the URL where they can be located.

42 citations

Patent
18 Jan 2012
TL;DR: In this article, a complete video frame that is associated with a presented video image of a video content event is presented, where a region of text is identified in the video frame and an optical character recognition (OCR) algorithm is used to translate the text.
Abstract: Systems and methods are operable to present text identified in a presented video image of a media content event. An exemplary embodiment receives a complete video frame that is associated with a presented video image of a video content event, wherein the presented video image includes a region of text; finds the text in the complete video frame; uses an optical character recognition (OCR) algorithm to translate the found text; and presents the translated text. The translated text may be presented on a display concurrently with the video image that is presented on the display. Alternatively, or additionally, the translated text may be presented as audible speech emitted from at least one speaker.

42 citations

Proceedings ArticleDOI
12 Oct 1997
TL;DR: This paper examines a simple pattern-recognition system using an artificial neural network to simulate character recognition, and uses the backpropagation method for learning in the neural network.
Abstract: Many artificial neural network models (ANNs) have been proposed to mimic the human brain in solving problems involving human-like intelligence. An application of an artificial neural network approach for optical character recognition (OCR) is discussed in this paper. We examine a simple pattern-recognition system using an artificial neural network to simulate character recognition. A simple feedforward neural network model has been trained with different sets of noisy data. The backpropagation method is used for learning in the neural network.

42 citations

Book ChapterDOI
20 Oct 2003
TL;DR: This study attempts to develop a method for embedding watermark in the text that is as successful as the frequency-domain methods have been for image and audio.
Abstract: Numerous schemes have been designed for watermarking multimedia contents. Many of these schemes are vulnerable to watermark erasing attacks. Naturally, such methods are ineffective on text unless the text is represented as a bitmap image, but in that case, the watermark can be erased easily by using Optical Character Recognition (OCR) to change the representation of the text from a bitmap to ASCII or EBCDIC. This study attempts to develop a method for embedding watermark in the text that is as successful as the frequency-domain methods have been for image and audio. The novel method embeds the watermark in original text, creating ciphertext, which preserves the meaning of the original text via various semantic replacements.

42 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
87% related
Feature (computer vision)
128.2K papers, 1.7M citations
85% related
Image segmentation
79.6K papers, 1.8M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023186
2022425
2021333
2020448
2019430
2018357