Topic
Optical character recognition
About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.
Papers published on a yearly basis
Papers
More filters
•
03 Oct 2007
TL;DR: In this article, a business document is scanned to create an imaged document and a set of extracted data is extracted from the business document image via optical character recognition (OCR) and compared with data in business information management or enterprise resource planning (ERP) system.
Abstract: Systems and methods of reconciling data from an imaged document. In one embodiment, a business document is scanned to create a business document image. A set of extracted data is extracted from the business document image via optical character recognition (OCR). The set of OCR extracted data is then compared with data in business information management or enterprise resource planning (ERP) system. A set of ERP data is retrieved from the ERP system that relates to the set of OCR extracted data. The retrieved ERP data is than assigned to the set of OCR extracted data to create a set of assigned data. The business document image is then displayed in a business document image pane, the set of OCR extracted data is displayed in the OCR data pane, and the retrieved ERP data is displayed in the ERP data pane. The set of assigned data is validated, and the ERP system is updated with the set of validated, assigned data. In other embodiments, data is extracted from text files without using OCR.
42 citations
••
17 Jan 2005TL;DR: The UNLV/ISRI Analytic Tools for OCR Evaluation together with a large and diverse collection of scanned document images with the associated ground-truth text will allow anyone to conduct a meaningful test comparing the performance of competing page-reading algorithms.
Abstract: We announce the availability of the UNLV/ISRI Analytic Tools for OCR Evaluation together with a large and diverse collection of scanned document images with the associated ground-truth text. This combination of tools and test data will allow anyone to conduct a meaningful test comparing the performance of competing page-reading algorithms. The value of this collection of software tools and test data is enhanced by knowledge of the past performance of several systems using exactly these tools and this data. These performance comparisons were published in previous ISRI Test Reports and are also provided. Another value is that the tools can be used to test the character accuracy of any page-reading OCR system for any language included in the Unicode standard. The paper concludes with a summary of the programs, test data, and documentation that is available and gives the URL where they can be located.
42 citations
•
18 Jan 2012
TL;DR: In this article, a complete video frame that is associated with a presented video image of a video content event is presented, where a region of text is identified in the video frame and an optical character recognition (OCR) algorithm is used to translate the text.
Abstract: Systems and methods are operable to present text identified in a presented video image of a media content event. An exemplary embodiment receives a complete video frame that is associated with a presented video image of a video content event, wherein the presented video image includes a region of text; finds the text in the complete video frame; uses an optical character recognition (OCR) algorithm to translate the found text; and presents the translated text. The translated text may be presented on a display concurrently with the video image that is presented on the display. Alternatively, or additionally, the translated text may be presented as audible speech emitted from at least one speaker.
42 citations
••
12 Oct 1997TL;DR: This paper examines a simple pattern-recognition system using an artificial neural network to simulate character recognition, and uses the backpropagation method for learning in the neural network.
Abstract: Many artificial neural network models (ANNs) have been proposed to mimic the human brain in solving problems involving human-like intelligence. An application of an artificial neural network approach for optical character recognition (OCR) is discussed in this paper. We examine a simple pattern-recognition system using an artificial neural network to simulate character recognition. A simple feedforward neural network model has been trained with different sets of noisy data. The backpropagation method is used for learning in the neural network.
42 citations
••
20 Oct 2003
TL;DR: This study attempts to develop a method for embedding watermark in the text that is as successful as the frequency-domain methods have been for image and audio.
Abstract: Numerous schemes have been designed for watermarking multimedia contents. Many of these schemes are vulnerable to watermark erasing attacks. Naturally, such methods are ineffective on text unless the text is represented as a bitmap image, but in that case, the watermark can be erased easily by using Optical Character Recognition (OCR) to change the representation of the text from a bitmap to ASCII or EBCDIC. This study attempts to develop a method for embedding watermark in the text that is as successful as the frequency-domain methods have been for image and audio. The novel method embeds the watermark in original text, creating ciphertext, which preserves the meaning of the original text via various semantic replacements.
42 citations