scispace - formally typeset
Search or ask a question
Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.


Papers
More filters
Proceedings ArticleDOI
23 Jan 2004
TL;DR: A system that is developed in order to retrieve information from digitized books and journals belonging to digital libraries with the ability of combining two principal retrieval strategies in several ways, and the effectiveness of the integrated retrieval is described.
Abstract: Large collections of scanned documents (books and journals) are now available in digital libraries. The most common method for retrieving relevant information from these collections is image browsing, but this approach is not feasible for books with more than a few dozen pages. The recognition of printed text can be made on the images by OCR systems, and in this case a retrieval by textual content can be performed. However, the results heavily depend on the quality of original documents. More sophisticated navigation can be performed when an electronic table of contents of the book is available with links to the corresponding pages. An opposite approach relies on the reduction of the amount of symbolic information to be extracted at the storage time. This approach is taken into account by document image retrieval systems. We describe a system that we developed in order to retrieve information from digitized books and journals belonging to digital libraries. The main feature of the system is the ability of combining two principal retrieval strategies in several ways. The first strategy allows an user to find pages with a layout similar to a query page. The second strategy is used in order to retrieve words in the collection matching a user-defined query, without performing OCR. The combination of these basic strategies allows users to retrieve meaningful pages with a low effort during the indexing phase. We describe the basic tools used in the system (layout analysis, layout retrieval, word retrieval) and the integration of these tools for answering complex queries. The experimental results are made on 1287 pages and show the effectiveness of the integrated retrieval.

34 citations

Patent
15 Mar 2013
TL;DR: In this article, an electronic device and method capture multiple images of a scene of real world at several zoom levels, the scene containing text of one or more sizes, then the electronic devices and method extract from each of the multiple images, one or multiple text regions, followed by analyzing an attribute that is relevant to OCR.
Abstract: An electronic device and method capture multiple images of a scene of real world at a several zoom levels, the scene of real world containing text of one or more sizes. Then the electronic device and method extract from each of the multiple images, one or more text regions, followed by analyzing an attribute that is relevant to OCR in one or more versions of a first text region as extracted from one or more of the multiple images. When an attribute has a value that meets a limit of optical character recognition (OCR) in a version of the first text region, the version of the first text region is provided as input to OCR.

34 citations

Proceedings ArticleDOI
01 Sep 2016
TL;DR: In this approach, the vertical edge detection algorithm is applied and removes unwanted edges by image normalization technique and the LP region is extracted by incorporating statistical and morphological image processing techniques.
Abstract: Automatic License Plate Recognition (ALPR) systems are employed for detection and recognition of license plate/number plate of vehicles. The performance of existing systems is well below the desired level. In this perspective, there is a definite need to propose a system to overcome the limitations of currently available systems. A new approach is being introduced in this paper for fast and efficient implementation of ALPR system. In this approach, the vertical edge detection algorithm is applied and removes unwanted edges by image normalization technique. The LP region is extracted by incorporating statistical and morphological image processing techniques. For character recognition, the template matching is employed for optical character recognition (OCR). The algorithm is tested on 500 real time images, which are acquired under different illumination conditions and from different scenes. Overall efficiency of the proposed method is 84.8% and the execution time is less than 0.5sec.

34 citations

Journal ArticleDOI
TL;DR: A system intended to provide input of printed text to computers is applied to published patents, annotated law reports, and technical journals.
Abstract: A system intended to provide input of printed text to computers is applied to published patents, annotated law reports, and technical journals.

34 citations

Journal ArticleDOI
TL;DR: A tentative guide that is intended to help others compiling the necessary information and making the right decisions on digitising numeric or text data (optical character recognition, speech recognition, and key entry).
Abstract: Hand-written or printed manuscript data are an important source for paleo-climatological studies, but bringing them into a suitable format can be a time consuming adventure with uncertain success. Before digitising such data (e.g., in the context a specific research project), it is worthwhile spending a few thoughts on the characteristics of the data, the scientific requirements with respect to quality and coverage, the metadata, and technical aspects such as reproduction techniques, digitising techniques, and quality control strategies. Here we briefly discuss the most important considerations according to our own experience and describe different methods for digitising numeric or text data (optical character recognition, speech recognition, and key entry). We present a tentative guide that is intended to help others compiling the necessary information and making the right decisions.

34 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
87% related
Feature (computer vision)
128.2K papers, 1.7M citations
85% related
Image segmentation
79.6K papers, 1.8M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Deep learning
79.8K papers, 2.1M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023186
2022425
2021333
2020448
2019430
2018357