Topic

Optical character recognition

About: Optical character recognition is a research topic. Over the lifetime, 7342 publications have been published within this topic receiving 158193 citations. The topic is also known as: OCR & optical character reader.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Book•DOI•

Document Analysis Systems II

[...]

Suzanne L. Taylor, Jonathan J. Hull

01 Apr 1998

TL;DR: Evaluating the performance of techniques for the extraction of primitives from line drawings composed of horizontal and vertical lines and evaluating the development of a general framework for intelligent document image retrieval.

...read moreread less

Abstract: Evaluating the performance of techniques for the extraction of primitives from line drawings composed of horizontal and vertical lines, J.F. Arias et al the development of a general framework for intelligent document image retrieval, D. Doermann et al perdition of OCR accuracy using a neural network, J. Gonzalez et al evaluating Japanese document recognition in the Internet/intranet environment, T. Hong et al DocBrowse - a system for textual and graphical querying on degraded document image data, M.Y. Jaisimha et al language identification in complex, unoriented and degraded document images, D. Lee et al document analysis and the World Wide Web, D. Lopresti and J. Zhou language-independent and segmentation-free optical character recognition, J. Makhoul et al documents on the move - DA&IR-driven mail piece processing today and tomorrow, U. Miletzki priming the recognizer, G. Nagy and Y. Xu semiautomatic production of highly accurate word bounding box ground truth, R.P. Rogers et al SPAM - a scientific paper access method, A.L. Spitz automated CAD conversion with the machine drawing understanding system, L. Wenyin and D. Dori. (Part contents)

...read moreread less

120 citations

Journal Article•DOI•

Automatic text location in images and video frames

[...]

Anil K. Jain¹, Bin Yu¹•Institutions (1)

Michigan State University¹

01 Dec 1998-Pattern Recognition

TL;DR: This work proposes a new text location algorithm that is suitable in a number of applications, including conversion of newspaper advertisements from paper documents to their electronic versions, World Wide Web search, color image indexing and video indexing, and emphasize on extracting important text with large size and high contrast.

...read moreread less

120 citations

Journal Article•DOI•

At the frontiers of OCR

[...]

George Nagy¹•Institutions (1)

Rensselaer Polytechnic Institute¹

01 Jul 1992

TL;DR: It is argued that it is time for a major change of approach to optical character recognition (OCR) research, and new OCR systems should take advantage of the typographic uniformity of paragraphs or other layout components.

...read moreread less

Abstract: It is argued that it is time for a major change of approach to optical character recognition (OCR) research. The traditional approach, focusing on the correct classification of isolated characters, has been exhausted. The demonstration of the superiority of a new classification method under operational conditions requires large experimental facilities and databases beyond the resources of most researchers. In any case, even perfect classification of individual characters is insufficient for the conversion of complex archival documents to a useful computer-readable form. Many practical OCR tasks require integrated treatment of entire documents and well-organized typographic and domain-specific knowledge. New OCR systems should take advantage of the typographic uniformity of paragraphs or other layout components. They should also exploit the unavoidable interaction with human operators to improve themselves without explicit 'training'. >

...read moreread less

119 citations

Journal Article•DOI•

Trainable COSFIRE Filters for Keypoint Detection and Pattern Recognition

[...]

George Azzopardi¹, Nicolai Petkov¹•Institutions (1)

University of Groningen¹

01 Feb 2013-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The proposed COSFIRE filters are conceptually simple and easy to implement and are versatile keypoint detectors and are highly effective in practical computer vision applications.

...read moreread less

Abstract: Background: Keypoint detection is important for many computer vision applications. Existing methods suffer from insufficient selectivity regarding the shape properties of features and are vulnerable to contrast variations and to the presence of noise or texture. Methods: We propose a trainable filter which we call Combination Of Shifted FIlter REsponses (COSFIRE) and use for keypoint detection and pattern recognition. It is automatically configured to be selective for a local contour pattern specified by an example. The configuration comprises selecting given channels of a bank of Gabor filters and determining certain blur and shift parameters. A COSFIRE filter response is computed as the weighted geometric mean of the blurred and shifted responses of the selected Gabor filters. It shares similar properties with some shape-selective neurons in visual cortex, which provided inspiration for this work. Results: We demonstrate the effectiveness of the proposed filters in three applications: the detection of retinal vascular bifurcations (DRIVE dataset: 98.50 percent recall, 96.09 percent precision), the recognition of handwritten digits (MNIST dataset: 99.48 percent correct classification), and the detection and recognition of traffic signs in complex scenes (100 percent recall and precision). Conclusions: The proposed COSFIRE filters are conceptually simple and easy to implement. They are versatile keypoint detectors and are highly effective in practical computer vision applications.

...read moreread less

119 citations

Journal Article•DOI•

CMATERdb1: a database of unconstrained handwritten Bangla and Bangla–English mixed script document image

[...]

Ram Sarkar¹, Nibaran Das¹, Subhadip Basu¹, Mahantapas Kundu¹, Mita Nasipuri¹, Dipak Kumar Basu¹ - Show less +2 more•Institutions (1)

Jadavpur University¹

01 Mar 2012-International Journal on Document Analysis and Recognition

TL;DR: This paper has described the preparation of a benchmark database for research on off-line Optical Character Recognition (OCR) of document images of handwritten Bangla text and Bangle text mixed with English words, which is the first handwritten database in this area available as an open source document.

...read moreread less

Abstract: In this paper, we have described the preparation of a benchmark database for research on off-line Optical Character Recognition (OCR) of document images of handwritten Bangla text and Bangla text mixed with English words. This is the first handwritten database in this area, as mentioned above, available as an open source document. As India is a multi-lingual country and has a colonial past, so multi-script document pages are very much common. The database contains 150 handwritten document pages, among which 100 pages are written purely in Bangla script and rests of the 50 pages are written in Bangla text mixed with English words. This database for off-line-handwritten scripts is collected from different data sources. After collecting the document pages, all the documents have been preprocessed and distributed into two groups, i.e., CMATERdb1.1.1, containing document pages written in Bangla script only, and CMATERdb1.2.1, containing document pages written in Bangla text mixed with English words. Finally, we have also provided the useful ground truth images for the line segmentation purpose. To generate the ground truth images, we have first labeled each line in a document page automatically by applying one of our previously developed line extraction techniques [Khandelwal et al., PReMI 2009, pp. 369–374] and then corrected any possible error by using our developed tool GT Gen 1.1. Line extraction accuracies of 90.6 and 92.38% are achieved on the two databases, respectively, using our algorithm. Both the databases along with the ground truth annotations and the ground truth generating tool are available freely at http://code.google.com/p/cmaterdb.

...read moreread less

119 citations

Collapse

Network Information

Performance

Metrics

7,941

Papers

180,323

Citations

No. of papers in the topic in previous years
Year	Papers
2023	186
2022	425
2021	333
2020	448
2019	430
2018	357

Optical character recognition

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics