International Journal on Document Analysis and Recognition
Springer Science+Business Media
About: International Journal on Document Analysis and Recognition is an academic journal published by Springer Science+Business Media. The journal publishes majorly in the area(s): Pattern recognition (psychology) & Optical character recognition. It has an ISSN identifier of 1433-2825. Over the lifetime, 561 publications have been published receiving 20720 citations. The journal is also known as: IJDAR. International journal on document analysis and recognition (Print) & IJDAR. International journal on document analysis and recognition (Internet).
Topics: Pattern recognition (psychology), Optical character recognition, Handwriting recognition, Computer science, Artificial intelligence
Papers published on a yearly basis
TL;DR: A database that consists of handwritten English sentences based on the Lancaster-Oslo/Bergen corpus, which is expected that the database would be particularly useful for recognition tasks where linguistic knowledge beyond the lexicon level is used.
Abstract: In this paper we describe a database that consists of handwritten English sentences. It is based on the Lancaster-Oslo/Bergen (LOB) corpus. This corpus is a collection of texts that comprise about one million word instances. The database includes 1,066 forms produced by approximately 400 different writers. A total of 82,227 word instances out of a vocabulary of 10,841 words occur in the collection. The database consists of full English sentences. It can serve as a basis for a variety of handwriting recognition tasks. However, it is expected that the database would be particularly useful for recognition tasks where linguistic knowledge beyond the lexicon level is used, because this knowledge can be automatically derived from the underlying corpus. The database also includes a few image-processing procedures for extracting the handwritten text from the forms and the segmentation of the text into lines and words.
TL;DR: A survey of application domains, technical challenges, and solutions for the analysis of documents captured by digital cameras, and some sample applications under development and feasible ideas for future development is presented.
Abstract: The increasing availability of high-performance, low-priced, portable digital imaging devices has created a tremendous opportunity for supplementing traditional scanning for document image acquisition. Digital cameras attached to cellular phones, PDAs, or wearable computers, and standalone image or video devices are highly mobile and easy to use; they can capture images of thick books, historical manuscripts too fragile to touch, and text in scenes, making them much more versatile than desktop scanners. Should robust solutions to the analysis of documents captured with such devices become available, there will clearly be a demand in many domains. Traditional scanner-based document analysis techniques provide us with a good reference and starting point, but they cannot be used directly on camera-captured images. Camera-captured images can suffer from low resolution, blur, and perspective distortion, as well as complex layout and interaction of the content and background. In this paper we present a survey of application domains, technical challenges, and solutions for the analysis of documents captured by digital cameras. We begin by describing typical imaging devices and the imaging process. We discuss document analysis from a single camera-captured image as well as multiple frames and highlight some sample applications under development and feasible ideas for future development.
TL;DR: The objective of this paper is to present a survey of existing methods, developed during the last decade and dedicated to documents of historical interest.
Abstract: There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as word spotting, text/image alignment, authentication and extraction of specific fields are in use today. For all these tasks, a major step is document segmentation into text lines. Because of the low quality and the complexity of these documents (background noise, artifacts due to aging, interfering lines), automatic text line segmentation remains an open research field. The objective of this paper is to present a survey of existing methods, developed during the last decade and dedicated to documents of historical interest.
TL;DR: It is shown in a subset of the George Washington collection that such a word spotting technique can outperform a Hidden Markov Model word-based recognition technique in terms of word error rates.
Abstract: Searching and indexing historical handwritten collections are a very challenging problem. We describe an approach called word spotting which involves grouping word images into clusters of similar words by using image matching to find similarity. By annotating “interesting” clusters, an index that links words to the locations where they occur can be built automatically. Image similarities computed using a number of different techniques including dynamic time warping are compared. The word similarities are then used for clustering using both K-means and agglomerative clustering techniques. It is shown in a subset of the George Washington collection that such a word spotting technique can outperform a Hidden Markov Model word-based recognition technique in terms of word error rates.
TL;DR: The performance of a detection algorithm is illustrated intuitively by performance graphs which present object level precision and recall depending on constraints on detection quality, and a representative single performance value is computed from the graphs.
Abstract: Evaluation of object detection algorithms is a non-trivial task: a detection result is usually evaluated by comparing the bounding box of the detected object with the bounding box of the ground truth object. The commonly used precision and recall measures are computed from the overlap area of these two rectangles. However, these measures have several drawbacks: they don't give intuitive information about the proportion of the correctly detected objects and the number of false alarms, and they cannot be accumulated across multiple images without creating ambiguity in their interpretation. Furthermore, quantitative and qualitative evaluation is often mixed resulting in ambiguous measures. In this paper we propose a new approach which tackles these problems. The performance of a detection algorithm is illustrated intuitively by performance graphs which present object level precision and recall depending on constraints on detection quality. In order to compare different detection algorithms, a representative single performance value is computed from the graphs. The influence of the test database on the detection performance is illustrated by performance/generality graphs. The evaluation method can be applied to different types of object detection algorithms. It has been tested on different text detection algorithms, among which are the participants of the ICDAR 2003 text detection competition.