Journal•ISSN: 1433-2825

International Journal on Document Analysis and Recognition

Springer Science+Business Media

About: International Journal on Document Analysis and Recognition is an academic journal published by Springer Science+Business Media. The journal publishes majorly in the area(s): Pattern recognition (psychology) & Optical character recognition. It has an ISSN identifier of 1433-2825. Over the lifetime, 561 publications have been published receiving 20720 citations. The journal is also known as: IJDAR. International journal on document analysis and recognition (Print) & IJDAR. International journal on document analysis and recognition (Internet).

...read moreread less

Topics: Pattern recognition (psychology), Optical character recognition, Computer science, Handwriting recognition, Handwriting ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The IAM-database: an English sentence database for offline handwriting recognition

[...]

Urs-Viktor Marti, Horst Bunke

01 Nov 2002-International Journal on Document Analysis and Recognition

TL;DR: A database that consists of handwritten English sentences based on the Lancaster-Oslo/Bergen corpus, which is expected that the database would be particularly useful for recognition tasks where linguistic knowledge beyond the lexicon level is used.

...read moreread less

Abstract: In this paper we describe a database that consists of handwritten English sentences. It is based on the Lancaster-Oslo/Bergen (LOB) corpus. This corpus is a collection of texts that comprise about one million word instances. The database includes 1,066 forms produced by approximately 400 different writers. A total of 82,227 word instances out of a vocabulary of 10,841 words occur in the collection. The database consists of full English sentences. It can serve as a basis for a variety of handwriting recognition tasks. However, it is expected that the database would be particularly useful for recognition tasks where linguistic knowledge beyond the lexicon level is used, because this knowledge can be automatically derived from the underlying corpus. The database also includes a few image-processing procedures for extracting the handwritten text from the forms and the segmentation of the text into lines and words.

...read moreread less

1,254 citations

Journal Article•DOI•

Camera-based analysis of text and documents: a survey

[...]

Jian Liang¹, David Doermann¹, Huiping Li•Institutions (1)

University of Maryland, College Park¹

01 Jul 2005-International Journal on Document Analysis and Recognition

TL;DR: A survey of application domains, technical challenges, and solutions for the analysis of documents captured by digital cameras, and some sample applications under development and feasible ideas for future development is presented.

...read moreread less

Abstract: The increasing availability of high-performance, low-priced, portable digital imaging devices has created a tremendous opportunity for supplementing traditional scanning for document image acquisition. Digital cameras attached to cellular phones, PDAs, or wearable computers, and standalone image or video devices are highly mobile and easy to use; they can capture images of thick books, historical manuscripts too fragile to touch, and text in scenes, making them much more versatile than desktop scanners. Should robust solutions to the analysis of documents captured with such devices become available, there will clearly be a demand in many domains. Traditional scanner-based document analysis techniques provide us with a good reference and starting point, but they cannot be used directly on camera-captured images. Camera-captured images can suffer from low resolution, blur, and perspective distortion, as well as complex layout and interaction of the content and background. In this paper we present a survey of application domains, technical challenges, and solutions for the analysis of documents captured by digital cameras. We begin by describing typical imaging devices and the imaging process. We discuss document analysis from a single camera-captured image as well as multiple frames and highlight some sample applications under development and feasible ideas for future development.

...read moreread less

493 citations

Journal Article•DOI•

Text line segmentation of historical documents: a survey

[...]

Laurence Likforman-Sulem¹, Abderrazak Zahour², Bruno Taconet²•Institutions (2)

École Normale Supérieure¹, University of Le Havre²

04 Apr 2007-International Journal on Document Analysis and Recognition

TL;DR: The objective of this paper is to present a survey of existing methods, developed during the last decade and dedicated to documents of historical interest.

...read moreread less

Abstract: There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as word spotting, text/image alignment, authentication and extraction of specific fields are in use today. For all these tasks, a major step is document segmentation into text lines. Because of the low quality and the complexity of these documents (background noise, artifacts due to aging, interfering lines), automatic text line segmentation remains an open research field. The objective of this paper is to present a survey of existing methods, developed during the last decade and dedicated to documents of historical interest.

...read moreread less

416 citations

Journal Article•DOI•

Word spotting for historical documents

[...]

Toni M. Rath¹, R. Manmatha¹•Institutions (1)

University of Massachusetts Amherst¹

04 Apr 2007-International Journal on Document Analysis and Recognition

TL;DR: It is shown in a subset of the George Washington collection that such a word spotting technique can outperform a Hidden Markov Model word-based recognition technique in terms of word error rates.

...read moreread less

Abstract: Searching and indexing historical handwritten collections are a very challenging problem. We describe an approach called word spotting which involves grouping word images into clusters of similar words by using image matching to find similarity. By annotating “interesting” clusters, an index that links words to the locations where they occur can be built automatically. Image similarities computed using a number of different techniques including dynamic time warping are compared. The word similarities are then used for clustering using both K-means and agglomerative clustering techniques. It is shown in a subset of the George Washington collection that such a word spotting technique can outperform a Hidden Markov Model word-based recognition technique in terms of word error rates.

...read moreread less

368 citations

Journal Article•DOI•

Object count/area graphs for the evaluation of object detection and segmentation algorithms

[...]

Christian Wolf¹, Jean-Michel Jolion¹•Institutions (1)

Institut national des sciences Appliquées de Lyon¹

01 Sep 2006-International Journal on Document Analysis and Recognition

TL;DR: The performance of a detection algorithm is illustrated intuitively by performance graphs which present object level precision and recall depending on constraints on detection quality, and a representative single performance value is computed from the graphs.

...read moreread less

Abstract: Evaluation of object detection algorithms is a non-trivial task: a detection result is usually evaluated by comparing the bounding box of the detected object with the bounding box of the ground truth object. The commonly used precision and recall measures are computed from the overlap area of these two rectangles. However, these measures have several drawbacks: they don't give intuitive information about the proportion of the correctly detected objects and the number of false alarms, and they cannot be accumulated across multiple images without creating ambiguity in their interpretation. Furthermore, quantitative and qualitative evaluation is often mixed resulting in ambiguous measures. In this paper we propose a new approach which tackles these problems. The performance of a detection algorithm is illustrated intuitively by performance graphs which present object level precision and recall depending on constraints on detection quality. In order to compare different detection algorithms, a representative single performance value is computed from the graphs. The influence of the test database on the detection performance is illustrated by performance/generality graphs. The evaluation method can be applied to different types of object detection algorithms. It has been tested on different text detection algorithms, among which are the participants of the ICDAR 2003 text detection competition.

...read moreread less

353 citations

Collapse

Performance

Metrics

562

Papers

20,721

Citations

No. of papers from the Journal in previous years
Year	Papers
2023	24
2022	32
2021	28
2020	18
2019	29
2018	20