Book ChapterDOI
Cell Extraction and Horizontal-Scale Correction in Structured Documents
Divya Srivastava,Gaurav Harit +1 more
TLDR
The effectiveness of horizontal-scale correction is proved by applying it as a preprocessing step in a recognition system proposed in (Almazan et al. in Pattern Anal Mach Intell 36(12):21552–2566, 2014 [2]).Abstract:
Preprocessing techniques form an important task in document image analysis. In structured documents like forms, cheques, etc., there is a predefined space called frame field/cell for the user to fill the entry. When the user is writing, the nonuniformity of inter-character spacing becomes an issue. Many times, the starting characters of the word are written with sparse spacing between the characters and then gradually with a more compact spacing so as to accommodate the word within the frame field. To deal with this variation in intra-word spacing, horizontal-scale correction is applied to the extracted form fields. The effectiveness of the system is proved by applying it as a preprocessing step in a recognition system proposed in (Almazan et al. in Pattern Anal Mach Intell 36(12):21552–2566, 2014 [2]). The recognition framework results in reduced error rates with this normalization.read more
References
More filters
Book
Digital Image Processing Using MATLAB
TL;DR: 1. Fundamentals of Image Processing, 2. Intensity Transformations and Spatial Filtering, and 3. Frequency Domain Processing.
Journal ArticleDOI
A Novel Connectionist System for Unconstrained Handwriting Recognition
TL;DR: This paper proposes an alternative approach based on a novel type of recurrent neural network, specifically designed for sequence labeling tasks where the data is hard to segment and contains long-range bidirectional interdependencies, significantly outperforming a state-of-the-art HMM-based system.
Journal ArticleDOI
The IAM-database: an English sentence database for offline handwriting recognition
Urs-Viktor Marti,Horst Bunke +1 more
TL;DR: A database that consists of handwritten English sentences based on the Lancaster-Oslo/Bergen corpus, which is expected that the database would be particularly useful for recognition tasks where linguistic knowledge beyond the lexicon level is used.
Journal ArticleDOI
Word Spotting and Recognition with Embedded Attributes
TL;DR: An approach in which both word images and text strings are embedded in a common vectorial subspace, allowing one to cast recognition and retrieval tasks as a nearest neighbor problem and is very fast to compute and, especially, to compare.
Journal ArticleDOI
Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition systems
U.-V. Marti,Horst Bunke +1 more
TL;DR: A novel feature of the system is that the HMM is applied in such a way that the difficult problem of segmenting a line of text into individual words is avoided and linguistic knowledge beyond the lexicon level is incorporated in the recognition process.