scispace - formally typeset
Book ChapterDOI

Cell Extraction and Horizontal-Scale Correction in Structured Documents

TLDR
The effectiveness of horizontal-scale correction is proved by applying it as a preprocessing step in a recognition system proposed in (Almazan et al. in Pattern Anal Mach Intell 36(12):21552–2566, 2014 [2]).
Abstract
Preprocessing techniques form an important task in document image analysis. In structured documents like forms, cheques, etc., there is a predefined space called frame field/cell for the user to fill the entry. When the user is writing, the nonuniformity of inter-character spacing becomes an issue. Many times, the starting characters of the word are written with sparse spacing between the characters and then gradually with a more compact spacing so as to accommodate the word within the frame field. To deal with this variation in intra-word spacing, horizontal-scale correction is applied to the extracted form fields. The effectiveness of the system is proved by applying it as a preprocessing step in a recognition system proposed in (Almazan et al. in Pattern Anal Mach Intell 36(12):21552–2566, 2014 [2]). The recognition framework results in reduced error rates with this normalization.

read more

References
More filters
Book

Digital Image Processing Using MATLAB

TL;DR: 1. Fundamentals of Image Processing, 2. Intensity Transformations and Spatial Filtering, and 3. Frequency Domain Processing.
Journal ArticleDOI

A Novel Connectionist System for Unconstrained Handwriting Recognition

TL;DR: This paper proposes an alternative approach based on a novel type of recurrent neural network, specifically designed for sequence labeling tasks where the data is hard to segment and contains long-range bidirectional interdependencies, significantly outperforming a state-of-the-art HMM-based system.
Journal ArticleDOI

The IAM-database: an English sentence database for offline handwriting recognition

TL;DR: A database that consists of handwritten English sentences based on the Lancaster-Oslo/Bergen corpus, which is expected that the database would be particularly useful for recognition tasks where linguistic knowledge beyond the lexicon level is used.
Journal ArticleDOI

Word Spotting and Recognition with Embedded Attributes

TL;DR: An approach in which both word images and text strings are embedded in a common vectorial subspace, allowing one to cast recognition and retrieval tasks as a nearest neighbor problem and is very fast to compute and, especially, to compare.
Journal ArticleDOI

Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition systems

TL;DR: A novel feature of the system is that the HMM is applied in such a way that the difficult problem of segmenting a line of text into individual words is avoided and linguistic knowledge beyond the lexicon level is incorporated in the recognition process.