Content directed enhancement of degraded document images

doi:10.1145/2432553.2432564

Proceedings ArticleDOI

Content directed enhancement of degraded document images

Sangeet Aggarwal, +3 more

- pp 55-61

Chats0

TLDR

This paper presents a novel framework that learns optimal parameters, depending on the nature of the document image content for binarization and text/graphics segmentation, using EM algorithm.

Abstract:

Most of the document pre-processing techniques are parameter dependent. In this paper, we present a novel framework that learns optimal parameters, depending on the nature of the document image content for binarization and text/graphics segmentation. The learning problem has been formulated as an optimization problem using EM algorithm to adaptively learn optimal parameters. Experimental results have established the effectiveness of our approach.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Newspaper Article Extraction Using Hierarchical Fixed Point Model

Anukriti Bansal, +3 more

TL;DR: A novel learning based framework to extract articles from newspaper images using a Fixed-Point Model that uses contextual information and features of each block to learn the layout of newspaper images and attains a contraction mapping to assign a unique label to every block.

...read moreread less

Proceedings ArticleDOI

A novel local enhancement technique for rebuilding Broken characters in a degraded Kannada script

N. Sandhya, +2 more

TL;DR: A novel method to rebuild the broken characters are thinned and the endpoints of the lines are obtained and the line segments are effectively rebuilt so as to preserve the degraded character.

...read moreread less

Proceedings ArticleDOI

Automatic Selection of Parameters for Document Image Enhancement Using Image Quality Assessment

Ritu Garg, +1 more

TL;DR: A novel framework for automatic selection of optimal parameters for pre-processing algorithm by estimating the quality of the document image and compute parameters to maximize the expected recognition accuracy found in E-step.

...read moreread less

Proceedings ArticleDOI

Text graphic separation in Indian newspapers

Ritu Garg, +3 more

TL;DR: A novel framework for learning optimal parameters for text graphic separation in the presence of complex layouts of Indian newspaper is proposed.

...read moreread less

References

PDF

Open Access

More filters

Journal ArticleDOI

A threshold selection method from gray level histograms

Nobuyuki Otsu

- 01 Jan 1979 -

IEEE Transactions on Systems, Man, and C...

Journal ArticleDOI

Adaptive document image binarization

Jaakko Sauvola, +1 more

- 01 Feb 2000 -

Pattern Recognition

TL;DR: A new method is presented for adaptive document image binarization, where the page is considered as a collection of subcomponents such as text, background and picture, which adapts and performs well in each case qualitatively and quantitatively.

...read moreread less

Journal ArticleDOI

A robust algorithm for text string separation from mixed text/graphics images

Lloyd Alan Fletcher, +1 more

- 01 Nov 1988 -

IEEE Transactions on Pattern Analysis an...

TL;DR: The development and implementation of an algorithm for automated text string separation that is relatively independent of changes in text font style and size and of string orientation are described and showed superior performance compared to other techniques.

...read moreread less

Journal ArticleDOI

Adaptive degraded document image binarization

Basilis Gatos, +2 more

- 01 Mar 2006 -

Pattern Recognition

TL;DR: The proposed method does not require any parameter tuning by the user and can deal with degradations which occur due to shadows, non-uniform illumination, low contrast, large signal-dependent noise, smear and strain.

...read moreread less

Journal ArticleDOI

Block segmentation and text extraction in mixed text/image documents

Friedrich M. Wahl, +2 more

- 01 Dec 1982 -

Computer Graphics and Image Processing

TL;DR: It is shown that a constrained run length algorithm is well suited to partition most documents into areas of text lines, solid black lines, and rectangular ☐es enclosing graphics and halftone images.

...read moreread less