Enhancement of Old Manuscript Images

doi:10.1109/ICDAR.2007.4377014

Home
/
Papers
/
Enhancement of Old Manuscript Images

Proceedings Article•DOI•

Enhancement of Old Manuscript Images

Ajay Gupta¹, S.. Kumar², R. D. Gupta³, Santanu Chaudhury¹, Shiv Dutt Joshi¹ - Show less +1 more•Institutions (3)

Indian Institute of Technology Delhi¹, IBM², Cypress Semiconductor³

23 Sep 2007-Vol. 2, pp 744-748

TL;DR: A segmentation based histogram matching scheme for enhancing small portions of the text in these manuscripts that have degraded with time and are not readable is proposed.

read less

Abstract: In this paper we address the issue of enhancement in the quality of scanned images of old manuscripts. Small portions of the text in these manuscripts have degraded with time and are not readable. We propose a segmentation based histogram matching scheme for enhancing these degraded text regions. To automatically identify the degraded text we use a matched wavelet based text extraction algorithm followed by MRF(Markov Random Field) post processing. Additionally we do background clearing to improve the quality of results. This method does not require any a priori information about the font, font size, background texture or geometric transformation. We have tested our method on a variety of manuscript images. The results show proposed method to be a robust, versatile and effective tool for enhancement of manuscript images.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Review on image enhancement methods of old manuscript with the damaged background

[...]

Sitti Rachmawati Yahya, Siti Norul Huda Sheikh Abdullah, Khairuddin Omar, Mohamad Shanudin Zakaria, Choong Yeun Liong¹ - Show less +1 more•Institutions (1)

National University of Malaysia¹

22 Sep 2009

TL;DR: The aim of this paper is to provide comprehensive review methods to enhance old document images with damaging background using binarization method or thresholding method and other methods.

...read moreread less

Abstract: Quite often that old documents are suffering from background damage. Examples of background damages are varying contrast, ancient document age and the documents have degraded over time due to storage conditions and the quality of the written parchment. These have damaged background for example such as: have varying contrast, smudges, dirty, presence of seeping ink from the other side of the document, uneven background. In order to make them readable, image processing offers a selection of approaches. The aim of this paper is to provide comprehensive review methods to enhance old document images with damaging background. Three kinds of enhancement methods are: (a) Image enhancement methods using binarization method or thresholding method, (b) Image enhancement methods using binarization method or thresholding method and other methods, (c) Image enhancement methods using other methods only. As conclusion, the second method has becoming more popular and has a great potential to improve in future.

...read moreread less

39 citations

Journal Article•DOI•

Nonlinear separation of show-through image mixtures using a physical model trained with ICA

[...]

Mariana S. C. Almeida¹, Luís B. Almeida¹•Institutions (1)

Instituto Superior Técnico¹

01 Apr 2012-Signal Processing

TL;DR: This work addresses the problem of separating images through the use of a physical model of the mixture process, which is nonlinear but invertible, and uses the inverse model to perform the separation.

...read moreread less

22 citations

Proceedings Article•DOI•

Degraded Character Recognition by Image Quality Evaluation

[...]

Chunmei Liu¹•Institutions (1)

Tongji University¹

23 Aug 2010

TL;DR: A novel approach to degraded character recognition by three kinds of independent degradation sources is proposed, composed of two stems: character image quality evaluation, character recognition.

...read moreread less

Abstract: The character image quality plays an important role in degraded character recognition which could tell the recognition difficulty. This paper proposed a novel approach to degraded character recognition by three kinds of independent degradation sources. It is composed of two stems: character image quality evaluation, character recognition. Firstly, it presents the dual-evaluation to evaluate the image quality of the input character. Secondly, according to the input evaluation result, the character recognition sub-systems adaptively act on. These sub-systems are trained by character sets whose image qualities are similar to the input’s quality, and have special features and special classifiers respectively. Experiment results demonstrate the proposed approach highly improved the performance of degraded character recognition system.

...read moreread less

4 citations

Cites methods from "Enhancement of Old Manuscript Image..."

...[7] presented an algorithm based on matched wavelets and MRF model to automatically identify and extract the low contrast text regions from scanned manuscript images and enhance them using a histograms matching technique....
[...]

Book Chapter•DOI•

Digital Restoration by Denoising and Binarization of Historical Manuscripts Images

[...]

D.E. Ventzas, Nikolaos Ntogas, Maria-Malamo Ventza

14 Mar 2012

TL;DR: This chapter deals with digital restoration, preservation, and data base storage of historical manuscripts images by focusing on restoration techniques and binarization methods combined with image processing applied on document images for text background enhancement and discrimination.

...read moreread less

Abstract: This chapter deals with digital restoration, preservation, and data base storage of historical manuscripts images. It focuses on restoration techniques and binarization methods combined with image processing applied on document images for text background enhancement and discrimination. Sequential image processing procedures are applied for image refinement and enhancement on quality class categorized images. Research results on historical (i.e. Byzantine, old newspapers, etc) manuscripts are presented.

...read moreread less

3 citations

Book Chapter•DOI•

Digitization of Ancient Manuscripts and Inscriptions - A Review

[...]

N. Jayanthi¹, S. Indu¹, Snigdhaa Hasija¹, Prateek Tripathi¹•Institutions (1)

Delhi Technological University¹

11 Nov 2016

TL;DR: An analysis of the different methods used for the enhancement of degraded ancient images in terms of low resolution, minimal intensity difference between the text and background, show through effects and uneven background.

...read moreread less

Abstract: The article describes the most recent developments in the field of enhancement and digitization of ancient manuscripts and inscriptions. Digitization of ancient sources of information is essential to have an insight of the rich culture of previous civilizations, which in turn requires the high rate of accuracy in word and character recognition. To enhance the accuracy of the Optical Character Recognition system, the degraded images need to be made compatible for the OCR system. So, the image has to be pre-processed by filtering techniques and segmented by thresholding methods followed by post processing operations. The need for digitization of ancient artefacts is to preserve information that lies in the ancient manuscripts and improve the tourism of our country by attracting more and more tourists. This article gives an analysis of the different methods used for the enhancement of degraded ancient images in terms of low resolution, minimal intensity difference between the text and background, show through effects and uneven background. The techniques reviewed include ICA, NGFICA, Cumulants Based ICA and a novel thresholding technique for text extraction.

...read moreread less

1 citations

References

PDF

Open Access

More filters

Journal Article•DOI•

Fast approximate energy minimization via graph cuts

[...]

Yuri Boykov¹, Olga Veksler¹, Ramin Zabih²•Institutions (2)

Princeton University¹, Cornell University²

01 Nov 2001-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work presents two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves that allow important cases of discontinuity preserving energies.

...read moreread less

Abstract: Many tasks in computer vision involve assigning a label (such as disparity) to every pixel. A common constraint is that the labels should vary smoothly almost everywhere while preserving sharp discontinuities that may exist, e.g., at object boundaries. These tasks are naturally stated in terms of energy minimization. The authors consider a wide class of energies with various smoothness constraints. Global minimization of these energy functions is NP-hard even in the simplest discontinuity-preserving case. Therefore, our focus is on efficient approximation algorithms. We present two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves. These moves can simultaneously change the labels of arbitrarily large sets of pixels. In contrast, many standard algorithms (including simulated annealing) use small moves where only one pixel changes its label at a time. Our expansion algorithm finds a labeling within a known factor of the global minimum, while our swap algorithm handles more general energy functions. Both of these algorithms allow important cases of discontinuity preserving energies. We experimentally demonstrate the effectiveness of our approach for image restoration, stereo and motion. On real data with ground truth, we achieve 98 percent accuracy.

...read moreread less

7,413 citations

"Enhancement of Old Manuscript Image..." refers background or methods in this paper

...The problem of correcting the misclassification can be formulated in terms of energy minimization [9] [2]....
[...]
...The first and second terms in the above equation are referred to as Esmooth (interaction energy) and Edata (energy corresponding to the data term) in the literature [9]....
[...]
...The discontinuity preserving energy function used here is Potts interaction penalty [9]....
[...]

Journal Article•DOI•

Machine printed text and handwriting identification in noisy document images

[...]

Yefeng Zheng¹, Huiping Li¹, David Doermann¹•Institutions (1)

University of Maryland, College Park¹

01 Mar 2004-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper addresses the problem of the identification of text in noisy document images by treating noise as a separate class and model noise based on selected features.

...read moreread less

Abstract: In this paper, we address the problem of the identification of text in noisy document images. We are especially focused on segmenting and identifying between handwriting and machine printed text because: 1) Handwriting in a document often indicates corrections, additions, or other supplemental information that should be treated differently from the main content and 2) the segmentation and recognition techniques requested for machine printed and handwritten text are significantly different. A novel aspect of our approach is that we treat noise as a separate class and model noise based on selected features. Trained Fisher classifiers are used to identify machine printed text and handwriting from noise and we further exploit context to refine the classification. A Markov Random Field-based (MRF) approach is used to model the geometrical structure of the printed text, handwriting, and noise to rectify misclassifications. Experimental results show that our approach is robust and can significantly improve page segmentation in noisy document collections.

...read moreread less

195 citations

"Enhancement of Old Manuscript Image..." refers methods in this paper

...Similar approach has been used recently in [8] to refine the results of segmenting the handwritten text, printed text and noise in the document image....
[...]
...Thus in our case classification confidence found using Fisher classification is the intuitive choice of Dp [8]....
[...]

Journal Article•DOI•

Text Extraction and Document Image Segmentation Using Matched Wavelets and MRF Model

[...]

S.. Kumar¹, Rajat Gupta, Nitin Khanna², Santanu Chaudhury³, Shashank Joshi³ - Show less +1 more•Institutions (3)

IBM¹, Purdue University², Indian Institute of Technology Delhi³

01 Aug 2007-IEEE Transactions on Image Processing

TL;DR: A clustering-based technique has been devised for estimating globally matched wavelet filters using a collection of groundtruth images and a text extraction scheme for the segmentation of document images into text, background, and picture components is extended.

...read moreread less

Abstract: In this paper, we have proposed a novel scheme for the extraction of textual areas of an image using globally matched wavelet filters. A clustering-based technique has been devised for estimating globally matched wavelet filters using a collection of groundtruth images. We have extended our text extraction scheme for the segmentation of document images into text, background, and picture components (which include graphics and continuous tone images). Multiple, two-class Fisher classifiers have been used for this purpose. We also exploit contextual information by using a Markov random field formulation-based pixel labeling scheme for refinement of the segmentation results. Experimental results have established effectiveness of our approach.

...read moreread less

159 citations

"Enhancement of Old Manuscript Image..." refers background or methods in this paper

...The problem of correcting the misclassification can be formulated in terms of energy minimization [9] [ 2 ]....
[...]
...In this paper, we have used the text extraction algorithm designed in [1],[ 2 ], a brief overview of which is given below...
[...]

Journal Article•DOI•

A new approach for estimation of statistically matched wavelet

[...]

Anubha Gupta¹, Shiv Dutt Joshi², Surendra Prasad²•Institutions (2)

Netaji Subhas Institute of Technology¹, Indian Institutes of Technology²

01 May 2005-IEEE Transactions on Signal Processing

TL;DR: Methods are presented to design a finite impulse response/infinite impulse response (FIR/IIR) biorthogonal perfect reconstruction filterbank, leading to the estimation of a compactly supported/infinitely supported statistically matched wavelet.

...read moreread less

Abstract: This paper presents a new approach for the estimation of wavelets that is matched to a given signal in the statistical sense. Based on this approach, a number of new methods to estimate statistically matched wavelets are proposed. The paper first proposes a new method for the estimation of statistically matched two-band compactly supported biorthogonal wavelet system. Second, a new method is proposed to estimate statistically matched semi-orthogonal two-band wavelet system that results in compactly supported or infinitely supported wavelet. Next, the proposed method of estimating two-band wavelet system is generalized to M-band wavelet system. Here, the key idea lies in the estimation of analysis wavelet filters from a given signal. This is similar to a sharpening filter used in image enhancement. The output of analysis highpass filter branch is viewed to be equivalent to an error in estimating the middle sample from the neighborhood. To minimize this error, a minimum mean square error (MMSE) criterion is employed. Since wavelet expansion acts like Karhunen-Loe/spl grave/ve-type expansion for generalized 1/f/sup /spl beta// processes, it is assumed that the given signal is a sample function of an mth-order fractional Brownian motion. Therefore, the autocorrelation structure of a generalized 1/f/sup /spl beta// process is used in the estimation of analysis filters using the MMSE criterion. We then present methods to design a finite impulse response/infinite impulse response (FIR/IIR) biorthogonal perfect reconstruction filterbank, leading to the estimation of a compactly supported/infinitely supported statistically matched wavelet. The proposed methods are very simple. Simulation results to validate the proposed theory are presented for different synthetic self-similar signals as well as music and speech clips. Estimated wavelets for different signals are compared with standard biorthogonal 9/7 and 5/3 wavelets for the application of compression and are shown to have better results.

...read moreread less

80 citations

Proceedings Article•DOI•

Historical document image enhancement using background light intensity normalization

[...]

Zhixin Shi¹, Venu Govindaraju¹•Institutions (1)

State University of New York System¹

23 Aug 2004

TL;DR: This work presents a background light intensity normalization algorithm suitable for historical document images that adaptively captures the background with a "best fit" linear function and normalized with respect to the approximation.

...read moreread less

Abstract: This work presents a background light intensity normalization algorithm suitable for historical document images. The algorithm uses an adaptive linear function to approximate the uneven background due to the uneven surface of the document paper, aged color and light source of the cameras for image lifting. Our algorithm adaptively captures the background with a "best fit" linear function and normalized with respect to the approximation. The technique works for both gray scale and color images with significant improvement in readability.

...read moreread less

77 citations

"Enhancement of Old Manuscript Image..." refers methods in this paper

...We implement an optional background clearing step, described in [3], to make manuscript image more readable....
[...]
...Linear function approximations of document background [3], and a foreground-background separation method by using local adaptive analysis [5], are among other methods used in the digital restoration of historical document images....
[...]