scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Enhancement of Old Manuscript Images

23 Sep 2007-Vol. 2, pp 744-748
TL;DR: A segmentation based histogram matching scheme for enhancing small portions of the text in these manuscripts that have degraded with time and are not readable is proposed.
Abstract: In this paper we address the issue of enhancement in the quality of scanned images of old manuscripts. Small portions of the text in these manuscripts have degraded with time and are not readable. We propose a segmentation based histogram matching scheme for enhancing these degraded text regions. To automatically identify the degraded text we use a matched wavelet based text extraction algorithm followed by MRF(Markov Random Field) post processing. Additionally we do background clearing to improve the quality of results. This method does not require any a priori information about the font, font size, background texture or geometric transformation. We have tested our method on a variety of manuscript images. The results show proposed method to be a robust, versatile and effective tool for enhancement of manuscript images.
Citations
More filters
Proceedings ArticleDOI
22 Sep 2009
TL;DR: The aim of this paper is to provide comprehensive review methods to enhance old document images with damaging background using binarization method or thresholding method and other methods.
Abstract: Quite often that old documents are suffering from background damage. Examples of background damages are varying contrast, ancient document age and the documents have degraded over time due to storage conditions and the quality of the written parchment. These have damaged background for example such as: have varying contrast, smudges, dirty, presence of seeping ink from the other side of the document, uneven background. In order to make them readable, image processing offers a selection of approaches. The aim of this paper is to provide comprehensive review methods to enhance old document images with damaging background. Three kinds of enhancement methods are: (a) Image enhancement methods using binarization method or thresholding method, (b) Image enhancement methods using binarization method or thresholding method and other methods, (c) Image enhancement methods using other methods only. As conclusion, the second method has becoming more popular and has a great potential to improve in future.

39 citations

Journal ArticleDOI
TL;DR: This work addresses the problem of separating images through the use of a physical model of the mixture process, which is nonlinear but invertible, and uses the inverse model to perform the separation.

22 citations

Proceedings ArticleDOI
Chunmei Liu1
23 Aug 2010
TL;DR: A novel approach to degraded character recognition by three kinds of independent degradation sources is proposed, composed of two stems: character image quality evaluation, character recognition.
Abstract: The character image quality plays an important role in degraded character recognition which could tell the recognition difficulty. This paper proposed a novel approach to degraded character recognition by three kinds of independent degradation sources. It is composed of two stems: character image quality evaluation, character recognition. Firstly, it presents the dual-evaluation to evaluate the image quality of the input character. Secondly, according to the input evaluation result, the character recognition sub-systems adaptively act on. These sub-systems are trained by character sets whose image qualities are similar to the input’s quality, and have special features and special classifiers respectively. Experiment results demonstrate the proposed approach highly improved the performance of degraded character recognition system.

4 citations


Cites methods from "Enhancement of Old Manuscript Image..."

  • ...[7] presented an algorithm based on matched wavelets and MRF model to automatically identify and extract the low contrast text regions from scanned manuscript images and enhance them using a histograms matching technique....

    [...]

Book ChapterDOI
14 Mar 2012
TL;DR: This chapter deals with digital restoration, preservation, and data base storage of historical manuscripts images by focusing on restoration techniques and binarization methods combined with image processing applied on document images for text background enhancement and discrimination.
Abstract: This chapter deals with digital restoration, preservation, and data base storage of historical manuscripts images. It focuses on restoration techniques and binarization methods combined with image processing applied on document images for text background enhancement and discrimination. Sequential image processing procedures are applied for image refinement and enhancement on quality class categorized images. Research results on historical (i.e. Byzantine, old newspapers, etc) manuscripts are presented.

3 citations

Book ChapterDOI
11 Nov 2016
TL;DR: An analysis of the different methods used for the enhancement of degraded ancient images in terms of low resolution, minimal intensity difference between the text and background, show through effects and uneven background.
Abstract: The article describes the most recent developments in the field of enhancement and digitization of ancient manuscripts and inscriptions. Digitization of ancient sources of information is essential to have an insight of the rich culture of previous civilizations, which in turn requires the high rate of accuracy in word and character recognition. To enhance the accuracy of the Optical Character Recognition system, the degraded images need to be made compatible for the OCR system. So, the image has to be pre-processed by filtering techniques and segmented by thresholding methods followed by post processing operations. The need for digitization of ancient artefacts is to preserve information that lies in the ancient manuscripts and improve the tourism of our country by attracting more and more tourists. This article gives an analysis of the different methods used for the enhancement of degraded ancient images in terms of low resolution, minimal intensity difference between the text and background, show through effects and uneven background. The techniques reviewed include ICA, NGFICA, Cumulants Based ICA and a novel thresholding technique for text extraction.

1 citations

References
More filters
Journal ArticleDOI
TL;DR: This work presents two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves that allow important cases of discontinuity preserving energies.
Abstract: Many tasks in computer vision involve assigning a label (such as disparity) to every pixel. A common constraint is that the labels should vary smoothly almost everywhere while preserving sharp discontinuities that may exist, e.g., at object boundaries. These tasks are naturally stated in terms of energy minimization. The authors consider a wide class of energies with various smoothness constraints. Global minimization of these energy functions is NP-hard even in the simplest discontinuity-preserving case. Therefore, our focus is on efficient approximation algorithms. We present two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves. These moves can simultaneously change the labels of arbitrarily large sets of pixels. In contrast, many standard algorithms (including simulated annealing) use small moves where only one pixel changes its label at a time. Our expansion algorithm finds a labeling within a known factor of the global minimum, while our swap algorithm handles more general energy functions. Both of these algorithms allow important cases of discontinuity preserving energies. We experimentally demonstrate the effectiveness of our approach for image restoration, stereo and motion. On real data with ground truth, we achieve 98 percent accuracy.

7,413 citations


"Enhancement of Old Manuscript Image..." refers background or methods in this paper

  • ...The problem of correcting the misclassification can be formulated in terms of energy minimization [9] [2]....

    [...]

  • ...The first and second terms in the above equation are referred to as Esmooth (interaction energy) and Edata (energy corresponding to the data term) in the literature [9]....

    [...]

  • ...The discontinuity preserving energy function used here is Potts interaction penalty [9]....

    [...]

Journal ArticleDOI
TL;DR: This paper addresses the problem of the identification of text in noisy document images by treating noise as a separate class and model noise based on selected features.
Abstract: In this paper, we address the problem of the identification of text in noisy document images. We are especially focused on segmenting and identifying between handwriting and machine printed text because: 1) Handwriting in a document often indicates corrections, additions, or other supplemental information that should be treated differently from the main content and 2) the segmentation and recognition techniques requested for machine printed and handwritten text are significantly different. A novel aspect of our approach is that we treat noise as a separate class and model noise based on selected features. Trained Fisher classifiers are used to identify machine printed text and handwriting from noise and we further exploit context to refine the classification. A Markov Random Field-based (MRF) approach is used to model the geometrical structure of the printed text, handwriting, and noise to rectify misclassifications. Experimental results show that our approach is robust and can significantly improve page segmentation in noisy document collections.

195 citations


"Enhancement of Old Manuscript Image..." refers methods in this paper

  • ...Similar approach has been used recently in [8] to refine the results of segmenting the handwritten text, printed text and noise in the document image....

    [...]

  • ...Thus in our case classification confidence found using Fisher classification is the intuitive choice of Dp [8]....

    [...]

Journal ArticleDOI
TL;DR: A clustering-based technique has been devised for estimating globally matched wavelet filters using a collection of groundtruth images and a text extraction scheme for the segmentation of document images into text, background, and picture components is extended.
Abstract: In this paper, we have proposed a novel scheme for the extraction of textual areas of an image using globally matched wavelet filters. A clustering-based technique has been devised for estimating globally matched wavelet filters using a collection of groundtruth images. We have extended our text extraction scheme for the segmentation of document images into text, background, and picture components (which include graphics and continuous tone images). Multiple, two-class Fisher classifiers have been used for this purpose. We also exploit contextual information by using a Markov random field formulation-based pixel labeling scheme for refinement of the segmentation results. Experimental results have established effectiveness of our approach.

159 citations


"Enhancement of Old Manuscript Image..." refers background or methods in this paper

  • ...The problem of correcting the misclassification can be formulated in terms of energy minimization [9] [ 2 ]....

    [...]

  • ...In this paper, we have used the text extraction algorithm designed in [1],[ 2 ], a brief overview of which is given below...

    [...]

Journal ArticleDOI
TL;DR: Methods are presented to design a finite impulse response/infinite impulse response (FIR/IIR) biorthogonal perfect reconstruction filterbank, leading to the estimation of a compactly supported/infinitely supported statistically matched wavelet.
Abstract: This paper presents a new approach for the estimation of wavelets that is matched to a given signal in the statistical sense. Based on this approach, a number of new methods to estimate statistically matched wavelets are proposed. The paper first proposes a new method for the estimation of statistically matched two-band compactly supported biorthogonal wavelet system. Second, a new method is proposed to estimate statistically matched semi-orthogonal two-band wavelet system that results in compactly supported or infinitely supported wavelet. Next, the proposed method of estimating two-band wavelet system is generalized to M-band wavelet system. Here, the key idea lies in the estimation of analysis wavelet filters from a given signal. This is similar to a sharpening filter used in image enhancement. The output of analysis highpass filter branch is viewed to be equivalent to an error in estimating the middle sample from the neighborhood. To minimize this error, a minimum mean square error (MMSE) criterion is employed. Since wavelet expansion acts like Karhunen-Loe/spl grave/ve-type expansion for generalized 1/f/sup /spl beta// processes, it is assumed that the given signal is a sample function of an mth-order fractional Brownian motion. Therefore, the autocorrelation structure of a generalized 1/f/sup /spl beta// process is used in the estimation of analysis filters using the MMSE criterion. We then present methods to design a finite impulse response/infinite impulse response (FIR/IIR) biorthogonal perfect reconstruction filterbank, leading to the estimation of a compactly supported/infinitely supported statistically matched wavelet. The proposed methods are very simple. Simulation results to validate the proposed theory are presented for different synthetic self-similar signals as well as music and speech clips. Estimated wavelets for different signals are compared with standard biorthogonal 9/7 and 5/3 wavelets for the application of compression and are shown to have better results.

80 citations

Proceedings ArticleDOI
23 Aug 2004
TL;DR: This work presents a background light intensity normalization algorithm suitable for historical document images that adaptively captures the background with a "best fit" linear function and normalized with respect to the approximation.
Abstract: This work presents a background light intensity normalization algorithm suitable for historical document images. The algorithm uses an adaptive linear function to approximate the uneven background due to the uneven surface of the document paper, aged color and light source of the cameras for image lifting. Our algorithm adaptively captures the background with a "best fit" linear function and normalized with respect to the approximation. The technique works for both gray scale and color images with significant improvement in readability.

77 citations


"Enhancement of Old Manuscript Image..." refers methods in this paper

  • ...We implement an optional background clearing step, described in [3], to make manuscript image more readable....

    [...]

  • ...Linear function approximations of document background [3], and a foreground-background separation method by using local adaptive analysis [5], are among other methods used in the digital restoration of historical document images....

    [...]