scispace - formally typeset
Search or ask a question

Showing papers by "Gaurav Harit published in 2011"


Journal ArticleDOI
TL;DR: A contour-based thinning method used for performing skeletonization of printed noisy isolated character images by using shape characteristics of text to get skeleton of nearly same as the true character shape.

27 citations


Proceedings ArticleDOI
19 Feb 2011
TL;DR: This paper presents a novel handwritten character recognition method based on the structural shape of a character irrespective of the viewing direction on the 2D plane, and preliminary results demonstrate the efficacy of this approach.
Abstract: The main challenge in recognizing handwritten characters is to handle large-scale shape variations in the handwriting of different individuals. In this paper, we present a novel handwritten character recognition method based on the structural shape of a character irrespective of the viewing direction on the 2D plane. Structural shape of a character is described by different skeletal convexities of character strokes. Such skeletal convexity acts as an invariant feature for character recognition. Longest common subsequence matching is used for recognition. We have tested out method on a benchmark dataset of handwritten Bengali character images. Preliminary results demonstrate the efficacy of our approach.

27 citations


Proceedings ArticleDOI
15 Dec 2011
TL;DR: A novel approach towards character segmentation in a handwritten document is proposed based on the vertex characterization of outer isothetic polygonal covers so that each cover corresponds to a particular word or part of a word.
Abstract: Segmentation of cursive handwriting is one of the most challenging problems in the area of handwritten character recognition. In this paper, we propose a novel approach towards character segmentation in a handwritten document. It is based on the vertex characterization of outer isothetic polygonal covers so that each cover corresponds to a particular word or part of a word. The proposed method has the potential to segment skewed text without deskewing them. Experiment is done on several Bangla handwritings of different individuals. The average success rate is 96.04\%. This method can be considered as a significant preprocessing step towards the development of a handwritten Bangla OCR system.

15 citations


Journal ArticleDOI
TL;DR: Novel features based on the topography of a character as visible from different viewing directions on a 2D plane as represented as a shapebased graph which acts as an invariant feature set for discriminating very similar type characters efficiently.
Abstract: selection and extraction plays an important role in different classification based problems such as face recognition, signature verification, optical character recognition (OCR) etc. The performance of OCR highly depends on the proper selection and extraction of feature set. In this paper, we present novel features based on the topography of a character as visible from different viewing directions on a 2D plane. By topography of a character we mean the structural features of the strokes and their spatial relations. In this work we develop topographic features of strokes visible with respect to views from different directions (e.g. North, South, East, and West). We consider three types of topographic features: closed region, convexity of strokes, and straight line strokes. These features are represented as a shapebased graph which acts as an invariant feature set for discriminating very similar type characters efficiently. We have tested the proposed method on printed and handwritten Bengali and Hindi character images. Initial results demonstrate the efficacy of our approach.

10 citations


Proceedings ArticleDOI
22 Dec 2011
TL;DR: A novel technique for binarization of degraded documents that works in a multi-scale framework with an adaptive-cum-interpolative thresholding as a modification of Otsu's method, which is found to be robust and appreciably better, as tested by conventional evaluation schemes.
Abstract: A novel technique for binarization of degraded documents is proposed. It works in a multi-scale framework with an adaptive-cum-interpolative thresholding as a modification of Otsu's method. Instead of computing a global threshold value for an input document image, it computes the local threshold values for a small set of grid points by observing the intensity pattern of the pixels lying in the concerned grid cells. Thresholds estimated for these grid points are used, in turn, to compute the threshold values of all the remaining pixels using a fast-yet-efficient interpolation procedure. To handle noises in degraded images, this grid-based adaptive thresholding is applied in successively reducing scales to obtain the nearoptimal binarization as a set of connected components. After a post-processing with these connected components, we get the final output. Exhaustive experimentation has been carried out with benchmark datasets including George Washington corpus of handwritten documents, and also with our own datasets. When compared to other methods, the proposed method is found to be robust and appreciably better, as tested by conventional evaluation schemes.

8 citations


Proceedings ArticleDOI
17 Sep 2011
TL;DR: Novel topological features based on the structural shape of a character are presented, which detect the convexshaped segments formed by the various strokes and are represented as a spatial layout of convex segments.
Abstract: In this paper, we present novel topological features based on the structural shape of a character. We detect the convexshaped segments formed by the various strokes. The convex segments are then represented with shape primitives from a repertoire. The character is represented as a spatial layout of convex segments. We formulate feature templates for Bangla characters. A given character is assigned the label of the best matching feature template. We have tested the method on a benchmark datasets of printed and handwritten Bangla basic and compound character images. Our results demonstrate the efficacy of our approach.

8 citations


Journal ArticleDOI
TL;DR: A thinning methodology applicable to character images that is novel in terms of its ability to adapt to local character shape while constructing the thinned skeleton while obtaining less spurious branches compared to other thinning methods.
Abstract: In this paper we propose a thinning methodology applicable to character images. It is novel in terms of its ability to adapt to local character shape while constructing the thinned skeleton. Our method does not produce many of the distortions in the character shapes which normally result from the use of existing thinning algorithms. The proposed thinning methodology is based on the medial axis of the character. The skeleton has a width of one pixel. As a by-product of our thinning approach, the skeleton also gets segmented into strokes in vector form. Hence further stroke segmentation is not required. We have conducted experiments with printed and handwritten characters in several scripts such as English, Bengali, Hindi, Kannada and Tamil. We obtain less spurious branches compared to other thinning methods. Our method does not use any kind of post processing.

8 citations


Journal ArticleDOI
TL;DR: In this paper, the authors developed topographic features of strokes visible with respect to views from different directions (e.g. North, South, East, and West) for optical character recognition (OCR).
Abstract: Feature selection and extraction plays an important role in different classification based problems such as face recognition, signature verification, optical character recognition (OCR) etc. The performance of OCR highly depends on the proper selection and extraction of feature set. In this paper, we present novel features based on the topography of a character as visible from different viewing directions on a 2D plane. By topography of a character we mean the structural features of the strokes and their spatial relations. In this work we develop topographic features of strokes visible with respect to views from different directions (e.g. North, South, East, and West). We consider three types of topographic features: closed region, convexity of strokes, and straight line strokes. These features are represented as a shape-based graph which acts as an invariant feature set for discriminating very similar type characters efficiently. We have tested the proposed method on printed and handwritten Bengali and Hindi character images. Initial results demonstrate the efficacy of our approach.

7 citations


Posted Content
TL;DR: This paper has proposed a medial axis based thinning strategy used for performing skeletonization of printed and handwritten character images using shape characteristics of text to get skeleton of nearly same as the true character shape.
Abstract: Thinning of character images is a big challenge. Removal of strokes or deformities in thinning is a difficult problem. In this paper, we have proposed a medial axis based thinning strategy used for performing skeletonization of printed and handwritten character images. In this method, we have used shape characteristics of text to get skeleton of nearly same as the true character shape. This approach helps to preserve the local features and true shape of the character images. The proposed algorithm produces one pixel width thin skeleton. As a by-product of our thinning approach, the skeleton also gets segmented into strokes in vector form. Hence further stroke segmentation is not required. Experiment is done on printed English and Bengali characters and we obtain less spurious branches comparing with other thinning methods without any post processing.

4 citations


Book ChapterDOI
02 Jan 2011
TL;DR: Novel features based on the topography of a character as visible from different viewing directions on a 2D plane are presented and tested on printed and handwritten Bengali and Hindi isolated character images.
Abstract: In this paper, we present novel features based on the topography of a character as visible from different viewing directions on a 2D plane. By topography of a character we mean the structural features of the strokes and their spatial relations. In this work we develop topographic features of strokes visible with respect to views from different directions (e.g. North, South, East, and West). We consider three types of topographic features: closed region, convexity of strokes, and straight line strokes. We have tested the proposed method on printed and handwritten Bengali and Hindi isolated character images. Initial results demonstrate the efficacy of our approach.

1 citations