scispace - formally typeset
Author

Gaurav Harit

Bio: Gaurav Harit is an academic researcher from Indian Institute of Technology, Jodhpur. The author has contributed to research in topic(s): Character (mathematics) & Image segmentation. The author has an hindex of 13, co-authored 73 publication(s) receiving 523 citation(s). Previous affiliations of Gaurav Harit include Indian Institutes of Technology & Indian Institute of Technology Delhi.


Papers
More filters
Journal ArticleDOI

[...]

TL;DR: A review of OCR work on Indian scripts, mainly on Bangla and Devanagari—the two most popular scripts in India, and the various methodologies and their reported results are presented.
Abstract: The past few decades have witnessed an intensive research on optical character recognition (OCR) for Roman, Chinese, and Japanese scripts. A lot of work has been also reported on OCR efforts for various Indian scripts, like Devanagari, Bangla, Oriya, Tamil, Telugu, Malayalam, Kannada, Gurmukhi, Gujarati, etc. In this paper, we present a review of OCR work on Indian scripts, mainly on Bangla and Devanagari—the two most popular scripts in India. We have summarized most of the published papers on this topic and have also analysed the various methodologies and their reported results. Future directions of research in OCR for Indian scripts have been also given.

63 citations

Journal ArticleDOI

[...]

TL;DR: The novelty of the approach lies in the formulation of appropriate rules of character decomposition for segmenting the character skeleton into stroke segments and then grouping them for extraction of meaningful shape components.
Abstract: In this paper we propose a novel character recognition method for Bangla compound characters. Accurate recognition of compound characters is a difficult problem due to their complex shapes. Our strategy is to decompose a compound character into skeletal segments. The compound character is then recognized by extracting the convex shape primitives and using a template matching scheme. The novelty of our approach lies in the formulation of appropriate rules of character decomposition for segmenting the character skeleton into stroke segments and then grouping them for extraction of meaningful shape components. Our technique is applicable to both printed and handwritten characters. The proposed method performs well for complex-shaped compound characters, which were confusing to the existing methods. HighlightsThe proper recognition of compound characters is a difficult problem due to their complex shapes.In this paper, we propose a novel character recognition method for Bangla compound characters.Our strategy is to decompose the compound character into simpler shape components.Our technique is applicable to printed and handwritten characters.Experiment is done on printed and handwritten Bangla compound characters.

30 citations

Journal ArticleDOI

[...]

TL;DR: A functional unobtrusive Indian sign language recognition system was implemented and tested on real world data and proposes a method for a novel, low-cost and easy-to-use application, for Indian Sign Language recognition, using the Microsoft Kinect camera.
Abstract: People with speech disabilities communicate in sign language and therefore have trouble in mingling with the able-bodied. There is a need for an interpretation system which could act as a bridge between them and those who do not know their sign language. A functional unobtrusive Indian sign language recognition system was implemented and tested on real world data. A vocabulary of 140 symbols was collected using 18 subjects, totalling 5041 images. The vocabulary consisted mostly of two-handed signs which were drawn from a wide repertoire of words of technical and daily-use origins. The system was implemented using Microsoft Kinect which enables surrounding light conditions and object colour to have negligible effect on the efficiency of the system. The system proposes a method for a novel, low-cost and easy-to-use application, for Indian Sign Language recognition, using the Microsoft Kinect camera. In the fingerspelling category of our dataset, we achieved above 90% recognition rates for 13 signs and 100% recognition for 3 signs with overall 16 distinct alphabets (A, B, D, E, F, G, H, K, P, R, T, U, W, X, Y, Z) recognised with an average accuracy rate of 90.68%.

28 citations

Proceedings ArticleDOI

[...]

19 Feb 2011
TL;DR: This paper presents a novel handwritten character recognition method based on the structural shape of a character irrespective of the viewing direction on the 2D plane, and preliminary results demonstrate the efficacy of this approach.
Abstract: The main challenge in recognizing handwritten characters is to handle large-scale shape variations in the handwriting of different individuals. In this paper, we present a novel handwritten character recognition method based on the structural shape of a character irrespective of the viewing direction on the 2D plane. Structural shape of a character is described by different skeletal convexities of character strokes. Such skeletal convexity acts as an invariant feature for character recognition. Longest common subsequence matching is used for recognition. We have tested out method on a benchmark dataset of handwritten Bengali character images. Preliminary results demonstrate the efficacy of our approach.

27 citations

Journal ArticleDOI

[...]

TL;DR: A contour-based thinning method used for performing skeletonization of printed noisy isolated character images by using shape characteristics of text to get skeleton of nearly same as the true character shape.
Abstract: Digital skeleton of character images, generated by thinning method, has a wide range of applications for shape analysis and classification. But thinning of character images is a big challenge. Removal of spurious strokes or deformities in thinning is a difficult problem. In this paper, we propose a contour-based thinning method used for performing skeletonization of printed noisy isolated character images. In this method, we use shape characteristics of text to get skeleton of nearly same as the true character shape. This approach helps to preserve the local features and true shapes of the character images. As a by-product of our thinning approach, the skeleton also gets segmented into strokes in vector form. Hence further stroke segmentation is not required. Experiment is done on printed English, Bengali, Hindi, and Tamil characters and we obtain much better results comparing with other thinning methods without any post-processing.

25 citations


Cited by
More filters
Journal Article

[...]

TL;DR: This paper addresses current topics about document image understanding from a technical point of view as a survey and proposes methods/approaches for recognition of various kinds of documents.
Abstract: The subject about document image understanding is to extract and classify individual data meaningfully from paper-based documents. Until today, many methods/approaches have been proposed with regard to recognition of various kinds of documents, various technical problems for extensions of OCR, and requirements for practical usages. Of course, though the technical research issues in the early stage are looked upon as complementary attacks for the traditional OCR which is dependent on character recognition techniques, the application ranges or related issues are widely investigated or should be established progressively. This paper addresses current topics about document image understanding from a technical point of view as a survey. key words: document model, top-down, bottom-up, layout structure, logical structure, document types, layout recognition

221 citations

Journal ArticleDOI

[...]

01 Apr 2007
TL;DR: Call for papers for Special Issue of ACM Transactions on Multimedia Computing, Communications and Applications on Interactive Digital Television.
Abstract: Call for papers for Special Issue of ACM Transactions on Multimedia Computing, Communications and Applications on Interactive Digital Television

201 citations

Journal ArticleDOI

[...]

TL;DR: A method for automatically obtaining object representations suitable for retrieval from generic video shots that includes associating regions within a single shot to represent a deforming object and an affine factorization method that copes with motion degeneracy.
Abstract: We describe a method for automatically obtaining object representations suitable for retrieval from generic video shots. The object representation consists of an association of frame regions. These regions provide exemplars of the object's possible visual appearances. Two ideas are developed: (i) associating regions within a single shot to represent a deforming object; (ii) associating regions from the multiple visual aspects of a 3D object, thereby implicitly representing 3D structure. For the association we exploit temporal continuity (tracking) and wide baseline matching of affine covariant regions. In the implementation there are three areas of novelty: First, we describe a method to repair short gaps in tracks. Second, we show how to join tracks across occlusions (where many tracks terminate simultaneously). Third, we develop an affine factorization method that copes with motion degeneracy. We obtain tracks that last throughout the shot, without requiring a 3D reconstruction. The factorization method is used to associate tracks into object-level groups, with common motion. The outcome is that separate parts of an object that are not simultaneously visible (such as the front and back of a car, or the front and side of a face) are associated together. In turn this enables object-level matching and recognition throughout a video. We illustrate the method on the feature film "Groundhog Day." Examples are given for the retrieval of deforming objects (heads, walking people) and rigid objects (vehicles, locations).

158 citations

Journal ArticleDOI

[...]

01 Nov 2011
TL;DR: In this paper, the state of the art from 1970s of machine printed and handwritten Devanagari optical character recognition (OCR) is discussed in various sections of the paper.
Abstract: In India, more than 300 million people use Devanagari script for documentation. There has been a significant improvement in the research related to the recognition of printed as well as handwritten Devanagari text in the past few years. State of the art from 1970s of machine printed and handwritten Devanagari optical character recognition (OCR) is discussed in this paper. All feature-extraction techniques as well as training, classification and matching techniques useful for the recognition are discussed in various sections of the paper. An attempt is made to address the most important results reported so far and it is also tried to highlight the beneficial directions of the research till date. Moreover, the paper also contains a comprehensive bibliography of many selected papers appeared in reputed journals and conference proceedings as an aid for the researchers working in the field of Devanagari OCR.

138 citations

Proceedings ArticleDOI

[...]

01 Nov 2017
TL;DR: The proposed method works with high precision on document images with varying layouts that include documents, research papers, and magazines and beats Tesseract's state of the art table detection system by a significant margin.
Abstract: Table detection is a crucial step in many document analysis applications as tables are used for presenting essential information to the reader in a structured manner. It is a hard problem due to varying layouts and encodings of the tables. Researchers have proposed numerous techniques for table detection based on layout analysis of documents. Most of these techniques fail to generalize because they rely on hand engineered features which are not robust to layout variations. In this paper, we have presented a deep learning based method for table detection. In the proposed method, document images are first pre-processed. These images are then fed to a Region Proposal Network followed by a fully connected neural network for table detection. The proposed method works with high precision on document images with varying layouts that include documents, research papers, and magazines. We have done our evaluations on publicly available UNLV dataset where it beats Tesseract's state of the art table detection system by a significant margin.

112 citations