scispace - formally typeset
Search or ask a question
Book ChapterDOI

Detection of structural concavities in character images--a writer-independent approach

TL;DR: A novel technique for detection of concave regions as a structural information of character images by analyzing the sequence of discrete turns taken to describe the character stroke, which has the added advantage of detecting same concave areas of a particular character written by different individuals.
Abstract: In this paper, we present a novel technique for detection of concave regions as a structural information of character images. The problem difficulty lies in reporting all concavities irrespective of the viewing direction on the 2D plane. In our approach, we detect concave regions by analyzing the sequence of discrete turns taken to describe the character stroke; hence, it becomes view-invariant. The proposed method has the added advantage of detecting same concave regions of a particular character written by different individuals. We have tested our method on printed and handwritten Bangla and Hindi isolated character images. Initial results demonstrate the efficacy of our approach.
Citations
More filters
Journal ArticleDOI
TL;DR: A review of OCR work on Indian scripts, mainly on Bangla and Devanagari—the two most popular scripts in India, and the various methodologies and their reported results are presented.
Abstract: The past few decades have witnessed an intensive research on optical character recognition (OCR) for Roman, Chinese, and Japanese scripts. A lot of work has been also reported on OCR efforts for various Indian scripts, like Devanagari, Bangla, Oriya, Tamil, Telugu, Malayalam, Kannada, Gurmukhi, Gujarati, etc. In this paper, we present a review of OCR work on Indian scripts, mainly on Bangla and Devanagari—the two most popular scripts in India. We have summarized most of the published papers on this topic and have also analysed the various methodologies and their reported results. Future directions of research in OCR for Indian scripts have been also given.

70 citations


Cites background from "Detection of structural concavities..."

  • ...Bag et al (2011b) have proposed topological features (Bag et al 2012) to improve the recognition performance for printed and handwritten Bangla basic characters....

    [...]

  • ...Such skeletal convexity acts as an invariant feature for character recognition (Bag et al 2012)....

    [...]

Journal ArticleDOI
TL;DR: This work presents a probabilistic model using the embedded hidden Markov models (HMMs) for the classification and modeling of perceptual sequences and investigated the progressive iterative approximation for synthesizing the handwritten trace.
Abstract: This paper handles the problem of online handwriting synthesis. Indeed, this work presents a probabilistic model using the embedded hidden Markov models (HMMs) for the classification and modeling of perceptual sequences. At first, we start with a vector of perceptual points as input seeking a class of basic shape probability as output. In fact, these perceptual points are necessary for the drawing and the recovering of each basic shape where each one is designed with an HMM built and trained with its components. Each path through these possibilities of control points represents an observation that serves as input for the following step. Secondly, the already detected sequences of observations which represent a segment formed an initial HMM and the concatenation of multiple ones leads to a global HMM. To classify a global HMM, we should codify it by searching the best path of initial HMM. The best path is obtained by computing the maximum of likelihood of the different basic shapes. In order to synthesize the handwritten trace, and to recover the best control points sequences, we investigated the progressive iterative approximation. The performance of the proposed model was assessed using samples of scripts extracted from IRONOFF and MAYASTROON databases. In fact, these samples served for the generation of the set of control points used for the HMMs training models. In experiments, good quantitative agreement and approximation were found for the generated trajectories and a more reduced representation of the scripts models was designed.

7 citations

Proceedings ArticleDOI
01 May 2014
TL;DR: This work presents a novel methodology to reduce distortion at junction regions adjoining the matra for BangIa script by using geometric properties of the junctions to solve the problem.
Abstract: Thinning which is an important preprocessing step for character recognition is often subject to several kinds of distortion. Junction point distortion is a major imperfection in thinned images especially for handwritten Indian scripts due to the presence of large number of complicated junctions in them. Such distortion does allow the optical character recognition (OCR) systems to exploit the properties of these junctions for character recognition. We present a novel methodology to reduce distortion at junction regions adjoining the matra for BangIa script. Our method uses geometric properties of the junctions to solve the problem. We have tested our approach on our own data set consisting of a variety of isolated handwritten character images by different writers and have got promising results.

1 citations


Cites methods from "Detection of structural concavities..."

  • ...So we find out the convexity of the point pi+2 by the method given below [7]....

    [...]

Book ChapterDOI
05 Dec 2017
TL;DR: A novel rough-set-theoretic model is introduced here to effectuate an unsupervised classification of optical characters with a suboptimal attribute set, called the semi-reduct, which eventually leads to quick and easy discernibility of almost all the characters irrespective of their font style.
Abstract: Most of the well-known OCR engines, such as Google Tesseract, resort to a supervised classification, causing the system drooping in speed with increasing diversity in font style. Hence, with an aim to resolve the tediousness and pitfalls of training an OCR system, but without compromising with its efficiency, we introduce here a novel rough-set-theoretic model. It is designed to effectuate an unsupervised classification of optical characters with a suboptimal attribute set, called the semi-reduct. The semi-reduct attributes are mostly geometric and topological in nature, each having a small range of discrete values estimated from different combinatorial characteristics of rough-set approximations. This eventually leads to quick and easy discernibility of almost all the characters irrespective of their font style. For a few indiscernible characters, Tesseract features are used, but very sparingly, in the final stages of the OCR pipeline so as to ensure an attractive run time of the overall process. Preliminary experimental results demonstrate its further scope and promise.

1 citations

Proceedings ArticleDOI
01 Dec 2017
TL;DR: This paper presents a computationally efficient technique for character spotting using certain concepts from rough set theory, and shows that the symbols or characters can be quite accurately spotted in the inscription.
Abstract: The ability to spot a few known characters or symbols allows the linguists and historians to guess the era in which an inscription was made. Manual spotting of these characters proves to be very laborious and error-prone. Hence, automation in character spotting has evolved in recent time, which has its own challenges due to natural wear and tear of inscriptions through aging. In this paper, we present a computationally efficient technique for character spotting using certain concepts from rough set theory. After image binarization, we compute various attributes for the isolated symbols within the ambit of rough set. In order to spot a symbol in the inscription, the corresponding attribute set for the query symbol is matched with that of the inscribed symbols. We provide the details of the method in this paper and show that the symbols or characters can be quite accurately spotted in the inscription.

1 citations

References
More filters
Journal ArticleDOI
TL;DR: A review of the OCR work done on Indian language scripts and the scope of future work and further steps needed for Indian script OCR development is presented.

592 citations


"Detection of structural concavities..." refers background in this paper

  • ...For last few decades, different structural-property extraction methods are reported for Indian OCR systems [12]....

    [...]

Journal ArticleDOI
TL;DR: A complete Optical Character Recognition (OCR) system for printed Bangla, the fourth most popular script in the world, is presented and extension of the work to Devnagari, the third most popular Script in the World, is discussed.

381 citations


"Detection of structural concavities..." refers methods in this paper

  • ...Chaudhury and Pal [5] proposed a method to extract different character strokes with different orientations in a character image....

    [...]

Journal ArticleDOI
01 Jul 2000
TL;DR: The reading process has been widely studied and there is a general agreement among researchers that knowledge in different forms and at different levels plays a vital role, which is the underlying philosophy of the Devanagari document recognition system described in this work.
Abstract: The reading process has been widely studied and there is a general agreement among researchers that knowledge in different forms and at different levels plays a vital role. This is the underlying philosophy of the Devanagari document recognition system described in this work. The knowledge sources we use are mostly statistical in nature or in the form of a word dictionary tailored specifically for optical character recognition (OCR). We do not perform any reasoning on these. However, we explore their relative importance and role in the hierarchy. Some of the knowledge sources are acquired a priori by an automated training process while others are extracted from the text as it is processed. A complete Devanagari OCR system has been designed and tested with real-life printed documents of varying size and font. Most of the documents used were photocopies of the original. A performance of approximately 90% correct recognition is achieved.

132 citations


"Detection of structural concavities..." refers methods in this paper

  • ...Methods for the detection of similar type of structural properties are also reported for Devanagari script in the literature [3,8,9]....

    [...]

Journal ArticleDOI
TL;DR: Curvature properties have been extracted after thinning the smoothed character images and filtering the thinned images using a Gaussian kernel and the unknown samples are classified using a two-stage feed forward neural net based recognition scheme.

98 citations


"Detection of structural concavities..." refers background in this paper

  • ...Dutta and Chaudhuri [7] detected different structural properties, such as junction points, holes, stroke segments, curvature maxima, curvature minima, and inflexion points of character images....

    [...]