scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Shape Descriptor Based Document Image Indexing and Symbol Recognition

TL;DR: A novel shape descriptor based on shape context, which in combination with hierarchical distance based hashing is used for word and graphical pattern based document image indexing and retrieval and the applicability is demonstrated for classification of characters and symbols.
Abstract: In this paper we present a novel shape descriptor based on shape context, which in combination with hierarchical distance based hashing is used for word and graphical pattern based document image indexing and retrieval. The shape descriptor represents the relative arrangement of points sampled on the boundary of the shape of object. We also demonstrate the applicability of the novel shape descriptor for classification of characters and symbols. For indexing, we provide anew formulation for distance based hierarchical locality sensitive hashing. Experiments have yielded promising results.
Citations
More filters
Proceedings ArticleDOI
04 Feb 2013
TL;DR: A generic Optical Character Recognition system for Arabic script languages called Nabocr is presented, initially trained to recognize both Urdu Nastaleeq and Arabic Naskh fonts, however, it can be trained by users to be used for other ArabicScript languages.
Abstract: In this paper, we present a generic Optical Character Recognition system for Arabic script languages called Nabocr. Nabocr uses OCR approaches specific for Arabic script recognition. Performing recognition on Arabic script text is relatively more difficult than Latin text due to the nature of Arabic script, which is cursive and context sensitive. Moreover, Arabic script has different writing styles that vary in complexity. Nabocr is initially trained to recognize both Urdu Nastaleeq and Arabic Naskh fonts. However, it can be trained by users to be used for other Arabic script languages. We have evaluated our system's performance for both Urdu and Arabic. In order to evaluate Urdu recognition, we have generated a dataset of Urdu text called UPTI (Urdu Printed Text Image Database), which measures different aspects of a recognition system. The performance of our system for Urdu clean text is 91%. For Arabic clean text, the performance is 86%. Moreover, we have compared the performance of our system against Tesseract's newly released Arabic recognition, and the performance of both systems on clean images is almost the same.

101 citations


Cites methods from "Shape Descriptor Based Document Ima..."

  • ...SPIE-IS&T/ Vol. 8658 86580N-4 Contour Extraction The contour of a ligature is extracted using an approach proposed by Hassan et al.8 The approach proposes extracting the contour by first applying a logical grid to the shape image as shown in Figure 8....

    [...]

  • ...However, in order to describe the whole ligature, we use an approach proposed by Hassan et al.10 which is as follows: • Dividing the ligature into regions as shown in Figure 10....

    [...]

  • ...The contour of a ligature is extracted using an approach proposed by Hassan et al.(8) The approach proposes extracting the contour by first applying a logical grid to the shape image as shown in Figure 8....

    [...]

Journal ArticleDOI
TL;DR: A survey of methods developed by researchers to access document images based on images such as signature, logo, machine-print, different fonts etc is provided.
Abstract: economic feasibility of creating a large database of document image has left a tremendous need for robust ways to access the information. Printed documents are scanned for archiving or in an attempt to move towards a paperless office and are stored as images. In this paper, we provide a survey of methods developed by researchers to access document images. The survey includes papers covering the current state of art on the research in document image retrieval based on images such as signature, logo, machine-print, different fonts etc.

28 citations


Cites background from "Shape Descriptor Based Document Ima..."

  • ...[42] Ehtesham Hassan, Santanu Chaudhury, and M Gopal....

    [...]

Journal ArticleDOI
TL;DR: The scheme presents the extension of distance based hashing to kernel space for generating the indexing structure based on similarity in kernel space using the concept of multiple kernel learning to incorporate multiple features for defining the image indexing space.
Abstract: The paper presents a novel feature based indexing scheme for image collections. The scheme presents the extension of distance based hashing to kernel space for generating the indexing structure based on similarity in kernel space. The objective of the scheme is to incorporate multiple features for defining the image indexing space using the concept of multiple kernel learning. However, the indexing problems are defined with unique learning objective; therefore, a novel application of genetic algorithm is presented for the optimization task. The extensive evaluation of the proposed concept is performed for developing word based document indexing application of Devanagari, Bengali, and English scripts. In addition, the efficacy of the proposed concept is shown by experimental evaluations on handwritten digits and natural image collection.

22 citations


Cites background from "Shape Descriptor Based Document Ima..."

  • ...Additionally, [41] and [24] have explored the application of approximate nearest neighbour search based retrieval for Indian script document indexing....

    [...]

  • ...In the context of Indian script document image retrieval, various feature extraction schemes have been proposed for word images which exploit global, graph based features [24]–[27]....

    [...]

  • ...The descriptor proposed in [24] represents structural organization of an object in 2-D histogram....

    [...]

Journal ArticleDOI
TL;DR: This paper presents a framework for multimodal analysis of multilingual news telecasts, which can be augmented with tools and techniques for specific news analytics tasks and focuses on a set of techniques for automatic indexing of the news stories based on keywords spotted in speech as well as on the visuals of contemporary and domain interest.
Abstract: The problems associated with automatic analysis of news telecasts are more severe in a country like India, where there are many national and regional language channels, besides English. In this paper, we present a framework for multimodal analysis of multilingual news telecasts, which can be augmented with tools and techniques for specific news analytics tasks. Further, we focus on a set of techniques for automatic indexing of the news stories based on keywords spotted in speech as well as on the visuals of contemporary and domain interest. English keywords are derived from RSS feed and converted to Indian language equivalents for detection in speech and on ticker texts. Restricting the keyword list to a manageable number results in drastic improvement in indexing performance. We present illustrative examples and detailed experimental results to substantiate our claim.

14 citations


Cites result from "Shape Descriptor Based Document Ima..."

  • ...We present illustrative examples and detailed experimental results to substantiate our claim....

    [...]

Journal ArticleDOI
TL;DR: A scheme for multiple feature based identity establishment using multi-kernel learning using genetic algorithm and the efficacy of the framework using individual and combination of features is demonstrated for Devanagari script input.

11 citations

References
More filters
Journal ArticleDOI
TL;DR: This paper presents work on computing shape models that are computationally fast and invariant basic transformations like translation, scaling and rotation, and proposes shape detection using a feature called shape context, which is descriptive of the shape of the object.
Abstract: We present a novel approach to measuring similarity between shapes and exploit it for object recognition. In our framework, the measurement of similarity is preceded by: (1) solving for correspondences between points on the two shapes; (2) using the correspondences to estimate an aligning transform. In order to solve the correspondence problem, we attach a descriptor, the shape context, to each point. The shape context at a reference point captures the distribution of the remaining points relative to it, thus offering a globally discriminative characterization. Corresponding points on two similar shapes will have similar shape contexts, enabling us to solve for correspondences as an optimal assignment problem. Given the point correspondences, we estimate the transformation that best aligns the two shapes; regularized thin-plate splines provide a flexible class of transformation maps for this purpose. The dissimilarity between the two shapes is computed as a sum of matching errors between corresponding points, together with a term measuring the magnitude of the aligning transform. We treat recognition in a nearest-neighbor classification framework as the problem of finding the stored prototype shape that is maximally similar to that in the image. Results are presented for silhouettes, trademarks, handwritten digits, and the COIL data set.

6,693 citations


"Shape Descriptor Based Document Ima..." refers background in this paper

  • ...The histogram Hi(k) is defined as the shape context [4] of point Pi....

    [...]

Proceedings Article
07 Sep 1999
TL;DR: Experimental results indicate that the novel scheme for approximate similarity search based on hashing scales well even for a relatively large number of dimensions, and provides experimental evidence that the method gives improvement in running time over other methods for searching in highdimensional spaces based on hierarchical tree decomposition.
Abstract: The nearestor near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over high-dimensional data, e.g., image databases, document collections, time-series databases, and genome databases. Unfortunately, all known techniques for solving this problem fall prey to the \curse of dimensionality." That is, the data structures scale poorly with data dimensionality; in fact, if the number of dimensions exceeds 10 to 20, searching in k-d trees and related structures involves the inspection of a large fraction of the database, thereby doing no better than brute-force linear search. It has been suggested that since the selection of features and the choice of a distance metric in typical applications is rather heuristic, determining an approximate nearest neighbor should su ce for most practical purposes. In this paper, we examine a novel scheme for approximate similarity search based on hashing. The basic idea is to hash the points Supported by NAVY N00014-96-1-1221 grant and NSF Grant IIS-9811904. Supported by Stanford Graduate Fellowship and NSF NYI Award CCR-9357849. Supported by ARO MURI Grant DAAH04-96-1-0007, NSF Grant IIS-9811904, and NSF Young Investigator Award CCR9357849, with matching funds from IBM, Mitsubishi, Schlumberger Foundation, Shell Foundation, and Xerox Corporation. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 25th VLDB Conference, Edinburgh, Scotland, 1999. from the database so as to ensure that the probability of collision is much higher for objects that are close to each other than for those that are far apart. We provide experimental evidence that our method gives signi cant improvement in running time over other methods for searching in highdimensional spaces based on hierarchical tree decomposition. Experimental results also indicate that our scheme scales well even for a relatively large number of dimensions (more than 50).

3,705 citations


"Shape Descriptor Based Document Ima..." refers methods in this paper

  • ...The Locality Sensitive Hashing(LSH) algorithm has been successfully applied for solving different nearest neighbor problems in high dimensional space [1]....

    [...]

Proceedings ArticleDOI
22 May 1995
TL;DR: A fast algorithm to map objects into points in some k-dimensional space (k is user-defined), such that the dis-similarities are preserved, and this method is introduced from pattern recognition, namely, Multi-Dimensional Scaling (MDS).
Abstract: A very promising idea for fast searching in traditional and multimedia databases is to map objects into points in k-d space, using k feature-extraction functions, provided by a domain expert [25]. Thus, we can subsequently use highly fine-tuned spatial access methods (SAMs), to answer several types of queries, including the 'Query By Example' type (which translates to a range query); the 'all pairs' query (which translates to a spatial join [8]); the nearest-neighbor or best-match query, etc.However, designing feature extraction functions can be hard. It is relatively easier for a domain expert to assess the similarity/distance of two objects. Given only the distance information though, it is not obvious how to map objects into points.This is exactly the topic of this paper. We describe a fast algorithm to map objects into points in some k-dimensional space (k is user-defined), such that the dis-similarities are preserved. There are two benefits from this mapping: (a) efficient retrieval, in conjunction with a SAM, as discussed before and (b) visualization and data-mining: the objects can now be plotted as points in 2-d or 3-d space, revealing potential clusters, correlations among attributes and other regularities that data-mining is looking for.We introduce an older method from pattern recognition, namely, Multi-Dimensional Scaling (MDS) [51]; although unsuitable for indexing, we use it as yardstick for our method. Then, we propose a much faster algorithm to solve the problem in hand, while in addition it allows for indexing. Experiments on real and synthetic data indeed show that the proposed algorithm is significantly faster than MDS, (being linear, as opposed to quadratic, on the database size N), while it manages to preserve distances and the overall structure of the data-set.

1,124 citations


"Shape Descriptor Based Document Ima..." refers methods in this paper

  • ...Several methods are available for defining functions that map an arbitrary space (X,D) into the real line R [2]....

    [...]

  • ...In recent works, primarily two approaches have been used for word image matching: pixel level matching and feature based matching [2, 3]....

    [...]

Proceedings ArticleDOI
07 Apr 2008
TL;DR: A novel formulation is presented, that uses statistical observations from sample data to analyze retrieval accuracy and efficiency for the proposed indexing method, and significantly outperforms VP-trees, which are a well-known method for distance-based indexing.
Abstract: A method is proposed for indexing spaces with arbitrary distance measures, so as to achieve efficient approximate nearest neighbor retrieval. Hashing methods, such as locality sensitive hashing (LSH), have been successfully applied for similarity indexing in vector spaces and string spaces under the Hamming distance. The key novelty of the hashing technique proposed here is that it can be applied to spaces with arbitrary distance measures, including non-metric distance measures. First, we describe a domain-independent method for constructing a family of binary hash functions. Then, we use these functions to construct multiple multibit hash tables. We show that the LSH formalism is not applicable for analyzing the behavior of these tables as index structures. We present a novel formulation, that uses statistical observations from sample data to analyze retrieval accuracy and efficiency for the proposed indexing method. Experiments on several real-world data sets demonstrate that our method produces good trade-offs between accuracy and efficiency, and significantly outperforms VP-trees, which are a well-known method for distance-based indexing.

105 citations


"Shape Descriptor Based Document Ima..." refers background in this paper

  • ...The Distance based hashing proposed in [6], is a novel formulation which can be applied for arbitrary distance measures....

    [...]

Proceedings ArticleDOI
Su Yang1, Yuanyuan Wang1
22 Aug 2007
TL;DR: A new pixel-level shape descriptor is proposed that solves the rotation-invariance problem of shape contexts based on the shift theorem of Fourier transformation while does not increase the computational complexity.
Abstract: We propose a new pixel-level shape descriptor. First, shape contexts are computed. Then, 2D FFT is performed on each 2D histogram from shape contexts. Such a scheme solves the rotation-invariance problem of shape contexts based on the shift theorem of Fourier transformation while does not increase the computational complexity. Theoretical proof and experimental validation are provided.

28 citations


"Shape Descriptor Based Document Ima..." refers background in this paper

  • ...The rotation invariance can be incorporated in the dominant histogram by transforming it to fourier domain [ 5 ]....

    [...]