scispace - formally typeset
Search or ask a question

Showing papers by "Santanu Chaudhury published in 2005"


Proceedings ArticleDOI
31 Aug 2005
TL;DR: A novel scheme for locating text regions in an image based on multiresolution wavelet analysis that does not require any a priori information about the font, font size, scripts, geometric transformation, distortion or background texture is proposed.
Abstract: In this paper we have proposed a novel scheme for locating text regions in an image. The method is based on multiresolution wavelet analysis. We used matched wavelets to capture textural characteristics of image regions. A clustering based approach has been proposed for estimating globally matched wavelets (GMWs) for a given collection of images. Using these GMWs, we generate feature vectors for segmentation and identification of text regions in an image. Our method, unlike most of the other methods, does not require any a priori information about the font, font size, scripts, geometric transformation, distortion or background texture. We have tested our method on various categories of images like license plates, posters, hand written documents and document images etc. The results show proposed method to be a robust, versatile and effective tool for text extraction from images.

18 citations


Journal ArticleDOI
01 Apr 2005
TL;DR: This paper presents a new online scheme for the recognition and pose estimation of a large isolated 3D object, which may not entirely fit in a camera's field of view, and uses a probabilistic reasoning framework for recognition and next-view planning.
Abstract: Most model-based three-dimensional (3D) object recognition systems use information from a single view of an object. However, a single view may not contain sufficient features to recognize it unambiguously. Further, two objects may have all views in common with respect to a given feature set, and may be distinguished only through a sequence of views. A further complication arises when in an image, we do not have a complete view of an object. This paper presents a new online scheme for the recognition and pose estimation of a large isolated 3D object, which may not entirely fit in a camera's field of view. We consider an uncalibrated projective camera, and consider the case when the internal parameters of the camera may be varied either unintentionally, or on purpose. The scheme uses a probabilistic reasoning framework for recognition and next-view planning. We show results of successful recognition and pose estimation even in cases of a high degree of interpretation ambiguity associated with the initial view.

15 citations


Journal ArticleDOI
TL;DR: A soft-segmentation visualization scheme to generate pixel partitions from the histogram of MR image data using a connectionist approach and then generate selective visual depictions of pixel partitions using pseudo color based on an appropriate fuzzy membership function is proposed.

13 citations


Proceedings ArticleDOI
31 Aug 2005
TL;DR: This paper makes use of an extension of OWL (ontology language for Web) to allow encoding of ontologies for document images to support conceptual querying and automated hyperlinking of document images.
Abstract: In this paper, we propose a scheme for accessing document images using ontology. We make use of an extension of OWL (ontology language for Web) to allow encoding of ontologies for document images. We experimentally demonstrate that reasoning with the concepts defined in ontology and their observation models provide a mechanism to support conceptual querying and automated hyperlinking of document images.

10 citations


Book ChapterDOI
20 Dec 2005
TL;DR: A novel framework for formal specification of spatio-temporal relations between media objects using fuzzy membership and its use in multimedia ontologies and a reasoning framework for creating media based descriptions of concepts are presented.
Abstract: This paper present a novel framework for formal specification of spatio-temporal relations between media objects using fuzzy membership. We have illustrated its use in multimedia ontologies and have described a reasoning framework for creating media based descriptions of concepts.

8 citations


Proceedings ArticleDOI
31 Aug 2005
TL;DR: A new representation scheme for word images which exploits the structural features of the word image skeleton in the form of a graph called as the geometric feature graph (GFG).
Abstract: In this paper, we discuss a new representation scheme for word images which exploits the structural features. The word image features are represented in the form of a graph called as the geometric feature graph (GFG). The GFG is encoded in the form of a string which serves as a compressed representation of the word image skeleton. We demonstrate reconstruction, and retrieval of word images for 3 different scripts using the GFG string.

5 citations



Journal ArticleDOI
TL;DR: A method for synthesis of views corresponding to translational motion of the camera, which can handle occlusions and changes in visibility in the synthesized views, and gives a characterisation of the viewpoints corresponding to which views can be synthesized.

3 citations


Book ChapterDOI
20 Dec 2005
TL;DR: This paper has explored use of the Token Passing Algorithm with HMM for simultaneous segmentation and characterization of the components in News video sequences into their semantic components using integrated aural and visual features.
Abstract: In this paper we have proposed a scheme for parsing News video sequences into their semantic components using integrated aural and visual features. We have explored use of the Token Passing Algorithm with HMM for simultaneous segmentation and characterization of the components. Experimentation with about 100 sequences have shown impressive results.

1 citations