Showing papers by "Santanu Chaudhury published in 2008"

PDF

Open Access

Proceedings Article•DOI•

Multimedia ontology learning for automatic annotation and video browsing

[...]

Anupama Mallik¹, Poornachander Pasumarthi¹, Santanu Chaudhury¹•Institutions (1)

30 Oct 2008

TL;DR: This work uses MOWL, a multimedia extension of Web Ontology Language (OWL) which is capable of describing domain concepts in terms of their media properties and of capturing the inherent uncertainties involved.

...read moreread less

Abstract: In this work, we offer an approach to combine standard multimedia analysis techniques with knowledge drawn from conceptual metadata provided by domain experts of a specialized scholarly domain, to learn a domain-specific multimedia ontology from a set of annotated examples. A standard Bayesian network learning algorithm that learns structure and parameters of a Bayesian network is extended to include media observables in the learning. An expert group provides domain knowledge to construct a basic ontology of the domain as well as to annotate a set of training videos. These annotations help derive the associations between high-level semantic concepts of the domain and low-level MPEG-7 based features representing audio-visual content of the videos. We construct a more robust and refined version of this ontology by learning from this set of conceptually annotated videos. To encode this knowledge, we use MOWL, a multimedia extension of Web Ontology Language (OWL) which is capable of describing domain concepts in terms of their media properties and of capturing the inherent uncertainties involved. We use the ontology specified knowledge for recognizing concepts relevant to a video to annotate fresh addition to the video database with relevant concepts in the ontology. These conceptual annotations are used to create hyperlinks in the video collection, to provide an effective video browsing interface to the user.

...read moreread less

20 citations

Proceedings Article•DOI•

Parametric Video Compression Scheme Using AR Based Texture Synthesis

[...]

A. Khandelia¹, Saurabh Gorecha¹, Brejesh Lall¹, Santanu Chaudhury¹, M. Mathur¹ - Show less +1 more•Institutions (1)

Indian Institute of Technology Delhi¹

16 Dec 2008

TL;DR: A video coding scheme based on parametric compression of texture is proposed that achieves upto 54.52% more compression as compared to the standard H.264/AVC at similar visual quality.

...read moreread less

Abstract: In this paper a video coding scheme based on parametric compression of texture is proposed Each macro block is characterized either as an edge block, or as a non edge block containing texture The non edge blocks are coded by modeling them as an auto-regressive process (AR) By applying the AR model in spatio-temporal domain, we ensure both spatial as well as temporal consistency Edge blocks are encoded using the standard H264/AVC The proposed algorithm achieves upto 5452% more compression as compared to the standard H264/AVC at similar visual quality

...read moreread less

20 citations

Proceedings Article•DOI•

Unusual Activity Analysis Using Video Epitomes and pLSA

[...]

Ayesha Choudhary¹, M. Pal¹, Subhashis Banerjee¹, Santanu Chaudhury¹•Institutions (1)

Indian Institute of Technology Delhi¹

16 Dec 2008

TL;DR: This paper uses video epitomes for segmenting foreground objects from background and applies pLSA for finding correlations among these patches to learn usual activities in the scene and extends it to classify a novel video as usual or unusual.

...read moreread less

Abstract: In this paper, we address the problem of unsupervised learning of usual patterns of activities in an area under surveillance and detecting deviant patterns We use video epitomes for segmenting foreground objects from background and obtain an approximate shape, trajectory and temporal information in the form of space-time patches We apply pLSA for finding correlations among these patches to learn usual activities in the scene We also extend pLSA to classify a novel video as usual or unusual

...read moreread less

18 citations

Proceedings Article•DOI•

Video summarization with supervised learning

[...]

Jayanta Basak¹, V. Luthra², Santanu Chaudhury²•Institutions (2)

IBM¹, Indian Institutes of Technology²

01 Dec 2008

TL;DR: A loss functional is formulated to quantify the discrepency between state transitional probabilities in the original video and that in the intended summary video, and optimize this functional to produce high quality summarization capturing the user perception.

...read moreread less

Abstract: We present a video summarization technique based on supervised learning. Within a class of videos of similar nature, user provides the desired summaries for a subset of videos. Based on this supervised information, the summaries for other videos in the same class are generated. We derive frame-transitional features and subsequently represent each frame transition as a state. We then formulate a loss functional to quantify the discrepency between state transitional probabilities in the original video and that in the intended summary video, and optimize this functional. We experimentally validate the performance of the technique using cross-validation scores on two different class of videos, and demonstrate that the proposed technique is able to produce high quality summarization capturing the user perception.

...read moreread less

14 citations

Proceedings Article•DOI•

Video Trans-coding in Smart Camera for Ubiquitous Multimedia Environment

[...]

E.R. Chutani¹, Santanu Chaudhury¹•Institutions (1)

Indian Institutes of Technology¹

13 Oct 2008

TL;DR: A scheme for on-line semantic transcoding of the video captured by the smart camera is proposed and a local associative computation based change detection scheme for identifying frames of interest is proposed.

...read moreread less

Abstract: Smart cameras are expected to be important components for creating ubiquitous multimedia environments. In this paper, we propose a scheme for on-line semantic transcoding of the video captured by the smart camera. The transcoding process selects frames of importance and regions of interest for use by other processing elements in a ubiquitous computing environment. We have proposed a local associative computation based change detection scheme for identifying frames of interest. The algorithm also segments out the region of change. The computation is structured for easy implementation in DSP based embedded environment. The transcoding scheme enables the camera to communicate only regions of change in frames of interest to a server or a peer. Consequently communication and processing overhead reduces in a networked application environment. Experimental results have established effectiveness of the transcoding scheme.

...read moreread less

10 citations

Journal Article•DOI•

Handcrafted fuzzy rules for tissue classification

[...]

Shashi Bhushan Mehta¹, Shashi Bhushan Mehta², Santanu Chaudhury³, Asok Bhattacharyya¹, Amarnath Jena - Show less +1 more•Institutions (3)

Delhi Technological University¹, Philips², Indian Institutes of Technology³

01 Jul 2008-Magnetic Resonance Imaging

TL;DR: A handcrafted fuzzy rule-based system for segmentation and identification of different tissue types in magnetic resonance (MR) brain images using a combination of histogram and spatial neighborhood-based features to handle variations and variability in features corresponding to different types of tissues.

...read moreread less

10 citations

Proceedings Article•DOI•

Bag-of-features kernel eigen spaces for classification

[...]

Gaurav Sharma¹, Santanu Chaudhury¹, J.B. Srivastava¹•Institutions (1)

Indian Institute of Technology Delhi¹

01 Dec 2008

TL;DR: A classifier unifying local features based representation and subspace based learning is presented and the system allows hierarchy by merging the KES in the feature space, which shows hierarchy on a dataset of videos collected over the internet.

...read moreread less

Abstract: We present a classifier unifying local features based representation and subspace based learning. We also propose a novel method to merge kernel eigen spaces (KES) in feature space. Subspace methods have traditionally been used with the full appearance of the image. Recently local features based bag-of-features (BoF) representation has performed impressively on classification tasks. We use KES with BoF vectors to construct class specific subspaces and use the distance of a query vector from the database KESs as the classification criteria. The use of local features makes our approach invariant to illumination, rotation, scale, small affine transformation and partial occlusions. The system allows hierarchy by merging the KES in the feature space. The classifier performs competitively on the challenging Caltech-101 dataset under normal and simulated occlusion conditions. We show hierarchy on a dataset of videos collected over the internet.

...read moreread less

8 citations

Proceedings Article•DOI•

A Framework for Analysis of Surveillance Videos

[...]

Ayesha Choudhary¹, Santanu Chaudhury¹, Subhashis Banerjee¹•Institutions (1)

Indian Institute of Technology Delhi¹

16 Dec 2008

TL;DR: A novel framework for automated analysis of surveillance videos that applies cluster algebra to mine this summary from multiple perspectives and to adapt association learning for automated selection of components because of which the event is unusual is proposed.

...read moreread less

Abstract: In this paper, we propose a novel framework for automated analysis of surveillance videos. By analysis, we imply summarizing and mining of the information in the video for learning usual patterns and discovering unusual ones. We approach this video analysis problem by acknowledging that a video contains information at multiple levels and in multiple attributes. Each such component and co-occurrences of these component values play an important role in characterizing an event as usual or unusual. Therefore, we cluster the video data at multiple levels of abstraction and in multiple attributes and view these clusters as a summary of the information in the video. We apply cluster algebra to mine this summary from multiple perspectives and to adapt association learning for automated selection of components because of which the event is unusual. We also propose a novel incremental clustering algorithm.

...read moreread less

8 citations

Proceedings Article•DOI•

Parametric video compression using appearance space

[...]

Santanu Chaudhury¹, Subarna Tripathi¹, Sumantra Dutta Roy¹•Institutions (1)

Indian Institute of Technology Delhi¹

01 Dec 2008

TL;DR: The novelty of the approach presented in this paper is the unique object-based video coding framework for videos obtained from a static camera that does not require explicit 2D or 3D models of objects and is general enough to satisfy the need for varying types of objects in the scene.

...read moreread less

Abstract: The novelty of the approach presented in this paper is the unique object-based video coding framework for videos obtained from a static camera. As opposed to most existing methods, the proposed method does not require explicit 2D or 3D models of objects and hence is general enough to satisfy the need for varying types of objects in the scene. The proposed system detects and tracks an object in the scene by learning the appearance model of each object online using nontraditional uniform norm based subspace. At the same time the object is coded using the projection coefficients to the orthonormal basis of the subspace learnt. The tracker incorporates a predictive framework based upon Kalman filter for predicting the five motion parameters. The proposed method shows substantially better compression than MPEG2 based coding with almost no additional complexity.

...read moreread less

4 citations

Proceedings Article•DOI•

Object Category Detection by Statistical Test of Hypothesis

[...]

Gaurav Sharma¹, Santanu Chaudhury¹, J.B. Srivastava¹•Institutions (1)

Indian Institute of Technology Delhi¹

16 Dec 2008

TL;DR: A novel framework for object detection and localization in images containing appreciable clutter and occlusions is proposed and a method similar to the recently proposed spatial scan statistic is used to refine the object localization estimates obtained from the sampling process.

...read moreread less

Abstract: We propose a novel framework for object detection and localization in images containing appreciable clutter and occlusions. The problem is cast in a statistical hypothesis testing framework. The image under test is converted into a set of local features using affine invariant local region detectors, described using the popular SIFT descriptor. Due to clutter and occlusions, this set is expected to contain features which do not belong to the object. We sample subsets of local features from this set and test for the alternate hypothesis of object present against the null hypothesis of object absent. Further, we use a method similar to the recently proposed spatial scan statistic to refine the object localization estimates obtained from the sampling process. We demonstrate the results of our method on the two datasets TUD Motorbikes and TUD Cars. TUD Cars database has background clutter. TUD Motorbikes dataset is recognized to have substantial variation in terms of scale, background, illumination, viewpoint and occlusions.

...read moreread less

Proceedings Article•DOI•

On Exploiting Affine Repetitions for 3D Reconstruction from a Single Image

[...]

Gaurav Sharma, Santanu Chaudhury¹, J.B. Srivastava¹•Institutions (1)

Indian Institute of Technology Delhi¹

16 Dec 2008

TL;DR: This paper shows how the 3D structure of multiple objects and their affine repetitions may be computed and used for synthesizing new views in a 3D scene for reconstruction and view synthesis from a single image.

...read moreread less

Abstract: Symmetry and affine repetitions are common in scenes with man-made structures. In this paper we propose a technique to exploit affine repetitions in a 3D scene for reconstruction and view synthesis from a single image. Assuming three vanishing points in the image, we show how the 3D structure of multiple objects and their affine repetitions may be computed and used for synthesizing new views. The reconstructed objects may also be inserted in other scenes to create augmented images.

...read moreread less

Journal Article•DOI•

Using ontology for building distributed digital libraries with multimedia contents

[...]

Hiranmay Ghosh¹, Gaurav Harit, Santanu Chaudhury•Institutions (1)

Tata Consultancy Services¹

01 Jan 2008-World Digital Libraries-An International Journal

TL;DR: A new scheme for media feature based concept modelling is proposed to address the limitation of traditional ontology based multimedia retrieval systems and supports probabilistic evidential reasoning for robust concept recognition in multimedia documents.

...read moreread less

Abstract: This paper presents a new approach to build distributed digital libraries with multimedia contents. The authors propose a new scheme for media feature based concept modelling to address the limitation of traditional ontology based multimedia retrieval systems. The perceptual models can be used for semantic query processing using standard MPEG-7 media content descriptions. The authors have defined a new ontology language M-OWL (multimedia web ontology language) to support this perceptual modelling. M-OWL is an extension to the OWL (web ontology language) with new constructs for formal representation of the media properties of the domain concepts. It supports probabilistic evidential reasoning for robust concept recognition in multimedia documents. The separation of perceptual modelling of concepts from the repository architecture enables seamless integration of diverse multimedia contents. SOA (Service Oriented Architecture) is used to integrate large number of distributed information sources, each of which is modelled as an intelligent information agent. The authors have demonstrated the capability of the architecture by building a few research prototypes, namely a virtual encyclopaedia of Indian culture, a document image repository and a multimedia portal.

...read moreread less