scispace - formally typeset
Search or ask a question

Showing papers by "Rajeev Sharma published in 2004"


Proceedings ArticleDOI
27 Jun 2004
TL;DR: This paper presents a novel approach to recognize the six universal facial expressions from visual data and use them to derive the level of interest using psychological evidences using a two-step classification built on the top of refined optical flow computed from sequence of images.
Abstract: This paper presents a novel approach to recognize the six universal facial expressions from visual data and use them to derive the level of interest using psychological evidences. The proposed approach relies on a two-step classification built on the top of refined optical flow computed from sequence of images. First, a bank of linear classifier was applied at frame level and the output of this stage was coalesced to produce a temporal signature for each observation. Second, temporal signatures thus computed from the training data set were used to train discrete hidden Markov models (HMMs) to learn the underlying models for each universal facial expressions. The average recognition rate of the proposed facial expression classifier is 90.9% without classifier fusion and 91.2% with fusion using a five fold cross validation scheme on a database of 488 video sequences that include 97 subjects. Recognized facial expressions were combined with the intensity of activity (motion) around the apex frame to measure the level of interest. To further illustrate the efficacy of the proposed approach two set of experiments, namely, television (TV) broadcast data (108 sequences of facial expression containing severe lighting conditions, diverse subjects and expressions) analysis and emotion elicitation on 21 subjects were conducted.

92 citations


Patent
22 Oct 2004
TL;DR: In this article, a system and method for automatically extracting the demographic information from images is presented, which detects the face in an image, locates different components, extracts component features, and then classifies the components to identify the age, gender, or ethnicity of the person(s) in the image.
Abstract: The present invention includes a system and method for automatically extracting the demographic information from images. The system detects the face in an image, locates different components, extracts component features, and then classifies the components to identify the age, gender, or ethnicity of the person(s) in the image. Using components for demographic classification gives better results as compared to currently known techniques. Moreover, the described system and technique can be used to extract demographic information in more robust manner than currently known methods, in environments where high degree of variability in size, shape, color, texture, pose, and occlusion exists. This invention also performs classifier fusion using Data Level fusion and Multi-level classification for fusing results of various component demographic classifiers. Besides use as an automated data collection system wherein given the necessary facial information as the data, the demographic category of the person is determined automatically, the system could also be used for targeting of the advertisements, surveillance, human computer interaction, security enhancements, immersive computer games and improving user interfaces based on demographic information.

63 citations


Patent
12 Apr 2004
TL;DR: In this paper, a system and method for modeling faces from images captured from a single or a plurality of image capturing systems at different times is presented, where the method first determines the demographics of the person being imaged and then selects an approximate three dimensional face model from a set of models.
Abstract: The present invention is a system and method for modeling faces from images captured from a single or a plurality of image capturing systems at different times. The method first determines the demographics of the person being imaged. This demographic classification is then used to select an approximate three dimensional face model from a set of models. Using this initial model and properties of camera projection, the model is adjusted leading to a more accurate face model.

57 citations


Proceedings ArticleDOI
19 Jul 2004
TL;DR: A model-acquisition framework for acquiring articulated models directly from monocular video that has in particular the ability to process human as well as non-human targets and makes no assumptions with respect to the structure of the kinematic tree or complexity.
Abstract: Past research on model-based tracking of articulated targets has neglected to address the problems of model-acquisition and initialization. However, for model-based approaches to ever become practical and autonomous, these important issues need to be addressed Towards this goal, this paper-presents a model-acquisition framework for acquiring articulated models directly from monocular video. Both structure, shape, and appearance of articulated models are estimated In addition, the initialization problem is solved by estimating pose information for at least one frame of a sequence, allowing subsequent model-based tracking. The presented work is based on basic assumptions and hence not restricted towards specific types of targets. It has in particular the ability to process human as well as non-human targets and makes no assumptions with respect to the structure of the kinematic tree or complexity. This work hence presents a set of systematic solutions to the problems of model-acquisition and initialization that bridge the gap between state of the art model-based tracking approaches and practical applications.

33 citations


Proceedings ArticleDOI
13 Oct 2004
TL;DR: A novel interface system for accessing geospatial data (GeoMIP) has been developed that realizes a user-centered multimodal speech/gesture interface for addressing some of the critical needs in crisis management.
Abstract: A novel interface system for accessing geospatial data (GeoMIP) has been developed that realizes a user-centered multimodal speech/gesture interface for addressing some of the critical needs in crisis management. In this system we primarily developed vision sensing algorithms, speech integration, multimodality fusion, and rule-based mapping of multimodal user input to GIS database queries. A demo system of this interface has been developed for the Port Authority NJ/NY and is explained here.

22 citations


Journal ArticleDOI
TL;DR: An integrated framework to address detection and tracking of multiple objects in a computationally efficient manner was presented, in particular, a neural network-based face detector was employed to detect faces and compute person specific statistical model for skin color from the face regions.
Abstract: Automatic initialization and tracking of multiple people and their body parts is one of the first steps in designing interactive multimedia applications. The key problems in this context are robust detection and tracking of people and their body parts in an unconstrained environment. This paper presents an integrated framework to address detection and tracking of multiple objects in a computationally efficient manner. In particular, a neural network-based face detector was employed to detect faces and compute person specific statistical model for skin color from the face regions. A probabilistic model was proposed to fuse the color and motion information to localize the moving body parts (hands). Multiple hypothesis tracking (MHT) algorithm was adopted to track face and hands. In real world scenes extracted features (face and hands) usually contain spurious measurements that create unconvincing trajectories and needless computations. To deal with this problem a path coherence function was incorporated along with MHT to reduce the number of hypotheses, which in turn reduces the computational cost and improves the structure of trajectories. The performance of the framework was validated using experiments on synthetic and real sequence of images.

14 citations


Proceedings ArticleDOI
24 May 2004
TL;DR: This demonstration presents initial progress towards supporting geocollaborative activities, focusing on one type of collaboration involving crisis managers in the field coordinating with those in an emergency operation center (EOC).
Abstract: Managing large scale and distributed crisis events is a national priority; and it is a priority that presents information technology challenges to the responsible government agencies. Geographical information systems (with their ability to map out evolving crisis events, affected human and infrastructure assets, as well as actions taken and resources applied) have been indispensable in all stages of crisis management. Their use, however, has been mostly confined to single users within single agencies. The potential for maps and related geospatial technologies to be the media for collaborative activities among distributed agencies and teams have been discussed [1-4], but feasible technological infrastructure and tools are not yet available. An interdisciplinary team from Penn State University (comprised of GIScientists, information Scientists and computer scientists), currently funded by the NSF/DG program, have joined efforts with collaborators from federal, state, and local agencies to develop an approach to and technology to support "GeoCollaborative Crisis Management" (NSF-EIA-0306845). The dual goals of this project are: (1) to understand the roles of geographical information distributed crisis management activities; and (2) to develop enabling geospatial information technologies and human-computer systems to facilitate geocollaborative crisis management. This demonstration presents initial progress towards supporting geocollaborative activities, focusing on one type of collaboration involving crisis managers in the field coordinating with those in an emergency operation center (EOC).

13 citations


Proceedings ArticleDOI
27 Jun 2004
TL;DR: Experimental results show that the proposed non-parametric approach to the ICA problem that is robust towards outlier effects is able to perform separation of sources in the presence of outliers, whereas existing algorithms like Jade and Infomax break down under such conditions.
Abstract: Learning using independent component analysis (ICA) has found a wide range of applications in the area of computer vision and pattern analysis, ranging from face recognition to speech separation. This paper presents a non-parametric approach to the ICA problem that is robust towards outlier effects. The algorithm, for the first time in the field of ICA, adopts an intuitive and direct approach, focusing on the very definition of independence itself; i.e. the joint probability density function (pdf) of independent sources is factorial over the marginal distributions. In the proposed algorithm, kernel density estimation is employed to approximate the underlying distributions. There are two major advantages of our algorithm. First, existing algorithms focus on learning the independent components by attempting to fulfill necessary conditions (but not sufficient) for independence. For example, the Jade algorithm attempts to approximate independence by minimizing higher order statistics, which are not robust to outliers. Comparatively, our technique is inherently robust towards outlier effects. Second, since the learning employs kernel density estimation, it is naturally free from the assumptions of source distributions (unlike the Infomax algorithm). Experimental results show that the algorithm is able to perform separation of sources in the presence of outliers, whereas existing algorithms like Jade and Infomax break down under such conditions. The results have also shown that the proposed non-parametric approach is generally source distribution independent. In addition, it is able to separate non-Gaussian zero-kurtotic signals unlike the traditional ICA algorithms like Jade and Infomax.

6 citations


Proceedings ArticleDOI
13 Oct 2004
TL;DR: A same-time different-place collaboration system for managing crisis situations using geospatial information that enables distributed spatial decision-making by providing a multimodal interface to team members is demonstrated.
Abstract: We demonstrate a same-time different-place collaboration system for managing crisis situations using geospatial information. Our system enables distributed spatial decision-making by providing a multimodal interface to team members. Decision makers in front of large screen displays and/or desktop computers, and emergency responders in the field with tablet PCs can engage in collaborative activities for situation assessment and emergency response.

4 citations


Book ChapterDOI
01 Jan 2004
TL;DR: This work utilized concepts from robot assembly planning to develop a systematic framework for presenting augmentation stimuli for the assembly domain and utilized computer vision methods for assembly object recognition without special markers to provide sensing.
Abstract: We consider the problem of scene augmentation in the context of a human engaged in assembling an object from its components. In order to exploit the potential of augmented reality (AR) in this context, two main problems need to be considered: designing an effective augmentation scheme for information presentation/control, and providing accurate and fast sensing to determine the state of the assembly. We utilized concepts from robot assembly planning to develop a systematic framework for presenting augmentation stimuli for the assembly domain. An interactive augmentation design and control engine called AUDIT is described. To provide sensing, we utilized computer vision methods for assembly object recognition without special markers. Even though fiducials currently constitute the only feasible vision-based solution, occlusion by the manipulator as well as other assembly parts make the use of more general computer vision techniques desirable. Here, we investigate computer vision techniques with the goal of eventually substituting markers. Constraints from the domain of assembly, as well as transformation space search-based algorithms make the problem tractable.

2 citations


Proceedings ArticleDOI
24 May 2004
TL;DR: The need to develop information science and technology to support crisis management has never been more apparent and federal, state, and local government agencies must develop coordinated strategies and adopt advanced and usable technologies.
Abstract: The need to develop information science and technology to support crisis management has never been more apparent. Federal, state, and local government agencies must develop coordinated strategies and adopt advanced and usable technologies to prepare for and cope with crises in contexts ranging from natural disasters to homeland security.