scispace - formally typeset
Search or ask a question
Author

Rajeev Sharma

Bio: Rajeev Sharma is an academic researcher from Pennsylvania State University. The author has contributed to research in topics: Gesture & Gesture recognition. The author has an hindex of 34, co-authored 107 publications receiving 5446 citations. Previous affiliations of Rajeev Sharma include University of Illinois at Urbana–Champaign.


Papers
More filters
Journal ArticleDOI
TL;DR: A fraction of the recycle slurry is treated with sulphuric acid to convert at least some of the gypsum to calcium sulphate hemihydrate and the slurry comprising hemihYDrate is returned to contact the mixture of phosphate rock, phosphoric acid and recycle Gypsum slurry.
Abstract: The use of hand gestures provides an attractive alternative to cumbersome interface devices for human-computer interaction (HCI). In particular, visual interpretation of hand gestures can help in achieving the ease and naturalness desired for HCI. This has motivated a very active research area concerned with computer vision-based analysis and interpretation of hand gestures. We survey the literature on visual interpretation of hand gestures in the context of its role in HCI. This discussion is organized on the basis of the method used for modeling, analyzing, and recognizing gestures. Important differences in the gesture interpretation approaches arise depending on whether a 3D model of the human hand or an image appearance model of the human hand is used. 3D hand models offer a way of more elaborate modeling of hand gestures but lead to computational hurdles that have not been overcome given the real-time requirements of HCI. Appearance-based models lead to computationally efficient "purposive" approaches that work well under constrained situations but seem to lack the generality desirable for HCI. We also discuss implemented gestural systems as well as other potential applications of vision-based gesture recognition. Although the current progress is encouraging, further theoretical as well as computational advances are needed before gestures can be widely used for HCI. We discuss directions of future research in gesture recognition, including its integration with other natural modes of human-computer interaction.

1,973 citations

Journal ArticleDOI
01 May 1998
TL;DR: It is clear that further research is needed for interpreting and fitting multiple sensing modalities in the context of HCI and the fundamental issues in integrating them at various levels, from early signal level to intermediate feature level to late decision level.
Abstract: Recent advances in various signal processing technologies, coupled with an explosion in the available computing power, have given rise to a number of novel human-computer interaction (HCI) modalities: speech, vision-based gesture recognition, eye tracking, electroencephalograph, etc. Successful embodiment of these modalities into an interface has the potential of easing the HCI bottleneck that has become noticeable with the advances in computing and communication. It has also become increasingly evident that the difficulties encountered in the analysis and interpretation of individual sensing modalities may be overcome by integrating them into a multimodal human-computer interface. We examine several promising directions toward achieving multimodal HCI. We consider some of the emerging novel input modalities for HCI and the fundamental issues in integrating them at various levels, from early signal level to intermediate feature level to late decision level. We discuss the different computational approaches that may be applied at the different levels of modality integration. We also briefly review several demonstrated multimodal HCI systems and applications. Despite all the recent developments, it is clear that further research is needed for interpreting and fitting multiple sensing modalities in the context of HCI. This research can benefit from many disparate fields of study that increase our understanding of the different human communication modalities and their potential role in HCI.

330 citations

Journal ArticleDOI
TL;DR: A spatio-temporal approach in recognizing six universal facial expressions from visual data and using them to compute levels of interest was presented and was found to be consistent with "ground truth" information in most of the cases.
Abstract: This paper presents a spatio-temporal approach in recognizing six universal facial expressions from visual data and using them to compute levels of interest. The classification approach relies on a two-step strategy on the top of projected facial motion vectors obtained from video sequences of facial expressions. First a linear classification bank was applied on projected optical flow vectors and decisions made by the linear classifiers were coalesced to produce a characteristic signature for each universal facial expression. The signatures thus computed from the training data set were used to train discrete hidden Markov models (HMMs) to learn the underlying model for each facial expression. The performances of the proposed facial expressions recognition were computed using five fold cross-validation on Cohn-Kanade facial expressions database consisting of 488 video sequences that includes 97 subjects. The proposed approach achieved an average recognition rate of 90.9% on Cohn-Kanade facial expressions database. Recognized facial expressions were mapped to levels of interest using the affect space and the intensity of motion around apex frame. Computed level of interest was subjectively analyzed and was found to be consistent with "ground truth" information in most of the cases. To further illustrate the efficacy of the proposed approach, and also to better understand the effects of a number of factors that are detrimental to the facial expression recognition, a number of experiments were conducted. The first empirical analysis was conducted on a database consisting of 108 facial expressions collected from TV broadcasts and labeled by human coders for subsequent analysis. The second experiment (emotion elicitation) was conducted on facial expressions obtained from 21 subjects by showing the subjects six different movies clips chosen in a manner to arouse spontaneous emotional reactions that would produce natural facial expressions.

246 citations

Proceedings ArticleDOI
08 Nov 2002
TL;DR: The research reported upon here attempts to overcome analyst-driven, menu-controlled, keyboard and mouse operated GIS by designing a multimodal, multi-user GIS interface that puts geospatial data directly in the hands of decision makers.
Abstract: Geospatial information is critical to effective, collaborative decision-making during emergency management situations; however conventional GIS are not suited for multi-user access and high-level abstract queries. Currently, decision makers do not always have the real time information they need; GIS analysts produce maps at the request of individual decision makers, often leading to overlapping requests with slow delivery times. In order to overcome these limitations, a paradigm shift in interface design for GIS is needed. The research reported upon here attempts to overcome analyst-driven, menu-controlled, keyboard and mouse operated GIS by designing a multimodal, multi-user GIS interface that puts geospatial data directly in the hands of decision makers. A large screen display is used for data visualization, and collaborative, multi-user interactions in emergency management are supported through voice and gesture recognition. Speech and gesture recognition is coupled with a knowledge-based dialogue management system for storing and retrieving geospatial data. This paper describes the first prototype and the insights gained for human-centered multimodal GIS interface design.

164 citations

Journal ArticleDOI
08 Sep 2003
TL;DR: The importance of multimodal interfaces in various aspects of crisis management is established and many issues in realizing successful speech-gesture driven, dialogue-enabled interfaces for crisis management are explored.
Abstract: Emergency response requires strategic assessment of risks, decisions, and communications that are time critical while requiring teams of individuals to have fast access to large volumes of complex information and technologies that enable tightly coordinated work. The access to this information by crisis management teams in emergency operations centers can be facilitated through various human-computer interfaces. Unfortunately, these interfaces are hard to use, require extensive training, and often impede rather than support teamwork. Dialogue-enabled devices, based on natural, multimodal interfaces, have the potential of making a variety of information technology tools accessible during crisis management. This paper establishes the importance of multimodal interfaces in various aspects of crisis management and explores many issues in realizing successful speech-gesture driven, dialogue-enabled interfaces for crisis management. This paper is organized in five parts. The first part discusses the needs of crisis management that can be potentially met by the development of appropriate interfaces. The second part discusses the issues related to the design and development of multimodal interfaces in the context of crisis management. The third part discusses the state of the art in both the theories and practices involving these human-computer interfaces. In particular, it describes the evolution and implementation details of two representative systems, Crisis Management (XISM) and Dialog Assisted Visual Environment for Geoinformation (DAVE/spl I.bar/G). The fourth part speculates on the short-term and long-term research directions that will help addressing the outstanding challenges in interfaces that support dialogue and collaboration. Finally, the fifth part concludes the paper.

159 citations


Cited by
More filters
Journal ArticleDOI
Ronald Azuma1
TL;DR: The characteristics of augmented reality systems are described, including a detailed discussion of the tradeoffs between optical and video blending approaches, and current efforts to overcome these problems are summarized.
Abstract: This paper surveys the field of augmented reality AR, in which 3D virtual objects are integrated into a 3D real environment in real time. It describes the medical, manufacturing, visualization, path planning, entertainment, and military applications that have been explored. This paper describes the characteristics of augmented reality systems, including a detailed discussion of the tradeoffs between optical and video blending approaches. Registration and sensing errors are two of the biggest problems in building effective augmented reality systems, so this paper summarizes current efforts to overcome these problems. Future directions and areas requiring further research are discussed. This survey provides a starting point for anyone interested in researching or using augmented reality.

8,053 citations

MonographDOI
01 Jan 2006
TL;DR: This coherent and comprehensive book unifies material from several sources, including robotics, control theory, artificial intelligence, and algorithms, into planning under differential constraints that arise when automating the motions of virtually any mechanical system.
Abstract: Planning algorithms are impacting technical disciplines and industries around the world, including robotics, computer-aided design, manufacturing, computer graphics, aerospace applications, drug design, and protein folding. This coherent and comprehensive book unifies material from several sources, including robotics, control theory, artificial intelligence, and algorithms. The treatment is centered on robot motion planning but integrates material on planning in discrete spaces. A major part of the book is devoted to planning under uncertainty, including decision theory, Markov decision processes, and information spaces, which are the “configuration spaces” of all sensor-based planning problems. The last part of the book delves into planning under differential constraints that arise when automating the motions of virtually any mechanical system. Developed from courses taught by the author, the book is intended for students, engineers, and researchers in robotics, artificial intelligence, and control theory as well as computer graphics, algorithms, and computational biology.

6,340 citations

Journal ArticleDOI
01 Oct 1996
TL;DR: This article provides a tutorial introduction to visual servo control of robotic manipulators by reviewing the prerequisite topics from robotics and computer vision, including a brief review of coordinate transformations, velocity representation, and a description of the geometric aspects of the image formation process.
Abstract: This article provides a tutorial introduction to visual servo control of robotic manipulators. Since the topic spans many disciplines our goal is limited to providing a basic conceptual framework. We begin by reviewing the prerequisite topics from robotics and computer vision, including a brief review of coordinate transformations, velocity representation, and a description of the geometric aspects of the image formation process. We then present a taxonomy of visual servo control systems. The two major classes of systems, position-based and image-based systems, are then discussed in detail. Since any visual servo system must be capable of tracking image features in a sequence of images, we also include an overview of feature-based and correlation-based methods for tracking. We conclude the tutorial with a number of observations on the current directions of the research field of visual servo control.

3,619 citations

Book
01 Jan 2006
TL;DR: In this paper, the Jacobian is used to describe the relationship between rigid motions and homogeneous transformations, and a linear algebraic approach is proposed for vision-based control of dynamical systems.
Abstract: Preface. 1. Introduction. 2. Rigid Motions and Homogeneous Transformations. 3. Forward and Inverse Kinematics. 4. Velocity Kinematics-The Jacobian. 5. Path and Trajectory Planning. 6. Independent Joint Control. 7. Dynamics. 8. Multivariable Control. 9. Force Control. 10. Geometric Nonlinear Control. 11. Computer Vision. 12. Vision-Based Control. Appendix A: Trigonometry. Appendix B: Linear Algebra. Appendix C: Dynamical Systems. Appendix D: Lyapunov Stability. Index.

3,100 citations

Journal ArticleDOI
TL;DR: The context for socially interactive robots is discussed, emphasizing the relationship to other research fields and the different forms of “social robots”, and a taxonomy of design methods and system components used to build socially interactive Robots is presented.

2,869 citations