Modeling sense disambiguation of human pose: recognizing action at a distance by key poses

doi:10.1007/978-3-642-19315-6_19

Home
/
Papers
/
Modeling sense disambiguation of human pose: recognizing action at a distance by key poses

Book Chapter•DOI•

Modeling sense disambiguation of human pose: recognizing action at a distance by key poses

Snehasis Mukherjee¹, Sujoy Kumar Biswas¹, Dipti Prasad Mukherjee¹•Institutions (1)

Indian Statistical Institute¹

08 Nov 2010-pp 244-255

TL;DR: A methodology for recognizing actions at a distance by watching the human poses and deriving descriptors that capture the motion patterns of the poses and shows the efficacy of this approach when compared to the present state of the art.

read less

Abstract: We propose a methodology for recognizing actions at a distance by watching the human poses and deriving descriptors that capture the motion patterns of the poses. Human poses often carry a strong visual sense (intended meaning) which describes the related action unambiguously. But identifying the intended meaning of poses is a challenging task because of their variability and such variations in poses lead to visual sense ambiguity. From a large vocabulary of poses (visual words) we prune out ambiguous poses and extract key poses (or key words) using centrality measure of graph connectivity [1]. Under this framework, finding the key poses for a given sense (i.e., action type) amounts to constructing a graph with poses as vertices and then identifying the most "important" vertices in the graph (following centrality theory). The results on four standard activity recognition datasets show the efficacy of our approach when compared to the present state of the art.

...read moreread less

Citations

PDF

Open Access

More filters

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Journal Article•DOI•

Selecting Key Poses on Manifold for Pairwise Action Recognition

[...]

Xianbin Cao¹, Bo Ning¹, Pingkun Yan, Xuelong Li•Institutions (1)

University of Science and Technology of China¹

01 Feb 2012-IEEE Transactions on Industrial Informatics

TL;DR: A novel approach for key poses selection is proposed, which models the descriptor space utilizing a manifold learning technique to recover the geometric structure of the descriptors on a lower dimensional manifold and develops a PageRank-based centrality measure.

...read moreread less

Abstract: In action recognition, bag of visual words based approaches have been shown to be successful, for which the quality of codebook is critical. In a large vocabulary of poses (visual words), some key poses play a more decisive role than others in the codebook. This paper proposes a novel approach for key poses selection, which models the descriptor space utilizing a manifold learning technique to recover the geometric structure of the descriptors on a lower dimensional manifold. A PageRank-based centrality measure is developed to select key poses according to the recovered geometric structure. In each step, a key pose is selected from the manifold and the remaining model is modified to maximize the discriminative power of selected codebook. With the obtained codebook, each action can be represented with a histogram of the key poses. To solve the ambiguity between some action classes, a pairwise subdivision is executed to select discriminative codebooks for further recognition. Experiments on benchmark datasets showed that our method is able to obtain better performance compared with other state-of-the-art methods.

...read moreread less

52 citations

Cites background from "Modeling sense disambiguation of hu..."

...Mukherjee et al. [27] selected several key poses for each action through analysis for action cycles....
[...]

Journal Article•DOI•

Recognizing Human Action at a Distance in Video by Key Poses

[...]

Snehasis Mukherjee, Sujoy Kumar Biswas, Dipti Prasad Mukherjee

05 Apr 2011-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A graph theoretic technique for recognizing human actions at a distance in a video by modeling the visual senses associated with poses and introduces a “meaningful” threshold on centrality measure that selects key poses for each action type.

...read moreread less

Abstract: In this paper, we propose a graph theoretic technique for recognizing human actions at a distance in a video by modeling the visual senses associated with poses. The proposed methodology follows a bag-of-word approach that starts with a large vocabulary of poses (visual words) and derives a refined and compact codebook of key poses using centrality measure of graph connectivity. We introduce a “meaningful” threshold on centrality measure that selects key poses for each action type. Our contribution includes a novel pose descriptor based on histogram of oriented optical flow evaluated in a hierarchical fashion on a video frame. This pose descriptor combines both pose information and motion pattern of the human performer into a multidimensional feature vector. We evaluate our methodology on four standard activity-recognition datasets demonstrating the superiority of our method over the state-of-the-art.

...read moreread less

43 citations

Cites result from "Modeling sense disambiguation of hu..."

...We will show in the result section that the theory of meaningfulness gives better result compared to the procedure of selecting q-best poses for each action type [18]....
[...]
...These key poses can either be selected by choosing q-best (with q fixed) poses for each action type [18], or by introducing a suitable threshold on the graph centrality measure using the concept of meaningfulness [17]....
[...]

Proceedings Article•DOI•

Recognizing interaction between human performers using 'key pose doublet'

[...]

Snehasis Mukherjee¹, Sujoy Kumar Biswas¹, Dipti Prasad Mukherjee¹•Institutions (1)

Indian Statistical Institute¹

28 Nov 2011

TL;DR: A graph theoretic approach for recognizing interactions between two human performers present in a video clip and applies the same centrality measure on all possible combinations of the key poses of the two performers to select the set of 'key pose doublets' that best represent the corresponding action.

...read moreread less

Abstract: In this paper, we propose a graph theoretic approach for recognizing interactions between two human performers present in a video clip. We watch primarily the human poses of each performer and derive descriptors that capture the motion patterns of the poses. From an initial dictionary of poses (visual words), we extract key poses (or key words) by ranking the poses on the centrality measure of graph connectivity. We argue that the key poses are graph nodes which share a close semantic relationship (in terms of some suitable edge weight function) with all other pose nodes and hence are said to be the central part of the graph. We apply the same centrality measure on all possible combinations of the key poses of the two performers to select the set of 'key pose doublets' that best represent the corresponding action. The results on standard interaction recognition dataset show the robustness of our approach when compared to the present state of the art method.

...read moreread less

22 citations

Cites background or methods from "Modeling sense disambiguation of hu..."

...the interaction descriptors (as action descriptors in [9]) from the dictionary of ‘key pose doublets’ Ψ for recognition....
[...]
...We make a two-fold contribution to enhance the approach of [9] for recognizing interaction between multiple human performers....
[...]
...In [9], a new pose descriptor is proposed using a gradient weighted optical flow feature combining both global and local features....
[...]
...The poses from Sj , j = 1, 2 are placed in a graph as nodes and the edge between each two poses stands for the dissimilarity in terms of a semantic relationship between them, measured using some form of weight function [9]....
[...]
...We use multidimensional pose descriptor corresponding to each performer of each frame of an action video as suggested in [9]....
[...]

Journal Article•DOI•

Region-based Mixture Models for human action recognition in low-resolution videos

[...]

Ying Zhao¹, Huijun Di², Jian Zhang¹, Yao Lu², Feng Lv², Yufang Li³ - Show less +2 more•Institutions (3)

University of Technology, Sydney¹, Beijing Institute of Technology², Beijing Union University³

19 Jul 2017-Neurocomputing

TL;DR: The Layered Elastic Motion Tracking (LEMT) method is adopted, a hybrid feature representation is presented to integrate both of the shape and motion features, and a Region-based Mixture Model (RMM) is proposed to be utilized for action classification.

...read moreread less

15 citations

Cites methods from "Modeling sense disambiguation of hu..."

...There were 4 university teams [20, 29, 35, 36] who participated in the AVAC Challenge and the UT-Tower dataset was used to evaluate each partic200 ipant’s method....
[...]

References

PDF

Open Access

More filters

Journal Article•

The Anatomy of a Large-Scale Hypertextual Web Search Engine.

[...]

Sergey Brin, Lawrence Page

01 Jan 1998-Computer Networks

TL;DR: Google as discussed by the authors is a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems.

...read moreread less

13,327 citations

Proceedings Article•

An iterative image registration technique with an application to stereo vision

[...]

Bruce D. Lucas¹, Takeo Kanade¹•Institutions (1)

Carnegie Mellon University¹

24 Aug 1981

TL;DR: In this paper, the spatial intensity gradient of the images is used to find a good match using a type of Newton-Raphson iteration, which can be generalized to handle rotation, scaling and shearing.

...read moreread less

Abstract: Image registration finds a variety of applications in computer vision. Unfortunately, traditional image registration techniques tend to be costly. We present a new image registration technique that makes use of the spatial intensity gradient of the images to find a good match using a type of Newton-Raphson iteration. Our technique is taster because it examines far fewer potential matches between the images than existing techniques Furthermore, this registration technique can be generalized to handle rotation, scaling and shearing. We show how our technique can be adapted tor use in a stereo vision system.

...read moreread less

12,944 citations

Journal Article•DOI•

Social Network Analysis: Methods and Applications.

[...]

Christopher Winship, Stanley Wasserman, Katherine Faust

01 Sep 1996-Journal of the American Statistical Association

TL;DR: This work characterizes networked structures in terms of nodes (individual actors, people, or things within the network) and the ties, edges, or links that connect them.

...read moreread less

Abstract: Social Network Analysis Methods And Social network analysis (SNA) is the process of investigating social structures through the use of networks and graph theory. It characterizes networked structures in terms of nodes (individual actors, people, or things within the network) and the ties, edges, or links (relationships or interactions) that connect them. Examples of social structures commonly visualized through social network ...

...read moreread less

12,634 citations

"Modeling sense disambiguation of hu..." refers background or methods in this paper

..., action type) we rank the poses in order of “importance” using centrality measure of graph connectivity [12]....
[...]
...An equivalent problem exists in social network analysis [1, 12] (viz....
[...]
...To make e explicit we adopt eccentricity as a measure of graph connectivity [12]....
[...]

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

...read moreread less

10,141 citations

Book•

Pattern Recognition and Machine Learning (Information Science and Statistics)

[...]

Christopher M. Bishop

01 Aug 2006

TL;DR: Looking for competent reading resources?

...read moreread less

Abstract: Looking for competent reading resources? We have pattern recognition and machine learning information science and statistics to read, not only read, but also download them or even check out online. Locate this fantastic book writtern by by now, simply here, yeah just here. Obtain the reports in the kinds of txt, zip, kindle, word, ppt, pdf, as well as rar. Once again, never ever miss to review online and download this book in our site right here. Click the link.

...read moreread less

8,923 citations