Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion
read more
Citations
Sequence of the most informative joints (SMIJ)
A Hierarchical Representation for Future Action Prediction
Deep Clustering via Joint Convolutional Autoencoder Embedding and Relative Entropy Minimization
Deep Clustering via Joint Convolutional Autoencoder Embedding and Relative Entropy Minimization
ModDrop: Adaptive Multi-Modal Gesture Recognition
References
Some methods for classification and analysis of multivariate observations
Normalized cuts and image segmentation
A global geometric framework for nonlinear dimensionality reduction.
An iterative image registration technique with an application to stereo vision
Normalized cuts and image segmentation
Related Papers (5)
Frequently Asked Questions (15)
Q2. What is the quality of the detection, recognition, or synthesis in these applications?
The quality of the detection, recognition, or synthesis in these applications greatly depends on the spatial and temporal resolution of motion databases, as well as the complexity of the models.
Q3. How many actions are included in the KTH dataset?
The KTH dataset contains six types of human actions (walking, jogging, running, boxing, hand-waving, and hand-clapping) performed by 25 subjects in different scenarios.
Q4. How do the authors compute the confusion matrix?
Once the confusion matrix is computed, the authors apply the Hungarian algorithm [57] to find the optimum cluster correspondence, and compute the accuracy as follows:acc ¼ max Ptr CPtr C1k k ; ð17Þ subject to the constraint that P 2 f0; 1gk k is a permutation matrix.
Q5. What methods can be used to find embeddings?
Common dimensionality reduction methods (e.g., PCA, LDA, Isomap, LLE) find embeddings from a data sample in the high-dimensional space to a point in the embedded space.
Q6. What is the main limitation of k-means clustering?
A major limitation of standard k-means clustering foranalysis of time series data [49] is that the temporal orderingof the frames is not taken into account.
Q7. How did they obtain a lowlevel representation of the movement?
To obtain a lowlevel representation, they segmented the movement by estimating the velocity and acceleration of the actuator attached to the joint.
Q8. How did they find motifs in multivariate time series data?
Minnen et al. [26] discovered motifs in real-valued, multivariate time series data by locating regions of high density in the space of all time series subsequences.
Q9. How did they decompose a multimodal stream of human behavior into several activities?
De la Torre and Agell [24] decomposed a multimodal stream of human behavior into several activities using semi-supervised temporal clustering.
Q10. What is the role of ac in a synthetic temporal clustering example?
ACA finds the two binary matrices G and H that, after applying DTAK between all pairwise segments, make the matrix HTGT ðGGT Þ 1GH W as correlated as possible with the frame kernel K. Fig. 4 illustrates the role of different matrices in a synthetic temporal clustering example.
Q11. Why is the implementation of (14) prohibitively expansive?
A straightforward implementation of (14) is prohibitively expansive, i.e., Oðn2n2maxÞ, due to the bottleneck of computing ðX½i;v ; _YjÞ for all i; v; j.
Q12. What is the problem addressed in this paper?
Fig. 1 illustrates the problem addressed in this paper: Given a sequence of a person walking and running, the first level of the hierarchy provided by their algorithm (HACA) is able to group the frames into two classes: running and walking.
Q13. What is the DTAK between the ith and jth segments?
Each element of the segment kernel matrix (T), ij ¼ ðYi;YjÞ ¼ trðKTijWijÞ, is the DTAK between the ith and jth segments (Yi and Yj) computed using (6), where Kij 2 IRni nj and Wij 2 IRni nj are the frame kernel matrix and the normalized correspondence matrix between segments Yi and Yj, respectively.
Q14. What is the popular method of detecting unusual activities in video?
In the computer vision literature, Zhong et al. [23] used a bipartite graph co-clustering algorithm to segment and detect unusual activities in video.
Q15. How is the ACA algorithm able to detect human actions?
algorithms such as ACA and HACA are able to achieve competitive detection performances (77 percent) for human actions in a completely unsupervised fashion.