scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Annotating Dance Posture Images Using Multi Kernel Feature Combination

TL;DR: A novel dance posture based annotation model by combining features using Multiple Kernel Learning (MKL) and a novel feature representation which represents the local texture properties of the image is proposed.
Abstract: We present a novel dance posture based annotation model by combining features using Multiple Kernel Learning (MKL). We have proposed a novel feature representation which represents the local texture properties of the image. The annotation model is defined in the direct a cyclic graph structure using the binary MKL algorithm. The bag-of-words model is applied for image representation. The experiments have been performed on the image collection belonging to two Indian classical dances (Bharatnatyam and Odissi). The annotation model has been tested using SIFT and the proposed feature individually and by optimally combining both the features. The experiments have shown promising results.
Citations
More filters
Book ChapterDOI
21 Sep 2016
TL;DR: P pose recognition is performed for some important postures in Bharatnatyam in order to find the origin of these postures from the Bhangas and further use this result to predict the expertise of a Bharat natyam dancer.
Abstract: Bharatnatyam is an ancient Indian Classical Dance form consisting of complex postures and movements. One main challenge which has not been addressed till now in the intelligent systems community is to perform pose recognition for the basic postures of this dance form called the Bhangas and use this for expertise prediction. In this paper, pose recognition is performed for some important postures in Bharatnatyam in order to find the origin of these postures from the Bhangas and further use this result to predict the expertise of a Bharatnatyam dancer. The features extracted are 10 joint angles using 15 joint locations to predict the 22 postures derived from the basic postures (Bhangas). Support Vector Machine classifier with a radial basis function kernel performed the best for pose recognition. By performing stick figure analysis and grouping of labels we estimate the origin of each of these postures from the Bhangas. This is followed by verification of the grouping using Hamming distance calculation. Testing is done on our own Bharatnatyam dataset consisting of 102 dancers, achieving an accuracy of 87.14%. Expertise prediction of the dancers for the 22 poses was performed for four ratings - Excellent, Good, Satisfactory and Poor giving an accuracy of 68.46% without grouping of postures and 80.80% with grouping of postures.

2 citations

Journal ArticleDOI
TL;DR: Yang et al. as mentioned in this paper proposed a patch-based attention (PbA) mechanism on top of standard backbone CNNs to learn contextual information from a set of uniform and multiscale patches and emphasizes discriminative features to understand the semantic correlation among patches.
Abstract: Human body-pose estimation is a complex problem in computer vision. Recent research interests have been widened specifically on the sports, yoga, and dance (SYD) postures for maintaining health conditions. The SYD pose categories are regarded as a fine-grained image classification (FGIC) task due to the complex movement of body parts. Deep convolutional neural networks (CNNs) have attained significantly improved performance in solving various human body-pose estimation problems. Though decent progress has been achieved in yoga postures recognition using deep-learning techniques, fine-grained sports and dance recognition necessitates ample research attention. However, no benchmark public image dataset with sufficient interclass and intraclass variations is available yet to address sports and dance postures classification. To solve this limitation, we have proposed two image datasets, one for 102 sport categories and another for 12 dance styles. Two public datasets, Yoga-82 that contains 82 classes and Yoga-107 that represents 107 classes, are collected for yoga postures. These four SYD datasets are experimented with the proposed deep model, SYD-Net, which integrates a patch-based attention (PbA) mechanism on top of standard backbone CNNs. The PbA module leverages the self-attention mechanism that learns contextual information from a set of uniform and multiscale patches and emphasizes discriminative features to understand the semantic correlation among patches. Moreover, random erasing data augmentation is applied to improve performance. The proposed SYD-Net has achieved state-of-the-art accuracy on Yoga-82 using five base CNNs. SYD-Net’s accuracy on other datasets is remarkable, implying its efficiency. Our Sports-102 and Dance-12 datasets are publicly available at https://sites.google.com/view/syd-net/home.
References
More filters
Proceedings ArticleDOI
20 Sep 1999
TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.
Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

16,989 citations


"Annotating Dance Posture Images Usi..." refers background or methods in this paper

  • ...Various efficient low-level feature extraction algorithms have been developed which are able to capture subtle variations in colors, color layouts and textures of images [1], [2]....

    [...]

  • ...Lowe [1] has used 16× 16 image region around the key point....

    [...]

Book ChapterDOI
07 May 2006
TL;DR: A novel scale- and rotation-invariant interest point detector and descriptor, coined SURF (Speeded Up Robust Features), which approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster.
Abstract: In this paper, we present a novel scale- and rotation-invariant interest point detector and descriptor, coined SURF (Speeded Up Robust Features). It approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster. This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descriptors (in casu, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps. The paper presents experimental results on a standard evaluation set, as well as on imagery obtained in the context of a real-life object recognition application. Both show SURF's strong performance.

13,011 citations


"Annotating Dance Posture Images Usi..." refers background in this paper

  • ...Various efficient low-level feature extraction algorithms have been developed which are able to capture subtle variations in colors, color layouts and textures of images [1], [2]....

    [...]

Journal ArticleDOI
TL;DR: A novel scale- and rotation-invariant detector and descriptor, coined SURF (Speeded-Up Robust Features), which approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster.
Abstract: This article presents a novel scale- and rotation-invariant detector and descriptor, coined SURF (Speeded-Up Robust Features). SURF approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster. This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descriptors (specifically, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps. The paper encompasses a detailed description of the detector and descriptor and then explores the effects of the most important parameters. We conclude the article with SURF's application to two challenging, yet converse goals: camera calibration as a special case of image registration, and object recognition. Our experiments underline SURF's usefulness in a broad range of topics in computer vision.

12,449 citations

Journal ArticleDOI
TL;DR: Comparisons with other multiresolution texture features using the Brodatz texture database indicate that the Gabor features provide the best pattern retrieval accuracy.
Abstract: Image content based retrieval is emerging as an important research area with application to digital libraries and multimedia databases. The focus of this paper is on the image processing aspects and in particular using texture information for browsing and retrieval of large image data. We propose the use of Gabor wavelet features for texture analysis and provide a comprehensive experimental evaluation. Comparisons with other multiresolution texture features using the Brodatz texture database indicate that the Gabor features provide the best pattern retrieval accuracy. An application to browsing large air photos is illustrated.

4,017 citations


"Annotating Dance Posture Images Usi..." refers background in this paper

  • ...Most of the works on using texture for image representation have considered global texture features [19], [20], [21]....

    [...]

Journal ArticleDOI
TL;DR: This paper shows how the kernel matrix can be learned from data via semidefinite programming (SDP) techniques and leads directly to a convex method for learning the 2-norm soft margin parameter in support vector machines, solving an important open problem.
Abstract: Kernel-based learning algorithms work by embedding the data into a Euclidean space, and then searching for linear relations among the embedded data points. The embedding is performed implicitly, by specifying the inner products between each pair of points in the embedding space. This information is contained in the so-called kernel matrix, a symmetric and positive semidefinite matrix that encodes the relative positions of all points. Specifying this matrix amounts to specifying the geometry of the embedding space and inducing a notion of similarity in the input space---classical model selection problems in machine learning. In this paper we show how the kernel matrix can be learned from data via semidefinite programming (SDP) techniques. When applied to a kernel matrix associated with both training and test data this gives a powerful transductive algorithm---using the labeled part of the data one can learn an embedding also for the unlabeled part. The similarity between test points is inferred from training points and their labels. Importantly, these learning problems are convex, so we obtain a method for learning both the model class and the function without local minima. Furthermore, this approach leads directly to a convex method for learning the 2-norm soft margin parameter in support vector machines, solving an important open problem.

2,419 citations


"Annotating Dance Posture Images Usi..." refers methods in this paper

  • ...In the MKL, optimal kernel for the task is learned from data instead of following conventional grid based search method [7], [8], [9]....

    [...]