scispace - formally typeset
Search or ask a question
Book ChapterDOI

Sequence Recognition in Bharatnatyam Dance

TL;DR: In this paper , the authors proposed a method to recognize the involved Key Postures (KPs) and motions in the Adavu using Convolutional Neural Network (CNN) and Support Vector Machine (SVM), respectively.
Abstract: Bharatanatyam is the oldest Indian Classical Dance (ICD) which is learned and practiced across India and the world. Adavu is the core of this dance form. There exist 15 Adavus and 58 variations. Each Adavu variation comprises a well-defined set of motions and postures (called dance steps) that occur in a particular order. So, while learning Adavus, students not only learn the dance steps but also take care of its sequence of occurrences. This paper proposed a method to recognize these sequences. In this work, firstly, we recognize the involved Key Postures (KPs) and motions in the Adavu using Convolutional Neural Network (CNN) and Support Vector Machine (SVM), respectively. In this, CNN achieves 99% and SVM’s recognition accuracy becomes 84%. Next, we compare these KP and motion sequences with the ground truth to find the best match using the Edit Distance algorithm with an accuracy of 98%. The paper contributes hugely to the state-of-the-art in the form of digital heritage, dance tutoring system, and many more. The paper addresses three novelties; (a) Recognizing the sequences based on the KPs and motions rather than only KPs as reported in the earlier works. (b) The performance of the proposed work is measured by analyzing the prediction time per sequence. We also compare our proposed approach with the previous works that deal with the same problem statement. (c) It tests the scalability of the proposed approach by including all the Adavu variations, unlike the earlier literature, which uses only one/two variations.
References
More filters
Journal ArticleDOI
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

40,826 citations

Journal ArticleDOI
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Abstract: The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide a detailed analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the 5 years of the challenge, and propose future directions and improvements.

30,811 citations

Proceedings ArticleDOI
Kaiming He1, Jian Sun1
07 Jun 2015
TL;DR: This paper investigates the accuracy of CNNs under constrained time cost, and presents an architecture that achieves very competitive accuracy in the ImageNet dataset, yet is 20% faster than “AlexNet” [14] (16.0% top-5 error, 10-view test).
Abstract: Though recent advanced convolutional neural networks (CNNs) have been improving the image recognition accuracy, the models are getting more complex and time-consuming. For real-world applications in industrial and commercial scenarios, engineers and developers are often faced with the requirement of constrained time budget. In this paper, we investigate the accuracy of CNNs under constrained time cost. Under this constraint, the designs of the network architectures should exhibit as trade-offs among the factors like depth, numbers of filters, filter sizes, etc. With a series of controlled comparisons, we progressively modify a baseline model while preserving its time complexity. This is also helpful for understanding the importance of the factors in network designs. We present an architecture that achieves very competitive accuracy in the ImageNet dataset (11.8% top-5 error, 10-view test), yet is 20% faster than “AlexNet” [14] (16.0% top-5 error, 10-view test).

1,259 citations

Journal ArticleDOI
01 Mar 2012
TL;DR: This paper provides an overview of MHI-based human motion recognition techniques and applications and points some areas for further research based on the MHI method and its variants.
Abstract: The motion history image (MHI) approach is a view-based temporal template method which is simple but robust in representing movements and is widely employed by various research groups for action recognition, motion analysis and other related applications. In this paper, we provide an overview of MHI-based human motion recognition techniques and applications. Since the inception of the MHI template for motion representation, various approaches have been adopted to improve this basic MHI technique. We present all important variants of the MHI method. This paper points some areas for further research based on the MHI method and its variants.

292 citations

Book ChapterDOI
01 Jan 2019
TL;DR: This paper has used rectified linear unit (Relu) and Leaky-Relu activation for inner CNN layer and softmax activation function for output layer to analyze its effect on MNIST dataset.
Abstract: Convolutional neural networks refer to a collection of feed-forward artificial neural networks. These networks have been implemented successfully on visual imagery. It uses a variety of perceptrons. These perceptrons are multilayered, that need very little preprocessing. Shift invariant or space invariant NN are alias for CNN, because of their architecture which is based on shared weights. It is also established on translation invariance features. In this paper, we have used rectified linear unit (Relu) and Leaky-Relu activation for inner CNN layer and softmax activation function for output layer to analyze its effect on MNIST dataset.

107 citations