Himadri B. G. S. Bhuyan
Bio: Himadri B. G. S. Bhuyan is an academic researcher from Indian Institute of Technology Kharagpur. The author has contributed to research in topics: Dance & Support vector machine. The author has an hindex of 2, co-authored 5 publications receiving 13 citations.
••16 Dec 2017
TL;DR: Using Kinect Xbox to capture dance videos in multi-modal form, NrityaGuru is presented – a tutoring system that can help a learner dancer identify deviations in her dance postures and movements against the prerecorded benchmark performances of the tutor (Guru).
Abstract: Indian Classical Dance (ICD) is a living heritage of India. Traditionally Gurus (teachers) are the custodians of this heritage. They practice and pass on the legacy through their Shishyas (disciples), often in undocumented forms. The preservation of the heritage, thus, remains limited in time and scope. Emergence of digital multimedia technology has created the opportunity to preserve heritage by ensuring that it can be accessible over a long period of time. However, there have been only limited attempts to use effective technologies either in the pedagogy of learning dance or in the preservation of heritage of ICD. In this context, the paper presents NrityaGuru – a tutoring system for Bharatanatyam – a form of ICD. Using Kinect Xbox to capture dance videos in multi-modal form, we design a system that can help a learner dancer identify deviations in her dance postures and movements against the prerecorded benchmark performances of the tutor (Guru).
TL;DR: In this article, the authors proposed an adaptive threshold, adopting Machine Learning (ML) approach and generating the effective feature by combining three frame differencing and bit-plane technique for the KF detection.
Abstract: Identifying k ey frames is the first and necessary step before solving the variety of other B haratanatyam problems. The paper aims to partition the momentarily stationary frames ( key frame s) from this dance video’s motion frames. The proposed key frame s (KFs) localization is novel, simple, and effective compared to the existing dance video analysis methods. It is distinctive from standard KFs detection algorithms as used in other human motion videos. In the dance’s basic structure, the occurrence of KFs during performances is often not completely stationary and varies with the dance form and the performer. Hence, it is not easy to decide a global threshold (on the quantum of motion) to work across dancers and performances. The earlier approaches try to compute the threshold iteratively. However, the novelty of the paper is: (a) formulating an adaptive threshold, (b) adopting Machine Learning (ML) approach and, (c) generating the effective feature by combining three frame differencing and bit-plane technique for the KF detection. In ML, we use Support Vector Machine (SVM) and Convolutional Neural Network (CNN) as the classifiers. The proposed approaches are also compared and analyzed with the earlier approaches. Finally, the proposed ML techniques emerge as a winner with around 90% accuracy.
••16 Dec 2017
TL;DR: The system uses Kinect to capture the human postures, identifies the positions and formations of the four major limbs, converts to the vocabulary of Labanotation and finally translates to a parseable LabanXML representation, which makes it easy for integration in a platform independent manner.
Abstract: We present a non-intrusive automated system to translate human postures into Labanotation, a graphical notation for human postures and movements. The system uses Kinect to capture the human postures, identifies the positions and formations of the four major limbs: two hands and two legs, converts to the vocabulary of Labanotation and finally translates to a parseable LabanXML representation. We use the skeleton stream to classify the formations of the limbs using multi-class support vector machines. Encoding to XML is performed based on Labanotation specification. A data set of postures is created and annotated for training the classifier and to test its performance. We achieve 80% to 90% accuracy for the 4 limbs. The system can be used as an effective front-end for posture analysis applications in various areas like dance and sports where predefined postures form the basis for analysis and interpretation. The parseability of XML makes it easy for integration in a platform independent manner.
••22 Dec 2019
TL;DR: In this paper, a method to classify the unique motions in Bharatanatyam dance videos is presented. But the method is not suitable for dance performance, as the number of frames in each motion may vary, which leads to variable feature lengths.
Abstract: The paper presents a method to classify the unique motions in Bharatanatyam dance videos. Unlike the motions in our daily activities, the motions involved in the dance is rather complex in nature. Looking at the state of art, there is a new scope of the motion classification in the domain of dance. During dance performance, the number of frames in each motion may vary, which leads to the variable feature lengths. This variability, makes comparisons of motions difficult for classification and adds to challenges of the current work. We use the velocities of the skeleton joints as a feature. The joint coordinates are captured by Kinect 1.0. Dynamic Time Warping (DTW) and kNN algorithm are used for classification. The DTW is used to measure the similarity between two motions using skeleton joint velocities and the extracted similarity measure is supplied to the kNN algorithm to identify similar motions. The paper adopts two techniques while measuring the similarity of the joint velocities; i) Non-Weighted Joints ii) Weighted Joints. To optimize the joint weights, Particle Swarm Optimization (PSO) algorithm is used. We also compare the result of the two techniques and highlight the pros and cons of each. The proposed approach is simple and very effective and eventually achieves an accuracy of more than 85%. Finally motion classification in Indian Classical Dance (ICD) can help in digital heritage, design of dance tutoring system, dance synthesis application, and the like.
TL;DR: A method to understand the underlying semantics of Bharatnatyam dance motion and classifies it and explores two ML techniques; Support Vector Machine (SVM) and Convolutional Neural Network (CNN).
Abstract: This paper provides a method to understand the underlying semantics of Bharatnatyam dance motion and classifies it. Each dance performance is audio-driven and spans over space and time. The dance is captured and analyzed, which is helpful in cultural heritage preservation, and tutoring systems to assist the naive learner. This paper attempts to solve the fundamental problem; recognizing the motions during a dance performance based on motion-pattern. The used dataset is the video recordings of an Indian Classical Dance form known as Bharatanatyam. The different Adavus (The basic unit of Bharatanatyam) of Bharatanatyam dance are captured using Kinect. We choose RGB from various forms of captured data (RGB, Depth, and Skeleton). Motion History Image (MHI) and Histogram of Gradient of MHI (HoGMHI) are computed for each motion and used as an input for the Machine Learning (ML) algorithms to recognize motion. The paper explores two ML techniques; Support Vector Machine (SVM) and Convolutional Neural Network (CNN). The overall accuracy of both the classifiers is more than 90%. The novelties of the work are (a) analysing all possible involved motions based on the motion-patterns rather than the joint velocities or pose, (b)exploring the impact of training data and the different features on the classifiers’ accuracy, (c) not restricting the number of frames in a motion during recognition and formulate a method to deal with the variable number of frames in the motions.
01 Jan 2006
TL;DR: In this paper, a robust 3D dance posture recognition system using two cameras is proposed, where a pair of wide-baseline video cameras with approximately orthogonal looking directions are used to reduce pose recognition ambiguities.
Abstract: In this paper, a robust 3D dance posture recognition system using two cameras is proposed. A pair of wide-baseline video cameras with approximately orthogonal looking directions is used to reduce pose recognition ambiguities. Silhouettes extracted from these two views are represented using Gaussian mixture models (GMM) and used as features for recognition. Relevance vector machine (RVM) is deployed for robust pose recognition. The proposed system is trained using synthesized silhouettes created using animation software and motion capture data. The experimental results on synthetic and real images illustrate that the proposed approach can recognize 3D postures effectively. In addition, the system is easy to set up without any need of precise camera calibration.
TL;DR: This paper attempts to solve three fundamental problems of dance analysis for understanding the underlying semantics of dance forms by capturing the multi-modal data of Bharatanatyam dance using Kinect and building an annotated data set for research in ICD.
Abstract: Understanding the underlying semantics of performing arts like dance is a challenging task. Dance is multimedia in nature and spans over time as well as space. Capturing and analyzing the multimedia content of the dance is useful for the preservation of cultural heritage, to build video recommendation systems, to assist learners to use tutoring systems. To develop an application for dance, three aspects of dance analysis need to be addressed: 1) Segmentation of the dance video to find the representative action elements, 2) Matching or recognition of the detected action elements, and 3) Recognition of the dance sequences formed by combining a number of action elements under certain rules. This paper attempts to solve three fundamental problems of dance analysis for understanding the underlying semantics of dance forms. Our focus is on an Indian Classical Dance (ICD) form known as Bharatanatyam. As dance is driven by music, we use the music as well as motion information for key posture extraction. Next, we recognize the key postures using machine learning as well as deep learning techniques. Finally, the dance sequence is recognized using the Hidden Markov Model (HMM). We capture the multi-modal data of Bharatanatyam dance using Kinect and build an annotated data set for research in ICD.
01 Oct 2019
TL;DR: Although ANN showed statistically shorter execution time, the recalling time pattern of TUT group showed a steep convergence after initial trial, and the results can be used in the field in terms of informing designers of IVR on what types of visual instruction are best for different task purpose.
Abstract: In this paper we present a comparative study of visual instructions in Immersive Virtual Reality (IVR), i.e., annotation (ANN) that employs 3D texts and objects for instructions and virtual tutor (TUT) that demonstrates a task with a 3D character. The comparison is based on three tasks, maze escape (ME), stretching exercise (SE), and crane manipulation (CM), defined by the types of a unit instruction. We conducted an automated evaluation of user's memory recall performances (recall time, accuracy, and error) by mapping a sequence of user's behaviors and events as a string. Results revealed that ANN group showed significantly more accurate performance (1.3 times) in ME and time performance (1.64 times) in SE than TUT group, while no statistical main difference was found in CM. Interestingly, although ANN showed statistically shorter execution time, the recalling time pattern of TUT group showed a steep convergence after initial trial. The results can be used in the field in terms of informing designers of IVR on what types of visual instruction are best for different task purpose.
TL;DR: This work presents a method for contextually motion analysis that organizes dance data semantically, to form the first digital dance ethnography, and uses quartet-based analysis to organize dance data into a categorization tree, while inferred information from dance metadata descriptions are used to set parent-child relationships.
Abstract: Folk dances often reflect the socio-cultural influences prevailing in different periods and nations; each dance produces a meaning, a story with the help of music, costumes and dance moves. However, dances have no borders; they have been transmitted from generation to generation, along different countries, mainly due to movements of people carrying and disseminating their civilization. Studying the contextual correlation of dances along neighboring countries, unveils the evolution of this unique intangible heritage in time, and helps in understanding potential cultural similarities. In this work we present a method for contextually motion analysis that organizes dance data semantically, to form the first digital dance ethnography. Firstly, we break dance motion sequences into some narrow temporal overlapping feature descriptors, named motion and style words, and then cluster them in a high-dimensional features space to define motifs. The distribution of those motion and style motifs creates motion and style signatures, in the content of a bag-of-motifs representation, that implies for a succinct but descriptive portrayal of motions sequences. Signatures are time-scale and temporal-order invariant, capable of exploiting the contextual correlation between dances, and distinguishing fine-grained difference between semantically similar motions. We then use quartet-based analysis to organize dance data into a categorization tree, while inferred information from dance metadata descriptions are then used to set parent-child relationships. We illustrate a number of different organization trees, and portray the evolution of dances over time. The efficiency of our method is also demonstrated in retrieving contextually similar dances from a database.
TL;DR: This paper proposes a novel feature that is invariant to anthropometric variation and body orientation and generates Labanotation symbols, which can generate the spatial symbols describing directions and levels in both the support column and arm column based on motion-captured data.
Abstract: Labanotation is a widely used notation system for recording body movements, especially dances. It has wide applications in choreography preservation, dance archiving, and so on. However, the manual creation of Labanotation scores is rather difficult and time-consuming. Therefore, research on the generation of Labanotation scores is of great interest. In this paper, we aim to generate Labanotation scores based on the motion-captured data obtained from real-world dance performances. First, to deal with challenges such as various dance movement patterns, different dancer shapes, and noises in the motion-captured data, we propose a novel feature that is invariant to anthropometric variation and body orientation. Then, we generate the notations of both lower-limb movements and upper-limb gestures. On the one hand, we utilize the hidden Markov model (HMM) to analyze the temporal dynamic characteristics of limb movements and map each lower limb movement to a corresponding dance notation. On the other hand, for upper limbs, we train a multi-class classifier based on the extremely randomized trees (Extra-Trees) to identify the notations for arm gestures. Finally, we generate the Labanotation symbols based on the above movement analysis and thus create Labanotation scores. The proposed methods can generate the spatial symbols describing directions and levels in both the support column and arm column based on motion-captured data. The generated scores are clear and reliable. Experimental results show an average recognizing accuracy of over 92% for the generated notations, which is significantly better than previous work.