scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

A Method for Hand Gesture Recognition

07 Apr 2014-pp 919-923
TL;DR: A combination of modelling and learning approach for hand gesture recognition using Microsoft Kinect sensor, using contour area and convexity defects as features for classification.
Abstract: In this paper, we present a method for hand gesture recognition using Microsoft Kinect sensor. Kinect allows capturing dense, and three dimensional scans of an object in real time. We propose a combination of modelling and learning approach for hand gesture recognition. We use Kinect depth feature for background segmentation of hand gesture images captured with Kinect. Image processing techniques are employed to find contour of segmented hand images. Then we calculate convex hull and convexity defects for this contour. We are using contour area and convexity defects as features for classification. We classify the gestures using naive Bayes classifier. We have considered five hand gestures classes i.e. To show using one, two, three, four, and five fingers one by one. We implemented and tested this algorithm for 15 images of each class. It gives a correct classification rate of 100%.
Citations
More filters
Journal ArticleDOI
TL;DR: A thorough review of state-of-the-art techniques used in recent hand gesture and sign language recognition research, suitably categorized into different stages: data acquisition, pre-processing, segmentation, feature extraction and classification.
Abstract: Hand gesture recognition serves as a key for overcoming many difficulties and providing convenience for human life. The ability of machines to understand human activities and their meaning can be utilized in a vast array of applications. One specific field of interest is sign language recognition. This paper provides a thorough review of state-of-the-art techniques used in recent hand gesture and sign language recognition research. The techniques reviewed are suitably categorized into different stages: data acquisition, pre-processing, segmentation, feature extraction and classification, where the various algorithms at each stage are elaborated and their merits compared. Further, we also discuss the challenges and limitations faced by gesture recognition research in general, as well as those exclusive to sign language recognition. Overall, it is hoped that the study may provide readers with a comprehensive introduction into the field of automated gesture and sign language recognition, and further facilitate future research efforts in this area.

344 citations

Journal ArticleDOI
24 Dec 2018-Sensors
TL;DR: This review describes current Machine Learning approaches to hand gesture recognition with depth data from time-of-flight sensors and confirms that Convolutional Neural Networks and Long Short-Term Memory yield most reliable results.
Abstract: In this review, we describe current Machine Learning approaches to hand gesture recognition with depth data from time-of-flight sensors. In particular, we summarise the achievements on a line of research at the Computational Neuroscience laboratory at the Ruhr West University of Applied Sciences. Relating our results to the work of others in this field, we confirm that Convolutional Neural Networks and Long Short-Term Memory yield most reliable results. We investigated several sensor data fusion techniques in a deep learning framework and performed user studies to evaluate our system in practice. During our course of research, we gathered and published our data in a novel benchmark dataset (REHAP), containing over a million unique three-dimensional hand posture samples.

58 citations


Cites methods from "A Method for Hand Gesture Recogniti..."

  • ...[9] managed to obtain a 100% recognition rate by applying first a depth threshold, then contour image algorithms...

    [...]

Journal ArticleDOI
TL;DR: A novel and robust descriptor, depth-projection-map-based bag of contour fragments, which is applied to extraction of hand shape and structure information from depth maps, which significantly outperforms the previous methods on all tested datasets for both static digit recognition and letter gesture recognition.
Abstract: This paper presents a novel and robust descriptor, depth-projection-map-based bag of contour fragments, which is applied to extraction of hand shape and structure information from depth maps Our method projects depth maps onto three orthogonal planes to generate the depth projection maps Then, the bag of contour fragment descriptors are extracted from the three depth projection maps and concatenated as a final shape representation of the original depth data A support vector machine with a linear kernel is used as a shape classifier The proposed description method is evaluated on three public datasets, as well as a new and more challenging dataset for hand gesture recognition Results demonstrate that the proposed method significantly outperforms the previous methods on all tested datasets for both static digit recognition and letter gesture recognition For the challenging HUST-ASL dataset, in particular, the proposed method improves on the previous state-of-the-art methods from 401% to 646%

32 citations


Cites methods from "A Method for Hand Gesture Recogniti..."

  • ...Shukla and Dwivedi [26] compute hand features based on contour area and convexity defects....

    [...]

  • ...[26] J. Shukla and A. Dwivedi, “A method for hand gesture recognition,” in Proc....

    [...]

Proceedings ArticleDOI
18 May 2015
TL;DR: The creation of the language providing alphabet, syntax and semantics: future work will explain the part of recognition of gestures that is still in progress, thus leading to the definition of a CADDY language, called CADDIAN, and a communication protocol.
Abstract: Underwater environment is characterized by harsh conditions and is difficult to monitor. The CADDY project deals with the development of a companion robot devoted to support and to monitor human operations and activities during the dive. In this scenario the communication and correct reception of messages between the diver and the robot are essential for success of the dive goals. However, the underwater environment poses a set of technical constraints hardly limiting the communication possibilities. For such reasons the solution proposed is to develop a communication language based on the consolidated and standardized diver gestures, commonly employed during professional and recreational dives, thus leading to the definition of a CADDY language, called CADDIAN, and a communication protocol. This article focuses on the creation of the language providing alphabet, syntax and semantics: future work will explain the part of recognition of gestures that is still in progress.

26 citations


Cites background from "A Method for Hand Gesture Recogniti..."

  • ...Developments in the virtual reality and computer games’ research branch have also given pulse to other sectors that have derived from them the idea to employ IR and ToF systems, as demonstrated by many works in the literature, dealing with different applications, such as [13], [14], [15], [16] and [17]; a good review of ToF and IR sensors as well as of gesture recognition methods is given in [18]....

    [...]

Journal ArticleDOI
TL;DR: An end-to-end framework based on 3D CNN, called 3D PostureNet, is developed for robust posture recognition, and achieves significantly superior performance on both skeleton-based human posture and hand posture recognition tasks.

18 citations

References
More filters
Journal ArticleDOI
TL;DR: This paper presents work on computing shape models that are computationally fast and invariant basic transformations like translation, scaling and rotation, and proposes shape detection using a feature called shape context, which is descriptive of the shape of the object.
Abstract: We present a novel approach to measuring similarity between shapes and exploit it for object recognition. In our framework, the measurement of similarity is preceded by: (1) solving for correspondences between points on the two shapes; (2) using the correspondences to estimate an aligning transform. In order to solve the correspondence problem, we attach a descriptor, the shape context, to each point. The shape context at a reference point captures the distribution of the remaining points relative to it, thus offering a globally discriminative characterization. Corresponding points on two similar shapes will have similar shape contexts, enabling us to solve for correspondences as an optimal assignment problem. Given the point correspondences, we estimate the transformation that best aligns the two shapes; regularized thin-plate splines provide a flexible class of transformation maps for this purpose. The dissimilarity between the two shapes is computed as a sum of matching errors between corresponding points, together with a term measuring the magnitude of the aligning transform. We treat recognition in a nearest-neighbor classification framework as the problem of finding the stored prototype shape that is maximally similar to that in the image. Results are presented for silhouettes, trademarks, handwritten digits, and the COIL data set.

6,693 citations


"A Method for Hand Gesture Recogniti..." refers background in this paper

  • ...Contours serves as the basis for a variety of local image descriptors for shape context in recognition system [14]....

    [...]

Proceedings ArticleDOI
20 Jun 2011
TL;DR: This work takes an object recognition approach, designing an intermediate body parts representation that maps the difficult pose estimation problem into a simpler per-pixel classification problem, and generates confidence-scored 3D proposals of several body joints by reprojecting the classification result and finding local modes.
Abstract: We propose a new method to quickly and accurately predict 3D positions of body joints from a single depth image, using no temporal information. We take an object recognition approach, designing an intermediate body parts representation that maps the difficult pose estimation problem into a simpler per-pixel classification problem. Our large and highly varied training dataset allows the classifier to estimate body parts invariant to pose, body shape, clothing, etc. Finally we generate confidence-scored 3D proposals of several body joints by reprojecting the classification result and finding local modes. The system runs at 200 frames per second on consumer hardware. Our evaluation shows high accuracy on both synthetic and real test sets, and investigates the effect of several training parameters. We achieve state of the art accuracy in our comparison with related work and demonstrate improved generalization over exact whole-skeleton nearest neighbor matching.

3,579 citations

Journal ArticleDOI
TL;DR: This work takes an object recognition approach, designing an intermediate body parts representation that maps the difficult pose estimation problem into a simpler per-pixel classification problem, and generates confidence-scored 3D proposals of several body joints by reprojecting the classification result and finding local modes.
Abstract: We propose a new method to quickly and accurately predict human pose---the 3D positions of body joints---from a single depth image, without depending on information from preceding frames. Our approach is strongly rooted in current object recognition strategies. By designing an intermediate representation in terms of body parts, the difficult pose estimation problem is transformed into a simpler per-pixel classification problem, for which efficient machine learning techniques exist. By using computer graphics to synthesize a very large dataset of training image pairs, one can train a classifier that estimates body part labels from test images invariant to pose, body shape, clothing, and other irrelevances. Finally, we generate confidence-scored 3D proposals of several body joints by reprojecting the classification result and finding local modes.The system runs in under 5ms on the Xbox 360. Our evaluation shows high accuracy on both synthetic and real test sets, and investigates the effect of several training parameters. We achieve state-of-the-art accuracy in our comparison with related work and demonstrate improved generalization over exact whole-skeleton nearest neighbor matching.

3,034 citations

Journal ArticleDOI
TL;DR: Two border following algorithms are proposed for the topological analysis of digitized binary images, which determine the surroundness relations among the borders of a binary image and follow only the outermost borders.
Abstract: Two border following algorithms are proposed for the topological analysis of digitized binary images. The first one determines the surroundness relations among the borders of a binary image. Since the outer borders and the hole borders have a one-to-one correspondence to the connected components of 1-pixels and to the holes, respectively, the proposed algorithm yields a representation of a binary image, from which one can extract some sort of features without reconstructing the image. The second algorithm, which is a modified version of the first, follows only the outermost borders (i.e., the outer borders which are not surrounded by holes). These algorithms can be effectively used in component counting, shrinking, and topological structural analysis of binary images, when a sequential digital computer is used.

2,303 citations


"A Method for Hand Gesture Recogniti..." refers background in this paper

  • ...It was proposed by Suzuki in 1985 [15]....

    [...]

Journal ArticleDOI
TL;DR: A fraction of the recycle slurry is treated with sulphuric acid to convert at least some of the gypsum to calcium sulphate hemihydrate and the slurry comprising hemihYDrate is returned to contact the mixture of phosphate rock, phosphoric acid and recycle Gypsum slurry.
Abstract: The use of hand gestures provides an attractive alternative to cumbersome interface devices for human-computer interaction (HCI). In particular, visual interpretation of hand gestures can help in achieving the ease and naturalness desired for HCI. This has motivated a very active research area concerned with computer vision-based analysis and interpretation of hand gestures. We survey the literature on visual interpretation of hand gestures in the context of its role in HCI. This discussion is organized on the basis of the method used for modeling, analyzing, and recognizing gestures. Important differences in the gesture interpretation approaches arise depending on whether a 3D model of the human hand or an image appearance model of the human hand is used. 3D hand models offer a way of more elaborate modeling of hand gestures but lead to computational hurdles that have not been overcome given the real-time requirements of HCI. Appearance-based models lead to computationally efficient "purposive" approaches that work well under constrained situations but seem to lack the generality desirable for HCI. We also discuss implemented gestural systems as well as other potential applications of vision-based gesture recognition. Although the current progress is encouraging, further theoretical as well as computational advances are needed before gestures can be widely used for HCI. We discuss directions of future research in gesture recognition, including its integration with other natural modes of human-computer interaction.

1,973 citations