scispace - formally typeset
Search or ask a question
Author

Sergio Escalera

Bio: Sergio Escalera is an academic researcher from University of Barcelona. The author has contributed to research in topics: Gesture recognition & Deep learning. The author has an hindex of 48, co-authored 371 publications receiving 8105 citations. Previous affiliations of Sergio Escalera include Autonomous University of Barcelona & University of Delaware.


Papers
More filters
Posted Content
TL;DR: Facial expressions are an important way through which humans interact socially as mentioned in this paper, and much research is needed about the way they relate to human affect, and a taxonomy of facial expression analysis methods can be found in this paper.
Abstract: Facial expressions are an important way through which humans interact socially. Building a system capable of automatically recognizing facial expressions from images and video has been an intense field of study in recent years. Interpreting such expressions remains challenging and much research is needed about the way they relate to human affect. This paper presents a general overview of automatic RGB, 3D, thermal and multimodal facial expression analysis. We define a new taxonomy for the field, encompassing all steps from face detection to facial expression recognition, and describe and classify the state of the art methods accordingly. We also present the important datasets and the bench-marking of most influential methods. We conclude with a general discussion about trends, important questions and future lines of research.

340 citations

Journal ArticleDOI
TL;DR: A taxonomy is presented that embeds all binary and ternary ECOC decoding strategies into four groups and shows that the zero symbol introduces two kinds of biases that require redefinition of the decoding design.
Abstract: A common way to model multiclass classification problems is to design a set of binary classifiers and to combine them. Error-correcting output codes (ECOC) represent a successful framework to deal with these type of problems. Recent works in the ECOC framework showed significant performance improvements by means of new problem-dependent designs based on the ternary ECOC framework. The ternary framework contains a larger set of binary problems because of the use of a ldquodo not carerdquo symbol that allows us to ignore some classes by a given classifier. However, there are no proper studies that analyze the effect of the new symbol at the decoding step. In this paper, we present a taxonomy that embeds all binary and ternary ECOC decoding strategies into four groups. We show that the zero symbol introduces two kinds of biases that require redefinition of the decoding design. A new type of decoding measure is proposed, and two novel decoding strategies are defined. We evaluate the state-of-the-art coding and decoding strategies over a set of UCI machine learning repository data sets and into a real traffic sign categorization problem. The experimental results show that, following the new decoding strategies, the performance of the ECOC design is significantly improved.

273 citations

Journal ArticleDOI
TL;DR: A detailed overview of recent advances in RGB-D-based motion recognition is presented in this paper, where the reviewed methods are broadly categorized into four groups, depending on the modality adopted for recognition: RGB-based, depth based, skeleton-based and RGB+D based.

270 citations

Journal ArticleDOI
TL;DR: In this paper, the authors present a comprehensive survey of body gesture recognition methods and discuss multi-modal approaches that combine speech or face with body gestures for improved emotion recognition, and define a complete framework for automatic emotional body gestures recognition.
Abstract: Automatic emotion recognition has become a trending research topic in the past decade. While works based on facial expressions or speech abound, recognizing affect from body gestures remains a less explored topic. We present a new comprehensive survey hoping to boost research in the field. We first introduce emotional body gestures as a component of what is commonly known as ”body language” and comment general aspects as gender differences and culture dependence. We then define a complete framework for automatic emotional body gesture recognition. We introduce person detection and comment static and dynamic body pose estimation methods both in RGB and 3D. We then comment the recent literature related to representation learning and emotion recognition from images of emotionally expressive gestures. We also discuss multi-modal approaches that combine speech or face with body gestures for improved emotion recognition. While pre-processing methodologies (e.g., human detection and pose estimation) are nowadays mature technologies fully developed for robust large scale analysis, we show that for emotion recognition the quantity of labelled data is scarce. There is no agreement on clearly defined output spaces and the representations are shallow and largely based on naive geometrical representations.

256 citations

Book ChapterDOI
06 Sep 2014
TL;DR: In this edition of the ChaLearn challenge, two large novel data sets were made publicly available and the Microsoft Codalab platform were used to manage the competition.
Abstract: This paper summarizes the ChaLearn Looking at People 2014 challenge data and the results obtained by the participants. The competition was split into three independent tracks: human pose recovery from RGB data, action and interaction recognition from RGB data sequences, and multi-modal gesture recognition from RGB-Depth sequences. For all the tracks, the goal was to perform user-independent recognition in sequences of continuous images using the overlapping Jaccard index as the evaluation measure. In this edition of the ChaLearn challenge, two large novel data sets were made publicly available and the Microsoft Codalab platform were used to manage the competition. Outstanding results were achieved in the three challenge tracks, with accuracy results of 0.20, 0.50, and 0.85 for pose recovery, action/interaction recognition, and multi-modal gesture recognition, respectively.

221 citations


Cited by
More filters