scispace - formally typeset
Search or ask a question
Author

Shangfei Wang

Bio: Shangfei Wang is an academic researcher from University of Science and Technology of China. The author has contributed to research in topics: Facial recognition system & Facial expression. The author has an hindex of 24, co-authored 138 publications receiving 2431 citations. Previous affiliations of Shangfei Wang include Kyushu University & China University of Science and Technology.


Papers
More filters
Journal ArticleDOI
TL;DR: A natural visible and infrared facial expression database, which contains both spontaneous and posed expressions of more than 100 subjects, recorded simultaneously by a visible and an infrared thermal camera, with illumination provided from three different directions is proposed.
Abstract: To date, most facial expression analysis has been based on visible and posed expression databases. Visible images, however, are easily affected by illumination variations, while posed expressions differ in appearance and timing from natural ones. In this paper, we propose and establish a natural visible and infrared facial expression database, which contains both spontaneous and posed expressions of more than 100 subjects, recorded simultaneously by a visible and an infrared thermal camera, with illumination provided from three different directions. The posed database includes the apex expressional images with and without glasses. As an elementary assessment of the usability of our spontaneous database for expression recognition and emotion inference, we conduct visible facial expression recognition using four typical methods, including the eigenface approach [principle component analysis (PCA)], the fisherface approach [PCA + linear discriminant analysis (LDA)], the Active Appearance Model (AAM), and the AAM-based + LDA. We also use PCA and PCA+LDA to recognize expressions from infrared thermal images. In addition, we analyze the relationship between facial temperature and emotion through statistical analysis. Our database is available for research purposes.

340 citations

Journal ArticleDOI
TL;DR: A unified probabilistic framework based on the dynamic Bayesian network is introduced to simultaneously and coherently represent the facial evolvement in different levels, their interactions and their observations.
Abstract: The tracking and recognition of facial activities from images or videos have attracted great attention in computer vision field. Facial activities are characterized by three levels. First, in the bottom level, facial feature points around each facial component, i.e., eyebrow, mouth, etc., capture the detailed face shape information. Second, in the middle level, facial action units, defined in the facial action coding system, represent the contraction of a specific set of facial muscles, i.e., lid tightener, eyebrow raiser, etc. Finally, in the top level, six prototypical facial expressions represent the global facial muscle movement and are commonly used to describe the human emotion states. In contrast to the mainstream approaches, which usually only focus on one or two levels of facial activities, and track (or recognize) them separately, this paper introduces a unified probabilistic framework based on the dynamic Bayesian network to simultaneously and coherently represent the facial evolvement in different levels, their interactions and their observations. Advanced machine learning methods are introduced to learn the model based on both training data and subjective prior knowledge. Given the model and the measurements of facial motions, all three levels of facial activities are simultaneously recognized through a probabilistic inference. Extensive experiments are performed to illustrate the feasibility and effectiveness of the proposed model on all three level facial activities.

188 citations

Proceedings ArticleDOI
23 Jun 2013
TL;DR: The Interval Temporal Bayesian Network is proposed to capture these complex temporal relations among primitive facial events for facial expression modeling and recognition and demonstrates the feasibility of the proposed approach in recognizing facial expressions based purely on spatio-temporal relations among facial muscles.
Abstract: Spatial-temporal relations among facial muscles carry crucial information about facial expressions yet have not been thoroughly exploited. One contributing factor for this is the limited ability of the current dynamic models in capturing complex spatial and temporal relations. Existing dynamic models can only capture simple local temporal relations among sequential events, or lack the ability for incorporating uncertainties. To overcome these limitations and take full advantage of the spatio-temporal information, we propose to model the facial expression as a complex activity that consists of temporally overlapping or sequential primitive facial events. We further propose the Interval Temporal Bayesian Network to capture these complex temporal relations among primitive facial events for facial expression modeling and recognition. Experimental results on benchmark databases demonstrate the feasibility of the proposed approach in recognizing facial expressions based purely on spatio-temporal relations among facial muscles, as well as its advantage over the existing methods.

166 citations

Journal ArticleDOI
TL;DR: A general framework for video affective content analysis is proposed, which includes video content, emotional descriptors, and users' spontaneous nonverbal responses, as well as the relationships between the three.
Abstract: Video affective content analysis has been an active research area in recent decades, since emotion is an important component in the classification and retrieval of videos. Video affective content analysis can be divided into two approaches: direct and implicit. Direct approaches infer the affective content of videos directly from related audiovisual features. Implicit approaches, on the other hand, detect affective content from videos based on an automatic analysis of a user’s spontaneous response while consuming the videos. This paper first proposes a general framework for video affective content analysis, which includes video content, emotional descriptors, and users’ spontaneous nonverbal responses, as well as the relationships between the three. Then, we survey current research in both direct and implicit video affective content analysis, with a focus on direct video affective content analysis . Lastly, we identify several challenges in this field and put forward recommendations for future research.

158 citations

Proceedings ArticleDOI
01 Dec 2013
TL;DR: A hierarchical model is built that combines the bottom-level image features and the top-level AU relationships to jointly recognize AUs in a principled manner and can successfully capture them to more accurately characterize the AU relationships.
Abstract: In this paper we tackle the problem of facial action unit (AU) recognition by exploiting the complex semantic relationships among AUs, which carry crucial top-down information yet have not been thoroughly exploited. Towards this goal, we build a hierarchical model that combines the bottom-level image features and the top-level AU relationships to jointly recognize AUs in a principled manner. The proposed model has two major advantages over existing methods. 1) Unlike methods that can only capture local pair-wise AU dependencies, our model is developed upon the restricted Boltzmann machine and therefore can exploit the global relationships among AUs. 2) Although AU relationships are influenced by many related factors such as facial expressions, these factors are generally ignored by the current methods. Our model, however, can successfully capture them to more accurately characterize the AU relationships. Efficient learning and inference algorithms of the proposed model are also developed. Experimental results on benchmark databases demonstrate the effectiveness of the proposed approach in modelling complex AU relationships as well as its superior AU recognition performance over existing approaches.

130 citations