scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression

TL;DR: The Cohn-Kanade (CK+) database is presented, with baseline results using Active Appearance Models (AAMs) and a linear support vector machine (SVM) classifier using a leave-one-out subject cross-validation for both AU and emotion detection for the posed data.
Abstract: In 2000, the Cohn-Kanade (CK) database was released for the purpose of promoting research into automatically detecting individual facial expressions. Since then, the CK database has become one of the most widely used test-beds for algorithm development and evaluation. During this period, three limitations have become apparent: 1) While AU codes are well validated, emotion labels are not, as they refer to what was requested rather than what was actually performed, 2) The lack of a common performance metric against which to evaluate new algorithms, and 3) Standard protocols for common databases have not emerged. As a consequence, the CK database has been used for both AU and emotion detection (even though labels for the latter have not been validated), comparison with benchmark algorithms is missing, and use of random subsets of the original database makes meta-analyses difficult. To address these and other concerns, we present the Extended Cohn-Kanade (CK+) database. The number of sequences is increased by 22% and the number of subjects by 27%. The target expression for each sequence is fully FACS coded and emotion labels have been revised and validated. In addition to this, non-posed sequences for several types of smiles and their associated metadata have been added. We present baseline results using Active Appearance Models (AAMs) and a linear support vector machine (SVM) classifier using a leave-one-out subject cross-validation for both AU and emotion detection for the posed data. The emotion and AU labels, along with the extended image data and tracked landmarks will be made available July 2010.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
07 Mar 2016
TL;DR: OpenFace is the first open source tool capable of facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation and allows for easy integration with other applications and devices through a lightweight messaging system.
Abstract: Over the past few years, there has been an increased interest in automatic facial behavior analysis and understanding. We present OpenFace — an open source tool intended for computer vision and machine learning researchers, affective computing community and people interested in building interactive applications based on facial behavior analysis. OpenFace is the first open source tool capable of facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation. The computer vision algorithms which represent the core of OpenFace demonstrate state-of-the-art results in all of the above mentioned tasks. Furthermore, our tool is capable of real-time performance and is able to run from a simple webcam without any specialist hardware. Finally, OpenFace allows for easy integration with other applications and devices through a lightweight messaging system.

1,151 citations


Cites methods from "The Extended Cohn-Kanade Dataset (C..."

  • ...In order to reduce the feature dimensionality we use a PCA model trained on a number of facial expression datasets: CK+ [38], DISFA [41], AVEC 2011 [52], FERA 2011 [60], and FERA 2015 [59]....

    [...]

Proceedings ArticleDOI
01 Jul 2017
TL;DR: In this article, the authors proposed a novel deep learning framework that can exploit labeled source data and unlabeled target data to learn informative hash codes, to accurately classify unseen target data.
Abstract: In recent years, deep neural networks have emerged as a dominant machine learning tool for a wide variety of application domains. However, training a deep neural network requires a large amount of labeled data, which is an expensive process in terms of time, labor and human expertise. Domain adaptation or transfer learning algorithms address this challenge by leveraging labeled data in a different, but related source domain, to develop a model for the target domain. Further, the explosive growth of digital data has posed a fundamental challenge concerning its storage and retrieval. Due to its storage and retrieval efficiency, recent years have witnessed a wide application of hashing in a variety of computer vision applications. In this paper, we first introduce a new dataset, Office-Home, to evaluate domain adaptation algorithms. The dataset contains images of a variety of everyday objects from multiple domains. We then propose a novel deep learning framework that can exploit labeled source data and unlabeled target data to learn informative hash codes, to accurately classify unseen target data. To the best of our knowledge, this is the first research effort to exploit the feature learning capabilities of deep neural networks to learn representative hash codes to address the domain adaptation problem. Our extensive empirical studies on multiple transfer tasks corroborate the usefulness of the framework in learning efficient hash codes which outperform existing competitive baselines for unsupervised domain adaptation.

984 citations

Journal ArticleDOI
TL;DR: AffectNet is by far the largest database of facial expression, valence, and arousal in the wild enabling research in automated facial expression recognition in two different emotion models and various evaluation metrics show that the deep neural network baselines can perform better than conventional machine learning methods and off-the-shelf facial expressions recognition systems.
Abstract: Automated affective computing in the wild setting is a challenging problem in computer vision. Existing annotated databases of facial expressions in the wild are small and mostly cover discrete emotions (aka the categorical model). There are very limited annotated facial databases for affective computing in the continuous dimensional model (e.g., valence and arousal). To meet this need, we collected, annotated, and prepared for public distribution a new database of facial emotions in the wild (called AffectNet). AffectNet contains more than 1,000,000 facial images from the Internet by querying three major search engines using 1250 emotion related keywords in six different languages. About half of the retrieved images were manually annotated for the presence of seven discrete facial expressions and the intensity of valence and arousal. AffectNet is by far the largest database of facial expression, valence, and arousal in the wild enabling research in automated facial expression recognition in two different emotion models. Two baseline deep neural networks are used to classify images in the categorical model and predict the intensity of valence and arousal. Various evaluation metrics show that our deep neural network baselines can perform better than conventional machine learning methods and off-the-shelf facial expression recognition systems.

937 citations


Cites background or methods from "The Extended Cohn-Kanade Dataset (C..."

  • ...Early databases of facial expressions such as JAFFE [7], CohnKanade [8], [9], MMI [10], andMultiPie [11] were captured in a lab-controlled environment where the subjects portrayed different facial expressions....

    [...]

  • ...The proposed AU detection approach was trained on CK+ [9], DISFA [14], and CFEE [34] databases, and the accuracy of the automated annotated AUs was reported about 80% on the manually annotated set....

    [...]

  • ...CK+ [9] - Frontal and 30 degree images - 123 - Controlled - Posed - 30 AUs - 7 emotion categories...

    [...]

  • ...detection approach was trained on CK+ [9], DISFA [14], and CFEE [34] databases, and the accuracy of the automated annotated AUs was reported about 80 percent on the manually annotated set....

    [...]

  • ...Researchers have created databases of human actors/subjects portraying basic emotions [7], [8], [9], [10], [11]....

    [...]

Proceedings ArticleDOI
07 Mar 2016
TL;DR: A deep neural network architecture to address the FER problem across multiple well-known standard face datasets is proposed, comparable to or better than the state-of-the-art methods and better than traditional convolutional neural networks in both accuracy and training time.
Abstract: Automated Facial Expression Recognition (FER) has remained a challenging and interesting problem in computer vision. Despite efforts made in developing various methods for FER, existing approaches lack generalizability when applied to unseen images or those that are captured in wild setting (i.e. the results are not significant). Most of the existing approaches are based on engineered features (e.g. HOG, LBPH, and Gabor) where the classifier's hyper-parameters are tuned to give best recognition accuracies across a single database, or a small collection of similar databases. This paper proposes a deep neural network architecture to address the FER problem across multiple well-known standard face datasets. Specifically, our network consists of two convolutional layers each followed by max pooling and then four Inception layers. The network is a single component architecture that takes registered facial images as the input and classifies them into either of the six basic or the neutral expressions. We conducted comprehensive experiments on seven publicly available facial expression databases, viz. MultiPIE, MMI, CK+, DISFA, FERA, SFEW, and FER2013. The results of our proposed architecture are comparable to or better than the state-of-the-art methods and better than traditional convolutional neural networks in both accuracy and training time.

816 citations


Cites methods from "The Extended Cohn-Kanade Dataset (C..."

  • ...In [49] multiple features are fused via a Multiple Ker- nel Learning algorithm and the cross-database experiment is trained on CK+, evaluated on MMI and vice versa....

    [...]

  • ...We evaluate the proposed method on well-known publicly available facial expression databases: CMU MultiPIE [15], MMI [36], Denver Intensity of Spontaneous Facial Actions (DISFA) [29], extended CK+ [27], GEMEPFERA database [5], SFEW [10], and FER2013 [1]....

    [...]

  • ...Results in [24] show that the features generated by this “AU-Aware” network are competitive with or superior to handcrafted features such as LBP, SIFT, HoG, and Gabor on the CK+, MMI and databases using a similar SVM....

    [...]

  • ...MultiPIE, MMI, CK+, DISFA, FERA, SFEW, and FER2013) and obtain results which are significantly better than, or comparable to, traditional convolutional neural networks or other state-ofthe-art methods in both accuracy and learning time....

    [...]

  • ...The reported result in Table 5 is the best result using different SVM kernels trained on the CK+ database and evaluated the model on the MMI database....

    [...]

Journal ArticleDOI
TL;DR: There is an urgent need for research that examines how people actually move their faces to express emotions and other social information in the variety of contexts that make up everyday life, as well as careful study of the mechanisms by which people perceive instances of emotion in one another.
Abstract: It is commonly assumed that a person’s emotional state can be readily inferred from his or her facial movements, typically called emotional expressions or facial expressions. This assumption influences legal judgments, policy decisions, national security protocols, and educational practices; guides the diagnosis and treatment of psychiatric illness, as well as the development of commercial applications; and pervades everyday social interactions as well as research in other scientific fields such as artificial intelligence, neuroscience, and computer vision. In this article, we survey examples of this widespread assumption, which we refer to as the common view, and we then examine the scientific evidence that tests this view, focusing on the six most popular emotion categories used by consumers of emotion research: anger, disgust, fear, happiness, sadness, and surprise. The available scientific evidence suggests that people do sometimes smile when happy, frown when sad, scowl when angry, and so on, as proposed by the common view, more than what would be expected by chance. Yet how people communicate anger, disgust, fear, happiness, sadness, and surprise varies substantially across cultures, situations, and even across people within a single situation. Furthermore, similar configurations of facial movements variably express instances of more than one emotion category. In fact, a given configuration of facial movements, such as a scowl, often communicates something other than an emotional state. Scientists agree that facial movements convey a range of information and are important for social communication, emotional or otherwise. But our review suggests an urgent need for research that examines how people actually move their faces to express emotions and other social information in the variety of contexts that make up everyday life, as well as careful study of the mechanisms by which people perceive instances of emotion in one another. We make specific research recommendations that will yield a more valid picture of how people move their faces to express emotions and how they infer emotional meaning from facial movements in situations of everyday life. This research is crucial to provide consumers of emotion research with the translational information they require.

772 citations

References
More filters
Journal ArticleDOI
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

40,826 citations


"The Extended Cohn-Kanade Dataset (C..." refers methods in this paper

  • ...However, these scores have no real meaning when comparing them from different SVMs....

    [...]

  • ...SVMs attempt to find the hyperplane that maximizes the margin between positive and negative observations for a specified class....

    [...]

  • ...We then use support vector machines (SVMs) to classify the facial expressions and emotions....

    [...]

  • ...LIBSVM was used for the training and testing of SVMs [6]....

    [...]

  • ...However, when output from both the SPTS and CAPP SVMs are combined (through summing the output probabilities) it can be seen that the detection of this emotion jumps from just over 20% to over 80% as can be seen in Table 7....

    [...]

Book
01 Jan 1981
TL;DR: In this paper, the basic theory of Maximum Likelihood Estimation (MLE) is used to detect a difference between two different proportions of a given proportion in a single proportion.
Abstract: Preface.Preface to the Second Edition.Preface to the First Edition.1. An Introduction to Applied Probability.2. Statistical Inference for a Single Proportion.3. Assessing Significance in a Fourfold Table.4. Determining Sample Sizes Needed to Detect a Difference Between Two Proportions.5. How to Randomize.6. Comparative Studies: Cross-Sectional, Naturalistic, or Multinomial Sampling.7. Comparative Studies: Prospective and Retrospective Sampling.8. Randomized Controlled Trials.9. The Comparison of Proportions from Several Independent Samples.10. Combining Evidence from Fourfold Tables.11. Logistic Regression.12. Poisson Regression.13. Analysis of Data from Matched Samples.14. Regression Models for Matched Samples.15. Analysis of Correlated Binary Data.16. Missing Data.17. Misclassification Errors: Effects, Control, and Adjustment.18. The Measurement of Interrater Agreement.19. The Standardization of Rates.Appendix A. Numerical Tables.Appendix B. The Basic Theory of Maximum Likelihood Estimation.Appendix C. Answers to Selected Problems.Author Index.Subject Index.

16,435 citations

Journal ArticleDOI

9,528 citations


"The Extended Cohn-Kanade Dataset (C..." refers methods in this paper

  • ...Inter-observer agreement was quantified with coefficient kappa, which is the proportion of agreement above what would be expected to occur by chance [10]....

    [...]

01 Jan 2008
TL;DR: A simple procedure is proposed, which usually gives reasonable results and is suitable for beginners who are not familiar with SVM.
Abstract: Support vector machine (SVM) is a popular technique for classication. However, beginners who are not familiar with SVM often get unsatisfactory results since they miss some easy but signicant steps. In this guide, we propose a simple procedure, which usually gives reasonable results.

7,069 citations


"The Extended Cohn-Kanade Dataset (C..." refers methods in this paper

  • ...A linear kernel was used in our experiments due to its ability to generalize well to unseen data in many pattern recognition tasks [13]....

    [...]

Journal ArticleDOI
Abstract: We describe a new method of matching statistical models of appearance to images. A set of model parameters control modes of shape and gray-level variation learned from a training set. We construct an efficient iterative matching algorithm by learning the relationship between perturbations in the model parameters and the induced image errors.

6,200 citations