scispace - formally typeset
Search or ask a question
Author

Ali Mollahosseini

Other affiliations: University of the Algarve
Bio: Ali Mollahosseini is an academic researcher from University of Denver. The author has contributed to research in topics: Facial expression & Social robot. The author has an hindex of 12, co-authored 23 publications receiving 1651 citations. Previous affiliations of Ali Mollahosseini include University of the Algarve.

Papers
More filters
Journal ArticleDOI
TL;DR: AffectNet is by far the largest database of facial expression, valence, and arousal in the wild enabling research in automated facial expression recognition in two different emotion models and various evaluation metrics show that the deep neural network baselines can perform better than conventional machine learning methods and off-the-shelf facial expressions recognition systems.
Abstract: Automated affective computing in the wild setting is a challenging problem in computer vision. Existing annotated databases of facial expressions in the wild are small and mostly cover discrete emotions (aka the categorical model). There are very limited annotated facial databases for affective computing in the continuous dimensional model (e.g., valence and arousal). To meet this need, we collected, annotated, and prepared for public distribution a new database of facial emotions in the wild (called AffectNet). AffectNet contains more than 1,000,000 facial images from the Internet by querying three major search engines using 1250 emotion related keywords in six different languages. About half of the retrieved images were manually annotated for the presence of seven discrete facial expressions and the intensity of valence and arousal. AffectNet is by far the largest database of facial expression, valence, and arousal in the wild enabling research in automated facial expression recognition in two different emotion models. Two baseline deep neural networks are used to classify images in the categorical model and predict the intensity of valence and arousal. Various evaluation metrics show that our deep neural network baselines can perform better than conventional machine learning methods and off-the-shelf facial expression recognition systems.

937 citations

Proceedings ArticleDOI
07 Mar 2016
TL;DR: A deep neural network architecture to address the FER problem across multiple well-known standard face datasets is proposed, comparable to or better than the state-of-the-art methods and better than traditional convolutional neural networks in both accuracy and training time.
Abstract: Automated Facial Expression Recognition (FER) has remained a challenging and interesting problem in computer vision. Despite efforts made in developing various methods for FER, existing approaches lack generalizability when applied to unseen images or those that are captured in wild setting (i.e. the results are not significant). Most of the existing approaches are based on engineered features (e.g. HOG, LBPH, and Gabor) where the classifier's hyper-parameters are tuned to give best recognition accuracies across a single database, or a small collection of similar databases. This paper proposes a deep neural network architecture to address the FER problem across multiple well-known standard face datasets. Specifically, our network consists of two convolutional layers each followed by max pooling and then four Inception layers. The network is a single component architecture that takes registered facial images as the input and classifies them into either of the six basic or the neutral expressions. We conducted comprehensive experiments on seven publicly available facial expression databases, viz. MultiPIE, MMI, CK+, DISFA, FERA, SFEW, and FER2013. The results of our proposed architecture are comparable to or better than the state-of-the-art methods and better than traditional convolutional neural networks in both accuracy and training time.

816 citations

Journal ArticleDOI
TL;DR: In this paper, the authors collected, annotated, and prepared for public distribution a new database of facial emotions in the wild (called AffectNet), which contains more than 1,000,000 facial images from the Internet by querying three major search engines using 1,250 emotion related keywords in six different languages.
Abstract: Automated affective computing in the wild setting is a challenging problem in computer vision. Existing annotated databases of facial expressions in the wild are small and mostly cover discrete emotions (aka the categorical model). There are very limited annotated facial databases for affective computing in the continuous dimensional model (e.g., valence and arousal). To meet this need, we collected, annotated, and prepared for public distribution a new database of facial emotions in the wild (called AffectNet). AffectNet contains more than 1,000,000 facial images from the Internet by querying three major search engines using 1,250 emotion related keywords in six different languages. About half of the retrieved images were manually annotated for the presence of seven discrete facial expressions and the intensity of valence and arousal. AffectNet is by far the largest database of facial expression, valence, and arousal in the wild enabling research in automated facial expression recognition in two different emotion models. Two baseline deep neural networks are used to classify images in the categorical model and predict the intensity of valence and arousal. Various evaluation metrics show that our deep neural network baselines can perform better than conventional machine learning methods and off-the-shelf facial expression recognition systems.

432 citations

Proceedings ArticleDOI
11 May 2016
TL;DR: The results of a new study on collecting, annotating, and analyzing wild facial expressions from the web show that deep neural networks can recognize wild facial expression with an accuracy of 82.12%.
Abstract: Recognizing facial expression in a wild setting has remained a challenging task in computer vision. The World Wide Web is a good source of facial images which most of them are captured in uncontrolled conditions. In fact, the Internet is a Word Wild Web of facial images with expressions. This paper presents the results of a new study on collecting, annotating, and analyzing wild facial expressions from the web. Three search engines were queried using 1250 emotion related keywords in six different languages and the retrieved images were mapped by two annotators to six basic expressions and neutral. Deep neural networks and noise modeling were used in three different training scenarios to find how accurately facial expressions can be recognized when trained on noisy images collected from the web using query terms (e.g. happy face, laughing man, etc)? The results of our experiments show that deep neural networks can recognize wild facial expressions with an accuracy of 82.12%.

81 citations

Journal ArticleDOI
TL;DR: The experiments show that using the magnitude and phase of shearlet coefficients as extra information to the network can improve the accuracy of detection and generalize better compared to the state-of-the-art methods that rely on hand-crafted features.
Abstract: Cancer is the second leading cause of death in US after cardiovascular disease. Image-based computer-aided diagnosis can assist physicians to efficiently diagnose cancers in early stages. Existing computer-aided algorithms use hand-crafted features such as wavelet coefficients, co-occurrence matrix features, and recently, histogram of shearlet coefficients for classification of cancerous tissues and cells in images. These hand-crafted features often lack generalizability since every cancerous tissue and cell has a specific texture, structure, and shape. An alternative approach is to use convolutional neural networks (CNNs) to learn the most appropriate feature abstractions directly from the data and handle the limitations of hand-crafted features. A framework for breast cancer detection and prostate Gleason grading using CNN trained on images along with the magnitude and phase of shearlet coefficients is presented. Particularly, we apply shearlet transform on images and extract the magnitude and phase of shearlet coefficients. Then we feed shearlet features along with the original images to our CNN consisting of multiple layers of convolution, max pooling, and fully connected layers. Our experiments show that using the magnitude and phase of shearlet coefficients as extra information to the network can improve the accuracy of detection and generalize better compared to the state-of-the-art methods that rely on hand-crafted features. This study expands the application of deep neural networks into the field of medical image analysis, which is a difficult domain considering the limited medical data available for such analysis.

78 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year, to survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks.

8,730 citations

Journal ArticleDOI
TL;DR: AffectNet is by far the largest database of facial expression, valence, and arousal in the wild enabling research in automated facial expression recognition in two different emotion models and various evaluation metrics show that the deep neural network baselines can perform better than conventional machine learning methods and off-the-shelf facial expressions recognition systems.
Abstract: Automated affective computing in the wild setting is a challenging problem in computer vision. Existing annotated databases of facial expressions in the wild are small and mostly cover discrete emotions (aka the categorical model). There are very limited annotated facial databases for affective computing in the continuous dimensional model (e.g., valence and arousal). To meet this need, we collected, annotated, and prepared for public distribution a new database of facial emotions in the wild (called AffectNet). AffectNet contains more than 1,000,000 facial images from the Internet by querying three major search engines using 1250 emotion related keywords in six different languages. About half of the retrieved images were manually annotated for the presence of seven discrete facial expressions and the intensity of valence and arousal. AffectNet is by far the largest database of facial expression, valence, and arousal in the wild enabling research in automated facial expression recognition in two different emotion models. Two baseline deep neural networks are used to classify images in the categorical model and predict the intensity of valence and arousal. Various evaluation metrics show that our deep neural network baselines can perform better than conventional machine learning methods and off-the-shelf facial expression recognition systems.

937 citations

Proceedings ArticleDOI
07 Mar 2016
TL;DR: A deep neural network architecture to address the FER problem across multiple well-known standard face datasets is proposed, comparable to or better than the state-of-the-art methods and better than traditional convolutional neural networks in both accuracy and training time.
Abstract: Automated Facial Expression Recognition (FER) has remained a challenging and interesting problem in computer vision. Despite efforts made in developing various methods for FER, existing approaches lack generalizability when applied to unseen images or those that are captured in wild setting (i.e. the results are not significant). Most of the existing approaches are based on engineered features (e.g. HOG, LBPH, and Gabor) where the classifier's hyper-parameters are tuned to give best recognition accuracies across a single database, or a small collection of similar databases. This paper proposes a deep neural network architecture to address the FER problem across multiple well-known standard face datasets. Specifically, our network consists of two convolutional layers each followed by max pooling and then four Inception layers. The network is a single component architecture that takes registered facial images as the input and classifies them into either of the six basic or the neutral expressions. We conducted comprehensive experiments on seven publicly available facial expression databases, viz. MultiPIE, MMI, CK+, DISFA, FERA, SFEW, and FER2013. The results of our proposed architecture are comparable to or better than the state-of-the-art methods and better than traditional convolutional neural networks in both accuracy and training time.

816 citations

Proceedings ArticleDOI
01 Jul 2017
TL;DR: A new DLP-CNN (Deep Locality-Preserving CNN) method, which aims to enhance the discriminative power of deep features by preserving the locality closeness while maximizing the inter-class scatters, is proposed.
Abstract: Past research on facial expressions have used relatively limited datasets, which makes it unclear whether current methods can be employed in real world. In this paper, we present a novel database, RAF-DB, which contains about 30000 facial images from thousands of individuals. Each image has been individually labeled about 40 times, then EM algorithm was used to filter out unreliable labels. Crowdsourcing reveals that real-world faces often express compound emotions, or even mixture ones. For all we know, RAF-DB is the first database that contains compound expressions in the wild. Our cross-database study shows that the action units of basic emotions in RAF-DB are much more diverse than, or even deviate from, those of lab-controlled ones. To address this problem, we propose a new DLP-CNN (Deep Locality-Preserving CNN) method, which aims to enhance the discriminative power of deep features by preserving the locality closeness while maximizing the inter-class scatters. The benchmark experiments on the 7-class basic expressions and 11-class compound expressions, as well as the additional experiments on SFEW and CK+ databases, show that the proposed DLP-CNN outperforms the state-of-the-art handcrafted features and deep learning based methods for the expression recognition in the wild.

746 citations