scispace - formally typeset
Open AccessJournal ArticleDOI

A review of affective computing

Reads0
Chats0
TLDR
This first of its kind, comprehensive literature review of the diverse field of affective computing focuses mainly on the use of audio, visual and text information for multimodal affect analysis, and outlines existing methods for fusing information from different modalities.
About
This article is published in Information Fusion.The article was published on 2017-09-01 and is currently open access. It has received 969 citations till now. The article focuses on the topics: Affective computing & Modality (human–computer interaction).

read more

Citations
More filters
Book ChapterDOI

Bibliography

TL;DR: In this article , Tian et al. proposed a citation alert system for new citation alerts, which will be sent to the user whenever a record that they have chosen has been cited.
Book ChapterDOI

Authors’ Biographies & Index

TL;DR: Tian et al. as mentioned in this paper added a new citation alert to a record that has been cited, which will be sent to:You will be notified whenever a record from the record that you have chosen has been citing.To manage your alert preferences, click on the button below.
Book ChapterDOI

A Smart System for Assessment of Mental Health Using Explainable AI Approach

Ahmad Tanveer
TL;DR: In this article , an attempt has been made to facilitate the diagnosis process by incorporating patient mental health (angry, fear, happy, sad, and neutral) using widely used ML algorithms.
Proceedings ArticleDOI

A Review of Personalized Health Navigation for Drivers

TL;DR: In this article , a cybernetic-based personalized health navigation framework for drivers (PHN-D) is proposed, which provides a new paradigm in the field of driver health.
References
More filters
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Posted Content

Efficient Estimation of Word Representations in Vector Space

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.
Journal ArticleDOI

A fast learning algorithm for deep belief nets

TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Proceedings ArticleDOI

Convolutional Neural Networks for Sentence Classification

TL;DR: The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors.
Book

The Expression of the Emotions in Man and Animals

TL;DR: The Expression of the Emotions in Man and Animals Introduction to the First Edition and Discussion Index, by Phillip Prodger and Paul Ekman.
Related Papers (5)
Frequently Asked Questions (9)
Q1. What contributions have the authors mentioned in the paper "A review of affective computing: from unimodal analysis to multimodal fusion" ?

This is the primary motivation behind their first of its kind, comprehensive literature review of the diverse field of affective computing. Furthermore, existing literature surveys lack a detailed discussion of state of the art in multimodal affect analysis frameworks, which this review aims to address. In this paper, the authors focus mainly on the use of audio, visual and text information for multimodal affect analysis, since around 90 % of the relevant literature appears to cover these three modalities. As part of this review, the authors carry out an extensive study of different categories of state-of-the-art fusion techniques, followed by a critical analysis of potential performance improvements with multimodal analysis compared to unimodal analysis. A comprehensive overview of these two complementary fields aims to form the building blocks for readers, to better understand this challenging and exciting research field. 

One important area of future research is to investigate novel approaches for advancing their understanding of the temporal dependency between utterances, i. e., the effect of utterance at time t on the utterance at time t+1. The progress in text classification research can play a major role in future of the multimodal affect analysis research. Future research should focus on answering this question. The use of deep learning for multimodal fusion can also be an important future work. 

The primary advantage of analyzing videos over textual analysis, for detecting emotions and sentiments from opinions, is the surplus of behavioral cues. 

For acoustic features, low-level acoustic features were extracted at frame level on each utterance and used to generate feature representation of the entire dataset, using the OpenSMILE toolkit. 

Whilst machine learning methods, for supervised training of the sentiment analysis system, are predominant in literature, a number of unsupervised methods such as linguistic patterns can also be found. 

Across the ages of people involved, and the nature of conversations, facial expressions are the primary channel for forming an impression of the subject’s present state of mind. 

The results on uncontrolled recordings (i.e., speech downloaded from a video-sharing website) revealed that the feature adaptation scheme significantly improved the unweighted and weighted accuracies of the emotion recognition system. 

In their literature survey, the authors have found more than 90% of studies reported visual modality as superior to audio and other modalities. 

To accommodate research in audio-visual fusion, the audio and video signals were synchronized with an accuracy of 25micro-seconds.