scispace - formally typeset
Search or ask a question
Author

Ashkan Yazdani

Bio: Ashkan Yazdani is an academic researcher from École Polytechnique Fédérale de Lausanne. The author has contributed to research in topics: Feature vector & Biometrics. The author has an hindex of 15, co-authored 30 publications receiving 2754 citations. Previous affiliations of Ashkan Yazdani include University of Tehran & École Normale Supérieure.

Papers
More filters
Journal ArticleDOI
TL;DR: A multimodal data set for the analysis of human affective states was presented and a novel method for stimuli selection is proposed using retrieval by affective tags from the last.fm website, video highlight detection, and an online assessment tool.
Abstract: We present a multimodal data set for the analysis of human affective states. The electroencephalogram (EEG) and peripheral physiological signals of 32 participants were recorded as each watched 40 one-minute long excerpts of music videos. Participants rated each video in terms of the levels of arousal, valence, like/dislike, dominance, and familiarity. For 22 of the 32 participants, frontal face video was also recorded. A novel method for stimuli selection is proposed using retrieval by affective tags from the last.fm website, video highlight detection, and an online assessment tool. An extensive analysis of the participants' ratings during the experiment is presented. Correlates between the EEG signal frequencies and the participants' ratings are investigated. Methods and results are presented for single-trial classification of arousal, valence, and like/dislike ratings using the modalities of EEG, peripheral physiological signals, and multimedia content analysis. Finally, decision fusion of the classification results from different modalities is performed. The data set is made publicly available and we encourage other researchers to use it for testing their own affective state estimation methods.

3,013 citations

Book ChapterDOI
TL;DR: This work presents some promising results of its research in classification of emotions induced by watching music videos, and shows robust correlations between users' self-assessments of arousal and valence and the frequency powers of their EEG activity.
Abstract: Recently, the field of automatic recognition of users. affective states has gained a great deal of attention. Automatic, implicit recognition of affective states has many applications, ranging from personalized content recommendation to automatic tutoring systems. In this work, we present some promising results of our research in classification of emotions induced by watching music videos. We show robust correlations between users' self-assessments of arousal and valence and the frequency powers of their EEG activity. We present methods for single trial classification using both EEG and peripheral physiological signals. For EEG, an average (maximum) classification rate of 55.7% (67.0%) for arousal and 58.8% (76.0%) for valence was obtained. For peripheral physiological signals, the results were 58.9% (85.5%) for arousal and 54.2% (78.5%) for valence.

176 citations

Proceedings ArticleDOI
23 Jun 2009
TL;DR: It is shown that the Dempster-Shafer KNN classifier achieves a higher correct classification rate than the classical voting KNN Classifier and the distance-weighted KNNclassifier.
Abstract: A brain computer interface (BCI) is a communication system, which translates brain activity into commands for a computer or other devices. Nearly all BCIs contain as a core component a classification algorithm, which is employed to discriminate different brain activities using previously recorded examples of brain activity. In this paper, we study the classification accuracy achievable with a k-nearest neighbor (KNN) method based on Dempster-Shafer theory. To extract features from the electroencephalogram (EEG), autoregressive (AR) models and wavelet decomposition are used. To test the classification method an EEG dataset containing signals recorded during the performance of five different mental tasks is used. We show that the Dempster-Shafer KNN classifier achieves a higher correct classification rate than the classical voting KNN classifier and the distance-weighted KNN classifier.

81 citations

Book ChapterDOI
09 Oct 2011
TL;DR: The results of analyzing the Electroencephalogram (EEG) for assessing emotions elicited during watching various pre-selected emotional music video clips have been reported and in-depth results of both subject-dependent and subjectindependent correlation analysis between time domain, and frequency domain features of EEG signal and subjects' self assessed emotions are produced.
Abstract: Studying emotions has become increasingly popular in various research fields. Researchers across the globe have studied various tools to implicitly assess emotions and affective states of people. Human computer interface systems specifically can benefit from such implicit emotion evaluator module, which can help them determine their users' affective states and act accordingly. Brain electrical activity can be considered as an appropriate candidate for extracting emotion-related cues, but it is still in its infancy. In this paper, the results of analyzing the Electroencephalogram (EEG) for assessing emotions elicited during watching various pre-selected emotional music video clips have been reported. More precisely, in-depth results of both subject-dependent and subjectindependent correlation analysis between time domain, and frequency domain features of EEG signal and subjects' self assessed emotions are produced and discussed.

79 citations

Proceedings ArticleDOI
23 Oct 2009
TL;DR: It is shown that the proposed brain-computer interface (BCI) system based on P300 evoked potential can successfully perform implicit emotional tagging and naïve subjects who have not participated in training of the system can also use it efficiently.
Abstract: In multimedia content sharing social networks, tags assigned to content play an important role in search and retrieval. In other words, by annotating multimedia content, users can associate a word or a phrase (tag) with that resource such that it can be searched for efficiently. Implicit tagging refers to assigning tags by observing subjects behavior during consumption of multimedia content. This is an alternative to traditional explicit tagging which requires an explicit action by subjects. In this paper we propose a brain-computer interface (BCI) system based on P300 evoked potential, for implicit emotional tagging of multimedia content. We show that our system can successfully perform implicit emotional tagging and naive subjects who have not participated in training of the system can also use it efficiently. Moreover, we introduce a subjective metric called "emotional taggability" to analyze the recognition performance of the system, given the degree of ambiguity that exists in terms of emotional values associated with a multimedia content.

66 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
TL;DR: A multimodal data set for the analysis of human affective states was presented and a novel method for stimuli selection is proposed using retrieval by affective tags from the last.fm website, video highlight detection, and an online assessment tool.
Abstract: We present a multimodal data set for the analysis of human affective states. The electroencephalogram (EEG) and peripheral physiological signals of 32 participants were recorded as each watched 40 one-minute long excerpts of music videos. Participants rated each video in terms of the levels of arousal, valence, like/dislike, dominance, and familiarity. For 22 of the 32 participants, frontal face video was also recorded. A novel method for stimuli selection is proposed using retrieval by affective tags from the last.fm website, video highlight detection, and an online assessment tool. An extensive analysis of the participants' ratings during the experiment is presented. Correlates between the EEG signal frequencies and the participants' ratings are investigated. Methods and results are presented for single-trial classification of arousal, valence, and like/dislike ratings using the modalities of EEG, peripheral physiological signals, and multimedia content analysis. Finally, decision fusion of the classification results from different modalities is performed. The data set is made publicly available and we encourage other researchers to use it for testing their own affective state estimation methods.

3,013 citations

01 Jan 1979
TL;DR: This special issue aims at gathering the recent advances in learning with shared information methods and their applications in computer vision and multimedia analysis and addressing interesting real-world computer Vision and multimedia applications.
Abstract: In the real world, a realistic setting for computer vision or multimedia recognition problems is that we have some classes containing lots of training data and many classes contain a small amount of training data. Therefore, how to use frequent classes to help learning rare classes for which it is harder to collect the training data is an open question. Learning with Shared Information is an emerging topic in machine learning, computer vision and multimedia analysis. There are different level of components that can be shared during concept modeling and machine learning stages, such as sharing generic object parts, sharing attributes, sharing transformations, sharing regularization parameters and sharing training examples, etc. Regarding the specific methods, multi-task learning, transfer learning and deep learning can be seen as using different strategies to share information. These learning with shared information methods are very effective in solving real-world large-scale problems. This special issue aims at gathering the recent advances in learning with shared information methods and their applications in computer vision and multimedia analysis. Both state-of-the-art works, as well as literature reviews, are welcome for submission. Papers addressing interesting real-world computer vision and multimedia applications are especially encouraged. Topics of interest include, but are not limited to: • Multi-task learning or transfer learning for large-scale computer vision and multimedia analysis • Deep learning for large-scale computer vision and multimedia analysis • Multi-modal approach for large-scale computer vision and multimedia analysis • Different sharing strategies, e.g., sharing generic object parts, sharing attributes, sharing transformations, sharing regularization parameters and sharing training examples, • Real-world computer vision and multimedia applications based on learning with shared information, e.g., event detection, object recognition, object detection, action recognition, human head pose estimation, object tracking, location-based services, semantic indexing. • New datasets and metrics to evaluate the benefit of the proposed sharing ability for the specific computer vision or multimedia problem. • Survey papers regarding the topic of learning with shared information. Authors who are unsure whether their planned submission is in scope may contact the guest editors prior to the submission deadline with an abstract, in order to receive feedback.

1,758 citations

Journal ArticleDOI
TL;DR: Results show the potential uses of the recorded modalities and the significance of the emotion elicitation protocol and single modality and modality fusion results for both emotion recognition and implicit tagging experiments are reported.
Abstract: MAHNOB-HCI is a multimodal database recorded in response to affective stimuli with the goal of emotion recognition and implicit tagging research. A multimodal setup was arranged for synchronized recording of face videos, audio signals, eye gaze data, and peripheral/central nervous system physiological signals. Twenty-seven participants from both genders and different cultural backgrounds participated in two experiments. In the first experiment, they watched 20 emotional videos and self-reported their felt emotions using arousal, valence, dominance, and predictability as well as emotional keywords. In the second experiment, short videos and images were shown once without any tag and then with correct or incorrect tags. Agreement or disagreement with the displayed tags was assessed by the participants. The recorded videos and bodily responses were segmented and stored in a database. The database is made available to the academic community via a web-based system. The collected data were analyzed and single modality and modality fusion results for both emotion recognition and implicit tagging experiments are reported. These results show the potential uses of the recorded modalities and the significance of the emotion elicitation protocol.

1,162 citations

Journal ArticleDOI
TL;DR: The experiment results show that neural signatures associated with different emotions do exist and they share commonality across sessions and individuals, and the performance of deep models with shallow models is compared.
Abstract: To investigate critical frequency bands and channels, this paper introduces deep belief networks (DBNs) to constructing EEG-based emotion recognition models for three emotions: positive, neutral and negative. We develop an EEG dataset acquired from 15 subjects. Each subject performs the experiments twice at the interval of a few days. DBNs are trained with differential entropy features extracted from multichannel EEG data. We examine the weights of the trained DBNs and investigate the critical frequency bands and channels. Four different profiles of 4, 6, 9, and 12 channels are selected. The recognition accuracies of these four profiles are relatively stable with the best accuracy of 86.65%, which is even better than that of the original 62 channels. The critical frequency bands and channels determined by using the weights of trained DBNs are consistent with the existing observations. In addition, our experiment results show that neural signatures associated with different emotions do exist and they share commonality across sessions and individuals. We compare the performance of deep models with shallow models. The average accuracies of DBN, SVM, LR, and KNN are 86.08%, 83.99%, 82.70%, and 72.60%, respectively.

1,131 citations