Emotion recognition from speech using global and local prosodic features

doi:10.1007/S10772-012-9172-2

Journal ArticleDOI

Emotion recognition from speech using global and local prosodic features

K. Sreenivasa Rao, +2 more

- 01 Jun 2013 -

International Journal of Speech Technolo...

- Vol. 16, Iss: 2, pp 143-160

Chats0

TLDR

The results indicate that, the recognition performance using local Prosodic features is better compared to the performance of global prosodic features.

Abstract:

In this paper, global and local prosodic features extracted from sentence, word and syllables are proposed for speech emotion or affect recognition. In this work, duration, pitch, and energy values are used to represent the prosodic information, for recognizing the emotions from speech. Global prosodic features represent the gross statistics such as mean, minimum, maximum, standard deviation, and slope of the prosodic contours. Local prosodic features represent the temporal dynamics in the prosody. In this work, global and local prosodic features are analyzed separately and in combination at different levels for the recognition of emotions. In this study, we have also explored the words and syllables at different positions (initial, middle, and final) separately, to analyze their contribution towards the recognition of emotions. In this paper, all the studies are carried out using simulated Telugu emotion speech corpus (IITKGP-SESC). These results are compared with the results of internationally known Berlin emotion speech corpus (Emo-DB). Support vector machines are used to develop the emotion recognition models. The results indicate that, the recognition performance using local prosodic features is better compared to the performance of global prosodic features. Words in the final position of the sentences, syllables in the final position of the words exhibit more emotion discriminative information compared to the words and syllables present in the other positions.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Learning Salient Features for Speech Emotion Recognition Using Convolutional Neural Networks

Qirong Mao, +3 more

- 29 Sep 2014 -

IEEE Transactions on Multimedia

TL;DR: This paper proposes to learn affect-salient features for SER using convolutional neural networks (CNN), and shows that this approach leads to stable and robust recognition performance in complex scenes and outperforms several well-established SER features.

...read moreread less

Journal ArticleDOI

Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers

Mehmet Berkehan Akçay, +1 more

- 01 Jan 2020 -

Speech Communication

TL;DR: This work defines speech emotion recognition systems as a collection of methodologies that process and classify speech signals to detect the embedded emotions and identified and discussed distinct areas of SER.

...read moreread less

Journal ArticleDOI

Databases, features and classifiers for speech emotion recognition: a review

Monorama Swain, +2 more

- 01 Mar 2018 -

International Journal of Speech Technolo...

TL;DR: In this study, available literature on various databases, different features and classifiers have been taken in to consideration for speech emotion recognition from assorted languages.

...read moreread less

Journal ArticleDOI

Human emotion recognition and analysis in response to audio music using brain signals

Adnan Mehmood Bhatti, +3 more

- 01 Dec 2016 -

Computers in Human Behavior

TL;DR: It has been evident from results that MLP gives best accuracy to recognize human emotion in response to audio music tracks using hybrid features of brain signals.

...read moreread less

Journal ArticleDOI

Deep features-based speech emotion recognition for smart affective services

Abdul Malik Badshah, +7 more

- 01 Mar 2019 -

Multimedia Tools and Applications

TL;DR: This paper proposes rectangular kernels of varying shapes and sizes, along with max pooling in rectangular neighborhoods, to extract discriminative features from speech spectrograms using a deep convolutional neural network (CNN) with rectangular kernels.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

A database of German emotional speech.

Felix Burkhardt, +4 more

TL;DR: A database of emotional speech that was evaluated in a perception test regarding the recognisability of emotions and their naturalness and can be accessed by the public via the internet.

...read moreread less

Journal ArticleDOI

Vocal communication of emotion: a review of research paradigms

Klaus R. Scherer

- 01 Apr 2003 -

Speech Communication

TL;DR: It is suggested to use the Brunswikian lens model as a base for research on the vocal communication of emotion, which allows one to model the complete process, including both encoding, transmission, and decoding of vocal emotion communication.

...read moreread less

Journal ArticleDOI

Toward detecting emotions in spoken dialogs

Chul Min Lee, +1 more

- 22 Feb 2005 -

IEEE Transactions on Speech and Audio Pr...

TL;DR: This paper explores the detection of domain-specific emotions using language and discourse information in conjunction with acoustic correlates of emotion in speech signals on a case study of detecting negative and non-negative emotions using spoken language data obtained from a call center application.

...read moreread less

Journal ArticleDOI

Speech emotion recognition using hidden Markov models

Tin Lay Nwe, +2 more

- 01 Nov 2003 -

Speech Communication

TL;DR: This paper proposes a text independent method of emotion classification of speech that makes use of short time log frequency power coefficients (LFPC) to represent the speech signals and a discrete hidden Markov model (HMM) as the classifier.

...read moreread less

Journal ArticleDOI

Describing the emotional states that are expressed in speech

Roddy Cowie, +1 more

- 01 Apr 2003 -

Speech Communication

TL;DR: For instance, the authors describe the relationship between speech and emotion using a rich descriptive system, but it is intractable because it involves so many categories, and the relationships among them are undefined.

...read moreread less