A review on speech processing using machine learning paradigm

doi:10.1007/S10772-021-09808-0

Journal ArticleDOI

A review on speech processing using machine learning paradigm

Kishor B. Bhangale, +1 more

- 01 Jun 2021 -

International Journal of Speech Technolo...

- Vol. 24, Iss: 2, pp 367-388

TLDR

The performance of several machine learning techniques is validated for speech emotion recognition application on Berlin EmoDB database and the broad application areas and challenges in machine learning for speech processing are given.

Abstract:

Speech processing plays a crucial role in many signal processing applications, while the last decade has bought gigantic evolution based on machine learning prototype. Speech processing has a close relationship with computer linguistics, human–machine interaction, natural language processing, and psycholinguistics. This review article majorly discusses the feature extraction techniques and machine learning classifiers employed in speech processing and recognition activities. The performance of several machine learning techniques is validated for speech emotion recognition application on Berlin EmoDB database. Further, it gives the broad application areas and challenges in machine learning for speech processing.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Survey of Deep Learning Paradigms for Speech Processing

Kishor B. Bhangale, +1 more

- 04 Mar 2022 -

Wireless Personal Communications

Journal ArticleDOI

Speech Emotion Recognition Based on Multiple Acoustic Features and Deep Convolutional Neural Network

Kishor B. Bhangale, +1 more

- 07 Feb 2023 -

Electronics

TL;DR: In this paper , the acoustic feature set based on Mel frequency cepstral coefficients (MFCC), linear prediction Cepstrals coefficients (LPCC), wavelet packet transform (WPT), zero crossing rate (ZCR), spectrum centroid, spectral roll-off, spectral kurtosis, root mean square (RMS), pitch, jitter, and shimmer to improve the feature distinctiveness.

...read moreread less

Proceedings ArticleDOI

Neural Style Transfer: Reliving art through Artificial Intelligence

Kishor B. Bhangale, +3 more

TL;DR: Deep-learning techniques are introduced, which are vital in accomplishing human characteristics and open up a new world of prospects in generating higher quality creative products.

...read moreread less

Journal ArticleDOI

Energy efficient clustering routing protocol using novel admission allotment scheme (AAS) based intra-cluster communication for Wireless Sensor Network

Mohammed Shahid Thekiya, +1 more

- 01 Sep 2022 -

International journal of information tec...

TL;DR: Novel Admission Allotment Scheme (AAS) based intra-cluster communication to minimize overheads on the sensor nodes and packet drop and energy efficient Ant Colony Optimization (ACO) is proposed to route the data from CH to base station (BS).

...read moreread less

Journal ArticleDOI

Entropy-Argumentative Concept of Computational Phonetic Analysis of Speech Taking into Account Dialect and Individuality of Phonation

Oksana Kovtun

- 20 Jul 2022 -

Entropy

TL;DR: In this paper , a mathematical model and methods for computational phonetic analysis of speech with an analytical description of the phenomenon of phonetic fusion is proposed, which allows for determining reliably the individual phonetic alphabet inherent in a person, taking into account their inherent dialect of speech and individual features of phonation, as well as detecting and correcting errors in the recognition of language units.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Robust text-independent speaker identification using Gaussian mixture speaker models

Douglas A. Reynolds, +1 more

- 01 Jan 1995 -

IEEE Transactions on Speech and Audio Pr...

TL;DR: The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are effective for modeling speaker identity and is shown to outperform the other speaker modeling techniques on an identical 16 speaker telephone speech task.

...read moreread less

Proceedings ArticleDOI

Listen, attend and spell: A neural network for large vocabulary conversational speech recognition

William Chan, +3 more

TL;DR: Listen, Attend and Spell (LAS), a neural speech recognizer that transcribes speech utterances directly to characters without pronunciation models, HMMs or other components of traditional speech recognizers is presented.

...read moreread less

Proceedings ArticleDOI

A database of German emotional speech.

Felix Burkhardt, +4 more

TL;DR: A database of emotional speech that was evaluated in a perception test regarding the recognisability of emotions and their naturalness and can be accessed by the public via the internet.

...read moreread less

Journal ArticleDOI

Speech recognition by machine: A review

D.R. Reddy

TL;DR: This paper provides a review of recent developments in speech recognition research and the concept of sources of knowledge is introduced and the use of knowledge to generate and verify hypotheses is discussed.

...read moreread less

Reference BookDOI

Springer Handbook of Acoustics

M. Schroeder, +5 more

TL;DR: The Springer Handbook of Acoustics as discussed by the authors is an unparalleled modern handbook reflecting this richly interdisciplinary nature edited by one of the acknowledged masters in the field, Thomas Rossing.

...read moreread less