DNN-Based Cepstral Excitation Manipulation for Speech Enhancement

doi:10.1109/TASLP.2019.2933698

Open AccessJournal ArticleDOI

DNN-Based Cepstral Excitation Manipulation for Speech Enhancement

Samy Elshamy, +1 more

- 01 Nov 2019 -

IEEE Transactions on Audio, Speech, and ...

- Vol. 27, Iss: 11, pp 1803-1814

TLDR

The new approach exceeds the performance of a formerly introduced classical signal processing-based cepstral excitation manipulation (CEM) method in terms of noise attenuation by about 1.5 dB and shows that this gain also holds true when comparing serial combinations of envelope and excitation enhancement.

Abstract:

This contribution aims at speech model-based speech enhancement by exploiting the source-filter model of human speech production. The proposed method enhances the excitation signal in the cepstral domain by making use of a deep neural network (DNN). We investigate two types of target representations along with the significant effects of their normalization. The new approach exceeds the performance of a formerly introduced classical signal processing-based cepstral excitation manipulation (CEM) method in terms of noise attenuation by about 1.5 dB. We show that this gain also holds true when comparing serial combinations of envelope and excitation enhancement. In the important low-SNR conditions, no significant trade-off for speech component quality or speech intelligibility is induced, while allowing for substantially higher noise attenuation. In total, a traditional purely statistical state-of-the-art speech enhancement system is outperformed by more than 3 dB noise attenuation.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Enhancement in speaker recognition for optimized speech features using GMM, SVM and 1-D CNN

Sumita Nainan, +1 more

- 01 Dec 2021 -

International Journal of Speech Technolo...

TL;DR: A comparative analysis of accuracies obtained in ASR with employment of classical Gaussian mixture model, support vector machine, SVM which is the machine learning algorithm and the state of art 1-D CNN as classifiers is presented and results indicate that SVM and1-D Neural network outperform GMM.

...read moreread less

Journal ArticleDOI

Multi-scale decomposition based supervised single channel deep speech enhancement

Nasir Saleem, +1 more

- 01 Oct 2020 -

Applied Soft Computing

TL;DR: A nonlinear multi-scale decomposition-based deep speech enhancement method to improve the quality and intelligibility of the contaminated speech by applying Hurst exponent-based Empirical Mode Decomposition (HEMD) to the noisy speech and obtaining a set of intrinsic mode functions (IMFs) and a residual.

...read moreread less

Journal ArticleDOI

Improved CEM for Speech Harmonic Enhancement in Single Channel Noise Suppression

Yanjue Song, +1 more

- 01 Jan 2022 -

IEEE/ACM transactions on audio, speech, ...

TL;DR: In this article , the authors proposed two modifications to improve the robustness and performance of CEM in low signal to noise ratio (SNR) cases, which resulted in better preservation of speech harmonics, more refined fine structure and higher interharmonic noise suppression.

...read moreread less

Posted Content

Robust Acoustic Scene Classification in the Presence of Active Foreground Speech

Siyuan Song, +4 more

- 02 Aug 2021 -

arXiv: Audio and Speech Processing

TL;DR: In this article, an iVector based acoustic scene classification (ASC) system was proposed for real-life settings where active foreground speech can be present, where each recording is represented by a fixed-length iVector that models the recording's important properties.

...read moreread less

Proceedings ArticleDOI

Improvement of Speech Residuals for Speech Enhancement

Samy Elshamy, +1 more

TL;DR: A deep neural network is used to enhance residual signals in the cepstral domain, thereby exceeding a former cep stral excitation manipulation approach in different ways and providing higher speech component quality in low-SNR conditions.

...read moreread less

References

PDF

Open Access

More filters

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Book

Neural networks for pattern recognition

Christopher M. Bishop

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.

...read moreread less

Book ChapterDOI

Neural Networks for Pattern Recognition

Suresh Kothari, +1 more

- 01 Jan 1993 -

Advances in Computers

TL;DR: The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue.

...read moreread less

Posted Content

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Martín Abadi, +39 more

- 01 Jan 2015 -

arXiv: Distributed, Parallel, and Cluste...

TL;DR: The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields.

...read moreread less

Proceedings Article

Understanding the difficulty of training deep feedforward neural networks

Xavier Glorot, +1 more

TL;DR: The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future.

...read moreread less

Collapse

IEEE Transactions on Audio, Speech, and ...

Enhancement of Noisy Speech for Noise Robust Front-End and Speech Reconstruction at Back-End of DSR System

Hyoung-Gook Kim, +3 more

A single channel speech enhancement technique using psychoacoustic principles

Francois Xavier Nsabimana, +2 more

DNN-Based Cepstral Excitation Manipulation for Speech Enhancement

Citations

Enhancement in speaker recognition for optimized speech features using GMM, SVM and 1-D CNN

Multi-scale decomposition based supervised single channel deep speech enhancement

Improved CEM for Speech Harmonic Enhancement in Single Channel Noise Suppression

Robust Acoustic Scene Classification in the Presence of Active Foreground Speech

Improvement of Speech Residuals for Speech Enhancement

References

Adam: A Method for Stochastic Optimization

Neural networks for pattern recognition

Neural Networks for Pattern Recognition

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Understanding the difficulty of training deep feedforward neural networks

Related Papers (5)

Two-stage speech enhancement with manipulation of the cepstral excitation

Role of Deep Neural Network in Speech Enhancement: A Review

Instantaneous A Priori SNR Estimation by Cepstral Excitation Manipulation

Enhancement of Noisy Speech for Noise Robust Front-End and Speech Reconstruction at Back-End of DSR System

A single channel speech enhancement technique using psychoacoustic principles