DNN-Based Cepstral Excitation Manipulation for Speech Enhancement
Samy Elshamy,Tim Fingscheidt +1 more
TLDR
The new approach exceeds the performance of a formerly introduced classical signal processing-based cepstral excitation manipulation (CEM) method in terms of noise attenuation by about 1.5 dB and shows that this gain also holds true when comparing serial combinations of envelope and excitation enhancement.Abstract:
This contribution aims at speech model-based speech enhancement by exploiting the source-filter model of human speech production. The proposed method enhances the excitation signal in the cepstral domain by making use of a deep neural network (DNN). We investigate two types of target representations along with the significant effects of their normalization. The new approach exceeds the performance of a formerly introduced classical signal processing-based cepstral excitation manipulation (CEM) method in terms of noise attenuation by about 1.5 dB. We show that this gain also holds true when comparing serial combinations of envelope and excitation enhancement. In the important low-SNR conditions, no significant trade-off for speech component quality or speech intelligibility is induced, while allowing for substantially higher noise attenuation. In total, a traditional purely statistical state-of-the-art speech enhancement system is outperformed by more than 3 dB noise attenuation.read more
Citations
More filters
Journal ArticleDOI
Enhancement in speaker recognition for optimized speech features using GMM, SVM and 1-D CNN
Sumita Nainan,Vaishali Kulkarni +1 more
TL;DR: A comparative analysis of accuracies obtained in ASR with employment of classical Gaussian mixture model, support vector machine, SVM which is the machine learning algorithm and the state of art 1-D CNN as classifiers is presented and results indicate that SVM and1-D Neural network outperform GMM.
Journal ArticleDOI
Multi-scale decomposition based supervised single channel deep speech enhancement
TL;DR: A nonlinear multi-scale decomposition-based deep speech enhancement method to improve the quality and intelligibility of the contaminated speech by applying Hurst exponent-based Empirical Mode Decomposition (HEMD) to the noisy speech and obtaining a set of intrinsic mode functions (IMFs) and a residual.
Journal ArticleDOI
Improved CEM for Speech Harmonic Enhancement in Single Channel Noise Suppression
Yanjue Song,Nilesh Madhu +1 more
TL;DR: In this article , the authors proposed two modifications to improve the robustness and performance of CEM in low signal to noise ratio (SNR) cases, which resulted in better preservation of speech harmonics, more refined fine structure and higher interharmonic noise suppression.
Posted Content
Robust Acoustic Scene Classification in the Presence of Active Foreground Speech
TL;DR: In this article, an iVector based acoustic scene classification (ASC) system was proposed for real-life settings where active foreground speech can be present, where each recording is represented by a fixed-length iVector that models the recording's important properties.
Proceedings ArticleDOI
Improvement of Speech Residuals for Speech Enhancement
Samy Elshamy,Tim Fingscheidt +1 more
TL;DR: A deep neural network is used to enhance residual signals in the cepstral domain, thereby exceeding a former cep stral excitation manipulation approach in different ways and providing higher speech component quality in low-SNR conditions.
References
More filters
Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Book
Neural networks for pattern recognition
TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.
Book ChapterDOI
Neural Networks for Pattern Recognition
Suresh Kothari,Heekuck Oh +1 more
TL;DR: The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue.
Posted Content
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
Martín Abadi,Ashish Agarwal,Paul Barham,Eugene Brevdo,Zhifeng Chen,Craig Citro,Greg S. Corrado,Andy Davis,Jeffrey Dean,Matthieu Devin,Sanjay Ghemawat,Ian Goodfellow,Andrew Harp,Geoffrey Irving,Michael Isard,Yangqing Jia,Rafal Jozefowicz,Lukasz Kaiser,Manjunath Kudlur,Josh Levenberg,Dan Mané,Rajat Monga,Sherry Moore,Derek G. Murray,Chris Olah,Mike Schuster,Jonathon Shlens,Benoit Steiner,Ilya Sutskever,Kunal Talwar,Paul A. Tucker,Vincent Vanhoucke,Vijay K. Vasudevan,Fernanda B. Viégas,Oriol Vinyals,Pete Warden,Martin Wattenberg,Martin Wicke,Yuan Yu,Xiaoqiang Zheng +39 more
TL;DR: The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields.
Proceedings Article
Understanding the difficulty of training deep feedforward neural networks
Xavier Glorot,Yoshua Bengio +1 more
TL;DR: The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future.