Noise Adaptive Speech Enhancement using Domain Adversarial Training.

Open AccessPosted Content

Noise Adaptive Speech Enhancement using Domain Adversarial Training.

Chien-Feng Liao, +3 more

- 19 Jul 2018 -

arXiv: Sound

Chats0

TLDR

In this article, the authors proposed a noise adaptive speech enhancement (SE) system, which employs a domain adversarial training (DAT) approach to tackle the issue of a noise type mismatch between the training and testing conditions.

Abstract:

In this study, we propose a novel noise adaptive speech enhancement (SE) system, which employs a domain adversarial training (DAT) approach to tackle the issue of a noise type mismatch between the training and testing conditions Such a mismatch is a critical problem in deep-learning-based SE systems A large mismatch may cause a serious performance degradation to the SE performance Because we generally use a well-trained SE system to handle various unseen noise types, a noise type mismatch commonly occurs in real-world scenarios The proposed noise adaptive SE system contains an encoder-decoder-based enhancement model and a domain discriminator model During adaptation, the DAT approach encourages the encoder to produce noise-invariant features based on the information from the discriminator model and consequentially increases the robustness of the enhancement model to unseen noise types Herein, we regard stationary noises as the source domain (with the ground truth of clean speech) and non-stationary noises as the target domain (without the ground truth) We evaluated the proposed system on TIMIT sentences The experiment results show that the proposed noise adaptive SE system successfully provides significant improvements in PESQ (190%), SSNR (393%), and STOI (270%) over the SE system without an adaptation

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Domain Adversarial for Acoustic Emotion Recognition

Mohammed Abdelwahab, +1 more

- 01 Dec 2018 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: It is shown that exploiting unlabeled data consistently leads to better emotion recognition performance across all emotional dimensions, and the effect of adversarial training on the feature representation across the proposed deep learning architecture is visualize.

...read moreread less

Dissertation

台灣地區噪音下漢語語音聽辨測驗之軟體發展;Software Development of Taiwan Mandarin Hearing In Noise Test

方志豪, +1 more

Posted Content

MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement

Szu-Wei Fu, +3 more

- 13 May 2019 -

arXiv: Sound

TL;DR: A novel MetricGAN approach with an aim to optimize the generator with respect to one or multiple evaluation metrics, which could not be fully optimized by Lp or conventional adversarial losses is proposed.

...read moreread less

Proceedings ArticleDOI

A Cross-Task Transfer Learning Approach to Adapting Deep Speech Enhancement Models to Unseen Background Noise Using Paired Senone Classifiers

Sicheng Wang, +3 more

TL;DR: An environment adaptation approach that improves deep speech enhancement models via minimizing the Kullback-Leibler divergence between posterior probabilities produced by a multi-condition senone classifier fed with noisy speech features to transfer an existing deep neural network (DNN) speech enhancer to specific noisy environments without using noisy/clean paired target waveforms needed in conventional DNN-based spectral regression.

...read moreread less

Journal ArticleDOI

RemixIT: Continual Self-Training of Speech Enhancement Models via Bootstrapped Remixing

Efthymios Tzinis, +5 more

- 17 Feb 2022 -

IEEE Journal of Selected Topics in Signa...

TL;DR: In this article , a self-supervised method for training speech enhancement without the need of a single isolated in-domain speech nor a noise waveform is presented. But, this method is not suitable for unsupervised domain adaptation.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Journal ArticleDOI

Generative Adversarial Nets

Ian Goodfellow, +7 more

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

Posted Content

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

- 22 Dec 2014 -

arXiv: Learning

TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.

...read moreread less

Proceedings ArticleDOI

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Kyunghyun Cho, +8 more

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.

...read moreread less

Journal ArticleDOI

Suppression of acoustic noise in speech using spectral subtraction

S. Boll

- 01 Apr 1979 -

IEEE Transactions on Acoustics, Speech, ...

TL;DR: A stand-alone noise suppression algorithm that resynthesizes a speech waveform and can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.

...read moreread less