Multi-Microphone Complex Spectral Mapping for Speech Dereverberation

Open AccessPosted Content

Multi-Microphone Complex Spectral Mapping for Speech Dereverberation

Zhong-Qiu Wang, +1 more

- 04 Mar 2020 -

arXiv: Audio and Speech Processing

Chats0

TLDR

Experimental results on multi-channel speech dereverberation demonstrate the effectiveness of the proposed approach and the integration of multi-microphone complex spectral mapping with beamforming and post-filtering is investigated.

Abstract:

This study proposes a multi-microphone complex spectral mapping approach for speech dereverberation on a fixed array geometry. In the proposed approach, a deep neural network (DNN) is trained to predict the real and imaginary (RI) components of direct sound from the stacked reverberant (and noisy) RI components of multiple microphones. We also investigate the integration of multi-microphone complex spectral mapping with beamforming and post-filtering. Experimental results on multi-channel speech dereverberation demonstrate the effectiveness of the proposed approach.

Citations

PDF

Open Access

More filters

Posted Content

A consolidated view of loss functions for supervised deep learning-based speech enhancement

Sebastian Braun, +1 more

- 25 Sep 2020 -

arXiv: Audio and Speech Processing

TL;DR: This work investigates a wide variety of loss spectral functions for a recurrent neural network architecture suitable to operate in online frame-by-frame processing and reveals that combining magnitude-only with phase-aware objectives always leads to improvements, even when the phase is not enhanced.

...read moreread less

Journal ArticleDOI

Neural Spectrospatial Filtering

K.F. Tan, +2 more

- 01 Jan 2022 -

IEEE/ACM transactions on audio, speech, ...

TL;DR: It is concluded that this neural spectrospatial filter provides a strong alternative to traditional and mask-based beamforming and achieves separation performance comparable to or better than beamforming for different array geometries and speech separation tasks and reduces to monaural complex spectral mapping in single-channel conditions.

...read moreread less

Proceedings ArticleDOI

A consolidated view of loss functions for supervised deep learning-based speech enhancement

Sebastian Braun, +1 more

TL;DR: In this article, the authors investigated a wide variety of loss spectral functions for a recurrent neural network architecture suitable to operate in online frame-by-frame processing and found that combining magnitude-only with phase-aware objectives always leads to improvements, even when the phase is not enhanced.

...read moreread less

Journal ArticleDOI

TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation

Zhong Lin Wang, +5 more

- 22 Nov 2022 -

arXiv.org

TL;DR: TF-GridNet as mentioned in this paper is a multi-path deep neural network (DNN) integrating full-and sub-band modeling in the T-F domain, which achieves state-of-the-art performance on the multi-channel tasks of SMS-WSJ and WHAMR!.

...read moreread less

Journal ArticleDOI

STFT-Domain Neural Speech Enhancement With Very Low Algorithmic Latency

Zhong-Qiu Wang, +3 more

- 21 Apr 2022 -

IEEE/ACM transactions on audio, speech, ...

TL;DR: Compared with Conv-TasNet, the STFT- domain system can achieve better enhancement performance for a comparable amount of computation, or comparable perfor- mance with less computation, maintaining strong performance at an algorithmic latency as low as 2 ms.

...read moreread less

References

PDF

Open Access

More filters

Book ChapterDOI

U-Net: Convolutional Networks for Biomedical Image Segmentation

Olaf Ronneberger, +2 more

TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.

...read moreread less

Proceedings ArticleDOI

Densely Connected Convolutional Networks

Gao Huang, +3 more

TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.

...read moreread less

Proceedings ArticleDOI

Deep clustering: Discriminative embeddings for segmentation and separation

John R. Hershey, +3 more

TL;DR: In this paper, a deep network is trained to assign contrastive embedding vectors to each time-frequency region of the spectrogram in order to implicitly predict the segmentation labels of the target spectrogram from the input mixtures.

...read moreread less

Journal ArticleDOI

Supervised Speech Separation Based on Deep Learning: An Overview

DeLiang Wang, +1 more

- 01 Oct 2018 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: A comprehensive overview of deep learning-based supervised speech separation can be found in this paper, where three main components of supervised separation are discussed: learning machines, training targets, and acoustic features.

...read moreread less

Journal ArticleDOI

Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks

Morten Kolbæk, +3 more

- 01 Oct 2017 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: In this article, the utterance-level permutation invariant training (uPIT) technique was proposed for speaker independent multitalker speech separation, where RNNs, trained with uPIT, can separate multitalker mixed speech without any prior knowledge of signal duration, number of speakers, speaker identity, or gender.

...read moreread less