Open AccessPosted Content
Multi-Microphone Complex Spectral Mapping for Speech Dereverberation
Zhong-Qiu Wang,DeLiang Wang +1 more
Reads0
Chats0
TLDR
Experimental results on multi-channel speech dereverberation demonstrate the effectiveness of the proposed approach and the integration of multi-microphone complex spectral mapping with beamforming and post-filtering is investigated.Abstract:
This study proposes a multi-microphone complex spectral mapping approach for speech dereverberation on a fixed array geometry. In the proposed approach, a deep neural network (DNN) is trained to predict the real and imaginary (RI) components of direct sound from the stacked reverberant (and noisy) RI components of multiple microphones. We also investigate the integration of multi-microphone complex spectral mapping with beamforming and post-filtering. Experimental results on multi-channel speech dereverberation demonstrate the effectiveness of the proposed approach.read more
Citations
More filters
Posted Content
A consolidated view of loss functions for supervised deep learning-based speech enhancement
Sebastian Braun,Ivan Tashev +1 more
TL;DR: This work investigates a wide variety of loss spectral functions for a recurrent neural network architecture suitable to operate in online frame-by-frame processing and reveals that combining magnitude-only with phase-aware objectives always leads to improvements, even when the phase is not enhanced.
Journal ArticleDOI
Neural Spectrospatial Filtering
TL;DR: It is concluded that this neural spectrospatial filter provides a strong alternative to traditional and mask-based beamforming and achieves separation performance comparable to or better than beamforming for different array geometries and speech separation tasks and reduces to monaural complex spectral mapping in single-channel conditions.
Proceedings ArticleDOI
A consolidated view of loss functions for supervised deep learning-based speech enhancement
Sebastian Braun,Ivan Tashev +1 more
TL;DR: In this article, the authors investigated a wide variety of loss spectral functions for a recurrent neural network architecture suitable to operate in online frame-by-frame processing and found that combining magnitude-only with phase-aware objectives always leads to improvements, even when the phase is not enhanced.
Journal ArticleDOI
TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation
TL;DR: TF-GridNet as mentioned in this paper is a multi-path deep neural network (DNN) integrating full-and sub-band modeling in the T-F domain, which achieves state-of-the-art performance on the multi-channel tasks of SMS-WSJ and WHAMR!.
Journal ArticleDOI
STFT-Domain Neural Speech Enhancement With Very Low Algorithmic Latency
TL;DR: Compared with Conv-TasNet, the STFT- domain system can achieve better enhancement performance for a comparable amount of computation, or comparable perfor- mance with less computation, maintaining strong performance at an algorithmic latency as low as 2 ms.
References
More filters
Book ChapterDOI
U-Net: Convolutional Networks for Biomedical Image Segmentation
TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Proceedings ArticleDOI
Densely Connected Convolutional Networks
TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.
Proceedings ArticleDOI
Deep clustering: Discriminative embeddings for segmentation and separation
TL;DR: In this paper, a deep network is trained to assign contrastive embedding vectors to each time-frequency region of the spectrogram in order to implicitly predict the segmentation labels of the target spectrogram from the input mixtures.
Journal ArticleDOI
Supervised Speech Separation Based on Deep Learning: An Overview
DeLiang Wang,Jitong Chen +1 more
TL;DR: A comprehensive overview of deep learning-based supervised speech separation can be found in this paper, where three main components of supervised separation are discussed: learning machines, training targets, and acoustic features.
Journal ArticleDOI
Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks
TL;DR: In this article, the utterance-level permutation invariant training (uPIT) technique was proposed for speaker independent multitalker speech separation, where RNNs, trained with uPIT, can separate multitalker mixed speech without any prior knowledge of signal duration, number of speakers, speaker identity, or gender.
Related Papers (5)
Multi-Microphone Complex Spectral Mapping for Speech Dereverberation
Zhong-Qiu Wang,DeLiang Wang +1 more