scispace - formally typeset
D

DeLiang Wang

Researcher at Ohio State University

Publications -  475
Citations -  28623

DeLiang Wang is an academic researcher from Ohio State University. The author has contributed to research in topics: Speech processing & Speech enhancement. The author has an hindex of 82, co-authored 440 publications receiving 23687 citations. Previous affiliations of DeLiang Wang include Massachusetts Institute of Technology & Tsinghua University.

Papers
More filters
Posted Content

Self-attending RNN for Speech Enhancement to Improve Cross-corpus Generalization.

TL;DR: In this paper, a self-attending recurrent neural network (SARNN) is proposed for time-domain speech enhancement to improve cross-corpus generalization, which consists of recurrent neural networks (RNNs) augmented with selfattention blocks and feedforward blocks.
Proceedings ArticleDOI

Deep neural networks for estimating speech model activations

TL;DR: This paper uses two stages of deep neural networks, where the first stage estimates the ideal ratio mask that separates speech from noise, and the second stage maps the ratio-masked speech to the clean speech activation matrices that are used for nonnegative matrix factorization (NMF).
Posted Content

Dual-path Self-Attention RNN for Real-Time Speech Enhancement.

TL;DR: A real-time dual-path self-attention recurrent neural network (DP-SARNN) is proposed by using long short-term memory (LSTM) RNN and causal attention in inter-chunk SARNN and significantly outperforms existing approaches to speech enhancement.
Journal ArticleDOI

Neural Cascade Architecture With Triple-Domain Loss for Speech Enhancement

TL;DR: This paper proposed a neural cascade architecture to address the monaural speech enhancement problem, which consists of three modules which optimize in turn enhanced speech with respect to the magnitude spectrogram, the time-domain signal and the complex spectrogram.
Proceedings ArticleDOI

Appearance-based recognition using perceptual components

TL;DR: A spectral histogram model is employed for generic appearance-based recognition, which uses the nearest neighbor classifier to classify an unseen input image, where each object class is represented by the perceptual components of the training images.