scispace - formally typeset
D

DeLiang Wang

Researcher at Ohio State University

Publications -  475
Citations -  28623

DeLiang Wang is an academic researcher from Ohio State University. The author has contributed to research in topics: Speech processing & Speech enhancement. The author has an hindex of 82, co-authored 440 publications receiving 23687 citations. Previous affiliations of DeLiang Wang include Massachusetts Institute of Technology & Tsinghua University.

Papers
More filters
Proceedings ArticleDOI

Robust speaker identification in noisy and reverberant conditions

TL;DR: This paper first removes background noise through binary masking using a deep neural network classifier, then performs robust SID with speaker models trained in selected reverberant conditions, using bounded marginalization and direct masking.
Journal ArticleDOI

Robust speech recognition from binary masks

TL;DR: This letter proposes a fundamentally different approach to robust automatic speech recognition that is compared with a traditional HMM based approach and is shown to perform well under low SNR conditions.
Proceedings ArticleDOI

Musical Sound Separation Using Pitch-Based Labeling and Binary Time-Frequency Masking

TL;DR: A system that decomposes an input into time-frequency units using an auditory filterbank and utilizes pitch to label which instrument line each time- frequencies unit is assigned to is proposed.
Journal ArticleDOI

Image segmentation using local spectral histograms and linear regression

TL;DR: This work formulate the segmentation problem as a multivariate linear regression, where the solution is obtained by least squares estimation, and proposes an algorithm to automatically identify representative features corresponding to different homogeneous regions.
Proceedings ArticleDOI

Neural Vocoder is All You Need for Speech Super-resolution

TL;DR: This paper proposes a neural vocoder based speech super-resolution method that can handle a variety of input resolution and upsampling ratios and demonstrates that prior knowledge in the pre-trained vocoder is crucial for speech SR by performing mel-bandwidth extension with a simple replication-padding method.