scispace - formally typeset
Z

Zhen Huang

Researcher at Georgia Institute of Technology

Publications -  33
Citations -  834

Zhen Huang is an academic researcher from Georgia Institute of Technology. The author has contributed to research in topics: Artificial neural network & Word error rate. The author has an hindex of 14, co-authored 30 publications receiving 707 citations. Previous affiliations of Zhen Huang include Shanghai Jiao Tong University & Apple Inc..

Papers
More filters
Proceedings ArticleDOI

Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement.

TL;DR: In this paper, a multi-objective framework is proposed to learn both secondary targets not directly related to the intended task of speech enhancement, and the primary target of the clean log-power spectra (LPS) features to be used directly for constructing the enhanced speech signals.
Posted Content

Multi-Objective Learning and Mask-Based Post-Processing for Deep Neural Network Based Speech Enhancement

TL;DR: A series of experiments show that joint LPS and MFCC learning improves the SE performance, and IBM-based post-processing further enhances listening quality of the reconstructed speech.
Proceedings ArticleDOI

Rapid adaptation for deep neural networks through multi-task learning.

TL;DR: The proposed MTL adaptation framework improves the learning ability of the original DNN structure, then enlarge the coverage of the acoustic space to deal with the unseen senone problem, and thus enhance the discrimination power of the adapted DNN models.
Journal ArticleDOI

Data-Driven Power Outage Detection by Social Sensors

TL;DR: This paper proposes a novel method to detect and locate power outages based on the information collected from social media using Twitter as a real-time social sensor with a supervised topic model with a heterogeneous information network.
Journal ArticleDOI

An End-to-End Deep Learning Approach to Simultaneous Speech Dereverberation and Acoustic Modeling for Robust Speech Recognition

TL;DR: An integrated end-to-end automatic speech recognition (ASR) paradigm by joint learning of the front-end speech signal processing and back-end acoustic modeling is proposed, leading to a unified deep neural network (DNN) framework for distant speech processing that can achieve both high-quality enhanced speech and high-accuracy ASR simultaneously.