Journal ArticleDOI
The Hearing-Aid Speech Perception Index (HASPI)
Reads0
Chats0
TLDR
HASPI is found to give accurate intelligibility predictions for a wide range of signal degradations including speech degraded by noise and nonlinear distortion, speech processed using frequency compression, noisy speech processed through a noise-suppression algorithm, and speech where the high frequencies are replaced by the output of a noise vocoder.About:
This article is published in Speech Communication.The article was published on 2014-11-01. It has received 149 citations till now. The article focuses on the topics: Intelligibility (communication) & Hearing aid.read more
Citations
More filters
Journal ArticleDOI
An Algorithm for Predicting the Intelligibility of Speech Masked by Modulated Noise Maskers
Jesper Jensen,Cees H. Taal +1 more
TL;DR: It is shown that ESTOI can be interpreted in terms of an orthogonal decomposition of short-time spectrograms into intelligibility subspaces, i.e., a ranking of spectrogram features according to their importance to intelligibility.
Journal ArticleDOI
Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks
TL;DR: The proposed AVDCNN model is structured as an audio–visual encoder–decoder network, in which audio and visual data are first processed using individual CNNs, and then fused into a joint network to generate enhanced speech and reconstructed images at the output layer.
Speech Traneformatione Based on a Sinusoidal Representation
TL;DR: In this article, a speech analysis/synthesis technique is presented which provides the basis for a general class of speech transformations including time-scale modification, frequency scaling, and pitch modification.
Journal ArticleDOI
Objective Quality and Intelligibility Prediction for Users of Assistive Listening Devices: Advantages and limitations of existing tools
Tiago H. Falk,Vijay Parsa,João Felipe Santos,Kathryn H. Arehart,Oldooz Hazrati,Rainer Huber,James M. Kates,Susan Scollie +7 more
TL;DR: An overview of 12 existing objective speech quality and intelligibility prediction tools is presented and recommendations are given for suggested uses of the different tools under specific environmental and processing conditions.
Posted Content
An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
TL;DR: This paper provides a systematic survey of this research topic, focusing on the main elements that characterise the systems in the literature: acoustic features; visual features; deep learning methods; fusion techniques; training targets; and objective functions.
References
More filters
Journal ArticleDOI
Tests for comparing elements of a correlation matrix.
TL;DR: This article reviewed the literature on such tests, pointed out some statistics that should be avoided, and presented a variety of techniques that can be used safely with medium to large samples, and several illustrative numerical examples are provided.
Journal ArticleDOI
Speech recognition with primarily temporal cues.
TL;DR: Nearly perfect speech recognition was observed under conditions of greatly reduced spectral information; the presentation of a dynamic temporal pattern in only a few broad spectral regions is sufficient for the recognition of speech.
Journal ArticleDOI
Development of the Hearing In Noise Test for the measurement of speech reception thresholds in quiet and in noise
TL;DR: The mean-squared level of each digitally recorded sentence was adjusted to equate intelligibility when presented in spectrally matched noise to normal-hearing listeners, and statistical reliability and efficiency suit it to practical applications in which measures of speech intelligibility are required.
Journal ArticleDOI
An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech
TL;DR: A short-time objective intelligibility measure (STOI) is presented, which shows high correlation with the intelligibility of noisy and time-frequency weighted noisy speech (e.g., resulting from noise reduction) of three different listening experiments and showed better correlation with speech intelligibility compared to five other reference objective intelligible models.
Journal ArticleDOI
Speech analysis/Synthesis based on a sinusoidal representation
R.J. McAulay,Thomas F. Quatieri +1 more
TL;DR: A sinusoidal model for the speech waveform is used to develop a new analysis/synthesis technique that is characterized by the amplitudes, frequencies, and phases of the component sine waves, which forms the basis for new approaches to the problems of speech transformations including time-scale and pitch-scale modification, and midrate speech coding.
Related Papers (5)
Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator
Yariv Ephraim,David Malah +1 more