scispace - formally typeset
Open AccessJournal ArticleDOI

Speech enhancement using linear prediction residual

Reads0
Chats0
TLDR
The objective is to selectively enhance the high signal-to-noise ratio (SNR) regions in the noisy speech in the temporal and spectral domains, without causing significant distortion in the resulting enhanced speech.
About
This article is published in Speech Communication.The article was published on 1999-05-01 and is currently open access. It has received 87 citations till now. The article focuses on the topics: Speech enhancement & Speech processing.

read more

Citations
More filters
Journal ArticleDOI

Significance of Vowel-Like Regions for Speaker Verification Under Degraded Conditions

TL;DR: Vowel-like regions (VLRs) in speech includes vowels, semi-vowels, and diphthong sound units are detected using the knowledge of VLROPs during training and testing and significant improvement in the performance is reported for speaker verification under degraded conditions.
Journal ArticleDOI

Speech Enhancement Using Perceptual Wavelet Packet Decomposition and Teager Energy Operator

TL;DR: Experimental results demonstrate that the speech enhancement method presented in this paper is capable of outperforming conventional noise cancellation schemes.
Journal ArticleDOI

Speaker Verification by Vowel and Nonvowel Like Segmentation

TL;DR: The VLRs and non-VLRs are used independently during training and testing of a speaker verification (SV) system to reduce gross level mismatch due to sound units and achieve better compensation of degradation effects by applying different normalization to these two different energy regions.
Journal ArticleDOI

Epoch-based analysis of speech signals

TL;DR: In this paper, the importance of epochs for speech analysis is discussed, and methods to extract the epoch information are reviewed, and applications of epoch extraction for some speech applications are demonstrated.
Journal ArticleDOI

Enhancement of noisy speech by temporal and spectral processing

TL;DR: A noisy speech enhancement method by combining linear prediction (LP) residual weighting in the time domain and spectral processing in the frequency domain to provide better noise suppression as well as better enhancement in the speech regions is presented.
References
More filters
Book

Adaptive Filter Theory

Simon Haykin
TL;DR: In this paper, the authors propose a recursive least square adaptive filter (RLF) based on the Kalman filter, which is used as the unifying base for RLS Filters.
Journal ArticleDOI

Suppression of acoustic noise in speech using spectral subtraction

TL;DR: A stand-alone noise suppression algorithm that resynthesizes a speech waveform and can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.
Book

Discrete-Time Processing of Speech Signals

TL;DR: The preface to the IEEE Edition explains the background to speech production, coding, and quality assessment and introduces the Hidden Markov Model, the Artificial Neural Network, and Speech Enhancement.
Journal ArticleDOI

RASTA processing of speech

TL;DR: The theoretical and experimental foundations of the RASTA method are reviewed, the relationship with human auditory perception is discussed, the original method is extended to combinations of additive noise and convolutional noise, and an application is shown to speech enhancement.
Journal ArticleDOI

A signal subspace approach for speech enhancement

TL;DR: The popular spectral subtraction speech enhancement approach is shown to be a signal subspace approach which is optimal in an asymptotic (large sample) linear minimum mean square error sense, assuming the signal and noise are stationary.
Related Papers (5)
Frequently Asked Questions (14)
Q1. What contributions have the authors mentioned in the paper "Speech enhancement using linear prediction residual" ?

In this paper the authors propose a method for enhancement of speech in the presence of additive noise. 

A weight function at the ®ne level can be derived from the residual signal energy plot to deemphasize the segments corresponding to the valleys relative to the segments corresponding to the peaks. 

The modi®ed residual signal is used to excite the time± varying all-pole ®lter, updated every 2 ms, to generate the enhanced speech. 

A weight function is derived from the smoothed inverse ¯atness characteristics in such a way that the residual signal samples in the regions corresponding to low values of the inverse ¯atness are reduced relative to the residual signal samples in the regions corresponding to high values of the inverse ¯atness. 

Temporal sequence of these peaks also produces discontinuities in the contours of the spectral peaks when compared with the smooth contours encountered in natural speech. 

The setting of various thresholds in the processing is primarily dictated by the listener's tolerance to annoyance due to noise and preference to speech quality. 

The LP residual signal of noisy speech was modi®ed retaining only the 2 ms portions of the residual signal around the instants of excitation. 

The choice of the parameters depends on listener's preference, as the e ect of these parameters on the resulting quality of the enhanced speech is gradual and not abrupt. 

A mapping function of the type shown in Fig. 2 can be used to map the smoothed inverse spectral ¯atness values to the weight values for each short (2 ms) frame of residual signal. 

Note that the weighting of the residual signal at the ®ne level (i.e. relative emphasis of the residual signal samples within a glottal cycle) should be mild to avoid distortion in the processed speech. 

The spectral ¯atness characteristics are derived by comparing the energy in the residual signal with the energy in the noisy speech signal in each short interval of about 2 ms. 

But these features are not useful for enhancement, since for generating the enhanced speech signal one needs both the spectral envelope and excitation for each (short-time) analysis frame. 

The primary reason for this is that, in the source signal such as the linear prediction (LP) residual signal the samples are uncorrelated and hence the residual samples are more like noise than like a signal. 

For each small window of the residual signal, the energy ratio of the noisy speech signal and the corresponding portion of the residual signal gives anindication of the amount of reduction in the correlation of the signal samples.