scispace - formally typeset
Proceedings ArticleDOI

Robust recognition of loud and Lombard speech in the fighter cockpit environment

Reads0
Chats0
TLDR
In this paper, a method is devised that uses the differences in spectral slope between linear predictive coding log magnitude spectra to weight the point-by-point energy differences between the spectra.
Abstract
The major goal of this research is to reduce the discrepancy in recognition performance between normal and abnormal speech, given that reference templates were derived only from normal speech. A method is devised that uses the differences in spectral slope between linear predictive coding log magnitude spectra to weight the point-by-point energy differences between the spectra. The distances of all reference tokens of like phonemes are combined to form a smallest cumulative distance (SCD) method. When SCD is combined with the method of slope-dependent weighting (SDW), the most significant success is obtained. In terms of error rates for a fixed phoneme vector length of five, SDW+SCD is found to reduce the difference in error rate between normal and abnormal speech by approximately 50%. >

read more

Citations
More filters
Journal ArticleDOI

Nonlinear feature based classification of speech under stress

TL;DR: Three new features derived from the nonlinear Teager (1980) energy operator (TEO) are investigated for stress classification and it is shown that the TEO-CB-Auto-Env feature outperforms traditional pitch and mel-frequency cepstrum coefficients (MFCC) substantially.
Journal ArticleDOI

A comparative study of traditional and newly proposed features for recognition of speech under stress

TL;DR: The results show that unlike fast Fourier transform's (FFT) immunity to noise, the linear prediction power spectrum is more immune than FFT to stress as well as to a combination of a noisy and stressful environment.
Journal ArticleDOI

Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition

TL;DR: It is suggested that recent studies based on a Source Generator Framework can provide a viable foundation in which to establish robust speech recognition techniques, and three novel approaches for signal enhancement and stress equalization are considered to address the issue of recognition under noisy stressful conditions.
Journal ArticleDOI

On the relationship between, and measurement of, amplitude and frequency in birdsong

TL;DR: A growing number of studies ask whether and how bird songs vary between areas with low versus high levels of anthropogenic noise as discussed by the authors and find that birds are seen to sing at higher frequencies in urban versus rural populations, presumably because of selection for higher-pitched songs in the face of low-frequency urban noise.
Journal ArticleDOI

Feature analysis and neural network-based classification of speech under stress

TL;DR: Several speech features are considered as potential stress-sensitive relayers using a previously established stressed speech database (SUSAS) and a neural network-based classifier is formulated based on an extended delta-bar-delta learning rule.
References
More filters
Journal ArticleDOI

Dynamic programming algorithm optimization for spoken word recognition

TL;DR: This paper reports on an optimum dynamic progxamming (DP) based time-normalization algorithm for spoken word recognition, in which the warping function slope is restricted so as to improve discrimination between words in different categories.
Journal ArticleDOI

Minimum prediction residual principle applied to speech recognition

TL;DR: A computer system is described in which isolated words, spoken by a designated talker, are recognized through calculation of a minimum prediction residual through optimally registering the reference LPC onto the input autocorrelation coefficients using the dynamic programming algorithm.
Journal ArticleDOI

Distance measures for speech processing

TL;DR: The likelihood ratio, cepstral measure, and cosh measure are easily evaluated recursively from linear prediction filter coefficients, and each has a meaningful and interrelated frequency domain interpretation.
Journal ArticleDOI

On the performance of the quefrency-weighted cepstral coefficients in vowel recognition

TL;DR: The quefrency-weighted cepstral coefficients (also known as the root-power sums) are studied as to their effectiveness in a vowel recognition experiment and found to perform better than the cepStral coefficients with a Euclidean distance measure.
Journal ArticleDOI

Spectral slope distance measures with linear prediction analysis for word recognition in noise

TL;DR: Initial testing of spectral slope distance measures derived from linear prediction analysis models of speech for speaker-dependent isolated word recognition indicates that they give considerable performance improvement over the standard cepstral distance measure in several noise conditions.
Related Papers (5)