On the effects of varying filter bank parameters on isolated word recognition

doi:10.1109/TASSP.1983.1164172

Journal ArticleDOI

On the effects of varying filter bank parameters on isolated word recognition

B. A. Dautrich, +2 more

- 01 Aug 1983 -

IEEE Transactions on Acoustics, Speech, ...

- Vol. 31, Iss: 4, pp 793-807

TLDR

Results of performance evaluation of several types of filter bank analyzers in a speaker trained isolated word recognition test using dialed-up telephone line recordings indicate that the best performance is obtained by both a 15-channel uniform filter bank and a 13-channel nonuniform filter bank.

Abstract:

The vast majority of commercially available isolated word recognizers use a filter bank analysis as the front end processing for recognition. It is not well understood how the parameters of different filter banks (e.g., number of filters, types of filters, filter spacing, etc.) affect recognizer performance. In this paper we present results of performance evaluation of several types of filter bank analyzers in a speaker trained isolated word recognition test using dialed-up telephone line recordings. We have studied both DFT (discrete Fourier transform) and direct form implementations of the filter banks. We have also considered uniform and nonuniform filter spacings. The results indicate that the best performance (highest word accuracy) is obtained by both a 15-channel uniform filter bank and a 13-channel nonuniform filter bank (with channels spacing along a critical band scale). The performance of a 7-channel critical band filter bank is almost as good as that of the two best filter banks. In comparison to a conventional linear predictive coding (LPC) word recognizer, the performance of the best filter bank recognizers was, on average, several percent worse than that of an eighth-order LPC-based recognizer. A discussion as to why some filter banks performed better than others, and why the LPC-based system did the best, is given in this paper.

On the effects of varying filter bank parameters on isolated word recognition

Citations

Hidden Markov models for speech recognition

Signal modeling techniques in speech recognition

Speech recognition in noisy environments: a survey

Extraction of visual features for lipreading

On the use of bandpass liftering in speech recognition

References

Theory and application of digital signal processing

Minimum prediction residual principle applied to speech recognition

Speech Analysis, Synthesis and Perception

Subdivision of the Audible Frequency Range into Critical Bands (Frequenzgruppen)

Distance measures for speech processing

Related Papers (5)

Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences

Minimum prediction residual principle applied to speech recognition

Suppression of acoustic noise in speech using spectral subtraction

An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition

Fundamentals of speech recognition