Speech recognition from GSM codec parameters.

Open AccessProceedings Article

Speech recognition from GSM codec parameters.

TLDR

It is observed that by selectively combining the cepstral streams representing the LPC parameters and the residual signal it is possible to obtain recognition accuracy directly from the coded parameters that equals or exceeds the recognition accuracy obtained from the reconstructed waveforms.

Abstract:

Speech coding affects speech recognition performance, with recognition accuracy deteriorating as the coded bit rate decreases. Virtually all systems that recognize coded speech reconstruct the speech waveform from the coded parameters, and then perform recognition (after possible noise and/or channel compensation) using conventional techniques. In this paper we compare the recognition accuracy of coded speech obtained by reconstructing the speech waveform with the speech recognition accuracy obtained when using cepstral features derived from the coding parameters. We focus our efforts on speech that has been coded using the 13-kbps full-rate GSM codec, a Regular Pulse Excited Long Term Prediction (RPE-LTP) codec. The GSM codec develops separate representations for the linear prediction (LPC) filter and the residual signal components of the coded speech. We measure the effects of quantization and coding on the accuracy with which these parameters are represented, and present two different methods for recombining them for speech recognition purposes. We observe that by selectively combining the cepstral streams representing the LPC parameters and the residual signal it is possible to obtain recognition accuracy directly from the coded parameters that equals or exceeds the recognition accuracy obtained from the reconstructed waveforms.

Speech recognition from GSM codec parameters.

Citations

Electronic mobile guides: a survey

A bitstream-based front-end for wireless speech recognition on IS-136 communications system

Graceful degradation of speech recognition performance over packet-erasure networks

Speech recognition in mobile environments

Automatic speech recognition over error-prone wireless networks☆

References

Fundamentals of speech recognition

Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification

Speech Coding and Synthesis

Regular-pulse excitation--A novel approach to effective and efficient multipulse coding of speech

Effect of speech coders on speech recognition performance

Related Papers (5)

Effect of speech coders on speech recognition performance

A bitstream-based front-end for wireless speech recognition on IS-136 communications system

The influence of speech coding algorithms on automatic speech recognition

Quantization of cepstral parameters for speech recognition over the World Wide Web

Distributed speech recognition with codec parameters