scispace - formally typeset
Open Access

Segmentation and labeling of speech: a comparative performance evaluation.

TLDR
This thesis attempts to evaluate and understand the relative merits of a number of alternative design choices at that level of speech recognition, and develops a methodology for such comparative evaluation.
Abstract
: This thesis studies speech recognition at the parametric level. It attempts to evaluate and understand the relative merits of a number of alternative design choices at that level. In particular, it involves an investigation of segmentation and labeling techniques, and the use of parametric representations for the acoustic signal. Every speech recognition system employs some parametric representation and some initial signal to symbol transformation. The author shows the performance currently available for these initial processes, and asserts that such performance is comparable to human performance. After presenting the relative merits of some typical parametric representations, we develop a methodology for such comparative evaluation. Simple, parameter- independent schemes for segmenting, labeling, and training are also developed. The role of pattern classification techniques is clarified, as it relates to the initial signal to symbol transformation. Four parametric representations were chosen for study: a set of amplitudes and zero-crossing measurements from 5 octave filters; a set of energy measurements from a 1/3 octave filter bank; a smoothed, short-time spectrum computed from the LPC filter and the LPC coefficients themselves. Note that the first two involve the use of analog devices. Each method yields a set of measurements at uniform, short intervals--a pattern. Distance functions, chosen from pattern classification theory, are then applied to the parameter patterns as measures of acoustic similarity.

read more

Citations
More filters
Journal ArticleDOI

Speech recognition by machine: A review

TL;DR: This paper provides a review of recent developments in speech recognition research and the concept of sources of knowledge is introduced and the use of knowledge to generate and verify hypotheses is discussed.
Book

The HARPY speech recognition system

TL;DR: The HARPY system is the result of an attempt to understand the relative importance of various design choices of two earlier speech recognition systems developed at Carnegie-Mellon University, in which knowledge is represented as a finite state transition network but without the a-priori transition probabilities.

Harpy, production systems and human cognition

TL;DR: A viable and interesting theory of human speech perception has been generated by constructing a psychological model of speech perception that is faithful to Harpy and asking whether it is acceptable given what the authors know about human processing capabilities.
Proceedings Article

An Improved Speech Segmentation Quality Measure: the R-value

TL;DR: A new R-value quality measure is introduced that indicates how close a segmentation algorithm’s performance is to an ideal point of operation after established measures were found to be insensitive to this type of random boundary insertion.
References
More filters
Journal ArticleDOI

A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition

TL;DR: A pattern recognition approach for deciding whether a given segment of a speech signal should be classified as voiced speech, unvoiced speech, or silence, based on measurements made on the signal, which has been found to provide reliable classification with speech segments as short as 10 ms.
Journal ArticleDOI

Speech recognition: A model and a program for research

TL;DR: A speech recognition model is proposed in which the transformation from an input speech signal into a sequence of phonemes is carried out largely through an active or feedback process.
Journal ArticleDOI

Organization of the Hearsay II speech understanding system

TL;DR: The issues of the system organization of the HSII system are dealt with, which include a convenient modular structure for incorporating new knowledge into the system at any level, and a system structure suitable for execution on a parallel processing system.
Related Papers (5)