scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Acoustics, Speech, and Signal Processing in 1979"


Journal ArticleDOI
S. Boll1
TL;DR: A stand-alone noise suppression algorithm that resynthesizes a speech waveform and can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.
Abstract: A stand-alone noise suppression algorithm is presented for reducing the spectral effects of acoustically added noise in speech. Effective performance of digital speech processors operating in practical environments may require suppression of noise from the digital wave-form. Spectral subtraction offers a computationally efficient, processor-independent approach to effective digital speech analysis. The method, requiring about the same computation as high-speed convolution, suppresses stationary noise from speech by subtracting the spectral noise bias calculated during nonspeech activity. Secondary procedures are then applied to attenuate the residual noise left after subtraction. Since the algorithm resynthesizes a speech waveform, it can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.

4,862 citations


Journal ArticleDOI
TL;DR: A fast algorithm for two-dimensional median filtering based on storing and updating the gray level histogram of the picture elements in the window is presented, which is much faster than conventional sorting methods.
Abstract: We present a fast algorithm for two-dimensional median filtering. It is based on storing and updating the gray level histogram of the picture elements in the window. The algorithm is much faster than conventional sorting methods. For a window size of m × n, the computer time required is 0(n).

1,298 citations


Journal ArticleDOI
TL;DR: The author shows how stable finite difference operators can be derived to extrapolate acoustic wavefields in space and is widely applied in the petroleum industry in its effort to image subsurface seismic reflectors.
Abstract: This book describes digital signal processing methods and principles as they are used by geophysicists in the processing of seismic data. As one of the few books which discusses timeseries analysis from this perspective, it will probably become a standard for geophysicists. The material for this book is derived from two sources. The first part of the book results from class notes used by the author in a one-semester course in time-series analysis and the physics of stratified media given to geophysics students at the first-year graduate level at Stanford University, Stanford, CA. The second part of the book was compiled from work done by the author and his graduate students, and comprises much of their research efforts over the past six years. By applying specific approximations to the scalar wave equation, the author shows how stable finite difference operators can be derived to extrapolate acoustic wavefields in space. These results are now widely applied in the petroleum industry in its effort to image subsurface seismic reflectors

768 citations


Journal ArticleDOI
TL;DR: Improved speech quality is obtained by efficient removal of formant and pitch-related redundant structure of speech before quantizing, and by effective masking of the quantizer noise by the speech signal.
Abstract: Predictive coding methods attempt to minimize the rms error in the coded signal. However, the human ear does not perceive signal distortion on the basis of rms error, regardless of its spectral shape relative to the signal spectrum. In designing a coder for speech signals, it is necessary to consider the spectrum of the quantization noise and its relation to the speech spectrum. The theory of auditory masking suggests that noise in the formant regions would be partially or totally masked by the speech signal. Thus, a large part of the perceived noise in a coder comes from frequency regions where the signal level is low. In this paper, methods for reducing the subjective distortion in predictive coders for speech signals are described and evaluated. Improved speech quality is obtained: 1) by efficient removal of formant and pitch-related redundant structure of speech before quantizing, and 2) by effective masking of the quantizer noise by the speech signal.

376 citations


Journal ArticleDOI
TL;DR: Based on a linear model of speech production, it is shown that both the moment of glottal closure and opening can be determined from the normalized total squared error with proper choices of analysis window length and filter order.
Abstract: Covariance analysis as a least squares approach for accurately performing glottal inverse filtering from the acoustic speech waveform is discussed. Best results are obtained by situating the analysis window within a stable closed glottis interval. Based on a linear model of speech production, it is shown that both the moment of glottal closure and opening can be determined from the normalized total squared error with proper choices of analysis window length and filter order. Results from actual speech are presented to illustrate the technique.

347 citations


Journal ArticleDOI
TL;DR: In this paper, an interpretation of LP residual by considering the effect of the shape of glottal pulses, inaccurate estimation of formants and bandwidths, phase angles of formant at the instants of excitation, and zeros in the vocal tract system is presented.
Abstract: In voiced speech analysis epochal information is useful in accurate estimation of pitch periods and the frequency response of the vocal tract system. Ideally, linear prediction (LP) residual should give impulses at epochs. However, there are often ambiguities in the direct use of LP residual since samples of either polarity occur around epochs. Further, since the digital inverse filter does not compensate the phase response of the vocal tract system exactly, there is an uncertainty in the estimated epoch position. In this paper we present an interpretation of LP residual by considering the effect of the following factors: 1) the shape of glottal pulses, 2) inaccurate estimation of formants and bandwidths, 3) phase angles of formants at the instants of excitation, and 4) zeros in the vocal tract system. A method for the unambiguous identification of epochs from LP residual is then presented. The accuracy of the method is tested by comparing the results with the epochs obtained from the estimated glottal pulse shapes for several vowel segments. The method is used to identify the closed glottis interval for the estimation of the true frequency response of the vocal tract system.

291 citations


Journal ArticleDOI
H. Sakoe1
TL;DR: A general principle of connected word recognition is given based on pattern matching between unknown continuous speech and artificially synthesized connected reference patterns and Computation time and memory requirement are both proved to be within reasonable limits.
Abstract: This paper reports a pattern matching approach to connected word recognition. First, a general principle of connected word recognition is given based on pattern matching between unknown continuous speech and artificially synthesized connected reference patterns. Time-normalization capability is allowed by use of dynamic programming-based time-warping technique (DP-matching). Then, it is shown that the matching process is efficiently carried out by breaking it down into two steps. The derived algorithm is extensively subjected to recognition experiments. It is shown in a talker-adapted recognition experiment that digit data (one to four digits) connectedly spoken by five persons are recognized with as high as 99.6 percent accuracy. Computation time and memory requirement are both proved to be within reasonable limits.

289 citations


Journal ArticleDOI
TL;DR: A speaker-independent isolated word recognition system is described which is based on the use of multiple templates for each word in the vocabulary, and shows error rates that are comparable to, or better than, those obtained with speaker-trained isolatedword recognition systems.
Abstract: A speaker-independent isolated word recognition system is described which is based on the use of multiple templates for each word in the vocabulary. The word templates are obtained from a statistical clustering analysis of a large database consisting of 100 replications of each word (i.e., once by each of 100 talkers). The recognition system, which accepts telephone quality speech input, is based on an LPC analysis of the unknown word, dynamic time warping of each reference template to the unknown word (using the Itakura LPC distance measure), and the application of a K-nearest neighbor (KNN) decision rule. Results for several test sets of data are presented. They show error rates that are comparable to, or better than, those obtained with speaker-trained isolated word recognition systems.

245 citations


Journal ArticleDOI
TL;DR: In this paper, the authors presented a technique for overcoming the problem of tissue absorption in emission tomography by solving a linear partial-differential equation that links the derivatives of the spectrum with respect to absorption and spatial frequencies.
Abstract: This paper presents a technique for overcoming the problem of tissue absorption in emission tomography. Given a set of equispaced projections in the interval (0, 2π), it is possible to derive an exact formula for recovering the spectrum of the image. The formula is obtained by solving a linear partial-differential equation that links the derivatives of the spectrum with respect to absorption and spatial frequencies. Examples of the application of the technique to synthetic and real data are given.

229 citations


Journal ArticleDOI
J. Treichler1
TL;DR: The eigenvalue-eigenvector technique is used to evaluate the ALE's performance as an adaptive prewhitener for autoregressive (AR) models with white observation noise and to quantify the convergence time and characteristics of the ALE.
Abstract: The adaptive line enhancer (ALE) was first described as a practical technique for separating the periodic from the broad-band components of an input signal and for detecting the presence of a sinusoid in white noise. Subsequent work has shown that this adaptive filtering structure is applicable to spectral estimation, predictive deconvolution, speech processing, interference rejection, and other applications which have historically used matrix inversion or Levinson's algorithm techniques. This paper uses an eigenvalue-eigenvector analysis of the expected ALE impulse response vector to demonstrate properties of the convergent filter and to quantify the convergence time and characteristics of the ALE. In particular the ALE's response to a sinusoid plus white noise input is derived and compared to a computer simulation of the ALE with such an input. The eigenvalue-eigenvector technique is then used to evaluate the ALE's performance as an adaptive prewhitener for autoregressive (AR) models with white observation noise. A method is demonstrated which prevents the problem of spectral estimation bias which usually accrues from the observation noise.

220 citations


Journal ArticleDOI
TL;DR: In this article, the authors present a theoretical framework for the design of subband and transform coder for low bit-rate speech decoding, which is based on spectral estimation and models of speech production and perception.
Abstract: Frequency domain techniques for speech coding have recently received considerable attention. The basic concept of these methods is to divide the speech into frequency components by a filter bank (sub-band coding), or by a suitable transform (transform coding), and then encode them using adaptive PCM. Three basic factors are involved in the design of these coders: 1) the type of the filter bank or transform, 2) the choice of bit allocation and noise shaping properties involved in bit allocation, and 3) the control of the step-size of the encoders. This paper reviews the basic aspects of the design of these three factors for sub-band and transform coders. Concepts of short-time analysis/synthesis are first discussed and used to establish a basic theoretical framework. It is then shown how practical realizations of subband and transform coding are interpreted within this framework. Principles of spectral estimation and models of speech production and perception are then discussed and used to illustrate how the "side information" can be most efficiently represented and utilized in the design of the coder (particularly the adaptive transform coder) to control the dynamic bit allocation and quantizer step-sizes. Recent developments and examples of the "vocoder-driven" adaptive transform coder for low bit-rate applications are then presented.

Journal ArticleDOI
TL;DR: A three-dimensional homomorphic model for human color vision is outlined and applications to color image enhancement, transmission, coding, and to the definition of a distortion measure between such images are shown.
Abstract: Evidence that the way brightness and chromatic information are processed by the human visual system is relevant to the field of image processing is presented. A three-dimensional homomorphic model for human color vision is outlined. Applications of this model to color image enhancement, transmission, coding, and to the definition of a distortion measure between such images are shown.

Journal ArticleDOI
TL;DR: It is shown that, by incorporating pitch frequency information into a frequency-scaling process based on STFA, it is possible, to a good approximation, to perform this operation in the time domain with very few arithmetic operations.
Abstract: Frequency scaling of speech signals by methods based on short-time Fourier analysis (STFA), analytic rooting, and harmonic compression using a bank of filters, is a complex operation which requires a large amount of computation in a digital implementation. It is shown in this paper that, by incorporating pitch frequency information into a frequency-scaling process based on STFA, it is possible, to a good approximation, to perform this operation in the time domain with very few arithmetic operations (one multiplication and two additions per output sample, in most applications). The derivation of the time-domain harmonic scaling (TDHS) algorithms, selection of parameters, and, in particular, the determination of an appropriate weighting function used in the algorithms, as well as several potential applications, are detailed in the paper. Two proposed applications are discussed in greater detail. These are 1) a vocoder system which incorporates waveform coding of the frequency divided signal (by a factor of up to 3), and 2) a computer-based isolated-word recognition system in which all input utterances are compressed to the same duration at the preprocessing phase effecting an overall computation reduction by a factor of up to 3. Computer simulation results which demonstrate the TDHS algorithms' performance are included.

Journal ArticleDOI
Steven Kay1
TL;DR: In this paper, it was shown that the effect of white noise on the autoregressive spectral estimate is to produce a smoothed spectrum, which is a result of the introduction of spectral zeros due to the noise.
Abstract: The autoregressive power spectral density estimator possesses excellent resolution properties. However, it has been shown that for the case of a sinusoidal autoregressive process the addition of noise to the time series results in a decrease in spectral resolution. It is proven that, in general, the effect of white noise on the autoregressive spectral estimate is to produce a smoothed spectrum. This smoothing is a result of the introduction of spectral zeros due to the noise. Finally, the use of a large-order autoregressive model to combat the effects of noise is discussed.

Journal ArticleDOI
TL;DR: An integrated circuit realized in 4Φ-dynamic NMOS technology is presented which has been designed especially for echo cancellation in baseband modems and can pave the way to the use of similar circuits in the local area of the future digital telephone.
Abstract: In analogy with analog speech transmission, simultaneous two-way ("full-duplex") transmission of data signals over two-wire circuits can, in principle, be achieved by the application of hybrid couplers. In practice the resulting imperfect isolation between transmitter and receiver at the same end of a connection is disastrous for the detection of the data from the other end and has to be nullified in some way. A solution to this problem can be found in echo cancellation by means of an adaptive digital filter. In this paper an integrated circuit realized in 4Φ-dynamic NMOS technology is presented which has been designed especially for echo cancellation in baseband modems. In the design a number of aspects have extensively been accounted for. These are the choice of adaptation algorithm, the interaction between the digital filter and its analog environment, interpolation, finite precision implementation, and the natural occurrence of an effect akin to dithering. These subjects receive full attention in this paper. The realized IC can be used for the economical upgrading of existing baseband modems to full-duplex service and can pave the way to the use of similar circuits in the local area of the future digital telephone.

Journal ArticleDOI
TL;DR: Optimum filters as discussed by the authors are designed to enhance the estimation of time delay between signals received at two spatially separate sensors, and the resulting filters are placed in the predetection stage of a basic cross correlator.
Abstract: Optimum filters are designed to enhance the estimation of time delay between signals received at two spatially separate sensors. The resulting filters are placed in the predetection stage of a basic cross correlator. The transfer functions of the optimum filters are derived to operate according to two stated performance criteria. One selected criterion maximizes the expected peak at the time delay relative to the total background noise, and the other minimizes the difference between the incoming signal at the time delay and its estimated value. The flexibility and simplicity of the derivations lend themselves to the generalization of results and clarity of interpretation. The derived filters are contrasted with those found in the literature. One of the present filters is identical to that obtained by other authors when using the criterion of maximum likelihood. The reason for the difference between the present and maximum likelihood filters from the Eckart filter is pinpointed. Finally, the relationships among the various optimum and ad hoc filters are clarified.

Journal ArticleDOI
J. Cadzow1
TL;DR: It will be shown that the basic extrapolation operation is feasible for only a particular subset of the class of band-limited signals, and an efficient algorithmic method for achieving the desired extrapolation on this subset is presented.
Abstract: In this paper, the task of extrapolating a time-truncated version of a band-limited signal shall be considered. It will be shown that the basic extrapolation operation is feasible for only a particular subset of the class of band-limited signals (i.e., the operation is well-posed mathematically). An efficient algorithmic method for achieving the desired extrapolation on this subset is then presented. This algorithm is structured so that all necessary signal manipulations involve signals which are everywhere zero except possibly on a finite "observation time" set. As a consequence, its implementation is straightforward and can be carried out in real time. This is to be contrasted with many existing extrapolation algorithms which theoretically involve operations on signals that are nonzero for almost all values of time. Their numerical implementation thereby necessitates an error producing time-truncation and a resultant deleterious effect on the corresponding extrapolation. Using straightforward algebraic operations, a convenient one-step extrapolation procedure is next developed. This is noteworthy in that this procedure thereby enables one to effectively circumvent any potentially slow convergence rate difficulties which typically characterize extrapolation algorithms. The effectiveness of this one-step procedure is demonstrated by means of two examples.

Journal ArticleDOI
TL;DR: A criterion which sufficiently guarantees the absence of overflow oscillations in two-dimensional digital filters in the state space is given and this criterion is used to identify a certain class of two- dimensional filters for which overflow oscillation are proved to be absent.
Abstract: A criterion which sufficiently guarantees the absence of overflow oscillations in two-dimensional digital filters in the state space is given. This criterion is used to identify a certain class of two-dimensional filters for which overflow oscillations are proved to be absent. Such filters, however, are noncanonic.

Journal ArticleDOI
TL;DR: Research to code speech at 16 kbit/s with the goal of having the quality of the coded speech be equal to that of the original is reported, finding that the pitch predictor is not cost-effective on balance and may be eliminated.
Abstract: We report on research to code speech at 16 kbit/s with the goal of having the quality of the coded speech be equal to that of the original. Some of the original speech had been corrupted by noise and distortions typical of long-distance telephone lines. The basic structure chosen for our system was adaptive predictive coding. However, the rigorous requirements of this work led to a new outlook on the different aspects of adaptive predictive coding. We have found that the pitch predictor is not cost-effective on balance and may be eliminated. Solutions are presented to deal with the two types of quantization noise: clipping and granular noise. The clipping problem is completely eliminated by allowing the number of quantizer levels to increase indefinitely. An appropriate self-synchronizing variable-length code is proposed to minimize the average data rate; the coding scheme seems to be adequate for all speech and all conditions tested. The granular noise problem is treated by modifying the predictive coding system in a novel manner to include an adaptive noise spectral shaping filter. A design for such a filter is proposed that effectively eliminates the perception of granular noise.

Journal ArticleDOI
TL;DR: In this article, the authors developed a new homomorphic deconvolution system that is useful when the signal to be deconvolved is a convolution of two components, one of which is a train of pulses and for the other we have some a priori information on its approximate number of poles and zeros.
Abstract: In this paper, we develop a new homomorphic deconvolution system that is useful when the signal to be deconvolved is a convolution of two components, one of which is a train of pulses and for the other we have some a priori information on its approximate number of poles and zeros. The system that we develop is essentially the same as the logarithmic homomorphic deconvolution system (LHDS) [1]-[3] except that the logarithmic and exponential operations are replaced with ( )γand ( )1/γoperations. By theoretical and/or empirical results, we illustrate that under appropriate conditions, the new system with the proper choice of γ can potentially yield a better result than the LHDS and that the results obtained by the new system with the value of γ close to 0 are essentially the same as those obtained by the LHDS. As a potential area of practical application, a new speech analysis/ synthesis system is developed based on the new homomorphic deconvolution system discussed in this paper. Some preliminary results on the performance of such a system are also discussed.

Journal ArticleDOI
TL;DR: It is demonstrated that clustering can be a powerful tool for selecting reference templates for speaker-independent word recognition by identifying coarse structure, fine structure, overlap of, and outliers from clusters.
Abstract: It is demonstrated that clustering can be a powerful tool for selecting reference templates for speaker-independent word recognition. We describe a set of clustering techniques specifically designed for this purpose. These interactive procedures identify coarse structure, fine structure, overlap of, and outliers from clusters. The techniques have been applied to a large speech data base consisting of four repetitions of a 39 word vocabulary (the letters of the alphabet, the digits, and three auxiliary commands) spoken by 50 male and 50 female speakers. The results of the cluster analysis show that the data are highly structured containing large prominent clusters. Some statistics of the analysis and their significance are presented.

Journal ArticleDOI
TL;DR: The ALE output is shown to be the sum of two uncorrelated components, one arising from optimum finite-lag Wiener filtering of the narrow-band input components, and the other arising from the misadjustment error associated with the adaptation process.
Abstract: The adaptive line enhancer (ALE) is an adaptive digital filter designed to suppress uncorrelated components of its input, while passing any narrow-band components with little attenuation. The purpose of this paper is to analyze the second-order output statistics of the ALE in steady-state operation, for input samples consisting of weak narrow-band signals in white Gaussian noise. The ALE output is shown to be the sum of two uncorrelated components, one arising from optimum finite-lag Wiener filtering of the narrow-band input components, and the other arising from the misadjustment error associated with the adaptation process. General expressions are given for the output auto-correlation function and power spectrum with arbitrary narrow-band input signals, and the case of a single sinusoid in white noise is worked out as an example. Finally, the significance of these results to practical applications of the ALE is mentioned.

Journal ArticleDOI
TL;DR: It is shown that the complementary structure of the DIT and the DIF formulations makes possible the application of the pruning algorithms simultaneously at the input, as well as at the output, for either of the formulations.
Abstract: When an input data sequence has a large number of zeros and the number of output samples required to be computed is small, significant time saving can be achieved by a judicious combination of the pruning algorithms for decimation-in-time (DIT) and decimation-in-frequency (DIF). It is shown that the complementary structure of the DIT and the DIF formulations makes possible the application of the pruning algorithms simultaneously at the input, as well as at the output, for either of the formulations. For a given number of input and output points, a choice between the two formulations can be made based on the amount of time saved in each. Also, a simple assembly language modification is shown by which the bit reversal is made significantly faster.

Journal ArticleDOI
TL;DR: In this paper, a new algorithm for scaling in residue number systems (RNSs) is presented for applying residue number theory to recursive digital filtering, which provides an efficient method for scaling the output of each recursive filter section for use in subsequent iterations of recursion.
Abstract: A new algorithm for scaling in residue number systems (RNS's) is presented for applying residue number theory to recursive digital filtering. The algorithm provides an efficient method for scaling the output of each recursive filter section for use in subsequent iterations of the recursion. Four classes of residue systems are described in which scaling is simple and quantization errors are minimized, thereby combining good quantization error performance with the advantages of high-speed residue arithmetic. A computer analysis of the scaling quantization errors is presented, as well as some results from a recursive residue simulation. Three hardware architectures are described for the realization of recursive residue filters.


Journal ArticleDOI
TL;DR: In this article, a new class of recursive digital filters for sampling rate reduction is discussed, which brings together the advantages of finite-duration impulse response (FIR) and elliptic designs by having only powers of zDin the denominator (D is the decimation ratio).
Abstract: A new class of recursive digital filters for sampling rate reduction is discussed. These filters present equiripple behavior in the magnitude response, with all their zeros located on the unit circle. These new filters bring together, to some extent, the advantages of finite-duration impulse response (FIR) and elliptic designs by having only powers of zDin the denominator (D is the decimation ratio). Only every D th output has to be computed, as in the FIR case; while some feedback terms, as in the elliptic case, are also present. The design and some optimality properties of these filters are discussed. Some characteristics of filters with only powers of zDin the denominator, such as pole-zero location, group delay, and coefficient sensitivity are discussed and compared with elliptic designs. It is shown how these new filters require significantly fewer multiplications per second than equivalent FIR and elliptic designs.

Journal ArticleDOI
Henri J. Nussbaumer1, P. Quandalle1
TL;DR: In this article, two polynomial transforms have been proposed for computing discrete Fourier transform (DFT) by polynomials, which are particularly well adapted to multidimensional DFT's as well as to some one-dimensional DFTs.
Abstract: Polynomial transforms, defined in rings of polynomials, have been introduced recently and have been shown to give efficient algorithms for the computation of two-dimensional convolutions. In this paper we present two methods for computing discrete Fourier transforms (DFT) by polynomial transforms. We show that these techniques are particularly well adapted to multidimensional DFT's as well as to some one-dimensional DFT's and yield algorithms that are, in many instances, more efficient than the fast Fourier transform (FFT) or the Winograd Fourier Transform (WFTA). We also describe new split nesting and split prime factor techniques for computing large DFT's from a small set of short DFT's with a minimum number of operations.

Journal ArticleDOI
TL;DR: The state of the art of the method for estimating vocal tract area functions based on the linear prediction of speech is discussed in this article, where the proposed solutions for these problems are discussed together with some of the current developments pertaining to these problems.
Abstract: The state of the art of the method for estimating vocal-tract area functions based on the linear prediction of speech is discussed. Limitations and problems involved in this method are: 1) frequency band limitation of the signal, 2) source characteristics, 3) boundary conditions, 4) energy losses within the vocal tract, 5) vocal-tract length estimation, 6) dynamics of the vocal tract, and 7) evaluation of the method [1]. The proposed solutions for these problems are discussed together with some of the current developments pertaining to these problems. Advantages and disadvantages of this method are also discussed in comparison with the lip impulse response method.

Journal ArticleDOI
TL;DR: A method is presented for determining the harmonic components of a noisy signal by nonlinear extrapolation beyond the data interval by an algorithm that adaptively reduces the spectral components due to noise.
Abstract: A method is presented for determining the harmonic components of a noisy signal by nonlinear extrapolation beyond the data interval. The method is based on an algorithm that adaptively reduces the spectral components due to noise.

Journal ArticleDOI
L. Siegel1
TL;DR: In the training procedure, covering and satisfaction were attained on successively larger sets of speakers, and a classifier was obtained which could correctly make the V/UV decision for all of the speakers used in testing, including those not used in the training process.
Abstract: A classifier to make the voiced/unvoiced (V/UV) decision in speech analysis which performs with an error rate of less than half of a percent is presented. The decision making process is viewed as a pattern recognition problem in which a number of features can be used to make the classification. Training is accomplished using a nonparametric, nonstatistical technique. In order to obtain a classifier which would make the correct decision for a variety of speakers and to determine which of the features under consideration should be used, a procedure for interleaving the contributions of the feature and speaker sets was developed. This procedure is presented in terms of the notions of covering and satisfaction. The failure of a classifier to cover a set of speakers indicates that more training information from those speakers is necessary to define the classifier. The failure of the classifier to satisfy a set of speakers indicates that the performance of the classifier could be improved by the use of more features in making the V/UV decision. In the training procedure, covering and satisfaction were attained on successively larger sets of speakers, with the result that a classifier was obtained which could correctly make the V/UV decision for all of the speakers used in testing, including those not used in the training process.