Showing papers in "IEEE Transactions on Acoustics, Speech, and Signal Processing in 1974"
TL;DR: The implementation of the AMDF pitch extractor (nonreal-time simulation and real-time) is described and experimental results presented to illustrate its basic measurement properties.
Abstract: This paper describes a method for using the average magnitude difference function (AMDF) and associated decision logic to estimate the pitch period of voiced speech sounds. The AMDF is a variation on autocorrelation analysis where, instead of correlating the input speech at various delays (where multiplications and summations are formed at each value of delay), a difference signal is formed between the delayed speech and the original and, at each delay, the absolute magnitude of the difference is taken. The difference signal is always zero at delay = π, and exhibits deep nulls at delays corresponding to the pitch period of voiced sounds. Some of the reasons the AMDF is attractive include the following. 1) It is a simple measurement which gives a good estimate of pitch contour, 2) it has no multiply operations, 3) its dynamic range characteristics are suitable for implementation on a 16-bit machine, and 4) the nature of its operations makes it suitable for implementation on a programmable processor or in special purpose hardware. The implementation of the AMDF pitch extractor (nonreal-time simulation and real-time) is described and experimental results presented to illustrate its basic measurement properties.
562 citations
TL;DR: This approach capitalizes on recent advances in semiconductor memory technology and is shown to offer significant reductions in cost and power consumption for the same speed of operation as that of existing realizations.
Abstract: A new approach to the implementation problem of digital filters is presented. This approach capitalizes on recent advances in semiconductor memory technology and is shown to offer significant reductions in cost and power consumption for the same speed of operation as that of existing realizations. Furthermore, this approach makes possible speeds of operation which cannot be achieved by existing realizations. The proposed approach yields a very flexible hardware configuration and a discussion of the various options is presented together with a comparison to existing realizations. The mean-squared error resulting from the use of finite word length is analyzed.
529 citations
TL;DR: In this paper, a Fermat number transform (FNT) is proposed for digital computation, requiring on the order of N \log N additions, subtractions and bit shifts, but no multiplications.
Abstract: The structure of transforms having the convolution property is developed. A particular transform is proposed that is defined on a finite ring of integers with arithmetic carried out modulo Fermat numbers. This Fermat number transform (FNT) is ideally suited to digital computation, requiring on the order of N \log N additions, subtractions and bit shifts, but no multiplications. In addition to being efficient, the Fermat number transform implementation of convolution is exact, i.e., there is no roundoff error. There is a restriction on sequence length imposed by word length but multi-dimensional techniques are discussed which overcome this limitation. Results of an implementation on the IBM 370/155 are presented and compared with the fast Fourier transform (FFT) showing a substantial improvement in efficiency and accuracy.
292 citations
TL;DR: An algorithm is presented which finds the frequency and amplitude of the first three formants during all vowel-like segments of continuous speech, using as input the peaks of the linear prediction spectra and a segmentation parameter to indicate energy and voicing.
Abstract: An algorithm is presented which finds the frequency and amplitude of the first three formants during all vowel-like segments of continuous speech. It uses as input the peaks of the linear prediction spectra and a segmentation parameter to indicate energy and voicing. Ideally, the first three peaks are the first three formants. Frequently, however, two peaks merge, or spurious peaks appear, and the difficult part is to recognize such situations and deal with them. The general method is to fill formant slots with the available peaks at each frame, based on frequency position relative to an educated guess. Then, if a peak is left over and/or a slot is unfilled, special routines are called to decide how to deal with them. Included is a formant enhancement technique, analogous to a similar technique which has been implemented via the chirp-z transform [8], which usually succeeds in separating two merged formants. Processing begins at the middle of each high volume voiced segment, where formants are most likely to be correct, and branches outward from there in both directions in time, using the most recently found formant frequencies as the educated guess for the current frame. The algorithm has been implemented at Lincoln Laboratory on the Univac 1219 and the Fast Digital Processor, a programmable processor [9], and has been tested on a large number of unrestricted sentences.
247 citations
TL;DR: In this article, a spectral-flatness measure is introduced to give a quantitative measure of "whiteness" of a spectrum, and it is shown that maximizing the spectral flatness of an inverse filter output or linear predictor error is equivalent to the autocorrelation method of linear prediction.
Abstract: The purpose of this paper is to introduce a spectral-flatness measure into the study of linear prediction analysis of speech. A spectral-flatness measure is introduced to give a quantitative measure of "whiteness," of a spectrum. It is shown that maximizing the spectral flatness of an inverse filter output or linear predictor error is equivalent to the autocorrelation method of linear prediction. Theoretical properties of the flatness measure are derived, and compared with experimental results. It is shown that possible ill-conditioning of the analysis problem is directly related to the spectral-flatness measure and that prewhitening by a simple first-order linear predictor to increase spectral flatness can greatly reduce the amount of ill-conditioning.
160 citations
TL;DR: The formulation is very general and includes block processing and sectioning as special cases and, when used with various fast algorithms for short length convolutions, results in improved multiplication efficiency.
Abstract: This paper presents two formulations of multi-dimensional digital signals from one-dimensional digital signals so that multidimensional convolution will implement one-dimensional convolution of the original signals. This has reduced an important word length restriction when used with the Fermat number transform. The formulation is very general and includes block processing and sectioning as special cases and, when used with various fast algorithms for short length convolutions, results in improved multiplication efficiency.
137 citations
TL;DR: The method is extended to recursive filters and a comparison is made with existing techniques of implementing digital filters for the needs in computation and storage hardware: a specific example of design underlines the reduction in computation speed achieved in practice through this method.
Abstract: Any digital filter can be decomposed into two basic subsets, an extrapolator the output of which is sampled at a frequency depending only on the filter bandwidth and an interpolator delivering the filtered signal at the imposed output sampling rate. Redundancy in extrapolator and interpolator is removed by introducing half-band nonrecursive filtering elements for which definition, performance figures and efficient implementation are supplied. They reduce significantly the necessary computation and storage at the cost of a slight group delay increase. A formula is given for the amount of multiplications to be carried out every second in a filter; it depends on the filter bandwidth, signal to distortion ratio, and input-output sampling rate. The method is extended to recursive filters and a comparison is made with existing techniques of implementing digital filters for the needs in computation and storage hardware: a specific example of design underlines the reduction in computation speed achieved in practice through this method, which brings digital filters in a most favorable position for their competition against analog filters in many application fields.
133 citations
TL;DR: In this paper, a two-pass recursive scheme is proposed for realizing zero phase shift filters with arbitrary magnitude characteristics, where the first pass is performed in forward time and the second in reverse time.
Abstract: A two-pass recursive scheme is proposed for realizing zero phase shift filters with arbitrary magnitude characteristics The first pass is performed in forward time and the second in reverse time The effect of initial and reverse time transients is discussed, and a scheme for quasi on-line adaptation is presented
109 citations
TL;DR: The problem of designing recursive digital filters whose frequency response approximates an arbitrarily prescribed function in the Chebyshev sense on a single interval is considered, and an algorithm is given for determining the best Chebys hev as well as the best equiripple approximation.
Abstract: The problem of designing recursive digital filters whose frequency response approximates an arbitrarily prescribed function in the Chebyshev sense on a single interval is considered. Certain degenerate cases where the best Chebyshev approximation is not equiripple are studied in detail, and an algorithm is given for determining the best Chebyshev as well as the best equiripple approximation. Finally, a number of examples illustrating applications of this algorithm are given.
109 citations
TL;DR: An optimization algorithm is developed to minimize the p-error criterion under the constraint that the resulting filter be stable, and several examples are solved to illustrate the technique.
Abstract: In this paper a design technique for the two-dimensional filters is proposed. An optimization algorithm is developed to minimize the p-error criterion under the constraint that the resulting filter be stable. Design of one-dimensional filter may be considered as a special case to which the proposed algorithm is applicable. Several examples are solved to illustrate the technique.
108 citations
TL;DR: In this paper, a technique for designing stable two-dimensional recursive filters whose magnitude response is approximately circularly symmetric is presented, which is achieved by cascading a number of elementary filters which are called rotated filters.
Abstract: The digital filtering of two-dimensional signals offers the many advantages characteristic of digital computers, such as flexibility and accuracy. Applications exist in the processing of images and geophysical data. A technique is presented for designing stable two-dimensional recursive filters whose magnitude response is approximately circularly symmetric. This is achieved by cascading a number of elementary filters which are called rotated filters because they are designed by rotating one-dimensional continuous filters and using the two-dimensional z-transform to obtain the corresponding digital filter. Stability of these filters is considered in detail and the results obtained are stated in two corollaries. In particular it is proved that rotated filters are stable if the angle of rotation is between 270° and 360°. Finally, methods of analysis and design of the shape, circular symmetry, and cutoff frequency of two-dimensional recursive filters are discussed.
TL;DR: In this article, the use of linear programming techniques for the design of infinite impulse response (IIR) digital filters was discussed and it was shown that, in theory, a weighted equiripple approximation to an arbitrary magnitude function can be obtained in a predictable number of applications of the simplex algorithm.
Abstract: This paper discusses the use of linear programming techniques for the design of infinite impulse response (IIR) digital filters. In particular, it is shown that, in theory, a weighted equiripple approximation to an arbitrary magnitude function can be obtained in a predictable number of applications of the simplex algorithm of linear programming. When one implements the design algorithm, certain practical difficulties (e.g., coefficient sensitivity) limit the range of filters which can be designed using this technique. However, a fairly large number of IIR filters have been successfully designed and several examples will be presented to illustrate the range of problems for which we found this technique to be useful.
TL;DR: Experimental results are presented which illustrate both the capabilities and limitations of linear prediction vocoders.
Abstract: A detailed discussion of the computer simulation of a linear prediction vocoder system is presented. The basic technique used for analysis is the autocorrelation method of linear prediction. New results include modifications to the simplified inverse filter tracking (SIFT) algorithm for more efficient pitch extraction, coding algorithms for low-bit rate transmission, a simplified synthesizer gain calculation, and a bias correction for the synthesizer driving function. Experimental results are presented which illustrate both the capabilities and limitations of linear prediction vocoders.
TL;DR: An algorithm is given for stability test of filters of arbitrary dimension and complexity based on the generation of a number of multivariable polynomials, reduction of each of these into several single variable polynomic factorizations and back substitutions.
Abstract: An algorithm with a view towards computer implementation is given for stability test of filters of arbitrary dimension and complexity. The algorithm is based on the generation of a number of multivariable polynomials, reduction of each of these into several single variable polynomials by a finite number of rational operations, and a scheme of repeated single variable polynomial factorizations and back substitutions.
TL;DR: In this paper, the exact impulse response of field parameters for any field point on or off axis for the case where a circular disc radiator face is subjected to a displacement step corresponding to a velocity impulse is reviewed.
Abstract: The exact impulse response of field parameters for any field point on or off axis for the case where a circular disc radiator face is subjected to a displacement step corresponding to a velocity impulse is reviewed. By convolution, the transient field pattern for any arbitrary motion of the disc can be obtained. The exact response for a half-sine monopulse is computed. An approximate representation of the transient pressure response to the velocity impulse input at the disc is derived, and it is shown to correspond to the replica pulses described previously. The regions of validity of the approximation are quite limited and the replica pulses are displaced in time from the positions formerly attributed to them. The displaced replica approximation is applied to an examination of the structure of the near field for continuous sinusoidal excitation and a plot of positions of extrema is produced. It is shown that this approximation gives good agreement with the exact values and is superior to the previous published approach in this regard. For short sinusoidal pulses the effect of pulse length on the field pattern, and of field point on the time history of a transient wave are shown. When the excitation is a short sinusoidal pulse the effect of the pulse length and field point position on the field pattern and wave shape are demonstrated.
TL;DR: A new method of tracking the fundamental frequency of voiced speech is described, shown to be of similar accuracy as the Cepstrum technique and to be faster than the SIFT algorithm.
Abstract: A new method of tracking the fundamental frequency of voiced speech is described. The method is shown to be of similar accuracy as the Cepstrum technique. Since the method involves only additions, no multiplication, it is shown to be faster than the SIFT algorithm. The basis of the method is searching for a minimum in the magnitude of the difference between a speech segment and a delayed speech segment. This is shown to be equivalent to selecting the comb filter which best annihilates the input signal.
TL;DR: In this article, a systematic procedure to test for stability of three-dimensional filters (discrete and continuous) is presented, based on repeated applications of an extended Hermite or Schur-Cohn formulation, and use of Sturm's theorem to determine the content of a system of polynomial inequalities.
Abstract: In this paper, a systematic procedure to test for stability of three-dimensional filters (discrete and continuous) is presented. The test is based on repeated applications of an extended Hermite or Schur-Cohn formulation, and use of Sturm's theorem to determine the content of a system of polynomial inequalities in a single indeterminate. The need for generating a constructive algorithm for stability tests for higher than three-dimensional filters using Tarski's generalization of Sturm's theorem is discussed. Application of certain combinatorial rules for transforming the multidimensional digital filter problem to the multidimensional continuous filter problem or vice versa) is made.
TL;DR: The application to unequal bandwidth and vernier spectrum analysis of a technique referred to as digital frequency warping is discussed and a comparison is presented between the bandwidth as a function of frequency for the digital warping technique and proportional bandwidth analysis.
Abstract: The application to unequal bandwidth and vernier spectrum analysis of a technique referred to as digital frequency warping is discussed. In this technique a sequence is transformed in such a way that the Fourier transforms of the original and transformed sequences are related by a nonlinear transformation of the frequency axis. An equal bandwidth analysis carried out on the transformed sequence then corresponds to an unequal bandwidth analysis of the original sequence. A comparison is presented between the bandwidth as a function of frequency for the digital warping technique and proportional bandwidth analysis. An analysis of the effects of finite register length in implementing digital frequency warping is also presented.
TL;DR: The philosophy adopted is that for a given FIR filter structure, the filter coefficients can be designed to provide a minimum mean-squared error (MMSE) estimate of a random signal sequence imbedded in a random noise sequence.
Abstract: The problem of designing a finite duration impulse response (FIR) digital filter to approximate a desired spectral response is treated in this paper. The philosophy adopted is that for a given FIR filter structure, the filter coefficients can be designed to provide a minimum mean-squared error (MMSE) estimate of a random signal sequence (the design-signal) imbedded in a random noise sequence. By treating the signal and noise covariance functions as design parameters, one can design FIR filters with spectral responses that approximate the power spectral density of the design-signal. For signal processing applications that require some attention to signal fidelity, as well as noise rejection, the MMSE philosophy seems appropriate (as opposed to a maximum signal-to-noise ratio philosophy, for example). Several practical designs are presented that emphasize the simplicity of the design technique and illustrate the selection of design parameters. The designs show quite dramatically that the MMSE design technique can be competitive with existing low-pass and bandpass design techniques. Finally, considerable attention is given to an efficient Toeplitz matrix inversion algorithm that permits rapid inversion of the covariance matrices that arise in the MMSE design. The resulting computation times for the design of high-order filters (N = 128, e.g.) appear to be shorter than computation times for competing algorithms.
TL;DR: A specific representation of two-dimensional sequences as one- dimensional sequences is presented, which is valid both for signals of finite extent and for the more general class of signals with rational Z-transforms.
Abstract: A number of signal processing techniques which have been developed for processing one-dimensional sequences do not generalize to the processing of two-dimensional signals, largely due to the absence of a two-dimensional factorization theorem. In an attempt to circumvent this problem, a specific representation of two-dimensional sequences as one-dimensional sequences is presented in this paper. Using this mapping several two-dimensional problems can be viewed as one-dimensional problems and approached using one-dimensional techniques. This representation is valid both for signals of finite extent and for the more general class of signals with rational Z-transforms. In this paper we consider applications of these techniques for high speed convolution, processing of drum scans, and two-dimensional finite impulse response (FIR) filter design.
TL;DR: For pitch synchronous analysis, nonstationarity is a better assumption than stationarity, but for pitch asynchronous analysis and large analysis segment size the performance of both formulations in representing the speech waveform is practically the same.
Abstract: The purpose of this paper is to present the theoretical differences and results of experimental comparison of the stationary (autocorrelation) and nonstationary (covariance) linear prediction formulations when applied to voiced speech analysis. In this experimental study three criterion used for comparison purposes are: 1) total minimum normalized squared error, 2) accuracy in estimating speech spectrum, and 3) accuracy in estimating formant parameters. The results of linear prediction pitch synchronous as well as pitch asynchronous analyses of synthetic and natural speech are given. Influence of analysis segment size and its position on the estimated formant parameters and total minimum normalized squared error have been investigated. For pitch synchronous analysis, nonstationarity is a better assumption than stationarity, but for pitch asynchronous analysis and large analysis segment size (20-25 ms) the performance of both formulations in representing the speech waveform is practically the same.
TL;DR: Initial applications of the voice response system are in computer aided voice wiring, automatic directory assistance, and experiments on speaker verification, but the system is sufficiently modular to adapt readily to other applications.
Abstract: In this paper we discuss the issues involved in implementing an automatic computer voice response system which is capable of serving up to ten independent output channels in real time. The system has been implemented on a Data General NOVA- 800 minicomputer. Individual isolated words and phrases are coded at a rate of 24 000 bits/s using a hardware adaptive, differential pulse-code modulation (ADPCM) coder, and stored on a fixed-head disk as a random access vocabulary. By exploiting the features of ADPCM coding, it is possible to create and edit automatically a vocabulary for the system from an analog tape recording of the spoken entries, with minimal operator intervention. To provide ten simultaneous output lines of speech which are independent of each other required the use of an efficient scheduling algorithm. Such an algorithm was provided by the computer manufacturer in their real-time multitasking system which was part of their Fortran software. Thus almost all the programming required to implement this real-time system was in Fortran, thereby providing flexibility and ease in making changes in the system. Initial applications of the voice response system are in computer aided voice wiring, automatic directory assistance, and experiments on speaker verification, but the system is sufficiently modular to adapt readily to other applications.
IBM1
TL;DR: The parametrically controlled analyzer (PCA) is a large PL/I program which has been designed to perform spectral analysis of speech signals and features parametric selection of several analysis methods, including discrete Fourier transformation and linear predictive coding.
Abstract: The parametrically controlled analyzer (PCA) is a large PL/I program which has been designed to perform spectral analysis of speech signals. PCA features parametric selection of several analysis methods, including discrete Fourier transformation and linear predictive coding. Also, selection may be made among various smoothing, normalization, and interpolation methods. PCA develops high-quality spectrographic representations of speech for standard line printers and CRT displays. The PCA is described and numerous examples of various parameter settings are presented and discussed.
TL;DR: This paper describes techniques that utilize the time samples of the desired response as target values for an iterative minimization, leading to recursive filter designs requiring little computer time.
Abstract: The nonlinear minimization problem that results from recursive digital filter design with phase constraints is simplified somewhat by working in the time domain. This paper describes techniques that utilize the time samples of the desired response as target values for an iterative minimization. Initial values for the α and β (feedforward and feedback) coefficients can be obtained by one of several reliable methods and fed into iterative routines that lead to a locally optimal solution for the coefficients. The initial guess procedures, stemming from regressionlike equations, only require the solution of a set of linear equations. In addition, the iteration procedures described in this paper lead to recursive filter designs requiring little computer time. Examples are presented to illustrate a range of applications.
TL;DR: A simple method of calculating the steady-state value of the variance of the output noise of a digital filter due to the input quantization noise or internally generated noise from product round-off is presented.
Abstract: A simple method of calculating the steady-state value of the variance of the output noise of a digital filter due to the input quantization noise or internally generated noise from product round-off is presented. The output noise is expressed as a sum of simpler terms belonging to one of four basic groups. Explicit expressions have been developed for rapid evaluation of these terms in the expansion. The method is illustrated by means of examples.
TL;DR: The application of the "branch and bound" technique for nonlinear discrete optimization, due to Dakin, to the problem of finding the coefficients of a recursive digital filter with prescribed number of bits, to meet arbitrary response specifications of the magnitude characteristic is investigated.
Abstract: The application of the "branch and bound" technique for nonlinear discrete optimization, due to Dakin, to the problem of finding the coefficients of a recursive digital filter with prescribed number of bits, to meet arbitrary response specifications of the magnitude characteristic, is investigated. Due to the fact that the objective function is nonlinear and the stability constraints are linear with respect to the parameter, the recent algorithm for nonlinear programming due to Best and Ritter is used. Based on the ideas presented, a general computer program has been developed. Numerical experience with the present approach is also presented.
TL;DR: In this article, the power spectral effects of spline interpolators of all orders were investigated, and a general technique was given for finding the steady-state spectral effect of splines, when applied following uniform sampling of the input function.
Abstract: This paper discusses the power spectral effects of spline interpolators. A general technique is given for finding the steady-state spectral effects of splines of all orders, when applied following uniform sampling of the input function. The following observations are made: 1) the even order splines that were examined (second and fourth order) possessed divergent steady-state frequency transfer functions, 2) the degree of preservation of the power spectral density of the input process increased with the order of the (odd order) spline used for interpolation, and 3) the reconstruction of a stationary random process over a finite record length will, on the average, have less power than indicated by the steady-state transfer function.
TL;DR: In this paper, a wave digital filter, attenuation and attenuation sensitivity can be defined in two different ways: attenuation distortion and roundoff noise, which can easily be kept small.
Abstract: In a wave digital (WD) filter, attenuation, and thus attenuation sensitivity, can be defined in two different ways. The first type of sensitivity is of importance from the point of view of attenuation distortion and can easily be kept small. The second, which differs from the first at most by an additive constant, is important from the point of view of roundoff noise. Structures are discussed for which both definitions coincide, thus ensuring the possibility of keeping both sensitivities simultaneously low.
TL;DR: In this article, the authors examine the theoretical and practical issues of designing multiband filters and present several strategies for choosing the input parameters for the McClellan et al. filter-design algorithm to yield reasonable filters which meet arbitrary specifications.
Abstract: Although much has been learned about the relationships between design parameters for finite impulse-response (FIR) low-pass digital filters, very little is known about the relationships between the parameters of multiband filters. Thus given a set of design specifications for a multiband FIR filter (e.g., filter band edge frequencies and desired ripples in each of the bands) it is difficult to choose a set of modified parameters which will yield an acceptable filter using a standard FIR design algorithm. By an acceptable filter we mean one with monotonic behavior of the frequency response in the DON'T-CARE or transition regions between bands and one providing at least the desired attenuation (or ripple) in each of the bands. In this paper, we examine the theoretical and practical issues of designing multiband filters and present several strategies for choosing the input parameters for the McClellan et al. filter-design algorithm to yield reasonable filters which meet arbitrary specifications.
TL;DR: Various results obtained, in particular, under different assumptions concerning the distribution of the mantissa are presented on a common basis and in such a way that they are independent of the number of digits in the Mantissa, thus facilitating comparisons.
Abstract: The probability density and related properties of the relative error due to rounding after floating-point arithmetic operations can be computed from the distributions of the mantissa and its absolute error. Various results obtained, in particular, under different assumptions concerning the distribution of the mantissa are presented on a common basis and in such a way that they are independent of the number of digits in the mantissa, thus facilitating comparisons.