Showing papers in "IEEE Transactions on Acoustics, Speech, and Signal Processing in 1988"

PDF

Open Access

Journal Article•DOI•

Complete discrete 2-D Gabor transforms by neural networks for image analysis and compression

[...]

John Daugman¹•Institutions (1)

01 Jul 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A three-layered neural network based on interlaminar interactions involving two layers with fixed weights and one layer with adjustable weights finds coefficients for complete conjoint 2-D Gabor transforms without restrictive conditions for image analysis, segmentation, and compression.

...read moreread less

Abstract: A three-layered neural network is described for transforming two-dimensional discrete signals into generalized nonorthogonal 2-D Gabor representations for image analysis, segmentation, and compression. These transforms are conjoint spatial/spectral representations, which provide a complete image description in terms of locally windowed 2-D spectral coordinates embedded within global 2-D spatial coordinates. In the present neural network approach, based on interlaminar interactions involving two layers with fixed weights and one layer with adjustable weights, the network finds coefficients for complete conjoint 2-D Gabor transforms without restrictive conditions. In wavelet expansions based on a biologically inspired log-polar ensemble of dilations, rotations, and translations of a single underlying 2-D Gabor wavelet template, image compression is illustrated with ratios up to 20:1. Also demonstrated is image segmentation based on the clustering of coefficients in the complete 2-D Gabor transform. >

...read moreread less

1,977 citations

Journal Article•DOI•

Maximum likelihood localization of multiple sources by alternating projection

[...]

I. Ziskind¹, Mati Wax¹•Institutions (1)

Rafael Advanced Defense Systems¹

01 Oct 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: An algorithm, referred to as APM, for computing the maximum-likelihood estimator of the locations of simple sources in passive sensor arrays is presented and the convergence of the algorithm to the global maximum is demonstrated for a variety of scenarios.

...read moreread less

Abstract: An algorithm, referred to as APM, for computing the maximum-likelihood estimator of the locations of simple sources in passive sensor arrays is presented. The algorithm is equally applicable to the case of coherent signals and to the case of a single snapshot. The algorithm is iterative; the maximum of the likelihood function is computed by successive approximations. The convergence of the algorithm to the global maximum is demonstrated for a variety of scenarios. The key to this global convergence is the initialization scheme. >

...read moreread less

1,310 citations

Journal Article•DOI•

Efficient bit allocation for an arbitrary set of quantizers (speech coding)

[...]

Y. Shoham¹, Allen Gersho²•Institutions (2)

Bell Labs¹, University of California, Santa Barbara²

01 Sep 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: In this article, a bit allocation algorithm that is capable of efficiently allocating a given quota of bits to an arbitrary set of different quantizers is proposed, which produces an optimal or very nearly optimal allocation, while allowing the set of admissible bit allocation values to be constrained to nonnegative integers.

...read moreread less

Abstract: A bit allocation algorithm that is capable of efficiently allocating a given quota of bits to an arbitrary set of different quantizers is proposed. This algorithm is useful in any coding scheme which uses bit allocation or, more generally, codebook allocation. It produces an optimal or very nearly optimal allocation, while allowing the set of admissible bit allocation values to be constrained to nonnegative integers. It is particularly useful in cases where the quantizer performance versus rate is irregular and changing in time, a situation that cannot be handled by conventional allocation algorithms. >

...read moreread less

822 citations

Journal Article•DOI•

Parameter estimation of superimposed signals using the EM algorithm

[...]

Meir Feder¹, E. Weinstein²•Institutions (2)

Massachusetts Institute of Technology¹, Woods Hole Oceanographic Institution²

01 Apr 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A computationally efficient algorithm for parameter estimation of superimposed signals based on the two-step iterative EM (estimate-and-maximize, with an E step and an M step) algorithm is developed.

...read moreread less

Abstract: A computationally efficient algorithm for parameter estimation of superimposed signals based on the two-step iterative EM (estimate-and-maximize, with an E step and an M step) algorithm is developed. The idea is to decompose the observed data into their signal components and then to estimate the parameters of each signal component separately. The algorithm iterates back and forth, using the current parameter estimates to decompose the observed data better and thus increase the likelihood of the next parameter estimates. The application of the algorithm to the multipath time delay and multiple-source location estimation problems is considered. >

...read moreread less

814 citations

Journal Article•DOI•

Inverse filtering of room acoustics

[...]

Masato Miyoshi, Yutaka Kaneda

01 Feb 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: In this article, a novel method is proposed for realizing exact inverse filtering of acoustic impulse responses in room, based on the principle called the multiple-input/output inverse theorem (MINT).

...read moreread less

Abstract: A novel method is proposed for realizing exact inverse filtering of acoustic impulse responses in room. This method is based on the principle called the multiple-input/output inverse theorem (MINT). The inverse is constructed from multiple finite-impulse response (FIR) filters (transversal filters) by adding some extra acoustic signal-transmission channels produced by multiple loudspeakers or microphones. The coefficients of these FIR filters can be computed by the well-known rules of matrix algebra. Inverse filtering in a sound field is investigated experimentally. It is shown that the proposed method is greatly superior to previous methods that use only one acoustic signal-transmission channel. The results prove the possibility of sound reproduction and sound reception without any distortion caused by reflected sounds. >

...read moreread less

734 citations

Journal Article•DOI•

Signal enhancement-a composite property mapping algorithm

[...]

J. Cadzow¹•Institutions (1)

Arizona State University¹

01 Jan 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A signal enhancement algorithm is developed that seeks to recover a signal from noise-contaminated distorted measurements made on that signal by utilizing a set of properties which the signal is known or is hypothesized as possessing.

...read moreread less

Abstract: A signal enhancement algorithm is developed that seeks to recover a signal from noise-contaminated distorted measurements made on that signal. This object is achieved by utilizing a set of properties which the signal is known or is hypothesized as possessing. The measured signal is modified to the smallest degree necessary to sequentially possess each of the individual properties. Conditions for the algorithm's convergence are established in which the primary requirement is that a composite property mapping be closed. This is a relatively unrestricted condition in comparison to that required of most existing signal-enhancement algorithms. >

...read moreread less

703 citations

Journal Article•DOI•

Multiband excitation vocoder

[...]

Daniel W. Griffin¹, Jae Lim¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Aug 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A speech model, referred to as the multiband excitation model, is presented where the band around each harmonic of the fundamental frequency is declared voiced or unvoiced and methods to synthesize speech from the model parameters are described.

...read moreread less

Abstract: A speech model, referred to as the multiband excitation model, is presented. In this model the band around each harmonic of the fundamental frequency is declared voiced or unvoiced. Estimation methods for the parameters of the model are developed and methods to synthesize speech from the model parameters are described. To illustrate a potential application of the speech model, an 8 kb/s vocoder is developed and its performance is evaluated. Both informal listening and intelligibility tests show that the vocoder has very good performance both in speech quality and intelligibility, particularly for noisy speech. >

...read moreread less

586 citations

Journal Article•DOI•

Focussing matrices for coherent signal-subspace processing

[...]

H. Hung¹, Mostafa Kaveh¹•Institutions (1)

Iowa State University¹

01 Jan 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A class of focussing matrices proposed for use in the coherent signal-subspace method (CSM) leads to performance substantially better than those suggested in previous studies on direction-of-arrival estimation.

...read moreread less

Abstract: A class of focussing matrices is proposed for use in the coherent signal-subspace method (CSM) (H. Wang and M. Kaveh, ibid., vol.ASSP-33, Aug. 1985). When the directions-of-arrival of wideband sources fall into more than one group of one beamwidth each, this class leads to performance substantially better than those suggested in previous studies on direction-of-arrival estimation. New insight into the structures of various focussing matrices and their effect on the performance of CSM is presented. The performance of CSM is compared for several classes of these matrices on the simulations as well as relative sufficiency analyses. >

...read moreread less

467 citations

Journal Article•DOI•

An analog electronic cochlea

[...]

Richard F. Lyon¹, Carver A. Mead¹•Institutions (1)

California Institute of Technology¹

01 Jul 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: An analog electronic cochlea has been built in CMOS VLSI technology using micropower techniques and Measurements on the test chip suggest that the circuit matches both the theory and observations from real coChleas.

...read moreread less

Abstract: An analog electronic cochlea has been built in CMOS VLSI technology using micropower techniques. The key point of the model and circuit is that a cascade of simple, nearly linear, second-order filter stages with controllable Q parameters suffices to capture the physics of the fluid-dynamic traveling-wave system in the cochlea, including the effects of adaptation and active gain involving the outer hair cells. Measurements on the test chip suggest that the circuit matches both the theory and observations from real cochleas. >

...read moreread less

439 citations

Journal Article•DOI•

Image restoration using a neural network

[...]

Y.-T. Zhou¹, Rama Chellappa¹, A. Vaid¹, B.K. Jenkins¹•Institutions (1)

University of Southern California¹

01 Jul 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: An approach for restoration of gray level images degraded by a known shift invariant blur function and additive noise is presented using a neural computational network and a high-quality image is obtained using this approach.

...read moreread less

Abstract: An approach for restoration of gray level images degraded by a known shift invariant blur function and additive noise is presented using a neural computational network. A neural network model is used to represent a possibly nonstationary image whose gray level function is the simple sum of the neuron state variables. The restoration procedure consists of two stages: estimation of the parameters of the neural network model and reconstruction of images. Owing to the model's fault-tolerant nature and computation capability, a high-quality image is obtained using this approach. A practical algorithm with reduced computational complexity is also presented. A procedure for learning the blur parameters from prototypes of original and degraded images is outlined. >

...read moreread less

409 citations

Journal Article•DOI•

Layered neural nets for pattern recognition

[...]

Bernard Widrow¹, Rodney Winter¹, Robert Baxter¹•Institutions (1)

Stanford University¹

01 Jul 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A pattern recognition concept involving first an 'invariance net' and second a 'trainable classifier' is proposed, which is expected that the same basic approach will be effective for speech recognition, where insensitivity to certain aspects of speech signals and at the same time sensitivity to other aspects ofspeech signals will be required.

...read moreread less

Abstract: A pattern recognition concept involving first an 'invariance net' and second a 'trainable classifier' is proposed. The invariance net can be trained or designed to produce a set of outputs that are insensitive to translation, rotation, scale change, perspective change, etc., of the retinal input pattern. The outputs of the invariance net are scrambled, however. When these outputs are fed to a trainable classifier, the final outputs are descrambled and the original patterns are reproduced in standard position, orientation, scale, etc. It is expected that the same basic approach will be effective for speech recognition, where insensitivity to certain aspects of speech signals and at the same time sensitivity to other aspects of speech signals will be required. The entire recognition system is a layered network of ADALINE neurons. The ability to adapt a multilayered neural net is fundamental. An adaptation rule is proposed for layered nets which is an extension of the MADALINE rule of the 1960s. The new rule, MRII, is a useful alternative to the backpropagation algorithm. >

...read moreread less

Journal Article•DOI•

Lattice structures for optimal design and robust implementation of two-channel perfect-reconstruction QMF banks

[...]

P.P. Vaidyanathan¹, P.-Q. Hoang¹•Institutions (1)

California Institute of Technology¹

01 Jan 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A lattice structure and an algorithm are presented for the design of two-channel QMF (quadrature mirror filter) banks, satisfying a sufficient condition for perfect reconstruction.

...read moreread less

Abstract: A lattice structure and an algorithm are presented for the design of two-channel QMF (quadrature mirror filter) banks, satisfying a sufficient condition for perfect reconstruction. The structure inherently has the perfect-reconstruction property, while the algorithm ensures a good stopband attenuation for each of the analysis filters. Implementations of such lattice structures are robust in the sense that the perfect-reconstruction property is preserved in spite of coefficient quantization. The lattice structure has the hierarchical property that a higher order perfect-reconstruction QMF bank can be obtained from a lower order perfect-reconstruction QMF bank, simply by adding more lattice sections. Several numerical examples are provided in the form of design tables. >

...read moreread less

Journal Article•DOI•

An improved spatial smoothing technique for bearing estimation in a multipath environment

[...]

R.T. Williams¹, Surendra Prasad¹, A.K. Mahalanabis¹, L.H. Sibul¹•Institutions (1)

Pennsylvania State University¹

01 Apr 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: It is shown that under certain conditions, the modified algorithm may fail to yield the desired increase in array aperture, and some simulation results concerning the sensitivity of the modified spatial smoothing algorithm to these conditions are provided.

...read moreread less

Abstract: It is well known that signal subspace algorithms perform poorly when coherent or highly correlated signals are present. Recently, the so-called spatial smoothing technique was devised to preprocess the array covariance matrix so that signal subspace algorithms can be applied irrespective of the signal correlation. Unfortunately, the application of this technique reduces the effective aperture of the array. A modified spatial smoothing technique that is capable of increasing the effective aperture of the array over that of conventional spatial smoothing methods is explored. It is shown that under certain conditions, the modified algorithm may fail to yield the desired increase in array aperture, and some simulation results concerning the sensitivity of the modified spatial smoothing algorithm to these conditions are provided. >

...read moreread less

Journal Article•DOI•

Regularized iterative image restoration with ringing reduction

[...]

Reginald L. Lagendijk¹, Jan Biemond¹, D.E. Boekee¹•Institutions (1)

Delft University of Technology¹

01 Dec 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A regularized iterative image restoration algorithm is proposed in which both ringing reduction methods are included by making use of the theory of the projections onto convex sets and the concept of norms in a weighted Hilbert space.

...read moreread less

Abstract: Linear space-invariant image restoration algorithms often introduce ringing effects near sharp intensity transitions. It is shown that these artifacts are attributable to the regularization of the ill-posed image restoration problem. Two possible methods to reduce the ringing effects in restored images are proposed. The first method incorporates deterministic a priori knowledge about the original image into the restoration algorithm. The second method locally regulates the severity of the noise magnification and the ringing phenomenon, depending on the edge information in the image. A regularized iterative image restoration algorithm is proposed in which both ringing reduction methods are included by making use of the theory of the projections onto convex sets and the concept of norms in a weighted Hilbert space. Both the numerical performance and the visual evaluation of the results are improved by the use of ringing reduction. >

...read moreread less

Journal Article•DOI•

Learned classification of sonar targets using a massively parallel network

[...]

R.P. Gorman¹, Terrence J. Sejnowski²•Institutions (2)

AlliedSignal¹, Johns Hopkins University²

01 Jul 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: The performance of a three-layered network was better than trained human listeners and the network generalized better than a nearest-neighbor classifier.

...read moreread less

Abstract: Massively parallel learning networks are applied to the classification of sonar returns from two undersea targets and the ability of networks to correctly classify both training and testing examples is studied. Networks with an intermediate layer of hidden processing units achieved a classification accuracy as high as 100% on a training set of 104 returns. These networks correctly classified a test set of 104 returns not contained in the training set with an accuracy of up to 90.4%. Networks without an intermediate layer of processing units achieved only 73.1% correct on the same test set. Performance improved and the variability due to the initial conditions for training decreased with the number of hidden units. The effect of training set design on test set performance was also examined. The performance of a three-layered network was better than trained human listeners and the network generalized better than a nearest-neighbor classifier. >

...read moreread less

Journal Article•DOI•

Experiments on neural net recognition of spoken and written text

[...]

D.J. Burr

01 Jul 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: Results indicate that neural networks and nearest-neighbor classifiers perform at near the same level of accuracy and a novel handwriting encoder is described.

...read moreread less

Abstract: The problems are discussed of the recognition of handprinted and spoken digits and the handprinted and spoken English alphabet. Four such experiments were conducted and the results were compared to a conventional nearest-neighbor classifier trained on the same data. Results indicate that neural networks and nearest-neighbor classifiers perform at near the same level of accuracy. For each task, a critical number of neurons can be determined experimentally which yields highest recognition accuracy with least hardware. This number can also measure the classification efficiency of the input feature encoder. Several techniques for optimizing the performance of layered networks are discussed. A constant level added to the input signal biases patterns into the range where the learning rate is highest. Eliminating near-zero weights after learning results in little loss of accuracy. Finally, a novel handwriting encoder is described. >

...read moreread less

Journal Article•DOI•

On the use of instantaneous and transitional spectral information in speaker recognition

[...]

F.K. Soong¹, Aaron E. Rosenberg¹•Institutions (1)

Bell Labs¹

01 Jun 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: The experimental results show that the instantaneous and transitional representations are relatively uncorrelated, thus providing complementary information for speaker recognition, and simple transmission channel variations are shown to affect both the instantaneous spectral representations and the corresponding recognition performance significantly.

...read moreread less

Abstract: The use of instantaneous and transitional spectral representations of spoken utterances for speaker recognition is investigated. Linear-predictive-coding (LPC)-derived cepstral coefficients are used to represent instantaneous spectral information, and best linear fits of each cepstral coefficient over a specified time window are used to represent transitional information. An evaluation has been carried out using a database of isolated digit utterances over dialed-up telephone lines by 10 talkers. Two vector quantization (VQ) codebooks, instantaneous and transitional, were constructed from each speaker's training utterances. The experimental results show that the instantaneous and transitional representations are relatively uncorrelated, thus providing complementary information for speaker recognition. A rectangular window of approximately 100 ms duration provides an effective estimate of the transitional spectral features for speaker recognition. Also, simple transmission channel variations are shown to affect both the instantaneous spectral representations and the corresponding recognition performance significantly, while the transitional representations and performance are relatively resistant. >

...read moreread less

Journal Article•DOI•

A new statistical approach for the automatic segmentation of continuous speech signals

[...]

Régine André-Obrecht

01 Jan 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A statistical approach for the segmentation of a continuous speech signal to detect acoustic events is presented and a comparison between the experimental results of automatic and handmade segmentations demonstrates the potential acoustic-phonetic classification capability of the proposed algorithms.

...read moreread less

Abstract: A statistical approach for the segmentation of a continuous speech signal to detect acoustic events is presented. Experiments are carried out to test the segmentation algorithms. Reasonable results are obtained with speech signals, although these are not exactly piecewise stationary. A comparison between the experimental results of automatic and handmade segmentations, demonstrates the potential acoustic-phonetic classification capability of the proposed algorithms. >

...read moreread less

Journal Article•DOI•

Adaptive eigensubspace algorithms for direction or frequency estimation and tracking

[...]

Jar-Ferr Yang¹, Mostafa Kaveh¹•Institutions (1)

University of Minnesota¹

01 Jan 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: The authors present an adaptive estimator of the complete noise or signal subspace of a sample covariance matrix as well as the estimator's practical implementations and simulation results show that the adaptive subspace algorithms perform substantially better than P.A. Thompson's (1980) adaptive version of V.F. Pisarenko's technique in estimating frequencies or directions of arrival (DOA) of plane waves.

...read moreread less

Abstract: The authors present an adaptive estimator of the complete noise or signal subspace of a sample covariance matrix as well as the estimator's practical implementations. The general formulation of the proposed estimator results from an asymptotic argument, which shows the signal or noise subspace computation to be equivalent to a constrained gradient search procedure. A highly parallel algorithm, denoted the inflation method, is introduced for the estimation of the noise subspace. The simulation results of these adaptive estimators show that the adaptive subspace algorithms perform substantially better than P.A. Thompson's (1980) adaptive version of V.F. Pisarenko's technique (1973) in estimating frequencies or directions of arrival (DOA) of plane waves. For tracking nonstationary parameters, the simulation results also show that the adaptive subspace algorithms are better than direct eigendecomposition methods for which computational complexity is much higher than the adaptive versions. >

...read moreread less

Journal Article•DOI•

FIR-median hybrid filters with predictive FIR substructures

[...]

P.J. Heinonen¹, Yrjö Neuvo²•Institutions (2)

Nokia¹, Tampere University of Technology²

01 Jun 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A class of finite-impulse response (FIR) median hybrid (FMH) filters that contain linear FIR substructures to estimate the current signal value using forward and backward prediction is introduced and Predictors maximizing the signal-to-noise ratio on signal sections described by an lth-order polynominal are derived.

...read moreread less

Abstract: A class of finite-impulse response (FIR) median hybrid (FMH) filters that contain linear FIR substructures to estimate the current signal value using forward and backward prediction is introduced. The output of the overall filter is the median of the predicted values and the actual signal value in the middle of the filter window. Predictors maximizing the signal-to-noise ratio on signal sections described by an lth-order polynominal are derived. The ramp enhancement filters are shown to attenuate the noise on a ramp signal better than the standard median (SM) filters. The new predictive FMH filters are shown to have root signals which do not exist for the SM filters, e.g. triangular waves. By combining the level and the ramp enhancement FMH filters, a filter is obtained which attenuates noise on constant and ramp signals. The noise attenuation on ramp signals is better than with the SM filter, and the predictive FMH filter has novel and meaningful root structures. The number of arithmetic operations needed to implement the predictive FMH filter grows linearly with the length of the filter. >

...read moreread less

Journal Article•DOI•

A digital method of modeling quadratically nonlinear systems with a general random input

[...]

K.I. Kim, E.J. Powers

01 Nov 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: Without assuming particular statistics of the input, a practical digital method of estimating linear and quadratic transfer functions of a nonlinear time-invariant system that can be described by Volterra series of up to second order is presented.

...read moreread less

Abstract: Without assuming particular statistics of the input, a practical digital method of estimating linear and quadratic transfer functions of a nonlinear time-invariant system that can be described by Volterra series of up to second order is presented. The method is tested and validated by analyzing input-output data of a known quadratically nonlinear system. It is used when there is little knowledge about the input statistics or the input is non-Gaussian. It is also noted that the ordinary coherence functions cannot be used in explaining the input-output power transfer relationship of a quadratic system excited by a non-Gaussian input signal. With respect to the practical application of the method, the relationship between the mean square errors involved in the transfer function estimates and the number of averages taken from the spectral estimation is qualitatively discussed. >

...read moreread less

Journal Article•DOI•

Analysis of the asymptotic relative efficiency of the MUSIC algorithm

[...]

Boaz Porat¹, Benjamin Friedlander•Institutions (1)

Technion – Israel Institute of Technology¹

01 Apr 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: An analytical performance evaluation of the errors of the direction-of-arrival estimates obtained by the MUSic algorithm for uncorrelated sources confirms empirical evidence to the excellent performance of the MUSIC algorithm for narrowband signals.

...read moreread less

Abstract: An analytical performance evaluation of the errors of the direction-of-arrival estimates obtained by the MUSIC algorithm for uncorrelated sources is provided. Explicit asymptotic formulas are derived for the means and the covariance of the estimates. The covariances are then compared to the Cramer-Rao lower bound. It is shown that for a single course, the MUSIC algorithm is asymptotically efficient. For multiple sources, the algorithm is not efficient in general. However, it approaches asymptotic efficiency when the SNRs (signal-to-noise ratios) of all sources tend to infinity. It is illustrated by several test cases that the relative efficiency of the MUSIC algorithm is nearly one under a wide range of parameter variations. The analytic performance evaluation thus confirms empirical evidence to the excellent performance of the MUSIC algorithm for narrowband signals. >

...read moreread less

Journal Article•DOI•

Stack filters and the mean absolute error criterion

[...]

E.J. Coyle¹, J.-H. Lin¹•Institutions (1)

Purdue University¹

01 Aug 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: It is shown that optimal stack filtering under the mean-absolute-error criterion is analogous to optimal linear filtering underThe mean-squared- error criterion: both linear filters and stack filters are defined by superposition properties, both classes are implementable, and both have tractable procedures for finding the optimal filter under an appropriate error criterion.

...read moreread less

Abstract: A method to determine the stack filter which minimizes the mean absolute error between its output and a desired signal, given noisy observations of this desired signal, is presented. Specifically, an optimal window-width-b stack filter can be determined with a linear program with O(b2/sup b/) variables. This algorithm is efficient since the number of different inputs to a window-width-b filter is M/sup b/ if the filter has M-valued input and the number of stack filters grows faster than 2 raised to the 2/sup b/2/ power. It is shown that optimal stack filtering under the mean-absolute-error criterion is analogous to optimal linear filtering under the mean-squared-error criterion: both linear filters and stack filters are defined by superposition properties, both classes are implementable, and both have tractable procedures for finding the optimal filter under an appropriate error criterion. >

...read moreread less

Journal Article•DOI•

Note on the use of the Wigner distribution for time-frequency signal analysis

[...]

Boualem Boashash¹•Institutions (1)

University of Queensland¹

01 Sep 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: It is shown that a correct use of the Wigner distribution for time-frequency signal analysis requiresUse of the analytic signal, and this version, often referred to as theWigner-Ville distribution (WVD), is straightforward to compute, does not exhibit any aliasing problem, and introduces no frequency artifacts.

...read moreread less

Abstract: It is shown that a correct use of the Wigner distribution (WD) for time-frequency signal analysis requires use of the analytic signal. This version, often referred to as the Wigner-Ville distribution (WVD), is straightforward to compute, does not exhibit any aliasing problem, and introduces no frequency artifacts. The problems introduced by the use of the Wigner distribution with a real signal are clarified. >

...read moreread less

Journal Article•DOI•

LPC speech coding based on variable-length segment quantization

[...]

Y. Shiraki, M. Honda

01 Sep 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A low-bit-rate linear predictive coder (LPC) that is based on variable-length segment quantization that is compared to that of fixed-length segments quantization and vector quantization for voice coding is presented.

...read moreread less

Abstract: A low-bit-rate linear predictive coder (LPC) that is based on variable-length segment quantization is presented. In this vocoder, the speech spectral-parameter sequence is represented as the concatenation of variable-length spectral segments generated by linearly time-warping fixed-length code segments. Both the sequence of code segments and the segment lengths are efficiently determined using a dynamic programming procedure. This procedure minimizes the spectral distance measured between the original and the coded spectral sequence in a given interval. An iterative algorithm is developed for designing fixed-length code segments for the training spectral sequence. It updates the segment boundaries of the training spectral sequence using an a priori codebook and updates the codebook using these segment sequences. The convergence of this algorithm is discussed theoretically and experimentally. In experiments, the performance of variable-length segment quantization for voice coding is compared to that of fixed-length segment quantization and vector quantization. >

...read moreread less

Journal Article•DOI•

Statistical analysis of effective singular values in matrix rank determination

[...]

K. Konstantinides¹, K. Yao²•Institutions (2)

Hewlett-Packard¹, University of California, Los Angeles²

01 May 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A major problem in using SVD (singular-value decomposition) as a tool in determining the effective rank of a perturbed matrix is that of distinguishing between significantly small and significantly large singular values.

...read moreread less

Abstract: A major problem in using SVD (singular-value decomposition) as a tool in determining the effective rank of a perturbed matrix is that of distinguishing between significantly small and significantly large singular values to the end, conference regions are derived for the perturbed singular values of matrices with noisy observation data. The analysis is based on the theories of perturbations of singular values and statistical significance test. Threshold bounds for perturbation due to finite-precision and i.i.d. random models are evaluated. In random models, the threshold bounds depend on the dimension of the matrix, the noisy variance, and predefined statistical level of significance. Results applied to the problem of determining the effective order of a linear autoregressive system from the approximate rank of a sample autocorrelation matrix are considered. Various numerical examples illustrating the usefulness of these bounds and comparisons to other previously known approaches are given. >

...read moreread less

Journal Article•DOI•

Hidden Markov model for Mandarin lexical tone recognition

[...]

Wu-Ji Yang¹, J.-C. Lee, Yueh-chin Chang, H.-C. Wang•Institutions (1)

National Tsing Hua University¹

01 Jul 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A case of lexical tone recognition for Mandarin speech is discussed using a combination of vector quantization and hidden Markov modelling techniques to convert the observation sequence into a symbol sequence for Hidden Markov modeling.

...read moreread less

Abstract: A case of lexical tone recognition for Mandarin speech is discussed using a combination of vector quantization and hidden Markov modelling techniques. The observation sequence was a sequence of vectorized parameters consisting of a logarithmic pitch interval and its first derivative. The vector quantization was applied to convert the observation sequence into a symbol sequence for Hidden Markov modeling. The speech database was provided by seven male and seven female college students, with each pronouncing 72 isolated monosyllabic utterances. A probabilistic model for each of the four tones was generated. A series of tonal recognition tests were then conducted to evaluate the effects of pitch reference base, codebook size, and tonal model topology. Future consideration of Mandarin speech recognition is also discussed. >

...read moreread less

Journal Article•DOI•

The complex cepstrum of higher order cumulants and nonminimum phase system identification

[...]

R. Pan¹, Chrysostomos L. Nikias¹•Institutions (1)

Northeastern University¹

01 Feb 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A computationally efficient identification procedure is proposed for a nonGaussian white-noise-driven linear, time-invariant, nonminimum phase system and is flexible enough to be applied on autoregressive (AR), moving average (MA), or ARMA system without a priori knowledge of the type of the system.

...read moreread less

Abstract: A computationally efficient identification procedure is proposed for a nonGaussian white-noise-driven linear, time-invariant, nonminimum phase system. The method is based on the idea of computing the complex cepstrum of higher order cumulants of the system output. In particular, the differential cepstrum parameters of the nonminimum phase impulse response are estimated directly from higher-order cumulants by least-squares solution or two-dimensional FFT operations. The method reconstructs the minimum-phase and maximum-phase impulse response components separately. It is flexible enough to be applied on autoregressive (AR), moving average (MA), or ARMA system without a priori knowledge of the type of the system. Benchmark simulation examples demonstrate the effectiveness of the method even with short length data records. >

...read moreread less

Journal Article•DOI•

On detecting edges in speckle imagery

[...]

Alan C. Bovik

01 Jan 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: Methods for detecting sustained intensity changes in images corrupted by speckle are analyzed and a ratio-of-averages edge detector is used in conjunction with the LoG, which is found to be much more effective than either of the individual edge detectors.

...read moreread less

Abstract: Methods for detecting sustained intensity changes in images corrupted by speckle are analyzed. The problem is complicated by the nature of the speckle, which is characterized by a high degree of correlation and (approximately multiplicative) signal dependence. These characteristics make the automated extraction of edges in speckle very difficult for applications requiring the location and identifications of objects (e.g., synthetic aperture radar). In particular, the Laplacian-of-a-Gaussian (LoG) edge detector for this problem is analyzed. While the LoG is found to be effective for detecting meaningful edges, use of the LoG also gives rise to numerous extraneous edges having no physical correlate. To alleviate these effects, a ratio-of-averages edge detector is used in conjunction with the LoG. The combined scheme is found to be much more effective than either of the individual edge detectors. Actual SEASAT synthetic aperture radar (SAR) images are used to demonstrate the effectiveness of each technique. >

...read moreread less

Journal Article•DOI•

Time delay estimation in unknown Gaussian spatially correlated noise

[...]

Chrysostomos L. Nikias¹, R. Pan¹•Institutions (1)

Northeastern University¹

01 Nov 1988-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: It is demonstrated that estimation techniques based on higher-order cumulants suppress the effect of correlated Gaussian noise sources and therefore exhibit improved performance over generalized cross-correlation methods.

...read moreread less

Abstract: A novel class of methods that estimate the difference in arrival time between signals corrupted by spatially correlated Gaussian noise sources of unknown cross correlation is presented. The methods are based on the idea of comparing the similarities between the two sensor measurements in higher-order spectrum domains (bispectrum) rather than in the cross-correlation domain. It is demonstrated that estimation techniques based on higher-order cumulants suppress the effect of correlated Gaussian noise sources and therefore exhibit improved performance over generalized cross-correlation methods. Results are reported for different types of signals, lengths of data records, and signal-to-noise ratios. >

...read moreread less

Collapse