scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Signal Processing Letters in 1999"


Journal ArticleDOI
TL;DR: An effective hang-over scheme which considers the previous observations by a first-order Markov process modeling of speech occurrences is proposed which shows significantly better performances than the G.729B VAD in low signal-to-noise ratio (SNR) and vehicular noise environments.
Abstract: In this letter, we develop a robust voice activity detector (VAD) for the application to variable-rate speech coding. The developed VAD employs the decision-directed parameter estimation method for the likelihood ratio test. In addition, we propose an effective hang-over scheme which considers the previous observations by a first-order Markov process modeling of speech occurrences. According to our simulation results, the proposed VAD shows significantly better performances than the G.729B VAD in low signal-to-noise ratio (SNR) and vehicular noise environments.

1,341 citations


Journal ArticleDOI
TL;DR: In this article, a simple spatially adaptive statistical model for wavelet image coefficients was introduced and applied to image denoising. But the model is inspired by a recent wavelet compression algorithm, the estimationquantization coder.
Abstract: We introduce a simple spatially adaptive statistical model for wavelet image coefficients and apply it to image denoising. Our model is inspired by a recent wavelet image compression algorithm, the estimation-quantization (EQ) coder. We model wavelet image coefficients as zero-mean Gaussian random variables with high local correlation. We assume a marginal prior distribution on wavelet coefficients variances and estimate them using an approximate maximum a posteriori probability rule. Then we apply an approximate minimum mean squared error estimation procedure to restore the noisy wavelet image coefficients. Despite the simplicity of our method, both in its concept and implementation, our denoising results are among the best reported in the literature.

847 citations


Journal ArticleDOI
TL;DR: It is demonstrated that three speech signals can be separated with good fidelity given only two mixtures of the three signals.
Abstract: Empirical results were obtained for the blind source separation of more sources than mixtures using a previously proposed framework for learning overcomplete representations. This technique assumes a linear mixing model with additive noise and involves two steps: (1) learning an overcomplete representation for the observed data and (2) inferring sources given a sparse prior on the coefficients. We demonstrate that three speech signals can be separated with good fidelity given only two mixtures of the three signals. Similar results were obtained with mixtures of two speech signals and one music signal.

455 citations


Journal ArticleDOI
TL;DR: A novel approach for the problem of estimating the data model of independent component analysis (or blind source separation) in the presence of Gaussian noise is introduced and a modification of the fixed-point (FastICA) algorithm is introduced.
Abstract: A novel approach for the problem of estimating the data model of independent component analysis (or blind source separation) in the presence of Gaussian noise is introduced. We define the Gaussian moments of a random variable as the expectations of the Gaussian function (and some related functions) with different scale parameters, and show how the Gaussian moments of a random variable can be estimated from noisy observations. This enables us to use Gaussian moments as one-unit contrast functions that have no asymptotic bias even in the presence of noise, and that are robust against outliers. To implement the maximization of the contrast functions based on Gaussian moments, a modification of the fixed-point (FastICA) algorithm is introduced.

226 citations


Journal ArticleDOI
TL;DR: Once the signal and noise subspaces are estimated, any subspace based approach, including the multiple signal classification (MUSIC) algorithm, can be applied for direction of arrival (DOA) estimation.
Abstract: A new method for the estimation of the signal subspace and noise subspace based on time-frequency signal representations is introduced. The proposed approach consists of the joint block-diagonalization (JBD) of a set of spatial time-frequency distribution matrices. Once the signal and noise subspaces are estimated, any subspace based approach, including the multiple signal classification (MUSIC) algorithm, can be applied for direction of arrival (DOA) estimation. Performance of the proposed time-frequency MUSIC (TF-MUSIC) for an impinging chirp signal using three different kernels is numerically evaluated.

182 citations


Journal ArticleDOI
TL;DR: A novel design criterion for data-dependent narrowband filters that are of interest in temporal or spatial spectral analysis applications is introduced and the solution to the design problem considered is shown to coincide with the previously introduced amplitude and phase estimation (APES) filter.
Abstract: We introduce a novel design criterion for data-dependent narrowband filters that are of interest in temporal or spatial spectral analysis applications. The solution to the design problem considered is shown to coincide with the previously introduced amplitude and phase estimation (APES) filter. The new derivation of APES in this article sheds more light on the properties of APES and provides some intuitive explanation of the performance superiority of the APES filter over the Capon filter.

170 citations


Journal ArticleDOI
TL;DR: Generalizations of the Perona-Malik (1990) equation are introduced and an edge enhancing functional is proposed for direct edge enhancement and a number of super diffusion operators are introduced for fast and effective smoothing.
Abstract: This article introduces generalizations of the Perona-Malik (1990) equation. An edge enhancing functional is proposed for direct edge enhancement. A number of super diffusion operators is introduced for fast and effective smoothing. Statistical information is utilized for robust edge-stopping. Numerical integration is conducted by using a previously developed quasi-interpolating wavelet method. Computer experiments indicate that the present algorithm is very efficient for edge-detecting and noise-removing.

160 citations


Journal ArticleDOI
TL;DR: It is shown that the data-supported grid search of the LF provides a performance similar to that achieved by a genetic algorithm, but at a significantly lower computational cost.
Abstract: After reviewing the main existing methods for determining the maximum-likelihood (ML) estimates of the direction-of-arrival (DOA) parameters in array signal processing applications, we introduce a new conceptually simple and computationally effective approach that consists of maximizing the likelihood function (LF) over a set of points derived from the data. We show that the data-supported grid search of the LF provides a performance similar to that achieved by a genetic algorithm, but at a significantly lower computational cost. We use an ESPRIT-like algorithm to obtain the grid points with support in the data, although our approach is not limited to this choice.

148 citations


Journal ArticleDOI
TL;DR: In this article, a new set of speech feature parameters based on multirate signal processing and the Teager energy operator is introduced, which have robust speech recognition performance in the presence of car engine noise.
Abstract: In this letter, a new set of speech feature parameters based on multirate signal processing and the Teager energy operator is introduced. The speech signal is first divided into nonuniform subbands in mel-scale using a multirate filterbank, then the Teager energies of the subsignals are estimated. Finally, the feature vector is constructed by log-compression and inverse discrete cosine transform (DCT) computation. The new feature parameters have robust speech recognition performance in the presence of car engine noise.

129 citations


Journal ArticleDOI
TL;DR: A novel method for estimating the shape factor of a generalized Gaussian probability density function that relies on matching the entropy of the modeled distribution with that of the empirical data is presented and assessed.
Abstract: A novel method for estimating the shape factor of a generalized Gaussian probability density function (PDF) is presented and assessed. It relies on matching the entropy of the modeled distribution with that of the empirical data. The entropic approach is suitable for real-time applications and yields results that are accurate also for low values of the shape factor and small data sample. Modeling of wavelet coefficients for entropy coding is addressed and experimental results on true image data are reported and discussed.

93 citations


Journal ArticleDOI
TL;DR: Using the natural gradient, this work presents a new learning algorithm based on the minimization of mutual information that derives a natural gradient on the manifold using the isometry of the Riemannian metric.
Abstract: We study the natural gradient approach to blind separation of overdetermined mixtures. First we introduce a Lie group on the manifold of overdetermined mixtures, and endow a Riemannian metric on the manifold based on the property of the Lie group. Then we derive the natural gradient on the manifold using the isometry of the Riemannian metric. Using the natural gradient, we present a new learning algorithm based on the minimization of mutual information.

Journal ArticleDOI
D.B. Ward1, Gary W. Elko1
TL;DR: In this letter, a robustness analysis of the two-channel crosstalk canceler is presented, and optimum loudspeaker positions are derived.
Abstract: Acoustic crosstalk cancellation is a signal processing technique whereby two (or possibly more) loudspeakers are used to deliver desired signals exactly at the listener's ears. Such a system is useful for three-dimensional (3-D) audio applications, and removes the requirement for the listener to wear headphones. However, crosstalk cancelers are notoriously nonrobust to slight movements in head position. Recently, a system consisting of two closely spaced loudspeakers (referred to as the "stereo-dipole") has been proposed to improve the robustness of the crosstalk canceler. In this letter we present a robustness analysis of the two-channel crosstalk canceler, and derive optimum loudspeaker positions.

Journal ArticleDOI
TL;DR: It is proved that instantaneous frequency equals the average frequency at each time only when there is symmetry in the instantaneous spectrum, as previous empirical evidence has suggested.
Abstract: The interpretation of instantaneous frequency has been a subject of interest for many years. One interpretation is that it is the average frequency at each time in the signal. We prove that instantaneous frequency equals the average frequency at each time only when there is symmetry in the instantaneous spectrum, as previous empirical evidence has suggested. Also, when there is such symmetry, the average frequency at each time equals the median frequency at each time.

Journal ArticleDOI
TL;DR: A new, maximum likelihood estimate of the Laplacian parameter is derived using only the quantized coefficients available at the decoder, which proves that the benefits of biased reconstruction through extensive simulations are very close to the best possible resulting from centroid reconstruction.
Abstract: Assuming a Laplacian distribution, there exists a well known method for optimally biasing the reconstruction levels for the quantized ac discrete cosine transform (DCT) coefficients in the JPEG decoder. This, however, requires an estimate of the Laplacian distribution parameter. We derive a new, maximum likelihood estimate of the Laplacian parameter using only the quantized coefficients available at the decoder. We quantify the benefits of biased reconstruction through extensive simulations and demonstrate that such improvements are very close to the best possible resulting from centroid reconstruction.

Journal ArticleDOI
TL;DR: This letter presents an alternative solution that does not require the explicit estimation of the noise and the driving process variances and deals with a new formulation of the approach proposed within a control literature framework by Mehra (1970).
Abstract: A great deal of attention has been paid to speech enhancement using a single microphone system. The various approaches, based on the Kalman filter, operate in two steps: (1) the noise variances and the parameters of the speech model are estimated, and (2) the speech signal is retrieved using standard Kalman filtering. This letter presents an alternative solution that does not require the explicit estimation of the noise and the driving process variances. This deals with a new formulation of the approach proposed within a control literature framework by Mehra (1970).

Journal ArticleDOI
TL;DR: Simulations show that ETDE with the filter modulated to the signal center frequency significantly outperforms conventional ETDE.
Abstract: This letter addresses the problem of on-line sub-sample time delay estimation of narrowband signals of known center frequency. A new form of the Lagrange interpolator filter is presented in this letter, which is incorporated into the explicit time delay estimator (ETDE) method. Simulations show that ETDE with the filter modulated to the signal center frequency significantly outperforms conventional ETDE.

Journal ArticleDOI
TL;DR: An alternative definition for the Wigner distribution is proposed, which has a clear extension to discrete signals, and it is shown that the Wigan distribution does not exist for certain classes of discrete signals.
Abstract: Among the myriad of time-frequency distributions, the Wigner distribution stands alone in satisfying many desirable mathematical properties. Attempts to extend definitions of the Wigner distribution to discrete signals have not been completely successful. In this letter, we propose an alternative definition for the Wigner distribution, which has a clear extension to discrete signals. Under this definition, we show that the Wigner distribution does not exist for certain classes of discrete signals.

Journal ArticleDOI
TL;DR: An efficient technique that combines two popular adaptive filtering techniques, namely adaptive noise cancellation and adaptive signal enhancement, in a single recurrent neural network is proposed for the adaptive removal of ocular artifacts from EEG.
Abstract: The electroencephalogram (EEG) is susceptible to various large signal contaminations or artifacts. Ocular artifacts act as major source of noise, making it difficult to distinguish normal brain activities from the abnormal ones. In this letter, an efficient technique that combines two popular adaptive filtering techniques, namely adaptive noise cancellation and adaptive signal enhancement, in a single recurrent neural network is proposed for the adaptive removal of ocular artifacts from EEG. A real time recurrent learning algorithm is employed for training the proposed neural network which converges faster to a lower mean squared error. This technique is suitable for real-time processing.

Journal ArticleDOI
TL;DR: This work points out some important properties of the normalized fourth-order cumulant and emphasizes the relation between the signal distribution and the sign of the kurtosis, which gives theoretical explanation to techniques, like nonpermanent adaptation, used in nonstationary situations.
Abstract: In this work, we point out some important properties of the normalized fourth-order cumulant (i.e., the kurtosis). In addition, we emphasize the relation between the signal distribution and the sign of the kurtosis. One should mention that in many situations, authors claim that the sign of the kurtosis depends on the nature of the signal (i.e., over- or sub-Gaussian). For a unimodal probability density function, that claim is true and is clearly proved in the letter. But for more complex distributions, it has been shown that the kurtosis sign may change with parameters and does not depend only on the asymptotic behavior of the distributions. Finally, these results give theoretical explanation to techniques, like nonpermanent adaptation, used in nonstationary situations.

Journal ArticleDOI
TL;DR: A novel method is proposed to continuously estimate the SNR across the frequency bands without the need for a speech detector, based on a sinusoidal model for speech and a Gaussian assumption about the noise.
Abstract: This article addresses the problem of instantaneous signal-to-noise ratio (SNR) estimation during speech activity for the purpose of improving the performance of speech enhancement algorithms. It is shown that the kurtosis of noisy speech may be used to individually estimate speech and noise energies when speech is divided into narrow bands. Based on this concept, a novel method is proposed to continuously estimate the SNR across the frequency bands without the need for a speech detector. The derivations are based on a sinusoidal model for speech and a Gaussian assumption about the noise. Experimental results using recorded speech and noise show that the model and the derivations are valid, though not entirely accurate across the whole spectrum; it is also found that many noise types encountered in mobile telephony are not far from Gaussianity as far as higher statistics are concerned, making this scheme quite effective.

Journal ArticleDOI
TL;DR: An efficient implementation of the backward greedy algorithm is proposed that yields a significant improvement in computational efficiency over the standard implementation and an efficient algorithm for the case in which the transform matrix is too large to be stored is proposed.
Abstract: Recent work in sparse signal reconstruction has shown that the backward greedy algorithm can select the optimal subset of unknowns if the perturbation of the data is sufficiently small. We propose an efficient implementation of the backward greedy algorithm that yields a significant improvement in computational efficiency over the standard implementation. Furthermore, we propose an efficient algorithm for the case in which the transform matrix is too large to be stored. We analyze the computational complexity and compare the algorithms, and we illustrate the improved efficiency with examples.

Journal ArticleDOI
TL;DR: The reasons for the failure of the traditional definition of instantaneous frequency (IF/sub t/) in the multicomponent case are determined and this enables us to understand and integrate all previously reported cases in a simple unified theory.
Abstract: The failure of the traditional definition of instantaneous frequency (IF/sub t/) in the multicomponent case has been often reported. We determine the reasons for the failure of this definition. This enables us to understand and integrate all previously reported cases in a simple unified theory. As a direct consequence, we are able to extrapolate and predict the behavior of the traditional definition for any type of multicomponent signal.

Journal ArticleDOI
TL;DR: It is demonstrated that this method can be adapted for cross QAM constellations, if alternative fourth-order statistics are used, but a four quadrant inverse tangent function must now be used.
Abstract: Blind phase recovery for square QAM communication systems using higher order statistics is well established. It is demonstrated that this method can be adapted for cross QAM constellations, if alternative fourth-order statistics are used. However, a four quadrant inverse tangent function must now be used. Monte Carte simulation provides evidence of the usefulness of the approach.

Journal ArticleDOI
TL;DR: A new algorithm, based on a tree-structured Markov random field model, to carry out the unsupervised classification of images, that is adaptive to the local characteristics of the image and provides useful side information about the segmentation process.
Abstract: We propose a new algorithm, based on a tree-structured Markov random field (MRP) model, to carry out the unsupervised classification of images. It presents several appealing features; due to the MRF model, it takes into account spatial dependencies, yet is computationally light because only binary MRFs are used and a progressive refinement of information takes place. Moreover, it is adaptive to the local characteristics of the image and provides useful side information about the segmentation process.

Journal ArticleDOI
TL;DR: An iterative quadratic minimum distance (IQMD) algorithm for computing the reduced-rank Wiener filter is presented, shown to be globally and exponentially convergent under some weak conditions.
Abstract: The reduced-rank Wiener filter (RRWF) is a generic tool for data compression and filtering. This letter presents an iterative quadratic minimum distance (IQMD) algorithm for computing the RRWF. Although it is iterative in nature, the IQMD algorithm is shown to be globally and exponentially convergent under some weak conditions. While the conventional algorithms for computing the RRWF require an order of n/sup 3/ flops, the IQMD algorithm requires only an order of n/sup 2/ flops at each iteration where n is the dimension of data. The number of iterations required in practice is often small due to the exponential convergence rate of the IQMD.

Journal ArticleDOI
TL;DR: A new technique is introduced for the implementation of context based adaptive arithmetic entropy coding, based on the prediction of the value of the current transform coefficient, using a weighted least squares method, in order to achieve appropriate context selection for arithmetic coding.
Abstract: Significant progress has recently been made in lossless image compression using discrete wavelet transforms. The overall performance of these schemes may be further improved by properly designing efficient entropy coders. A new technique is introduced for the implementation of context based adaptive arithmetic entropy coding. This technique is based on the prediction of the value of the current transform coefficient, using a weighted least squares method, in order to achieve appropriate context selection for arithmetic coding. Experimental results illustrate and evaluate the performance of the proposed technique.

Journal ArticleDOI
TL;DR: This work demonstrates that filterbanks induce a special form of spectral redundancy even if the input is deterministic, which leads to frequency domain subspace methods for the identification of AR, MA and ARMA channels.
Abstract: This letter studies the problem of identifying a single-input single-output channel by linearly precoding the input. Although it is known that filterbanks induce cyclostationarity if the input is stationary, this work demonstrates that filterbanks induce a special form of spectral redundancy even if the input is deterministic. This frequency domain interpretation of filterbanks nicely characterizes all linear precoders. It not only helps to identify the strengths and aid in the design of linear precoders, it leads to frequency domain subspace methods for the identification of AR, MA and ARMA channels.

Journal ArticleDOI
TL;DR: An efficient direct method for the computation of a length-N discrete cosine transform (DCT) given two adjacent length-(N/2) DCT coefficients, which is lower than the traditional approach for lengths N>8.
Abstract: An efficient direct method for the computation of a length-N discrete cosine transform (DCT) given two adjacent length-(N/2) DCT coefficients, is presented. The computational complexity of the proposed method is lower than the traditional approach for lengths N>8. Savings of N memory locations and 2N data transfers are also achieved.

Journal ArticleDOI
TL;DR: It is found that normalizing all the coefficients of the denominator filter by the first coefficient after each adaptation removes the bias and leads to unbiased estimates, and the proposed method can indeed produce unbiased parameter estimates in the presence of noise.
Abstract: We present a novel may to remove the bias in equation-error based adaptive infinite impulse response (IIR) filtering by conceiving a scheme called monic normalization. It is found that normalizing all the coefficients of the denominator filter by the first coefficient after each adaptation removes the bias and leads to unbiased estimates. The analysis of stationary points is presented to show that the proposed method can indeed produce unbiased parameter estimates in the presence of noise. The computer simulation results also demonstrate that the proposed method performs better than or comparable to existing algorithms, while requiring much lower computational complexity.

Journal ArticleDOI
TL;DR: For any finite number of samples there exist physical conditions for which the proposed approach outperforms the traditional one, asymptotically as the number of snapshots tends to infinity.
Abstract: A new approach for estimating the number of radiating, not fully correlated sources using the data received by an array of sensors is presented. The common approach is to apply information theoretic criteria, such as the minimum description length (MDL) or the Akaike information criterion (AIC), on the received data. Alternatively, we suggest to apply these criteria on the ordered eigenvalues of the sample data covariance matrix. While asymptotically, as the number of snapshots tends to infinity, the two approaches converge, we demonstrate that for any finite number of samples there exist physical conditions for which the proposed approach outperforms the traditional one. These cases are associated with spatially close sources, or with highly correlated sources, or with the case of sources with very different signal-to-noise ratio (SNR).