scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Signal Processing Letters in 2000"


Journal ArticleDOI
TL;DR: A method for ROI coding that does not require any shape information to be transmitted to the decoder is described, which makes it possible to have arbitrarily shaped ROI without the need of shape information and ROI mask generation at the decoding.
Abstract: The general method for generating the regions of interest (ROI) mask needed for encoding ROI in the upcoming JPEG2000 still image coding standard is presented. A simple method for the generation of the ROI mask for rectangular-shaped ROI is then proposed. Finally, to simplify the decoder when dealing with arbitrary shaped ROIs, a method for ROI coding that does not require any shape information to be transmitted to the decoder is described. A small coding penalty is associated with this method, while it makes it possible to have arbitrarily shaped ROI without the need of shape information and ROI mask generation at the decoder. The proposed methods have been included in the Final Committee Draft of JPEG2000 Part 1.

246 citations


Journal ArticleDOI
TL;DR: It is shown that under certain conditions the performance of a suboptimal detector may be improved by adding noise to the received data.
Abstract: It is shown that under certain conditions the performance of a suboptimal detector may be improved by adding noise to the received data. The reasons for this counterintuitive result are explained and a computer simulation example given.

207 citations


Journal ArticleDOI
TL;DR: An orthonormal version of the PAST algorithm for fast estimation and tracking of the principal subspace or/and principal components of a vector sequence and guarantees the orthonormality of the weight matrix at each iteration is elaborated on.
Abstract: Subspace decomposition has proven to be an important tool in adaptive signal processing A number of algorithms have been proposed for tracking the dominant subspace Among the most robust and most efficient methods is the projection approximation and subspace tracking (PAST) method This paper elaborates on an orthonormal version of the PAST algorithm for fast estimation and tracking of the principal subspace or/and principal components of a vector sequence The orthonormal PAST (OPAST) algorithm guarantees the orthonormality of the weight matrix at each iteration Moreover, it has a linear complexity like the PAST algorithm and a global convergence property like the natural power (NP) method

189 citations


Journal ArticleDOI
TL;DR: A family of fast biorthogonal block transforms called binDCT that can be implemented using only shift and add operations is presented, based on a VLSI-friendly lattice structure that robustly enforces both linear phase and perfect reconstruction properties.
Abstract: This paper presents a family of fast biorthogonal block transforms called binDCT that can be implemented using only shift and add operations. The transform is based on a VLSI-friendly lattice structure that robustly enforces both linear phase and perfect reconstruction properties. The lattice coefficients are parameterized as a series of dyadic lifting steps providing fast, efficient, in place computation of the transform coefficients as well as the ability to map integers to integers. The new 8/spl times/8 transforms all approximate the popular 8/spl times/8 DCT closely, attaining a coding gain range of 8.77-8.82 dB, despite requiring as low as 14 shifts and 31 additions per eight input samples. Application of the binDCT in both lossy and lossless image coding yields very competitive results compared to the performance of the original floating-point DCT.

182 citations


Journal ArticleDOI
TL;DR: A novel speech enhancement technique based on global soft decision that provides a unified framework for such procedures as speech absence probability computation, spectral gain modification, and noise spectrum estimation using the same statistical model assumption.
Abstract: In this letter, we propose a novel speech enhancement technique based on global soft decision. The proposed approach provides a unified framework for such procedures as speech absence probability (SAP) computation, spectral gain modification, and noise spectrum estimation using the same statistical model assumption. Performances of the proposed enhancement algorithm are evaluated by subjective tests under various environments and show better results compared with the IS-127 standard enhancement method.

153 citations


Journal ArticleDOI
John Platt1
TL;DR: In this article, an error metric inspired by psychophysical experiments is used to reduce the number of pixels to be set in a high-resolution input image, and a linear system of equations can be expressed as a set of filters.
Abstract: Displays with repeating patterns of colored subpixels gain spatial resolution by setting individual subpixels rather than by setting entire pixels. This paper describes optimal filtering that produces subpixel values from a high-resolution input image. The optimal filtering is based on an error metric inspired by psychophysical experiments. Minimizing the error metric yields a linear system of equations, which can be expressed as a set of filters. These filters provide the same quality of font display as standard anti-aliasing at a point size 25% smaller. This optimization forms the filter design framework for Microsoft's ClearType.

136 citations


Journal ArticleDOI
Arie Yeredor1
TL;DR: It is shown that substantial improvement over SOBI can be attained when the joint diagonalization is transformed into a properly weighted nonlinear least squares problem.
Abstract: Blind separation of Gaussian sources with different spectra can be attained using second-order statistics. The second-order blind identification (SOBI) algorithm, proposed by Belouchrani et al. (1997), uses approximate joint diagonalization. We show that substantial improvement over SOBI can be attained when the joint diagonalization is transformed into a properly weighted nonlinear least squares problem. We provide an iterative solution and derive the optimal weights for our weights-adjusted SOBI (WASOBI) algorithm. The improvement is demonstrated by analysis and simulations.

135 citations


Journal ArticleDOI
TL;DR: Simulation results showed that the RLM algorithm performs better than the conventional RLS, NRLS, and the OSFKF algorithms when the desired and input signals are corrupted by impulses.
Abstract: This paper proposes a recursive least M-estimate (RLM) algorithm for robust adaptive filtering in impulse noise. It employs an M-estimate cost function, which is able to suppress the effect of impulses on the filter weights. Simulation results showed that the RLM algorithm performs better than the conventional RLS, NRLS, and the OSFKF algorithms when the desired and input signals are corrupted by impulses. Its initial convergence, steady-state error, computational complexity, and robustness to sudden system change are comparable to the conventional RLS algorithm in the presence of Gaussian noise alone.

104 citations


Journal ArticleDOI
TL;DR: A new STFD-based wideband root-MUSIC estimator is proposed that employs an extended coherent signal-subspace principle involving coherent averaging over a pre-selected set of time-frequency points rather than the conventional frequency-only averaging procedure.
Abstract: The recently developed concept of narrowband spatial time-frequency distributions (STFDs) is extended to the wide-band case. A new STFD-based wideband root-MUSIC estimator is proposed. This technique employs an extended coherent signal-subspace (CSS) principle involving coherent averaging over a pre-selected set of time-frequency points rather than the conventional frequency-only averaging procedure.

95 citations


Journal ArticleDOI
TL;DR: This letter investigates the impact of stress on monophone speech recognition accuracy and proposes a new set of acoustic parameters based on high resolution wavelet analysis, entitled wavelet packet parameters (WPP) and subband-based cepstral parameters (SBC).
Abstract: This letter investigates the impact of stress on monophone speech recognition accuracy and proposes a new set of acoustic parameters based on high resolution wavelet analysis. The two parameter schemes are entitled wavelet packet parameters (WPP) and subband-based cepstral parameters (SBC). The performance of these features is compared to traditional Mel-frequency cepstral coefficients (MFCC) for stressed speech monophone recognition. The stressed speaking styles considered are neutral, angry, loud, and Lombard effect speech from the SUSAS database. An overall monophone recognition improvement of 20.4% and 17.2% is achieved for loud and angry stressed speech, with a corresponding increase in the neutral monophone rate of 9.9% over MFCC parameters.

88 citations


Journal ArticleDOI
TL;DR: This letter unveils an efficient algorithm for sampling rate conversion (SRC) technique from 44.1 kHz compact disc (CD) to 48 kHz digital audio tape (DAT) that requires fewer million instructions per second (MIPS) and memory.
Abstract: This letter unveils an efficient algorithm for sampling rate conversion (SRC) technique from 44.1 kHz compact disc (CD) to 48 kHz digital audio tape (DAT). This method involves upsampling the input signal by two, and then passing the interpolated signal through a fractional delay filter that employs a simple decimation. This method can also be used for SRC from DAT to CD without changing the filter coefficients. The proposed algorithm is simulated in Matlab and can be implemented in a realtime digital signal processor (DSP). Compared with other existing methods, the proposed method has the advantage that it requires fewer million instructions per second (MIPS) and memory.

Journal ArticleDOI
TL;DR: Based on the relation between the ambiguity function represented in a quasipolar coordinate system and the fractional power spectra, the fractionsal Fourier transform (FT) moments are introduced and their applications for signal analysis are discussed.
Abstract: Based on the relation between the ambiguity function represented in a quasipolar coordinate system and the fractional power spectra, the fractional Fourier transform (FT) moments are introduced. Important equalities for the global second order fractional FT moments are derived, and their applications for signal analysis are discussed. The connection between the local moments and the angle derivative of the fractional power spectra is established. This permits us to solve the phase retrieval problem if only two close fractional power spectra are known.

Journal ArticleDOI
TL;DR: It is shown that a finite impulse response and multi-input-multi-output (FIR-MIMO) system with colored input is blindly identifiable up to a permutation and scaling using the second order statistics of the system's output if (a) the system function is irreducible, and (b) the input signals are uncorrelated from each other and have distinct power spectra.
Abstract: We show that a finite impulse response and multi-input-multi-output (FIR-MIMO) system with colored input is blindly identifiable up to a permutation and scaling using the second order statistics (SOS) of the system's output if (a) the system function is irreducible, and (b) the input signals are uncorrelated from each other and have distinct power spectra. Condition (a) is weaker than several conditions reported previously. It suggests a further potential of developing more robust blind algorithms.

Journal ArticleDOI
TL;DR: An orthogonalized version of the Oja algorithm (OOja) is proposed that can be used for the estimation of minor and principal subspaces of a vector sequence and offers advantages as orthogonality of the weight matrix.
Abstract: In this letter, we propose an orthogonalized version of the Oja algorithm (OOja) that can be used for the estimation of minor and principal subspaces of a vector sequence. The new algorithm offers, as compared to Oja, such advantages as orthogonality of the weight matrix, which is ensured at each iteration, numerical stability, and a quite similar computational complexity.

Journal ArticleDOI
TL;DR: This letter presents an efficient algorithm to determine multiple frequencies from multiple undersampled waveforms with sampling rates below the Nyquist rates.
Abstract: Frequency estimation/determination has applications in various areas, where the sampling rate is usually above the Nyquist rate. In some applications, it is preferred that the range of the frequencies is as large as possible for a given sampling rate and in some applications, the sampling rate is below the Nyquist rate. In both eases, frequency estimation from undersampled waveforms is needed. In this letter, we present an efficient algorithm to determine multiple frequencies from multiple undersampled waveforms with sampling rates below the Nyquist rates.

Journal ArticleDOI
TL;DR: It is shown that most of the gray-level histogram statistics of the images do not have any direct effect on the lossy coding performance, and image activity measure (IAM) is the only feature that has a negative correlation with the PSNR value.
Abstract: When a variety of multimedia images of different types (natural, synthetic, compound, medical, etc.) are compressed using a fixed wavelet filter, it is observed that the peak SNR (PSNR) values for a given compression ratio vary widely by as much as 30 dB from image to image. In this letter, it is shown that most of the gray-level histogram statistics of the images do not have any direct effect on the lossy coding performance, and image activity measure (IAM) is the only feature that has a negative correlation with the PSNR value. We determine the best measure of such image activity and show that one of these IAMs is not only very effective in differentiating between various images but also correlates well with the PSNR. We establish this relationship in the form of the IAM-PSNR equation.

Journal ArticleDOI
TL;DR: It is shown that this nonlinear detector can achieve smaller probability of error compared to the linear detector, especially occurs for non-Gaussian noises with heavy tails or a leptokurtic character.
Abstract: We compare two simple test statistics that a detector can compute from multiple noisy data in a binary decision problem based on a maximum a posteriori probability (MAP) criterion. One of these statistics is the standard sample mean of the data (linear detector), which allows one to minimize the probability of detection error when the noise is Gaussian. The other statistic is even simpler and consists of a sample mean of a two-state quantized version of the data (nonlinear detector). Although simpler to compute, we show that this nonlinear detector can achieve smaller probability of error compared to the linear detector. This especially occurs for non-Gaussian noises with heavy tails or a leptokurtic character.

Journal ArticleDOI
TL;DR: An innovative interpolator is presented that performs high quality 2/spl times/interpolation on both synthetic and real world images and provides edge-sensitive data interpolation so that sharp- and artifacts-free images are obtained at a reasonable computational cost.
Abstract: In this paper, we present an innovative interpolator that performs high quality 2/spl times/interpolation on both synthetic and real world images. Its structure, which is based on a rational operator, provides edge-sensitive data interpolation so that sharp- and artifacts-free images are obtained at a reasonable computational cost.

Journal ArticleDOI
TL;DR: An unsupervised method is presented for segmenting video sequences degraded by noise using a Markov random field, and the energy function of each MRF is minimized by chromosomes that evolve using distributed genetic algorithms.
Abstract: An unsupervised method is presented for segmenting video sequences degraded by noise. Each frame in a sequence is modeled using a Markov random field (MRF), and the energy function of each MRF is minimized by chromosomes that evolve using distributed genetic algorithms. To improve the computational efficiency, only unstable chromosomes corresponding to moving object parts are evolved. Experimental results show the effectiveness of the proposed method.

Journal ArticleDOI
TL;DR: Using a few unrestrictive physics-based assumptions on the environment, a generic multiplicative noise data model is derived and new direction finding techniques are proposed.
Abstract: The performances of high resolution array processing methods are known to degrade in random inhomogeneous media. Such a degradation is caused by random amplitude and phase variations of source wavefronts. In this letter, the popular covariance matching approach to direction finding in the presence of multiplicative noise is extended to the multiple source case. Using a few unrestrictive physics-based assumptions on the environment, we derive a generic multiplicative noise data model. Based on this model, new direction finding techniques are proposed.

Journal ArticleDOI
TL;DR: This letter investigates the problem of fusion of data collected by sensors having different ground and wavelength resolutions and shows how to design the filter banks so that spectra from different signals can be integrated with a minimum distortion.
Abstract: In this letter, we investigate the problem of fusion of data collected by sensors having different ground and wavelength resolutions. This is a typical problem encountered in the interpretation of remotely sensed images. The approach that is proposed here is based on the use of cosine-modulated uniform filter banks. We assume that the ratio of the sampling periods of the input data is not integer and show how to design the filter banks so that spectra from different signals can be integrated with a minimum distortion.

Journal ArticleDOI
TL;DR: The technique presented here is useful for high resolution 2-D spectral analysis applications and the creation of high-resolution spotlight-mode synthetic aperture radar (SAR) imagery, as is illustrated.
Abstract: This paper presents an extension of the one-dimensional (1-D) lattice (reflection coefficient) technique of linear prediction parameter estimation, first popularized by Burg, to the two-dimensional (2-D) case. The resulting fast recursive 2-D algorithm is a significant computational simplification over and an estimation improvement on previous attempts to extend the 1-D Burg linear prediction algorithm to 2-D by exploiting some newly discovered matrix structures. The technique presented here is useful for high resolution 2-D spectral analysis applications and the creation of high-resolution spotlight-mode synthetic aperture radar (SAR) imagery, as is illustrated.

Journal ArticleDOI
TL;DR: Using both the amplitude and phase information of the harmonics generated by block artefacts in MPEG1 coded video, it is possible to estimate the blockiness in MPEG2 coded video accurately without the reference signal, achieving correlation of 0.88.
Abstract: Using both the amplitude and phase information of the harmonics generated by block artefacts in MPEG2 coded video, it is possible to estimate the blockiness in MPEG2 coded video accurately without the reference signal, achieving correlation of 0.88.

Journal ArticleDOI
TL;DR: The channel capacity under a constraint or individual signal power is formulated and it is shown that to achieve the capacity, the shifts should be normally distributed, have maximum power, and adjacent shifts should not be negatively correlated.
Abstract: We have proposed earlier watermarking text documents by slightly shifting certain text lines. Such a text line represents a noisy channel, and marking represents the transmission of a signal through this channel. The power of the signal represents the size of the shift and must be small for the marks to be imperceptible. We formulate the channel capacity under a constraint or individual signal power. We show that to achieve the capacity, the shifts should be normally distributed, have maximum power, and adjacent shifts should be negatively correlated.

Journal ArticleDOI
TL;DR: Analysis of time frequency distributions, as the instantaneous frequency estimators for low noise, is extended to high noise and the crucial parameter is the ratio of auto-term (AT) magnitude and distribution standard deviation.
Abstract: Analysis of time frequency (TF) distributions, as the instantaneous frequency (IF) estimators for low noise, has been previously carried out. In this letter, we extend the analysis to high noise. This noise causes a specific error, which can dominate over all other studied errors. The crucial parameter is the ratio of auto-term (AT) magnitude and distribution standard deviation.

Journal ArticleDOI
TL;DR: A sliding window adaptive RLS-like algorithm for filtering alpha-stable noise that behaves much like the RLS algorithm in terms of convergence speed and computational complexity compared to previously introduced stochastic gradient-based algorithms, which behave like the LMS algorithm.
Abstract: We introduce a sliding window adaptive RLS-like algorithm for filtering alpha-stable noise. Unlike previously introduced stochastic gradient-type algorithms, the new adaptation algorithm minimizes the L/sub p/ norm of the error exactly in a sliding window of fixed size. Therefore, it behaves much like the RLS algorithm in terms of convergence speed and computational complexity compared to previously introduced stochastic gradient-based algorithms, which behave like the LMS algorithm. It is shown that the new algorithm achieves superior convergence rate at the expense of increased computational complexity.

Journal ArticleDOI
TL;DR: The new constrained conjugate gradient (CCG) algorithm is derived from the condition for equivalence between linearly constrained minimum-variance filters and their generalized sidelobe canceler (GSC) implementations.
Abstract: Based on the condition for equivalence between linearly constrained minimum-variance (LCMV) filters and their generalized sidelobe canceler (GSC) implementations, we derive the new constrained conjugate gradient (CCG) algorithm. We discuss the use of orthogonal and nonorthogonal blocking matrices for the GSC structure and how the choice of this matrix may affect the relationship with the LCMV counterpart. The newly derived algorithm was tested in a computer experiment for adaptive multiuser detection and showed excellent results.

Journal ArticleDOI
T. Vlachos1
TL;DR: A novel algorithm for the detection of cuts in video sequences is proposed that uses phase correlation to obtain a measure of content similarity for temporally adjacent frames and responds very well to scene cuts.
Abstract: A novel algorithm for the detection of cuts in video sequences is proposed. The algorithm uses phase correlation to obtain a measure of content similarity for temporally adjacent frames and responds very well to scene cuts. The algorithm is insensitive to the presence of global illumination changes and noise and outperforms established methods for cut detection. As the proposed scheme is implemented in the frequency domain, the availability of fast hardware makes the scheme attractive for interactive and on-line applications.

Journal ArticleDOI
TL;DR: The novel LT's coding performance consistently surpasses that of the much more complex 9/7-tap biorthogonal wavelet with floating-point coefficients, and its block-based nature facilitates one-pass sequential block coding, region-of-interest coding/decoding, and parallel processing.
Abstract: This paper introduces a class of multiband linear phase-lapped biorthogonal transforms with fast, VLSI-friendly implementations via lifting steps called the LiftLT. The transform is based on a lattice structure that robustly enforces both linear phase and perfect reconstruction properties. The lattice coefficients are parameterized as a series of lifting steps, providing fast, efficient, in-place computation of the transform coefficients. The new transform is designed for applications in image and video coding. Compared to the popular 8/spl times/8 DCT, the 8/spl times/16 LiftLT only requires one more multiplication, 22 more additions, and six more shifting operations. However, image coding examples show that the LiftLT is far superior to the DCT in both objective and subjective coding performance. Thanks to properly designed overlapping basis functions, the LiftLT can completely eliminate annoying blocking artifacts. In fact, the novel LT's coding performance consistently surpasses that of the much more complex 9/7-tap biorthogonal wavelet with floating-point coefficients. More importantly, the transform's block-based nature facilitates one-pass sequential block coding, region-of-interest coding/decoding, and parallel processing.

Journal ArticleDOI
TL;DR: This letter introduces the spatial ambiguity functions (SAFs) and discusses their applications to direction finding and source separation problems and emphasizes two properties of SAFs that make them an attractive tool for array signal processing.
Abstract: This letter introduces the spatial ambiguity functions (SAFs) and discusses their applications to direction finding and source separation problems. We emphasize two properties of SAFs that make them an attractive tool for array signal processing.