scispace - formally typeset
Search or ask a question

Showing papers on "Modified discrete cosine transform published in 2003"


Journal ArticleDOI
W.C. Chu1
TL;DR: A DCT-based image watermarking algorithm is described, where the original image is not required for watermark recovery, and is achieved by inserting the watermark in subimages obtained through subsampling.
Abstract: A DCT-based image watermarking algorithm is described, where the original image is not required for watermark recovery, and is achieved by inserting the watermark in subimages obtained through subsampling.

303 citations


Journal ArticleDOI
TL;DR: This approach results in new understanding of the MDCT/IMDCT, enables the development of new algorithms, and makes clear the connection between the algorithms.
Abstract: This paper presents a systematic investigation of the modified discrete cosine transform/inverse modified discrete cosine transform (MDCT/IMDCT) algorithm using a matrix representation. This approach results in new understanding of the MDCT/IMDCT, enables the development of new algorithms, and makes clear the connection between the algorithms. We represent in a matrix form the IMDCT as the product of the type-IV DCT with simple scaling, sign-changing, and permutation operations such that fast algorithms for the type-IV DCT can be simply modified for the IMDCT, and vice versa. Then, the simple symmetry and inversion properties of the type-IV DCT are used to develop new algorithms and establish the connection between existing fast IMDCT algorithms. This approach also enables us to show that MDCT and IMDCT share common core operation and present an efficient architecture for implementing both the MDCT and the IMDCT in one hardware.

68 citations


Journal Article
Ye Wang1, Miikka Vilermo
TL;DR: In this paper, a study of the modified discrete cosine transform (MDCT) and its implications for audio coding and error concealment is presented from the perspective of Fourier frequency analysis.
Abstract: A study of the modified discrete cosine transform (MDCT) and its implications for audio coding and error concealment is presented from the perspective of Fourier frequency analysis. A relationship between MDCT and DFT via shifted discrete fourier transform (SDFT) is established, which provides a possible fast implementation of MDCT employing a fast Fourier transform (FFT) routine. The concept of time-domain alias cancellation (TDAC), the symmetric and nonorthogonal properties of MDCT, is analyzed and illustrated with intuitive examples. New insights are given for innovative solutions in audio codec design and MDCT-domain audio processing such as error concealment.

65 citations


Book
21 Oct 2003
TL;DR: Introduction Polynomial Transforms and Their Fast Algorithms Fast Fourier Transform Algorithm Fast Algorithmms for 1D Discrete Hartley Transforms Fast Al algorithms for MD DiscreteHartley Transform Fast Al algorithms for 1 D Discrete Cosine Transforms fast Algorithts for MDdiscreteCosine Transform IntegerTransforms andFast Al algorithms New Methods of Time-Frequency Analysis Index
Abstract: Introduction Polynomial Transforms and Their Fast Algorithms Fast Fourier Transform Algorithms Fast Algorithms for 1D Discrete Hartley Transforms Fast Algorithms for MD Discrete Hartley Transforms Fast Algorithms for 1D Discrete Cosine Transforms Fast Algorithms for MD Discrete Cosine Transform Integer Transforms and Fast Algorithms New Methods of Time-Frequency Analysis Index

55 citations


Proceedings ArticleDOI
27 Dec 2003
TL;DR: Two methods for face recognition problem with an image database, based on face-based approach, combine the wavelet transform and fast Fourier transform and discrete cosine transform to extract the most insensitive features to facial expression variations.
Abstract: This paper describes two methods for face recognition problem with an image database (DB) The two methods are based on face-based approach The first method combines the wavelet transform (WT) and fast Fourier transform (FFT), while the second method, combine the WT and discrete cosine transform (DCT) Using the virtue of WT we extract the most insensitive features to facial expression variations The first method is proven to be very efficient with face images of different features, illuminations and small occlusion The second method has proven to be good with the above variations of first method together with face images of different: scales, poses and rotated images (plusmn20) Two DBs are used to perform our work The Yale DB and Olivetti DB The two introduced methods are compared to two known methods, the template method and the eigenface method on the same DBs

51 citations


Journal ArticleDOI
TL;DR: The analyzed results show that the proposed recursive infinite-impulse response (IIR) structures possess advantages of high efficiency and high throughput rate.
Abstract: In this paper, we present efficient recursive architectures for realizing the modified discrete cosine transform (MDCT) and the inverse MDCT (IMDCT) acquired in many audio coding systems. After data rearrangement, the MDCT and IMDCT can be represented as the Chebyshev polynomials such that we can efficiently implement them in the recursive structures. For verification, we design an ASIC to realize the recursive IMDCT. The analyzed results show that the proposed recursive infinite-impulse response (IIR) structures possess advantages of high efficiency and high throughput rate. The high regularity and modularity of the proposed recursive IMDCT and MDCT algorithms are other merits for very large scale integration implementation.

43 citations


Patent
Manu Mathew1
02 Sep 2003
TL;DR: In this paper, a digital audio encoding method using an advanced psychoacoustic model is provided. But the method is not suitable for high-dimensional audio signals, and it requires the decoding of the audio signal at each window type and the type of a window according to the characteristic of an input audio signal.
Abstract: A digital audio encoding method using an advanced psychoacoustic model is provided. The audio encoding method including determining the type of a window according to the characteristic of an input audio signal; generating a complex modified discrete cosine transform (CMDCT) spectrum from the input audio signal according to the determined window type; generating a fast Fourier transform (FFT) spectrum from the input audio signal, by using the determined window type; and performing a psychoacoustic model analysis by using the generated CMDCT spectrum and FFT spectrum.

30 citations


Proceedings ArticleDOI
10 Nov 2003
TL;DR: The algorithm is based on a novel contrast measure that is defined for each DCT coefficient that can be applied to the enhancement of images compressed with JPEG and it is especially useful when it is applied to enhance the direction contrast of the images.
Abstract: In this paper a new algorithm is presented for image enhancement in the discrete cosine transform (DCT) domain. The algorithm is based on a novel contrast measure that is defined for each DCT coefficient. This algorithm can be applied to the enhancement of images compressed with JPEG and it is especially useful when it is applied to enhance the direction contrast of the images. Experimental results show the effectiveness of the proposed algorithm.

27 citations


Journal ArticleDOI
TL;DR: The careful analysis of regular structure of the new fast MDCT algorithm allows to extract a new DCT-IV/DST-IV computational structure and to suggest a new sparse matrix factorization of the D CT-IV matrix.

24 citations


01 Jan 2003
TL;DR: It is shown that the MDCT can also be used as an analysis tool, by extracting the frequency of a pure sine wave with some simple combinations of MDCT coefficients, and studying the performance of this estimation in ideal (noiseless) conditions.
Abstract: The Modified Discrete Cosine Transform (MDCT) is a broadlyused transform for audio coding, since it allows an orthogonal time-frequency transform without blocking effects. In this article, we show that the MDCT can also be used as an analysis tool. This is illustrated by extracting the frequency of a pure sine wave with some simple combinations of MDCT coefficients. We studied the performance of this estimation in ideal (noiseless) conditions, as well as the influence of additive noise (white noise / quantization noise). This forms the basis of a low-level feature extraction directly in the compressed domain.

17 citations


01 Jan 2003
TL;DR: There exists a trade-off between computational complexity and coding efficiency when it is applied in the MPEG2 AAC based lossless audio coding scheme and one can reduce the computational complexity of the IntMDCT while a certain level of coding efficiency is maintained in the scheme.
Abstract: Recently, an MPEG2 AAC [1] based lossless audio codec with the Integer MDCT (IntMDCT) was proposed [2]. The IntMDCT was constructed by lifting scheme [3] to hold the perfect reconstruction(PR). In this paper, we will evaluate the IntMDCT implemented by fixed-point arithmetic with quantized lifting coefficients in the MPEG2 AAC based lossless audio coding. The results indicate that there exists a trade-off between computational complexity of the IntMDCT and coding efficiency when it is applied in the MPEG2 AAC based lossless audio coding scheme and one can reduce the computational complexity of the IntMDCT while a certain level of coding efficiency is maintained in the scheme.

Proceedings ArticleDOI
14 Oct 2003
TL;DR: Experimental results show better recognition accuracies and reduced computational burden for biometric identification based on frontal face images using a discrete cosine transform instead of the eigenfaces method.
Abstract: This paper proposes the use of a discrete cosine transform (DCT) instead of the eigenfaces method (Karhunen-Loeve Transform) for biometric identification based on frontal face images. Experimental results show better recognition accuracies and reduced computational burden. This paper includes results with different classifiers and a combination of them.

Proceedings ArticleDOI
17 Sep 2003
TL;DR: This paper proposes an ECG signal compressor based on optimum quantization of discrete cosine transform (DCT) coefficients and Golomb-Rice coding, and assesses the performance of the compressor at various distortion levels.
Abstract: This paper proposes an ECG signal compressor based on optimum quantization of discrete cosine transform (DCT) coefficients and Golomb-Rice coding. The ECG to be compressed is initially partitioned in blocks, and each DCT block is quantized using a quantization step size vector and a zeroing threshold vector. These vectors are defined so that the estimated entropy is minimized for a target distortion in the reconstructed signal or, alternatively, the distortion is minimized for a target entropy. The final step of the compressor is based on Golomb-Rice coding. To assess the performance of the compressor, records of the MIT-BIH Arrhythmia Database were compressed at various distortion levels, measured by the percent root-mean-square difference (PRD), and compression ratios (CR) were computed. An average CR of 10.4:1 was achieved for PRD equal to 2.5%.

Patent
Ho-Jin Ha1
07 Nov 2003
TL;DR: In this paper, an MPEG audio encoding method, a method for determining a window type when encoding MPEG audio, a psychoacoustic modeling method when encoding MPI audio, MPEG audio encoder and decoder, and MPEG audio decoding apparatus are described.
Abstract: An MPEG audio encoding method, a method for determining a window type when encoding MPEG audio, a psychoacoustic modeling method when encoding MPEG audio, an MPEG audio encoding apparatus, an apparatus for determining a window type when encoding MPEG audio, and a psychoacoustic modeling apparatus in an MPEG audio encoding system are provided. The MPEG audio encoding method comprises performing modified discrete cosine transform (MDCT) on an input audio signal in a time domain; with the MDCT performed MDCT coefficients as an input, performing psychoacoustic model; and by using the result of performing the psychoacoustic model, performing quantization, and packing a bitstream. According to the method, complexity of computation can be reduced and waste of bits can be prevented.

Proceedings ArticleDOI
17 Jun 2003
TL;DR: A new semi-blind digital watermarking technique is proposed using the modified discrete cosine transformation (MDCT), which shows that the quality of the watermarked image is high and is robust to compression, noise, filtering and geometric transformations.
Abstract: A new semi-blind digital watermarking technique is proposed using the modified discrete cosine transformation (MDCT). The results show that the quality of the watermarked image is high and is robust to compression, noise, filtering and geometric transformations.

Journal ArticleDOI
TL;DR: Joint structures for audio coding and echo cancellation are investigated, utilizing standard audio coders, and converter properties of the proposed echo canceller structures are shown using simulations with real-life recorded speech.
Abstract: Joint structures for audio coding and echo cancellation are investigated, utilizing standard audio coders. Two types of audio coders are considered, coders based on cosine modulated filterbanks and coders based on the modified discrete cosine transform (MDCT). For the first coder type, two methods for combining such a coder with a subband echo canceller are proposed. The two methods are: a modified audio coder filterbank that is suitable for echo cancellation but still generates the same final decomposition as the standard audio coder filterbank, and another that converts subband signals between an audio coder filterbank and a filterbank designed for echo cancellation. For the MDCT based audio coder, a joint structure with a frequency-domain adaptive filter based echo canceller is considered. Computational complexity and transmission delay for the different coder/echo canceller combinations are presented. Convergence properties of the proposed echo canceller structures are shown using simulations with real-life recorded speech.

01 Jan 2003
TL;DR: The proposed algorithm reduces the amount of computations for MP3 encoder while retaining the audio quality and makes use of complex modified discrete cosine transform of the filter-bank outputs for generating MDCT coefficients as well as the frequency spectrum.
Abstract: MPEG-1 Layer-3, popularly known as MP3, has revolutionized the digital music domain. MP3 makes use of psychoacoustic modeling to achieve compression through the removal of perceptually irrelevant components of digital audio. The psychoacoustic model is the key element of perceptual coding and requires intensive FFT computation for calculating the frequency spectrum. This spectrum is used to compute masking thresholds. Thus, the original MP3 algorithm computes modified discrete cosine transform (MDCT) and FFT parallelly. The proposed algorithm is an altemative to this. We make use of complex modified discrete cosine transform (CMDCT) of the filter-bank outputs for generating MDCT coefficients as well as the frequency spectrum. This method requires fewer computations than the original method. A novel method of window switching, based on filter-bank output is used to simplify the overall algorithm. The proposed algorithm reduces the amount of computations for MP3 encoder while retaining the audio quality.

Proceedings ArticleDOI
06 Jul 2003
TL;DR: The intMDCT is suitable for both lossless and lossy audio coding, and inherits most of the attractive properties of the MDCT, including a good spectral representation of the audio signal, critical sampling and overlapping of blocks.
Abstract: In this paper, an efficient implementation of the forward and inverse MDCT is proposed for even-length MDCT. The algorithm uses discrete cosine transform of type II (DCT-II) to compute the forward MDCT and their inverse DCT-III to compute the inverse MDCT. The lifting scheme is used to approximate multiplications appearing in the MDCT lattice structure where the dynamic range of the lifting coefficients can be controlled by proper choices of the lifting factorizations. The new structure requires less multiplications and additions than previous reported algorithms. The new transform has the properties that it is an integer-to-integer mapping and is reversible. Moreover, it inherits most of the attractive properties of the MDCT, including a good spectral representation of the audio signal, critical sampling and overlapping of blocks. Hence, the intMDCT is suitable for both lossless and lossy audio coding.

Journal ArticleDOI
TL;DR: The generalized recursive structure for one-dimensional discrete cosine transform and discrete sine transform is discussed, suggesting that they can be utilized in various applications like data compression and VLSI implementations that utilize the quick discrete Cosine and sine transforms.
Abstract: This paper discusses the generalized recursive structure for one-dimensional discrete cosine transform and discrete sine transform. This result in quicker computation of the discrete cosine and sine transform coefficients. The paper also looks at the relations among the family of discrete cosine and sine transforms and presents the mapping relationships for various discrete cosine and sine transforms. The results suggest that they can be utilized in various applications like data compression and VLSI implementations that utilize the quick discrete cosine transform and the quick discrete sine transform.

Proceedings ArticleDOI
06 Jul 2003
TL;DR: The proposed algorithm reduces the amount of computations for MP3 encoder while retaining the audio quality and makes use of complex modified discrete cosine transform of the filter-bank outputs for generating MDCT coefficients as well as the frequency spectrum.
Abstract: MPEG-1 layer-3, popularly known as MP3, has revolutionized the digital music domain. MP3 makes use of psychoacoustic modeling to achieve compression through the removal of perceptually irrelevant components of digital audio. The psychoacoustic model is the key element of perceptual coding and requires intensive FFT computation for calculating the frequency spectrum. This spectrum is used to compute masking thresholds. Thus, the original MP3 algorithm computes modified discrete cosine transform (MDCT) and FFT parallelly. The proposed algorithm is an alternative to this. We make use of complex modified discrete cosine transform (CMDCT) of the filter-bank outputs for generating MDCT coefficients as well as the frequency spectrum. This method requires fewer computations than the original method. A novel method of window switching, based on filter-bank output is used to simplify the overall algorithm. The proposed algorithm reduces the amount of computations for MP3 encoder while retaining the audio quality.

Journal Article
TL;DR: This paper presents a scalable lossless enhancement of MPEG-4 Advanced Audio Coding using the Integer Modified Discrete Cosine Transform (IntMDCT), which is an integer approximation of the MDCT providing perfect reconstruction.
Abstract: This paper presents a scalable lossless enhancement of MPEG-4 Advanced Audio Coding (AAC). Scalability is achieved in the frequency domain using the Integer Modified Discrete Cosine Transform (IntMDCT), which is an integer approximation of the MDCT providing perfect reconstruction. With this transform, and only minor extension of the bitstream syntax, the MPEG-4 AAC Scalable codec can be extended to a lossless operation. The system provides bit-exact reconstruction of the input signal independent of the implementation accuracy of the AAC core coder. Furthermore, scalability in sampling rate and reconstruction word length is supported.

Journal ArticleDOI
TL;DR: This paper represents the cosine coefficient in binary form and realize multiplications using a new serial–parallel multiplier architecture that results in a simple structure for VLSI realization.
Abstract: This paper presents an efficient serial–parallel multiplier algorithm that realizes the input data bit-serially, for implementation of the discrete cosine transform (DCT), which realizes the input data bit-serially First, the DCT equation is split into a few groups of equations by using some mathematical techniques, and index tables are constructed to facilitate efficient data permutations A new formulation of the DCT is then derived Second, we represent the cosine coefficient in binary form and realize multiplications using a new serial–parallel multiplier architecture that results in a simple structure for VLSI realization

Proceedings ArticleDOI
16 Mar 2003
TL;DR: This work investigates the use of MDCT in digital watermarking and finds that it has better coding performance compared to DCT and the computational complexity has been reduced in recent years compared to wavelets.
Abstract: A new digital watermarking technique is proposed for still images in the modified discrete cosine transform (MDCT) domain. Most of the existing image and video watermarking algorithms use block transformations that introduce blocking artifacts, causing perceptible distortions. MDCT has better coding performance compared to DCT and also the computational complexity of MDCT has been reduced in recent years compared to wavelets. We investigate the use of MDCT in digital watermarking.

Proceedings ArticleDOI
25 May 2003
TL;DR: This paper evaluates the performance of the integer implementation of the MDCT by incorporating in the MPEG layer III codec and compare with that of a standard implementation and the complexity of the lossless structure is evaluated.
Abstract: The MPEG audio layer III is an efficient audio coding scheme which gives perceptually high quality audio while achieving very high compression. The modified discrete cosine transform (MDCT) is a filterbank used by the MPEG audio layer III to provide finer spectral resolution by subdividing in frequency the subband outputs from the previous layers. Recently fast and lossless structures for the MDCT used in the layer III have been developed which use integer coefficients for their implementation. In this paper we evaluate the performance of the integer implementation of the MDCT by incorporating in the MPEG layer III codec and compare with that of a standard implementation. The comparison is done based on the decoded waveform, spectrogram and by conducting a subjective survey. The complexity of the lossless structure is also evaluated.

Proceedings ArticleDOI
19 Jun 2003
TL;DR: It is shown that if an operator, connected with the Discrete Fourier Transform (DFT), is referred to an appropriate basis it takes bloc-diagonal form and the full structure of the DCT/DST is investigated and end fast realizations to be construct.
Abstract: In paper two types of the discrete cosine (end sine) transforms (DCT/DST) are analyzed on the base of the linear representations of finite groups and geometrical approach. This transforms are useful for multirate systems, adaptive filtering and compression of speech signals and images. It is shown that if an operator, connected with the Discrete Fourier Transform (DFT), is referred to an appropriate basis it takes bloc-diagonal form. These blocks coincide with DCT-4/DST-4 for even dimensions of the signals' space and with DCT-8/DST-8 for odd ones. The results allow the full structure of the DCT/DST to be investigated end fast realizations to be construct.

Proceedings ArticleDOI
24 Nov 2003
TL;DR: Experimental results show that the proposed algorithm provides the resized image quality similar to an existing method with significantly lower computational complexity.
Abstract: An efficient image resizing algorithm in the compressed domain with mixed field/frame-mode macroblocks is proposed. A 16/spl times/16 field/frame-mode macroblock is converted into an 8/spl times/8 reduced block in the discrete cosine transform (DCT) domain using a modified inverse DCT (IDGT) kernel which performs stronger lowpass filtering than the simple truncation in the DCT domain. Experimental results show that the proposed algorithm provides the resized image quality similar to an existing method with significantly lower computational complexity.

Proceedings ArticleDOI
01 Jan 2003
TL;DR: The proposed analog VLSI architecture, capable of computing discrete cosine transform (DCT), using switched capacitor circuits and a resistor ladder, is very simple to implement and well suited where silicon area and power are required to be minimized with some compromise on accuracy.
Abstract: This paper describes an analog VLSI architecture, capable of computing discrete cosine transform(DCT), using switched capacitor circuits and a resistor ladder. The scheme operates from the general expression of DCT where the input samples are multiplied by all the DCT coefficients simultaneously using the resistor ladder. These multiplied values are then switched properly with the help of a cross-point switch, to different integrators for performing necessary addition/subtraction. Analog multiplications are done here with the help of a simple resistor ladder. The proposed architecture, is very simple to implement and well suited where silicon area and power are required to be minimized with some compromise on accuracy.

Journal ArticleDOI
TL;DR: A new feature representation approach, the simultaneous usage of the real and imaginary Fourier components with taking into account the covariance between these components, was compared with the Cosine transform approach for Gaussian recognition.

Proceedings ArticleDOI
25 May 2003
TL;DR: New recursive structures for computing radix-r two-dimensional discrete cosine transform (2-D DCT) are proposed, based on the same indices of transform bases, and the regular pre-add preprocess is established and the recursive structures are derived without involving any transposition procedure.
Abstract: In this paper, new recursive structures for computing radix-r two-dimensional discrete cosine transform (2-D DCT) are proposed. Based on the same indices of transform bases, the regular pre-add preprocess is established and the recursive structures for 2-D DCT, which can be realized in a second-order infinite-impulse response (IIR) filter, are derived without involving any transposition procedure. For computation of 2-D DCT, the recursive loops of the proposed structures are less than that of one-dimensional DCT recursive structures, which need data transposition to achieve the so-called row-column approach. With advantages of fewer recursive loops and no transposition, the proposed recursive structures achieve more accurate results than the existed methods. The regular and modular properties are suitable for VLSI implementation.

Dissertation
01 Jan 2003
TL;DR: Polynomial transform was presented as an efficient way to map multidimensional transforms (like DCT and DFT) and convolutions to onedimensional ones.
Abstract: Polynomial transform was presented as an efficient way to map multidimensional transforms (like DCT and DFT) and convolutions to onedimensional ones. Resulting algorithms have significantly lower computational complexity, especially for convolutions and DCT. If optimal one-dimensional DCT would be available, it would be possible to obtain optimal multidimensional algorithms by using polynomial transforms. As demonstrated in the thesis, polynomial transform based DCT can be implemented in significantly less logic resources than row-column distributed arithmetic algorithm. Polynomial transform computation is performed as the series of additions and subtractions, so the finite-register length effects can be avoided resulting in higher accuracy. Second dimension of distributed arithmetic DCT implementation was more problematic due to larger input word length. The resulting longer carry chain, especially the one in the large accumulators at the output, was the major obstacle to achieving higher operating frequency. The implementation of polynomial transform DCT avoids this obstacle by using a butterfly of adders and no accumulators are needed at the output. The accuracy of both implementations has been extensively measured and detailed results were reported, more detailed then available in the previous work. Once the strong foundations for comparison were built, other factors, like resource usage, maximum achievable frequency, latency and throughput have been compared. Another contribution of the thesis is the proposal of designing the butterfly stage of such highly complex algorithms by scheduling the computation from the last stage, where the designer has no flexibility. This approach has yielded a very efficient FPGA implementation, superior to others reported.