scispace - formally typeset
Search or ask a question

Showing papers on "Modified discrete cosine transform published in 2012"


Journal ArticleDOI
TL;DR: In this article, a low-complexity 8-point orthogonal approximate discrete cosine transform (DCT) is introduced. But the proposed transform requires no multiplications or bit-shift operations.
Abstract: A low-complexity 8-point orthogonal approximate discrete cosine transform (DCT) is introduced. The proposed transform requires no multiplications or bit-shift operations. The derived fast algorithm requires only 14 additions, less than any existing DCT approximation. Moreover, in several image compression scenarios, the proposed transform could outperform the well-known signed DCT, as well as state-of the-art algorithms.

91 citations


PatentDOI
TL;DR: In this paper, the authors present techniques for implementing a fast algorithm for implementing odd-type DCTs and DSTs, which include a mapping between the real-valued data sequence to an intermediate sequence to be used as an input to a DFT.
Abstract: This disclosure presents techniques for implementing a fast algorithm for implementing odd-type DCTs and DSTs. The techniques include the computation of an odd-type transform on any real-valued sequence of data (e.g., residual values in a video coding process or a block of pixel values of an image coding process) by mapping the odd-type transform to a discrete Fourier transform (DFT). The techniques include a mapping between the real-valued data sequence to an intermediate sequence to be used as an input to a DFT. Using this intermediate sequence, an odd-type transform may be achieved by calculating a DFT of odd size. Fast algorithms for a DFT may be then be used, and as such, the odd-type transform may be calculated in a fast manner

53 citations


Journal ArticleDOI
TL;DR: A new method for securing color image using discrete cosine transform in gyrator transform domain structured-phase encoding and the effectiveness of the proposed algorithm is demonstrated against the chosen and known plaintext attacks.

48 citations


Proceedings ArticleDOI
31 Aug 2012
TL;DR: A digital watermarking algorithm with gray image based on 2 dimensions discrete wavelet and cosine transform in order to protect digital media copyright efficiently and is robust to the common signal processing techniques.
Abstract: The technique of digital watermarking is one of the valid methods for copyright protection. In this paper, we propose a digital watermarking algorithm with gray image based on 2 dimensions discrete wavelet and cosine transform in order to protect digital media copyright efficiently. We transform the image into discrete wavelet domain for three timely and split the image into sub-blocks,?? which is lower in horizontal direction and high in vertical direction, and then transform every block into discrete cosine domain, the watermarking components, which is also transformed into discrete cosine domain, are embedded into cover image. Finally, the secret image is obtained by reverse transform of wavelet and cosine domain. The experimental results show that the watermarking is robust to the common signal processing techniques including JPEG compressing, noise, lowpass filtering and cutting.

34 citations


Proceedings ArticleDOI
25 Mar 2012
TL;DR: The extensive experimental results have shown that the proposed method can effectively identify whether the given audio has been previously compressed with MP3 and/or WMA, and can further estimate the hidden compression rates, even the compression rate is as high as 128 K bps (bits per second).
Abstract: Compression history identification plays a very important role in digital multimedia forensics. However, most existing literatures mainly focus on digital image forensics, and just a few works consider digital audio. In this paper, we investigate two popular compression schemes in digital audio, that is, MP3 and WMA, and try to reveal the compression history for a questionable audio signal in the original uncompressed WAV format via analyzing some statistical characteristics of the modified discrete cosine transform coefficients of the audio. The extensive experimental results have shown that the proposed method can effectively identify whether the given audio has been previously compressed with MP3 and/or WMA, and can further estimate the hidden compression rates, even the compression rate is as high as 128 K bps (bits per second).

32 citations


Journal ArticleDOI
TL;DR: This study is completed with several computer simulations in mobile broadband wireless communication scenarios, considering the presence of carrier frequency offset (CFO), and results indicate that the proposed systems outperform the standardized ones based on the DFT.
Abstract: In this correspondence, the conditions to use any kind of discrete cosine transform (DCT) for multicarrier data transmission are derived. The symmetric convolution-multiplication property of each DCT implies that when symmetric convolution is performed in the time domain, an element-by-element multiplication is performed in the corresponding discrete trigonometric domain. Therefore, appending symmetric redundancy (as prefix and suffix) into each data symbol to be transmitted, and also enforcing symmetry for the equivalent channel impulse response, the linear convolution performed in the transmission channel becomes a symmetric convolution in those samples of interest. Furthermore, the channel equalization can be carried out by means of a bank of scalars in the corresponding discrete cosine transform domain. The expressions for obtaining the value of each scalar corresponding to these one-tap per subcarrier equalizers are presented. This study is completed with several computer simulations in mobile broadband wireless communication scenarios, considering the presence of carrier frequency offset (CFO). The obtained results indicate that the proposed systems outperform the standardized ones based on the DFT.

31 citations


Patent
27 Apr 2012
TL;DR: In this paper, an efficient content classification and gated loudness estimation method for audio signal encoding is proposed, which is based on spectral representation of the audio signal and a gated noise estimation method.
Abstract: Efficient Content Classification and Gated Loudness Estimation The present document relates to methods and systems for encoding an audio signal. The method comprises determining a spectral representation of the audio signal. The determining a spectral representation step may comprise determining modified discrete cosine transform, MDCT, coefficients, or a Quadrature Mirror Filter, QMF, filter bank representation of the audio signal. The method further comprises encoding the audio signal using the determined spectral representation; and classifying parts of the audio signal to be speech or non-speech based on the determined spectral representation. Finally, a loudness measure for the audio signal based on the speech parts is determined.

30 citations


Journal ArticleDOI
TL;DR: A family of integer transforms, Loeffler, Ligtenberg, and Moschytz (LLM) integer cosine transform, is derived using this method, which is not only very close to the DCT but also has excellent coding performance.
Abstract: Existing video coding standards use only 4 × 4 and 8 × 8 transforms for energy compaction. Recent research has found that the use of larger transforms, such as 16 × 16, together with the existing transforms can improve coding performance especially in high-definition (HD) videos which are becoming more and more common. This raises the interest of seeking high-performance higher-order transforms with low computation requirement. In this paper, a method to derive orthogonal integer cosine transforms is proposed. The order-2N transform is defined using the order-N transform. A family of these integer transforms, Loeffler, Ligtenberg, and Moschytz (LLM) integer cosine transform, is derived using this method. Its fast algorithm structure is the same as LLM fast discrete cosine transform (DCT) algorithm but requires integer operations only. This new family of transforms is not only very close to the DCT but also has excellent coding performance.

30 citations


Proceedings ArticleDOI
11 Dec 2012
TL;DR: Close-form relationship between the 16×16 transform and arbitrary smaller sized transform is presented, enabling the usability of this architecture to compute transforms of size 4 · 2P × 4· 2q where 0 ≤ p, q ≤ 2.
Abstract: The discrete cosine transform (DCT) is widely employed in image and video coding applications due to its high energy compaction. In addition to 4×4 and 8×8 transforms utilized in earlier video coding standards, the proposed HEVC standard suggests the use of larger transform sizes including 16 × 16 and 32×32 transforms in order to obtain higher coding gains. Further, it also proposes the use of non-square transform sizes as well as the use of the discrete sine transform (DST) in certain intra-prediction modes. The decision on the type of transform used in a given prediction scenario is dynamically made, to obtain required compression rates. This motivated the proposed digital VLSI architecture for a multitransform engine capable of computing 16×16 approximate 2-D DCT/DST transform, with null multiplicative complexity. The relationship between DCT-II and DST-II is employed to compute both transforms using the same digital core, leading to reductions in both area and power. Closed-form relationship between the 16×16 transform and arbitrary smaller sized transform is presented, enabling the usability of this architecture to compute transforms of size 4 · 2P × 4 · 2q where 0 ≤ p, q ≤ 2.

24 citations


Proceedings ArticleDOI
01 Sep 2012
TL;DR: This paper develops two algorithms to solve the problem of selecting the best transform for each block that leads to the best energy compaction, and develops a locally optimal and globally optimal solution.
Abstract: With a proper transform, an image or motion-compensated residual can be represented quite accurately with a small fraction of the transform coefficients. This is referred to as the energy compaction property. When multiple block transforms are used, selecting the best transform for each block that leads to the best energy compaction is difficult. In this paper, we develop two algorithms to solve this problem. The first algorithm, which is computationally simple, leads to a locally optimal solution. The second algorithm, which is more computationally intensive, gives a globally optimal solution. We discuss the algorithms and their performances. Two-dimensional discrete cosine transform (2-D DCT) and direction-adaptive one-dimensional discrete cosine transforms (1-D DCTs) are used to evaluate the performance of our algorithms. Results obtained are consistent with those from previous research.

14 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed method can achieve higher security and less computational complexity by reusing the MDCT coefficients obtained in MP3.
Abstract: In this paper, we propose an advanced partial encryption of watermarking and scrambling using the magnitude information of Modifed Discrete Cosine Transform (MDCT). In MPEG-1/Audio Layer III (MP3), the magnitude and phase information of modified discrete cosine transform (MDCT) coefficients is encrypted. The proposed method uses both watermarking and scrambling, and aims at protecting the contents against eavesdropping and moreover against illegal mass distribution after descrambled. Experimental results show that the proposed method can achieve higher security and less computational complexity by reusing the MDCT coefficients obtained in MP3.

Journal ArticleDOI
TL;DR: Two types of fast algorithms for implementing LD-TDAC filterbanks in AAC-ELD are presented and a new fast factorization of 15-point DCT-II is presented that requires only 14 irrational multiplications, 3 dyadic rational multiplications and 67 additions.
Abstract: The MPEG committee has completed development of a new audio coding standard called "MPEG-4 advanced audio coding-enhanced low delay" (AAC-ELD). AAC-ELD uses low delay spectral band replication (LD-SBR) technology together with a low delay time domain alias cancellation (LD TDAC) filterbank in the encoder to achieve both high coding efficiency and low algorithmic delay. In this paper, we present fast algorithms for implementing LD-TDAC filterbanks in AAC-ELD. Two types of fast algorithms are presented. In the first, we map LD-TDAC analysis and synthesis filterbanks to modified discrete cosine transform (MDCT) and inverse modified discrete cosine transform (IMDCT), respectively. Since MDCT/IMDCT are already extensively used in AAC and they have many fast algorithms, this mapping not only provides a fast implementation but also allows a common implementation of the filterbanks in AAC Low Complexity (AAC-LC), AAC Low Delay (AAC-LD) and AAC-ELD codecs. In the second algorithm, we provide a mapping to discrete Cosine transform of type II. The mapping to DCT-II allows the merger of the matrix operations with the windowing stage that precedes or follows them. This further reduces the number of multiplications and leads to an algorithm with the lowest known arithmetic complexity. For filterbanks of lengths 1024 and 960, we also present a new fast factorization of 15-point DCT-II that requires only 14 irrational multiplications, 3 dyadic rational multiplications and 67 additions.

Journal Article
TL;DR: An improved mixed-radix DIF fast MDCT algorithm both in terms of the regularity and computational complexity is described, based on observed simple algebraic identities in the original proposed algorithm ShuBao, resulting in a very regular computational structure.
Abstract: Recently, a mixed-radix decimation in frequency (DIF) fast MDCT algorithm only for the mixed-radix decompositions or composite lengths N = 3^m.2, m>0, has been proposed in ShuBao. An improved mixed-radix DIF fast MDCT algorithm both in terms of the regularity and computational complexity is described. Based on observed simple algebraic identities in the original proposed algorithm ShuBao, new formulas are derived resulting in a very regular computational structure. Consequently, the number of arithmetic operations is reduced significantly. Moreover, the improved algorithm is extended to all composite lengths N = 3^m.2^p, m, p>0. The improved algorithm defines new sparse matrix factorizations of the MDCT matrix for the composite lengths N = 3^m.2^p, m, p>0, and finally it provides new implementations of the forward/backward MDCT in MPEG-1/2 layer III (MP3) audio coding standard.

Proceedings ArticleDOI
01 Sep 2012
TL;DR: This work presents a motion detection algorithm by a change detection filter matrix derived from Discrete Cosine Transform that achieves two orders of magnitude faster than the previous algorithm with better performance and is fundamentally robust to sudden illumination changes.
Abstract: We present a motion detection algorithm by a change detection filter matrix derived from Discrete Cosine Transform. Recently, a Fourier reconstruction scheme shows good results for motion detection. However, its computational cost is a major drawback. We revisit the problem and achieve two orders of magnitude faster than the previous algorithm with better performance. The proposed algorithm runs at about 800 frames per second for VGA resolution images on a consumer hardware by using only integer matrix multiplication and the symmetric property of the change detection filter matrix. In addition, our algorithm is fundamentally robust to sudden illumination changes because it works based on edge information. We verify our algorithm with challenging datasets that contain strong and sudden illumination changes.

Proceedings ArticleDOI
25 Sep 2012
TL;DR: The results show that the frequency based image steganography using Discrete Cosine Transformation (DCT) and JPEG can reduce the computation time and increase the capacity of the secret messages while maintaining the image quality and the size of JPEG stego-image.
Abstract: Image steganography is the art of hiding the existence of the information in the image. The steganography can be categorized into two types in spatial domain and frequency domain. In this study, the frequency based image steganography using Discrete Cosine Transformation (DCT) and JPEG is proposed. Instead of using 8x8-pixel blocks with the 8x8-pixel quantization table, a larger block of size 32x32 will be used with a corresponding 32x32 quantization table created by cubic interpolation technique. Our results show that we can reduce the computation time and increase the capacity of the secret messages while maintaining the image quality and the size of JPEG stego-image.

Journal ArticleDOI
TL;DR: The proposed algorithm not only can simultaneously compute MDCT and MDST (or IMDCT and IMDST) coefficients by adopting a compact recursive structure but also can increase the peak signal-to-noise ratio (PSNR) value by selecting the optimal q factor.
Abstract: This brief presents a novel low-cost and high-accuracy design for recursive modified discrete cosine transform (MDCT), modified discrete sine transform (MDST), inverse MDCT (IMDCT), and inverse MDST (IMDST) algorithms. The proposed algorithm not only can simultaneously compute MDCT and MDST (or IMDCT and IMDST) coefficients by adopting a compact recursive structure but also can increase the peak signal-to-noise ratio (PSNR) value by selecting the optimal q factor. The PSNR is over 78 dB at least for 256- and 512-point window lengths. Compared with Nikolajevic and Fettweis's algorithm for complexity analysis, the results show that the proposed algorithm greatly reduces 50.21% of multiplications, 24.97% of additions, and 50% of computational cycles for 512-point MDCT and MDST. The FPGA implementation results show that the proposed design can support 7.92 sound-channel encoding and decoding for Dolby AC-3 at a sampling rate of 48 kHz while the clock rate is set to 97 MHz.

Journal Article
TL;DR: In this paper, the authors used the filters derived from the 16 DTTs, instead of the wavelet convolution filters, for the Mallat scheme decomposition stage realization, which resulted in considerable reduction of computations up to 34 times, while the accompanying distortion has been assessed at the PSNR level about 40dB.
Abstract: In the paper assumptions for a computational experiment along with obtained results leading to positive verification of the hypothesis about efficient Mallat scheme decomposition stage realization with use of the filters derived from the 16 DTTs, instead of the wavelet convolution filters, have been described. Introduction of windowing in the time domain resulted in considerable reduction of computations, up to 34 times, while the accompanying distortion has been assessed at the PSNR level about 40dB. The investigation was performed with statistically defined signals, based on the first-order Markov process with assumed intersymbol correlation. Streszczenie. W artykule opisano założenia oraz wyniki eksperymentu obliczeniowego pozytywnie weryfikującego hipotezę o możliwości efektywnego zrealizowania etapu dekompozycji według schematu Mallata przy zastąpieniu splotowych filtrów zafalowaniowych filtrami pochodzącymi z 16 dyskretnych transformacji trygonometrycznych. Zastosowanie okien w dziedzinie czasu pozwoliło na zmniejszenie ilości obliczeń do 34 razy, przy wprowadzeniu nieznacznych deformacji sygnału, oszacowanych na poziomie około 40dB (PSNR). Badania przeprowadzono dla sygnałów zdefiniowanych statystycznie z wykorzystaniem procesów Markowa pierwszego rzędu z założoną wartością korelacji międzysymbolowej. (Dekompozycja w schemacie zafalowaniowo-podobnym z wykorzystaniem okienkowanych filtrów określonych dla dyskretnych transformacji trygonometrycznych).

Proceedings ArticleDOI
01 Sep 2012
TL;DR: A multipurpose blind watermarking algorithm for color image based on discrete wavelet transform and discrete cosine transform and the robust watermark and semi-fragile watermark are embedded using the thought of dither modulation (DM) and the method of spread transform dithering modulation (STDM) respectively.
Abstract: -In this paper, a multipurpose blind watermarking algorithm for color image based on discrete wavelet transform (DWT) and discrete cosine transform (DCT) is proposed according to the lack of multipurpose blind watermarking algorithm. In this algorithm, the robust watermark and semi-fragile watermark are embedded using the thought of dither modulation (DM) and the method of spread transform dither modulation (STDM) respectively. Simulation results demonstrate the quality of the watermarked image is good, the robust watermark has certain robustness for common attacks, the semi-fragile watermark has certain robustness to the common attacks and can locate well for the familiar attacks.

Proceedings ArticleDOI
24 Dec 2012
TL;DR: Experimental results indicate that the proposed watermarking method resists various attacks such as noise addition, cropping, re-sampling, re"-sampling", re-quantization, and MP3 compression and outperforms conventional methods in terms of imperceptibility and robustness.
Abstract: In this paper, we propose an audio watermarking method in transform domain based on singular value decomposition (SVD) and quantization for copyright protection of audio data. In our proposed method, initially the original audio is segmented into non-overlapping frames. Discrete wavelet transformation (DWT) is applied to each frame and detail coefficients are formulated. Discrete cosine transformation (DCT) is performed on the detail coefficients and the obtained DCT coefficients are reshaped. SVD is applied on the DCT coefficients and watermark information is then embedded into the highest singular value by quantization. Watermark is extracted by comparing the largest singular value of original and attacked watermarked DCT coefficients obtained from DWT sub bands of each audio frame. Experimental results indicate that the proposed watermarking method resists various attacks such as noise addition, cropping, re-sampling, re-quantization, and MP3 compression. Moreover, it outperforms conventional methods in terms of imperceptibility and robustness. Our proposed method achieves signal-to-noise ratio (SNR) values ranging from 38 to 40 dB, in contrast to conventional methods which achieve SNR values ranging from only 10 to 26 dB.

Proceedings ArticleDOI
22 Oct 2012
TL;DR: A Modified Discrete Cosine Transform (MDCT) based compression scheme is proposed to compress vibration signals to increase sampling rates and save valuable power in wireless sensors networks.
Abstract: Wireless sensors networks are acquiring more attention in the last years due to their application in many fields such as health monitoring and machinery fault diagnosis. However, they suffer from limited bandwidth as compared to wired networks or cables which affect their usage in data streaming tasks especially when high sampling rates are required. Therefore, data compression becomes relevant solution not only to increase sampling rates but also to save valuable power. In this work, a Modified Discrete Cosine Transform (MDCT) based compression scheme is proposed to compress vibration signals. Some enhancements to the algorithm of implementation of the MDCT are proposed to make it more suitable to the limited resources of wireless nodes. Three wireless nodes, made by the authors, are employed to host the compression scheme. The accuracy and efficiency of the proposed scheme are investigated by conducting several experimental tests on actual vibration signals generated by a machinery vibration simulator (MVS-1).

Posted Content
TL;DR: In this article, the unique real root of cos(x) = x, referred to as the Dottie number, is expressed as an iteral of cosine using the derivatives of iterals.
Abstract: The unique real root of cos(x) = x, recently referred to as the Dottie number, is expressed as an iteral of cosine Using the derivatives of iterals, it is shown why this number is achieved starting from any real number, when the iterates of cosine successfully approach infinity, and how this affects the Maclaurin series of the iterals Properties of the iterals of cosine and sine and their derivatives are considered A C++ template for iteral is applied for computation of Julia sets

Proceedings ArticleDOI
01 Dec 2012
TL;DR: The proposed method solves the problem with no decline in quality by sequentially updating sums instead of integral images and by improving look-up tables, which accomplishes a one-pass approximation with much less workspace.
Abstract: This paper presents an approximate Gaussian filter which can run in one-pass with high accuracy based on spectrum sparsity. This method is a modification of the cosine integral image (CII), which decomposes a filter kernel into few cosine terms and convolves each cosine term with an input image in constant time per pixel by using integral images and look-up tables. However, they require much workspace and high access cost. The proposed method solves the problem with no decline in quality by sequentially updating sums instead of integral images and by improving look-up tables, which accomplishes a one-pass approximation with much less workspace. A specialization for tiny kernels are also discussed for faster calculation. Experiments on image filtering show that the proposed method can run nearly two times faster than CII and also than convolution even with small kernel.

Proceedings ArticleDOI
21 Mar 2012
TL;DR: It is concluded that the proposed framework has a potential use for noise filtering and risk management in quantitative finance and discrete cosine transform offers comparable performance to Karhunen-Loeve transform for decomposition of empirical correlation matrix of a given portfolio.
Abstract: We present a Toeplitz approximation to symmetric empirical correlation matrix of asset returns by auto-regressive order one, AR(1), signal source modeling. AR(1) approximation provides an analytical framework where the corresponding eigenvalues and eigenvectors are defined in closed forms. Furthermore, we show discrete cosine transform (DCT) offers comparable performance to Karhunen-Loeve transform (KLT) for decomposition of empirical correlation matrix of a given portfolio where the first is significantly more efficient to implement. It is concluded that the proposed framework has a potential use for noise filtering and risk management in quantitative finance.

Patent
18 Jan 2012
TL;DR: In this paper, an audio watermarking method based on the MP3 encoding principle is proposed, which can effectively resist common audio attacks, and also has good robustness on desynchronization attacks (such as shearing attacks, time scaling and the like).
Abstract: The invention discloses an audio watermarking method based on the MP3 encoding principle, which belongs to the technical field of multimedia digital watermarks. The method includes two processes, i.e. watermark embedment and watermark extraction; a watermark is embedded into a low-frequency MDCT (Modified Discrete Cosine Transform) coefficient as an encoding process is carried out synchronously; in order to enhance the robustness of the watermark, an appropriate audio segment is chosen in combination with the frequency domain masking effect in a human auditory system to be embedded; and in order to resist desynchronization attacks, a synchronization mechanism is introduced. By processing the encoding process, the method fulfills the injection of the watermark, and the method can effectively resist common audio attacks, and also has good robustness on desynchronization attacks (such as shearing attacks, time scaling and the like). While taking audio watermark robustness into consideration, the method guarantees the auditory invisibility of audio contents, the computational complexity of the algorithm is low, and the method is easy to implement.

Journal ArticleDOI
TL;DR: Speaker Identification is proposed using the frequency distribution of various transforms like DFT, DCT, DST, Hartley, Walsh, Haar and Kekre transforms to extract the feature vectors in the training and the matching phases.
Abstract: In this paper, we propose Speaker Identification using the frequency distribution of various transforms like DFT (Discrete Fourier Transform), DCT (Discrete Cosine Transform), DST (Discrete Sine Transform), Hartley, Walsh, Haar and Kekre transforms. The speech signal spoken by a particular speaker is converted into frequency domain by applying the different transform techniques. The distribution in the transform domain is utilized to extract the feature vectors in the training and the matching phases. The results obtained by using all the seven transform techniques have been analyzed and compared. It can be seen that DFT, DCT, DST and Hartley transform give comparatively similar results (Above 96%). The results obtained by using Haar and Kekre transform are very poor. The best results are obtained by using DFT (97.19% for a feature vector of size 40).

Patent
09 May 2012
TL;DR: In this article, a method for audio parameter balance and an audio parameter balancer is proposed, which comprises the following steps: converting an input time-domain audio signal into a frequency-domain modified discrete cosine transform (MDCT) spectral coefficient by MDCT conversion.
Abstract: The invention provides a method for audio parameter balance and an audio parameter balancer. The method comprises the following steps: converting an input time-domain audio signal into a frequency-domain modified discrete cosine transform (MDCT) spectral coefficient by MDCT conversion; determining the range of the MDCT spectral coefficient corresponding to each frequency band according to the center frequency and the bandwidth of each frequency band set by a user and carrying out frequency band grouping and dividing on the MDCT spectral coefficient; carrying out corresponding gain adjustment on the MDCT spectral coefficient in each frequency band according to the gain of each frequency band set by the user; and converting the MDCT spectral coefficient after the gain adjustment into the time-domain audio signal by inverse modified discrete cosine transform (IMDCT). By the method for the audio parameter balance and the audio parameter balancer, the operating quantity in the balance treatment of the parameter can be reduced; and moreover, the realization is simple.

Patent
08 Feb 2012
TL;DR: In this paper, an audio-frequency fingerprint method of a compressed domain based on Zernike moment was proposed, which combines the frequency and time information of the modified discrete cosine transform coefficient (the MDCT coefficient) of the data of the compressed domain of an MP3.
Abstract: The invention belongs to the technical field of music search based on contents, in particular to an audio-frequency fingerprint method of a compressed domain based on Zernike moment The invention combines the frequency and time information of the modified discrete cosine transform coefficient (the MDCT coefficient) of the data of the compressed domain of an MP3 and some properties of the Zernikemoment skillfully, such as low-order moment representing the integral properties of a signal and high-order moment representing the detailed properties of the signal; and the Zernike moment has the invariable properties of rotation, scaling and translation, so that the fingerprint of the audio-frequency compressed domain which is finally formed can resist the processing of a plurality of signals of a time domain robustly and resists the processing of the signals of the time domain slightly

Journal ArticleDOI
TL;DR: Packet loss concealment methods for MP3 audio proved that both of the improvement methods for lower and higher dimensions effectively improved the subjective audio quality.
Abstract: This paper describes packet loss concealment methods for MP3 audio. The proposed methods are based on estimation of modified discrete cosine transform (MDCT) coefficients of the lost packets. The estimation of MDCT coefficients of lower dimensions is performed by switching two concealment methods: the sign correction method and the correlation-based method. The concealment methods are switched based on redundant side information calculated subband-by-subband for reducing MDCT prediction errors. Next, a method for improving estimation of MDCT coefficients of higher dimensions was proposed. The method estimates the absolute value and sign of an MDCT coefficient independently. The subjective evaluation experiment proved that both of the improvement methods for lower and higher dimensions effectively improved the subjective audio quality.

Journal Article
TL;DR: A watermarking algorithm based on the discrete wavelet transform and discrete cosine transform and has strong robustness to JPEG loss compression, shear, and noise interference attacks.
Abstract: A watermarking algorithm based on the discrete wavelet transform and discrete cosine transform is proposed.The bar code is scrambled as a watermark.The carrier image is transformed by discrete wavelet and discrete cosine transforms,and after being scrambled,watermark is embedded.When watermark is extracted,by using Radon transformation,the image is adjusted.According to the binary bar code feature,the proposed watermark image is amended.The experimental results show that the algorithm has good effect,and is not easily perceived after watermark was embedded.It has strong robustness to JPEG loss compression,shear,and noise interference attacks.

Journal ArticleDOI
TL;DR: This paper presents a generalized mixed-radix decimation-in-time (DIT) fast algorithm for computing the modified discrete cosine transform (MDCT) of the composite lengths N=2xq^m, m>=2, where q is an odd positive integer.