scispace - formally typeset
Search or ask a question

Showing papers on "Modified discrete cosine transform published in 2019"


Proceedings ArticleDOI
01 May 2019
TL;DR: In this paper, a trainable adaptive window switching (AWS) method was proposed for speech enhancement in the modified discrete cosine transform domain, and the windowing function of each time frame was manipulated using a DNN depending on the input signal.
Abstract: This study proposes a trainable adaptive window switching (AWS) method and apply it to a deep-neural-network (DNN) for speech enhancement in the modified discrete cosine transform domain. Time-frequency (T-F) mask processing in the short-time Fourier transform (STFT)-domain is a typical speech enhancement method. To recover the target signal precisely, DNN-based short-time frequency transforms have recently been investigated and used instead of the STFT. However, since such a fixed-resolution short-time frequency transform method has a T-F resolution problem based on the uncertainty principle, not only the short-time frequency transform but also the length of the windowing function should be optimized. To overcome this problem, we incorporate AWS into the speech enhancement procedure, and the windowing function of each time-frame is manipulated using a DNN depending on the input signal. We confirmed that the proposed method achieved a higher signal-to-distortion ratio than conventional speech enhancement methods in fixed-resolution frequency domains.

10 citations


Proceedings ArticleDOI
02 Jul 2019
TL;DR: A universal joint embedding distortion function (JED) is proposed to improve the undetectability and imperceptibility of MP3 steganography, which can be applied to Huffman codeword mapping (HCM) and sign bit flipping (SBF).
Abstract: In this paper, a universal joint embedding distortion function (JED) is proposed to improve the undetectability and imperceptibility of MP3 steganography, which can be applied to Huffman codeword mapping (HCM) and sign bit flipping (SBF). Content-aware and statistical distortions are synthetically modeled to formulate the atom modification of the quantified modified discrete cosine transform (QMDCT) coefficients. On the one hand, to retain the hearing imperceptibility, the absolute threshold of hearing is employed to measure the auditory sensitivity of each QMDCT coefficient. On the other hand, considering most of the existing universal MP3 steganalysis features are designed based on correlations, the forward and backward transition probability are utilized to characterize the correlations between adjacent QMDCT coefficients. What's more, we present an implementation of JED in sign bits domain. Experimental results demonstrate that our method is able to achieve higher embedding capacity and better imperceptibility. The detection accuracy of the proposed scheme is about 75% with the bitrate of 320kbps and embedding rate of 11kbit/s, which is respectively decreased by 9.54% ~ 16.94% than existing MP3 steganographic methods.

7 citations


Proceedings ArticleDOI
07 Mar 2019
TL;DR: This paper proposes an algorithm for the compression of audio signals using Discrete Cosine Transform (DCT) with temporal auditory masking (TAM) and evaluated the proposed algorithm using its compression ratio, peak signal to noise ratio (PSNR), and transmission delay (TD).
Abstract: In recent years, audio compression has become a very popular area of research. Today, innovations in audio signal processing are used in numerous applications which include advanced audio coding (AAC), perceptual audio coding schemes (MP3 encoding), internet radio and other lossless audio coding schemes. This paper proposesan algorithm for the compression of audio signals using Discrete Cosine Transform (DCT) with temporal auditory masking (TAM). Furthermore, we have developed a system for audio data compression and evaluated the proposed algorithm using its compression ratio (CR), peak signal to noise ratio (PSNR) and transmission delay (TD). The experimental results show that the proposed algorithm gives a compression ratio (CR) of 4:1 of the original signal. All the simulations are carried out using Octave Signal Processing Tool kit and the codes are written in Octave with C programming language.

7 citations


Proceedings ArticleDOI
01 Nov 2019
TL;DR: This paper investigates the performance of Smoothed- l0 norm (SL0) algorithm to reconstruct an audio signal and compares its performance to the two most used algorithms in audio CS: l1-magic and Orthogonal Matching Pursuit.
Abstract: Compressive Sensing (CS) is a new approach in signal processing that aims acquiring and compressing signal simultaneously. CS suggests that if a signal is sparse, the original signal can be reconstructed by exploiting a few random measurements using reconstruction algorithms. In this paper, we investigate the performance of Smoothed- l0 norm (SL0) algorithm to reconstruct an audio signal and compare its performance to the two most used algorithms in audio CS: l1-magic and Orthogonal Matching Pursuit (OMP). This study adopts the Modified Discrete Cosine Transform (MDCT) for sparse representation and random Gaussian matrix for measurement matrix. These Algorithms are evaluated using Signal-to-noise ratio (SNR) and computational complexity for different number of measurements. Results show that SL0 algorithm performs better in both reconstruction quality and computational complexity.

4 citations


Journal ArticleDOI
TL;DR: Improvements to the scalable wideband speech codec based on the iLBC are presented, employing the wavelet packet transform (WPT) instead of the modified discrete cosine transform (MDCT) to enhance the quality, and the proposed codec outperforms G.729.1 at most bit rates according to the objective quality.

3 citations


Patent
13 Aug 2019
TL;DR: In this article, a universal steganalysis method and system of audio based on spectrograms and deep residual network is presented, which includes extracting spectrogram features of a recompressed original audio signal by comprehensively considering common MDCT features in audio compression standards.
Abstract: The invention discloses a universal steganalysis method and system of audio based on spectrograms and deep residual network. Targeting at the current situation that existing steganographic algorithmsbased on audio compression standards perform steganography by correcting different audio compression parameters while no universal steganalysis algorithm is present, the method of the invention includes extracting spectrogram features of a recompressed original audio signal by comprehensively considering common MDCT (modified discrete cosine transform) features in AAC (advanced audio coding) and other compression coding standards, mining inherent distribution features of the audio signal via the deep residual network S-ResNet, and extracting classification features to construct a universal audio steganalysis unit. The method and system of the invention have the advantages that the method and system are not limited to a single coding standard and parameter domain and the method and system have good universality and good steganalysis test performance.

1 citations


Book ChapterDOI
18 Apr 2019

1 citations


Patent
07 Nov 2019
TL;DR: In this paper, sign changes in the modified discrete cosine transform (MDCT) coefficients in the received frames are analyzed by determining the number of sign changes between the corresponding MDCT coefficients in sub-vectors of successive error-free frames not containing the transient process.
Abstract: FIELD: physics.SUBSTANCE: invention relates to masking loss of audio decoder. For this purpose, sign changes in the modified discrete cosine transform (MDCT) coefficients in the received frames are analyzed by determining the number of sign changes between the corresponding MDCT coefficients in the sub-vectors of successive error-free frames not containing the transient process, each sub-vector comprising frequency band factors; accumulating number of sign changes in corresponding bands of consecutive frames; and reconstructing the lost frame by copying the MDCT coefficients from the preceding frame, but with the MDCT coefficient reverse signs in the bands having the accumulated number of sign changes greater than the predetermined threshold.EFFECT: technical result is improved masking of errors in frames due to absence of transmission of additional third-party parameters or formation of additional delays required by interpolation.10 cl, 16 dwg, 1 tbl

Proceedings ArticleDOI
01 Sep 2019
TL;DR: In this paper, power-complementary windows are introduced into IntMDCT to achieve lower samples delay and more dense inputs for reducing quantization noise method can contribute to saving bit rate.
Abstract: Modified Discrete Cosine Transform (MDCT) is widely used in current audio codec for its excellent performance in frequency selectivity during critical sampling. To avoid lossy codec which produced by floating MDCT coefficients from integer input samples, Integer Modified Discrete Cosine Transform (IntMDCT) is a derivation of MDCT and utilized in MPEG-4 SLS to generate integer outputs which could lead lossless codec. However, IntMDCT also causes one-frame-length samples decoder delay and big quantization noise. In this paper, power-complementary windows are introduced into IntMDCT to achieve lower samples delay. More dense inputs for reducing quantization noise method can contribute to saving bit rate.