scispace - formally typeset
Search or ask a question

Showing papers on "Modified discrete cosine transform published in 2021"


Journal ArticleDOI
Juan Wen1, Hao Zeng1, Yuzhu Wang1, Shurong Liu1, Yiming Xue1 
TL;DR: A robust speech steganographic scheme based on Singular Value Decomposition (SVD) and Modified Discrete Cosine Transform (MDCT) has striking advantages to resist common robust attacks and the state-of-the-art steganalysis attacks while maintaining good imperceptibility.
Abstract: Speech is one of the essential ways of communication. The study of speech steganography provides great value in information security. To improve imperceptibility and robustness of speech steganography, the characteristics of speech signals should be fully taken into account. In this paper, a robust speech steganographic scheme based on Singular Value Decomposition (SVD) and Modified Discrete Cosine Transform (MDCT) is proposed. Firstly, Voice Activity Detector (VAD) is used to detect voiced frames from speech signals, along with MDCT with Kaiser Bessel Derived (KBD) window being performed on each frame. Then the MDCT coefficients are selected from a certain frequency range and divided into a pair of segments. The two largest singular values of the paired segments are modified respectively according to their value difference to embed secret message. The thresholds are adaptively adjusted according to the largest singular values. Extensive experiments are carried out to compare the proposed method with three other methods from imperceptibility, robustness, capacity, and security. The experimental results show that under the simulation parameters β = 320, Nk = 58, fl = 100 Hz, fh = 3 kHz, and α = 0.61, the proposed method has striking advantages to resist common robust attacks and the state-of-the-art steganalysis attacks while maintaining good imperceptibility.

5 citations


Proceedings ArticleDOI
05 Apr 2021
TL;DR: In this paper, a deep convolutional neural network (CNN) is utilized to detect audio operations i.e. pitch shifted and amplitude varied signals for recognition of forgery in voice is challenging task.
Abstract: Attestation of audio signals for recognition of forgery in voice is challenging task. In this research work, a deep convolutional neural network (CNN) is utilized to detect audio operations i.e. pitch shifted and amplitude varied signals. Short-time Fourier transform (STFT) and Modified Discrete Cosine Transform (MDCT) features are chosen for audio processing and their plotted patterns are fed to CNN. Experimental results show that our model can successfully distinguish tampered signals to facilitate the audio authentication on TIMIT dataset. Proposed CNN architecture can distinguish spoofed voices of shifting pitch with accuracy of 97.55% and of varying amplitude with accuracy of 98.85%.

4 citations


Posted Content
TL;DR: In this article, a deep convolutional GAN is proposed to generate high-quality audio samples with long-range coherence using a Modified Discrete Cosine Transform (MDCT) data representation.
Abstract: We present a deep convolutional GAN which leverages techniques from MP3/Vorbis audio compression to produce long, high-quality audio samples with long-range coherence. The model uses a Modified Discrete Cosine Transform (MDCT) data representation, which includes all phase information. Phase generation is hence integral part of the model. We leverage the auditory masking and psychoacoustic perception limit of the human ear to widen the true distribution and stabilize the training process. The model architecture is a deep 2D convolutional network, where each subsequent generator model block increases the resolution along the time axis and adds a higher octave along the frequency axis. The deeper layers are connected with all parts of the output and have the context of the full track. This enables generation of samples which exhibit long-range coherence. We use MP3net to create 95s stereo tracks with a 22kHz sample rate after training for 250h on a single Cloud TPUv2. An additional benefit of the CNN-based model architecture is that generation of new songs is almost instantaneous.

2 citations


Journal ArticleDOI
TL;DR: A method for detecting Advanced Audio Coding (AAC) compression on suspicious WAV files, in which the variance of the Modified Discrete Cosine Transform (MDCT) characterises four compression bitrates, which can detect previous AAC compression and can be potentially used when it is unfeasible to recover the suspicious signal completely.
Abstract: Audio files are frequent targets of malicious users who seek illegal profit trading with fake-quality content. For increasing the confidence in the integrity of audio files, the detection of fake-q...

1 citations


Journal ArticleDOI
TL;DR: It is shown that the hybrid transform achieves a better lessening of PAPR and minimized Bit Error Rate in the OFDM communication system when compared with the basic OFDM, Hadamard transforms, and Companding technique.
Abstract: The performance of orthogonal frequency multiplexing (OFDM) communication systems is affected by the peak to average power ratio (PAPR). The research work uses a hybrid transform for reducing this PAPR value. In recent years, OFDM is more prevalent in the research field, particularly in wireless communication. OFDM system is found to have a high PAPR value. This high PAPR leads to nonlinear distortions that result in low power efficiency. Hence there is a need to minimize PAPR. Hadamard transform is a commonly used technique for the reduction of PAPR. However, greater efficiency can be carried out using a hybrid way to reducing the PAPR of the signals applying OFDM. Modified Discrete Cosine Transform (MDCT) combined with nonlinear companding technique, is suggested to minimize PAPR, described as a hybrid transform. It is shown that the hybrid transform achieves a better lessening of PAPR and minimized Bit Error Rate (BER) in the OFDM communication system when compared with the basic OFDM, Hadamard transforms, and Companding technique.

1 citations


Journal ArticleDOI
TL;DR: In this article, a Frequency Domain Joint Harmonics Prediction (FDJHP) method was proposed for low-delay transform domain general audio coders, which operates directly in the Modified Discrete Cosine Transform (MDCT) domain and can enhance the coding efficiency.
Abstract: In this paper we propose a long-term prediction method for low delay transform domain general audio coders. This Frequency Domain Joint Harmonics Prediction (FDJHP) method operates directly in the Modified Discrete Cosine Transform (MDCT) domain and can enhance the coding efficiency, even under very low frequency resolutions. We compare this new method with state-of-the-art MDCT based methods by analyzing bitrate savings and by a listening test using test signals with strong harmonic components. The results indicate that it outperforms an existing method, which also directly operates in the frequency domain. Additionally, we show how it can be combined with the existing techniques into an adaptive system, where the different methods can complement each other.

Patent
11 Jan 2021
TL;DR: In this paper, a modified discrete cosine transform (MDCT) was used for FD-presentation of information signal or its processed version into representation in time domain (TD-representation) using synthesis window function (90, having meander section (94), which transitions linear function in corresponding at least four points.
Abstract: FIELD: computer equipment.SUBSTANCE: invention relates to computer engineering for processing audio data. Technical result is achieved by performing synthesis based on modified discrete cosine transform (MDCT) for conversion of FD-presentation of information signal or its processed version into representation in time domain (TD-representation) using synthesis window function (90), having meander section (94), which transitions linear function in corresponding at least four points (No. 1, No. 2, No. 3, No. 4), wherein synthesis window function (90, 290) is defined such that in meandering section: exceed linear function in first interval between first intersection point and second intersection point; be lower than linear function in second interval between second intersection point and third intersection point; exceed the linear function in the third interval between the third intersection point and the fourth intersection point.EFFECT: technical result consists in excluding suboptimal frequency characteristic by eliminating rupture differentiation of information signal.43 cl, 33 dwg