scispace - formally typeset
Search or ask a question
Book ChapterDOI

Hiding Audio in Images: A Deep Learning Approach.

TL;DR: A deep generative model that consists of an auto-encoder as generator along with one discriminator that are trained to embed the message while, an exclusive extractor network with an audio discriminator is trained fundamentally to extract the hidden message from the encoded host signal.
Abstract: In this work, we propose an end-to-end trainable model of Generative Adversarial Networks (GAN) which is engineered to hide audio data in images. Due to the non-stationary property of audio signals and lack of powerful tools, audio hiding in images was not explored well. We devised a deep generative model that consists of an auto-encoder as generator along with one discriminator that are trained to embed the message while, an exclusive extractor network with an audio discriminator is trained fundamentally to extract the hidden message from the encoded host signal. The encoded image is subjected to few common attacks and it is established that the message signal can not be hindered making the proposed method robust towards blurring, rotation, noise, and cropping. The one remarkable feature of our method is that it can be trained to recover against various attacks and hence can also be used for watermarking.
Citations
More filters
Journal ArticleDOI
TL;DR: The main focus of the work is to generate realistic images that do not exist in reality but are synthesised from random noise by the proposed model.
Abstract: Art is an artistic method of using digital technologies as a part of the generative or creative process. With the advent of digital currency and NFTs (Non-Fungible Token), the demand for digital art is growing aggressively. In this manuscript, we advocate the concept of using deep generative networks with adversarial training for a stable and variant art generation. The work mainly focuses on using the Deep Convolutional Generative Adversarial Network (DC-GAN) and explores the techniques to address the common pitfalls in GAN training. We compare various architectures and designs of DC-GANs to arrive at a recommendable design choice for a stable and realistic generation. The main focus of the work is to generate realistic images that do not exist in reality but are synthesised from random noise by the proposed model. We provide visual results of generated animal face images (some pieces of evidence showing a blend of species) along with recommendations for training, architecture and design choices. We also show how training image preprocessing plays a massive role in GAN training.

1 citations

Book ChapterDOI
01 Jan 2023
TL;DR: In this article , the authors proposed end-to-end trainable models of Generative Adversarial Networks (GAN) for hiding video data inside images, which is a relatively new topic and has never been attempted earlier to the best knowledge.
Abstract: This work proposes end-to-end trainable models of Generative Adversarial Networks (GAN) for hiding video data inside images. Hiding video inside images is a relatively new topic and has never been attempted earlier to our best knowledge. We propose two adversarial models that hide video data inside images: a base model with Recurrent Neural Networks and a novel model with 3D-spatiotemporal Convolutional Neural Networks. Both the models have two distinct networks: (1) An embedder to extract features from the time variate video data and inject them into the deep latent representations of the image. (2) An extractor that reverse-engineers the embedder function to extract the hidden data inside the encoded image. A multi-discriminator GAN framework with multi-objective training for multimedia hiding is one of the novel contributions of this work.
Book ChapterDOI
01 Jan 2023
TL;DR: In this paper , the authors proposed a novel training process for a pre-existing architecture of GANs to enable task-sharing or multi-tasking of sub-modules.
Abstract: This manuscript proposes a novel training process for a pre-existing architecture of GANs to enable task-sharing or multi-tasking of sub-modules. We explore the application of data hiding to analyse the model’s performance. Share-GAN consists of an embedder network (to encode secret messages into a cover), a U-Net autoencoder (that consists of encoder and decoder). The embedder’s encoder network is custom trained to act as an extractor network (to extract the hidden message from the encoded image). The multi-tasking of the embedder’s encoder is, to our knowledge, never explored prior to this work. The encoded image is subjected to multiple attacks to analyse the noise sensitivity of the model. The proposed method shows inherent robustness towards attacks like Gaussian blurring, rotation, noise, and cropping. However, the model can be trained on any possible attacks to reduce noise sensitivity further. In this manuscript, we considered images as both messages and containers. However, the method can be extended to any combination of multi-media data.
Book ChapterDOI
01 Jan 2023
TL;DR: In this paper , an end-to-end trainable model, VAE-GAN, was proposed to hide messages (image) inside a container (image), which consists of an embedder network (to hide a message inside the container) and an extractor network(to extract the hidden message from the encoded image).
Abstract: This manuscript proposes an end-to-end trainable model, VAE-GAN, engineered to hide messages (image) inside a container (image). The model consists of an embedder network (to hide a message inside the container) and an extractor network(to extract the hidden message from the encoded image). In the proposed method, we employ the generative power of a variational autoencoder with adversarial training to embed images. At the extractor, a vanilla convolutional network with adversarial training has provided the best results with clean extracted images. To analyse the noise sensitivity of the model, the encoded image is subjected to multiple attacks, and it is established that the proposed method is inherently robust towards attacks like Gaussian blurring, rotation, noise, and cropping. However, the model can be trained on any possible attacks to reduce noise sensitivity further. In this manuscript, we explore the application of hiding images inside images, but the method can be extended to hide various combinations of data hiding.
References
More filters
Journal ArticleDOI
08 Dec 2014
TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Abstract: We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to ½ everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.

38,211 citations

Proceedings ArticleDOI
21 Jul 2017
TL;DR: It is concluded that the NTIRE 2017 challenge pushes the state-of-the-art in single-image super-resolution, reaching the best results to date on the popular Set5, Set14, B100, Urban100 datasets and on the authors' newly proposed DIV2K.
Abstract: This paper introduces a novel large dataset for example-based single image super-resolution and studies the state-of-the-art as emerged from the NTIRE 2017 challenge. The challenge is the first challenge of its kind, with 6 competitions, hundreds of participants and tens of proposed solutions. Our newly collected DIVerse 2K resolution image dataset (DIV2K) was employed by the challenge. In our study we compare the solutions from the challenge to a set of representative methods from the literature and evaluate them using diverse measures on our proposed DIV2K dataset. Moreover, we conduct a number of experiments and draw conclusions on several topics of interest. We conclude that the NTIRE 2017 challenge pushes the state-of-the-art in single-image super-resolution, reaching the best results to date on the popular Set5, Set14, B100, Urban100 datasets and on our newly proposed DIV2K.

2,388 citations

Proceedings Article
01 Jan 2016
TL;DR: Deep convolutional generative adversarial networks (DCGANs) as discussed by the authors learn a hierarchy of representations from object parts to scenes in both the generator and discriminator for unsupervised learning.
Abstract: In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks - demonstrating their applicability as general image representations.

2,205 citations

Journal ArticleDOI
TL;DR: An algorithm to estimate a signal from its modified short-time Fourier transform (STFT) by minimizing the mean squared error between the STFT of the estimated signal and the modified STFT magnitude is presented.
Abstract: In this paper, we present an algorithm to estimate a signal from its modified short-time Fourier transform (STFT). This algorithm is computationally simple and is obtained by minimizing the mean squared error between the STFT of the estimated signal and the modified STFT. Using this algorithm, we also develop an iterative algorithm to estimate a signal from its modified STFT magnitude. The iterative algorithm is shown to decrease, in each iteration, the mean squared error between the STFT magnitude of the estimated signal and the modified STFT magnitude. The major computation involved in the iterative algorithm is the discrete Fourier transform (DFT) computation, and the algorithm appears to be real-time implementable with current hardware technology. The algorithm developed in this paper has been applied to the time-scale modification of speech. The resulting system generates very high-quality speech, and appears to be better in performance than any existing method.

1,899 citations

Book ChapterDOI
08 Sep 2018
TL;DR: This work finds that neural networks can learn to use invisible perturbations to encode a rich amount of useful information, and demonstrates that adversarial training improves the visual quality of encoded images.
Abstract: Recent work has shown that deep neural networks are highly sensitive to tiny perturbations of input images, giving rise to adversarial examples Though this property is usually considered a weakness of learned models, we explore whether it can be beneficial We find that neural networks can learn to use invisible perturbations to encode a rich amount of useful information In fact, one can exploit this capability for the task of data hiding We jointly train encoder and decoder networks, where given an input message and cover image, the encoder produces a visually indistinguishable encoded image, from which the decoder can recover the original message We show that these encodings are competitive with existing data hiding algorithms, and further that they can be made robust to noise: our models learn to reconstruct hidden information in an encoded image despite the presence of Gaussian blurring, pixel-wise dropout, cropping, and JPEG compression Even though JPEG is non-differentiable, we show that a robust model can be trained using differentiable approximations Finally, we demonstrate that adversarial training improves the visual quality of encoded images

420 citations