Hiding Audio in Images: A Deep Learning Approach.

doi:10.1007/978-3-030-34872-4_43

Home
/
Papers
/
Hiding Audio in Images: A Deep Learning Approach.

Book Chapter•DOI•

Hiding Audio in Images: A Deep Learning Approach.

Rohit Gandikota¹, Deepak Mishra¹•Institutions (1)

Indian Institute of Space Science and Technology¹

17 Dec 2019-pp 389-399

TL;DR: A deep generative model that consists of an auto-encoder as generator along with one discriminator that are trained to embed the message while, an exclusive extractor network with an audio discriminator is trained fundamentally to extract the hidden message from the encoded host signal.

read less

Abstract: In this work, we propose an end-to-end trainable model of Generative Adversarial Networks (GAN) which is engineered to hide audio data in images. Due to the non-stationary property of audio signals and lack of powerful tools, audio hiding in images was not explored well. We devised a deep generative model that consists of an auto-encoder as generator along with one discriminator that are trained to embed the message while, an exclusive extractor network with an audio discriminator is trained fundamentally to extract the hidden message from the encoded host signal. The encoded image is subjected to few common attacks and it is established that the message signal can not be hindered making the proposed method robust towards blurring, rotation, noise, and cropping. The one remarkable feature of our method is that it can be trained to recover against various attacks and hence can also be used for watermarking.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

DC-Art-GAN: Stable Procedural Content Generation using DC-GANs for Digital Art

[...]

Rohit Gandikota, Nik Brown

06 Sep 2022-arXiv.org

TL;DR: The main focus of the work is to generate realistic images that do not exist in reality but are synthesised from random noise by the proposed model.

...read moreread less

Abstract: Art is an artistic method of using digital technologies as a part of the generative or creative process. With the advent of digital currency and NFTs (Non-Fungible Token), the demand for digital art is growing aggressively. In this manuscript, we advocate the concept of using deep generative networks with adversarial training for a stable and variant art generation. The work mainly focuses on using the Deep Convolutional Generative Adversarial Network (DC-GAN) and explores the techniques to address the common pitfalls in GAN training. We compare various architectures and designs of DC-GANs to arrive at a recommendable design choice for a stable and realistic generation. The main focus of the work is to generate realistic images that do not exist in reality but are synthesised from random noise by the proposed model. We provide visual results of generated animal face images (some pieces of evidence showing a blend of species) along with recommendations for training, architecture and design choices. We also show how training image preprocessing plays a massive role in GAN training.

...read moreread less

1 citations

Journal Article•DOI•

Hiding images within audio using deep generative model

[...]

Subhajit Paul, Deepak Mishra

11 Jul 2022-Multimedia Tools and Applications

1 citations

Book Chapter•DOI•

Hiding Video in Images: Harnessing Adversarial Learning on Deep 3D-Spatio-Temporal Convolutional Neural Networks

[...]

Rohit Gandikota, Deepak Mishra, Niko Brown

01 Jan 2023

TL;DR: In this article , the authors proposed end-to-end trainable models of Generative Adversarial Networks (GAN) for hiding video data inside images, which is a relatively new topic and has never been attempted earlier to the best knowledge.

...read moreread less

Abstract: This work proposes end-to-end trainable models of Generative Adversarial Networks (GAN) for hiding video data inside images. Hiding video inside images is a relatively new topic and has never been attempted earlier to our best knowledge. We propose two adversarial models that hide video data inside images: a base model with Recurrent Neural Networks and a novel model with 3D-spatiotemporal Convolutional Neural Networks. Both the models have two distinct networks: (1) An embedder to extract features from the time variate video data and inject them into the deep latent representations of the image. (2) An extractor that reverse-engineers the embedder function to extract the hidden data inside the encoded image. A multi-discriminator GAN framework with multi-objective training for multimedia hiding is one of the novel contributions of this work.

...read moreread less

Book Chapter•DOI•

Share-GAN: A Novel Shared Task Training in Generative Adversarial Networks for Data Hiding

[...]

Rohit Gandikota, Deepak Mishra

01 Jan 2023

TL;DR: In this paper , the authors proposed a novel training process for a pre-existing architecture of GANs to enable task-sharing or multi-tasking of sub-modules.

...read moreread less

Abstract: This manuscript proposes a novel training process for a pre-existing architecture of GANs to enable task-sharing or multi-tasking of sub-modules. We explore the application of data hiding to analyse the model’s performance. Share-GAN consists of an embedder network (to encode secret messages into a cover), a U-Net autoencoder (that consists of encoder and decoder). The embedder’s encoder network is custom trained to act as an extractor network (to extract the hidden message from the encoded image). The multi-tasking of the embedder’s encoder is, to our knowledge, never explored prior to this work. The encoded image is subjected to multiple attacks to analyse the noise sensitivity of the model. The proposed method shows inherent robustness towards attacks like Gaussian blurring, rotation, noise, and cropping. However, the model can be trained on any possible attacks to reduce noise sensitivity further. In this manuscript, we considered images as both messages and containers. However, the method can be extended to any combination of multi-media data.

...read moreread less

Book Chapter•DOI•

HD-VAE-GAN: Hiding Data with Variational Autoencoder Generative Adversarial Networks

[...]

Rohit Gandikota, Deepak Mishra

01 Jan 2023

TL;DR: In this paper , an end-to-end trainable model, VAE-GAN, was proposed to hide messages (image) inside a container (image), which consists of an embedder network (to hide a message inside the container) and an extractor network(to extract the hidden message from the encoded image).

...read moreread less

Abstract: This manuscript proposes an end-to-end trainable model, VAE-GAN, engineered to hide messages (image) inside a container (image). The model consists of an embedder network (to hide a message inside the container) and an extractor network(to extract the hidden message from the encoded image). In the proposed method, we employ the generative power of a variational autoencoder with adversarial training to embed images. At the extractor, a vanilla convolutional network with adversarial training has provided the best results with clean extracted images. To analyse the noise sensitivity of the model, the encoded image is subjected to multiple attacks, and it is established that the proposed method is inherently robust towards attacks like Gaussian blurring, rotation, noise, and cropping. However, the model can be trained on any possible attacks to reduce noise sensitivity further. In this manuscript, we explore the application of hiding images inside images, but the method can be extended to hide various combinations of data hiding.

...read moreread less

References

PDF

Open Access

More filters

Journal Article•DOI•

Generative Adversarial Nets

[...]

Ian Goodfellow¹, Jean Pouget-Abadie¹, Mehdi Mirza¹, Bing Xu¹, David Warde-Farley¹, Sherjil Ozair², Aaron Courville¹, Yoshua Bengio¹ - Show less +4 more•Institutions (2)

Université de Montréal¹, Indian Institute of Technology Delhi²

08 Dec 2014

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

Abstract: We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to ½ everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.

...read moreread less

38,211 citations

Proceedings Article•DOI•

NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study

[...]

Eirikur Agustsson¹, Radu Timofte¹•Institutions (1)

ETH Zurich¹

21 Jul 2017

TL;DR: It is concluded that the NTIRE 2017 challenge pushes the state-of-the-art in single-image super-resolution, reaching the best results to date on the popular Set5, Set14, B100, Urban100 datasets and on the authors' newly proposed DIV2K.

...read moreread less

Abstract: This paper introduces a novel large dataset for example-based single image super-resolution and studies the state-of-the-art as emerged from the NTIRE 2017 challenge. The challenge is the first challenge of its kind, with 6 competitions, hundreds of participants and tens of proposed solutions. Our newly collected DIVerse 2K resolution image dataset (DIV2K) was employed by the challenge. In our study we compare the solutions from the challenge to a set of representative methods from the literature and evaluate them using diverse measures on our proposed DIV2K dataset. Moreover, we conduct a number of experiments and draw conclusions on several topics of interest. We conclude that the NTIRE 2017 challenge pushes the state-of-the-art in single-image super-resolution, reaching the best results to date on the popular Set5, Set14, B100, Urban100 datasets and on our newly proposed DIV2K.

...read moreread less

2,388 citations

Proceedings Article•

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

[...]

Alec Radford, Luke Metz, Soumith Chintala¹•Institutions (1)

Facebook¹

01 Jan 2016

TL;DR: Deep convolutional generative adversarial networks (DCGANs) as discussed by the authors learn a hierarchy of representations from object parts to scenes in both the generator and discriminator for unsupervised learning.

...read moreread less

Abstract: In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks - demonstrating their applicability as general image representations.

...read moreread less

2,205 citations

Journal Article•DOI•

Signal estimation from modified short-time Fourier transform

[...]

D. Griffin¹, Jae Lim¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Apr 1984-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: An algorithm to estimate a signal from its modified short-time Fourier transform (STFT) by minimizing the mean squared error between the STFT of the estimated signal and the modified STFT magnitude is presented.

...read moreread less

Abstract: In this paper, we present an algorithm to estimate a signal from its modified short-time Fourier transform (STFT). This algorithm is computationally simple and is obtained by minimizing the mean squared error between the STFT of the estimated signal and the modified STFT. Using this algorithm, we also develop an iterative algorithm to estimate a signal from its modified STFT magnitude. The iterative algorithm is shown to decrease, in each iteration, the mean squared error between the STFT magnitude of the estimated signal and the modified STFT magnitude. The major computation involved in the iterative algorithm is the discrete Fourier transform (DFT) computation, and the algorithm appears to be real-time implementable with current hardware technology. The algorithm developed in this paper has been applied to the time-scale modification of speech. The resulting system generates very high-quality speech, and appears to be better in performance than any existing method.

...read moreread less

1,899 citations

Book Chapter•DOI•

HiDDeN: Hiding Data With Deep Networks

[...]

Jiren Zhu¹, Russell Kaplan¹, Justin Johnson¹, Li Fei-Fei¹•Institutions (1)

Stanford University¹

08 Sep 2018

TL;DR: This work finds that neural networks can learn to use invisible perturbations to encode a rich amount of useful information, and demonstrates that adversarial training improves the visual quality of encoded images.

...read moreread less

Abstract: Recent work has shown that deep neural networks are highly sensitive to tiny perturbations of input images, giving rise to adversarial examples Though this property is usually considered a weakness of learned models, we explore whether it can be beneficial We find that neural networks can learn to use invisible perturbations to encode a rich amount of useful information In fact, one can exploit this capability for the task of data hiding We jointly train encoder and decoder networks, where given an input message and cover image, the encoder produces a visually indistinguishable encoded image, from which the decoder can recover the original message We show that these encodings are competitive with existing data hiding algorithms, and further that they can be made robust to noise: our models learn to reconstruct hidden information in an encoded image despite the presence of Gaussian blurring, pixel-wise dropout, cropping, and JPEG compression Even though JPEG is non-differentiable, we show that a robust model can be trained using differentiable approximations Finally, we demonstrate that adversarial training improves the visual quality of encoded images

...read moreread less

420 citations