Saliency detection via conditional adversarial image-to-image network

doi:10.1016/J.NEUCOM.2018.08.013

Home
/
Papers
/
Saliency detection via conditional adversarial image-to-image network

Journal Article•DOI•

Saliency detection via conditional adversarial image-to-image network

Yuzhu Ji¹, Haijun Zhang¹, Q. M. Jonathan Wu²•Institutions (2)

Harbin Institute of Technology¹, University of Windsor²

17 Nov 2018-Neurocomputing (Elsevier)-Vol. 316, pp 357-368

TL;DR: This work proposes to conduct saliency detection by exploiting conditional adversarial network under the cGAN framework, in which saliency map prediction is transformed as a saliency segmentation task by using pair-wised image-to-ground-truth saliency.

read less

About: This article is published in Neurocomputing.The article was published on 2018-11-17. It has received 34 citations till now. The article focuses on the topics: Salience (neuroscience).

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Journal Article•DOI•

Ternary Adversarial Networks With Self-Supervision for Zero-Shot Cross-Modal Retrieval

[...]

Xing Xu¹, Huimin Lu², Jingkuan Song¹, Yang Yang¹, Heng Tao Shen¹, Xuelong Li³ - Show less +2 more•Institutions (3)

University of Electronic Science and Technology of China¹, Shanghai Jiao Tong University², Northwestern Polytechnical University³

01 Jun 2020-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: A novel model called ternary adversarial networks with self-supervision (TANSS) is proposed, inspired by zero-shot learning, to overcome the limitation of the existing methods on this challenging task of cross-modal retrieval.

...read moreread less

Abstract: Given a query instance from one modality (e.g., image), cross-modal retrieval aims to find semantically similar instances from another modality (e.g., text). To perform cross-modal retrieval, existing approaches typically learn a common semantic space from a labeled source set and directly produce common representations in the learned space for the instances in a target set. These methods commonly require that the instances of both two sets share the same classes. Intuitively, they may not generalize well on a more practical scenario of zero-shot cross-modal retrieval , that is, the instances of the target set contain unseen classes that have inconsistent semantics with the seen classes in the source set. Inspired by zero-shot learning, we propose a novel model called ternary adversarial networks with self-supervision (TANSS) in this paper, to overcome the limitation of the existing methods on this challenging task. Our TANSS approach consists of three paralleled subnetworks: 1) two semantic feature learning subnetworks that capture the intrinsic data structures of different modalities and preserve the modality relationships via semantic features in the common semantic space; 2) a self-supervised semantic subnetwork that leverages the word vectors of both seen and unseen labels as guidance to supervise the semantic feature learning and enhances the knowledge transfer to unseen labels; and 3) we also utilize the adversarial learning scheme in our TANSS to maximize the consistency and correlation of the semantic features between different modalities. The three subnetworks are integrated in our TANSS to formulate an end-to-end network architecture which enables efficient iterative parameter optimization. Comprehensive experiments on three cross-modal datasets show the effectiveness of our TANSS approach compared with the state-of-the-art methods for zero-shot cross-modal retrieval.

...read moreread less

185 citations

Cites methods from "Saliency detection via conditional ..."

...GAN has been extended and applied to various unimodal application areas, for example, image translation [48], image segmentation [53], saliency detection [54], etc....
[...]

Journal Article•DOI•

Deep Learning in the Biomedical Applications: Recent and Future Status

[...]

Ryad Zemouri, Noureddine Zerhouni¹, Daniel Racoceanu•Institutions (1)

Franche Comté Électronique Mécanique Thermique et Optique Sciences et Technologies¹

12 Apr 2019-Applied Sciences

TL;DR: This paper reviews the major deep learning concepts pertinent to biomedical applications and concludes with a critical discussion, interpretation and relevant open challenges of the Omics and the BBMI.

...read moreread less

Abstract: Deep neural networks represent, nowadays, the most effective machine learning technology in biomedical domain. In this domain, the different areas of interest concern the Omics (study of the genome—genomics—and proteins—transcriptomics, proteomics, and metabolomics), bioimaging (study of biological cell and tissue), medical imaging (study of the human organs by creating visual representations), BBMI (study of the brain and body machine interface) and public and medical health management (PmHM). This paper reviews the major deep learning concepts pertinent to such biomedical applications. Concise overviews are provided for the Omics and the BBMI. We end our analysis with a critical discussion, interpretation and relevant open challenges.

...read moreread less

124 citations

Journal Article•DOI•

CSGAN: Cyclic-Synthesized Generative Adversarial Networks for image-to-image transformation

[...]

Kancharagunta Kishan Babu¹, Shiv Ram Dubey¹•Institutions (1)

Indian Institutes of Information Technology¹

01 May 2021-Expert Systems With Applications

TL;DR: The proposed CSGAN uses a new objective function (loss) called Cyclic-Synthesized Loss (CS) between the synthesized image of one domain and cycled image of another domain and exhibits the promising and comparable performance over Facades dataset in terms of both qualitative and quantitative measures.

...read moreread less

Abstract: The primary motivation of image-to-image transformation is to convert an image of one domain to another domain. The Generative Adversarial Network (GAN) is the recent trend for image-to-image transformation. The existing GAN models suffer due to the lack of utilization of proper synthesization objectives. In this paper, we propose a new Cyclic-Synthesized Generative Adversarial Networks (CSGAN) for the development of expert and intelligent systems for image-to-image transformation. The proposed CSGAN uses a new objective function based on the proposed cyclic-synthesized loss between the synthesized image of one domain and cycled image of another domain. The proposed CSGAN enforces the mapping from one domain to another domain more accurately by limiting the scope of redundant transformation with the help of the cyclic-synthesized loss. The performance of the proposed CSGAN is evaluated on four benchmark image-to-image transformation datasets, including CUHK Face dataset, WHU-IIP Thermal-Visible Face Dataset, CMP Facades dataset, and NYU-Depth Dataset. The results are computed using the widely used evaluation metrics such as MSE, SSIM, PSNR, and LPIPS. The experimental results of the proposed CSGAN approach are compared with the latest state-of-the-art approaches, such as GAN, Pix2Pix, DualGAN, CycleGAN, and PS2GAN. The proposed CSGAN technique outperforms all the methods over CUHK dataset, WHU-IIP dataset, NYU-Depth dataset, and exhibits promising and comparable performance over Facades dataset in terms of both qualitative and quantitative measures. The code is available at https://github.com/KishanKancharagunta/CSGAN.

...read moreread less

40 citations

Journal Article•DOI•

Multi-scale semantic image inpainting with residual learning and GAN

[...]

Libin Jiao¹, Hao Wu¹, Haodi Wang¹, Rongfang Bie¹•Institutions (1)

Beijing Normal University¹

28 Feb 2019-Neurocomputing

TL;DR: A combination of an encoder–decoder generator for image semantic inpainting and a multi-layer convolutional net for image seamless fusion, which is capable of restoring image effectively and seamlessly.

...read moreread less

32 citations

1
2
3
4
…
5
6
7

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Generative Adversarial Nets

[...]

Ian Goodfellow¹, Jean Pouget-Abadie¹, Mehdi Mirza¹, Bing Xu¹, David Warde-Farley¹, Sherjil Ozair², Aaron Courville¹, Yoshua Bengio¹ - Show less +4 more•Institutions (2)

Université de Montréal¹, Indian Institute of Technology Delhi²

08 Dec 2014

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

Abstract: We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to ½ everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.

...read moreread less

38,211 citations

Proceedings Article•DOI•

Fully convolutional networks for semantic segmentation

[...]

Jonathan Long¹, Evan Shelhamer¹, Trevor Darrell¹•Institutions (1)

University of California, Berkeley¹

07 Jun 2015

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

...read moreread less

Abstract: Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, exceed the state-of-the-art in semantic segmentation. Our key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning. We define and detail the space of fully convolutional networks, explain their application to spatially dense prediction tasks, and draw connections to prior models. We adapt contemporary classification networks (AlexNet [20], the VGG net [31], and GoogLeNet [32]) into fully convolutional networks and transfer their learned representations by fine-tuning [3] to the segmentation task. We then define a skip architecture that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations. Our fully convolutional network achieves state-of-the-art segmentation of PASCAL VOC (20% relative improvement to 62.2% mean IU on 2012), NYUDv2, and SIFT Flow, while inference takes less than one fifth of a second for a typical image.

...read moreread less

28,225 citations

Proceedings Article•DOI•

Image-to-Image Translation with Conditional Adversarial Networks

[...]

Phillip Isola¹, Jun-Yan Zhu¹, Tinghui Zhou¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

21 Jul 2017

TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

...read moreread less

11,958 citations

Journal Article•DOI•

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

[...]

Liang-Chieh Chen¹, George Papandreou¹, Iasonas Kokkinos², Kevin Murphy¹, Alan L. Yuille³ - Show less +1 more•Institutions (3)

Google¹, University College London², Johns Hopkins University³

01 Apr 2018-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models.

...read moreread less

Abstract: In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First , we highlight convolution with upsampled filters, or ‘atrous convolution’, as a powerful tool in dense prediction tasks. Atrous convolution allows us to explicitly control the resolution at which feature responses are computed within Deep Convolutional Neural Networks. It also allows us to effectively enlarge the field of view of filters to incorporate larger context without increasing the number of parameters or the amount of computation. Second , we propose atrous spatial pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales. Third , we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed “DeepLab” system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7 percent mIOU in the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code is made publicly available online.

...read moreread less

11,856 citations

Proceedings Article•DOI•

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks

[...]

Jun-Yan Zhu¹, Taesung Park¹, Phillip Isola¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

01 Oct 2017

TL;DR: CycleGAN as discussed by the authors learns a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss.

...read moreread less

Abstract: Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training data will not be available. We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples. Our goal is to learn a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss. Because this mapping is highly under-constrained, we couple it with an inverse mapping F : Y → X and introduce a cycle consistency loss to push F(G(X)) ≈ X (and vice versa). Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc. Quantitative comparisons against several prior methods demonstrate the superiority of our approach.

...read moreread less

11,682 citations