scispace - formally typeset
Open accessProceedings ArticleDOI: 10.1109/RO-MAN46459.2019.8956310

A Conditional Adversarial Network for Scene Flow Estimation

01 Oct 2019-pp 1-6
Abstract: The problem of Scene flow estimation in depth videos has been attracting attention of researchers of machine vision, due to its potential application in various areas of robotics. The conventional scene flow estimation methods are difficult to use in real-time applications due to their long computational overhead. We propose a conditional adversarial network SceneFlowGAN for scene flow estimation. The proposed SceneFlowGAN uses loss function at two ends: both the generator and the discriminator. The proposed network is a first attempt to estimate scene flow using generative adversarial networks, and is able to estimate both the optical flow and disparity from the input stereo images simultaneously. The proposed method is experimented on a huge RGB-D benchmark sceneflow estimation dataset. more

Topics: Optical flow (59%)

Journal ArticleDOI: 10.1109/TIP.2021.3084073
Tongtong Che1, Yuanjie Zheng1, Yunshuai Yang1, Sujuan Hou1  +3 moreInstitutions (3)
Abstract: There is a growing consensus in computer vision that symmetric optical flow estimation constitutes a better model than a generic asymmetric one for its independence of the selection of source/target image. Yet, convolutional neural networks (CNNs), that are considered the de facto standard vision model, deal with the asymmetric case only in most cutting-edge CNNs-based optical flow techniques. We bridge this gap by introducing a novel model named SDOF-GAN: symmetric dense optical flow with generative adversarial networks (GANs). SDOF-GAN realizes a consistency between the forward mapping (source-to-target) and the backward one (target-to-source) by ensuring that they are inverse of each other with an inverse network. In addition, SDOF-GAN leverages a GAN model for which the generator estimates symmetric optical flow fields while the discriminator differentiates the “real” ground-truth flow field from a “fake” estimation by assessing the flow warping error. Finally, SDOF-GAN is trained in a semi-supervised fashion to enable both the precious labeled data and large amounts of unlabeled data to be fully-exploited. We demonstrate significant performance benefits of SDOF-GAN on five publicly-available datasets in contrast to several representative state-of-the-art models for optical flow estimation. more

Topics: Optical flow (58%), Flow (mathematics) (53%), Convolutional neural network (51%) more

Open accessProceedings Article
Diederik P. Kingma1, Jimmy Ba2Institutions (2)
01 Jan 2015-
Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm. more

Topics: Stochastic optimization (63%), Convex optimization (54%), Rate of convergence (52%) more

78,539 Citations

Open accessJournal ArticleDOI: 10.3156/JSOFT.29.5_177_2
Ian Goodfellow1, Jean Pouget-Abadie1, Mehdi Mirza1, Bing Xu1  +4 moreInstitutions (2)
08 Dec 2014-
Abstract: We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to ½ everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples. more

Topics: Generative model (64%), Discriminative model (54%), Approximate inference (53%) more

29,410 Citations

Open accessProceedings ArticleDOI: 10.1109/CVPR.2017.632
21 Jul 2017-
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either. more

Topics: Image translation (58%)

9,134 Citations

Open accessPosted Content
06 Nov 2014-arXiv: Learning
Abstract: Generative Adversarial Nets [8] were recently introduced as a novel way to train generative models. In this work we introduce the conditional version of generative adversarial nets, which can be constructed by simply feeding the data, y, we wish to condition on to both the generator and discriminator. We show that this model can generate MNIST digits conditioned on class labels. We also illustrate how this model could be used to learn a multi-modal model, and provide preliminary examples of an application to image tagging in which we demonstrate how this approach can generate descriptive tags which are not part of training labels. more

Topics: Generative model (65%), Generative Design (60%), MNIST database (51%)

6,062 Citations

Open accessProceedings ArticleDOI: 10.1109/CVPR.2017.19
Christian Ledig1, Lucas Theis1, Ferenc Huszar2, Jose Caballero3  +7 moreInstitutions (3)
21 Jul 2017-
Abstract: Despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper convolutional neural networks, one central problem remains largely unsolved: how do we recover the finer texture details when we super-resolve at large upscaling factors? The behavior of optimization-based super-resolution methods is principally driven by the choice of the objective function. Recent work has largely focused on minimizing the mean squared reconstruction error. The resulting estimates have high peak signal-to-noise ratios, but they are often lacking high-frequency details and are perceptually unsatisfying in the sense that they fail to match the fidelity expected at the higher resolution. In this paper, we present SRGAN, a generative adversarial network (GAN) for image super-resolution (SR). To our knowledge, it is the first framework capable of inferring photo-realistic natural images for 4x upscaling factors. To achieve this, we propose a perceptual loss function which consists of an adversarial loss and a content loss. The adversarial loss pushes our solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images. In addition, we use a content loss motivated by perceptual similarity instead of similarity in pixel space. Our deep residual network is able to recover photo-realistic textures from heavily downsampled images on public benchmarks. An extensive mean-opinion-score (MOS) test shows hugely significant gains in perceptual quality using SRGAN. The MOS scores obtained with SRGAN are closer to those of the original high-resolution images than to those obtained with any state-of-the-art method. more

Topics: Image texture (54%), Image resolution (53%), Convolutional neural network (52%) more

5,157 Citations

No. of citations received by the Paper in previous years