scispace - formally typeset
Search or ask a question
Author

Max Basler

Bio: Max Basler is an academic researcher from Swisscom. The author has contributed to research in topics: Pixel & Multi-task learning. The author has an hindex of 3, co-authored 4 publications receiving 69 citations.

Papers
More filters
Proceedings ArticleDOI
01 Oct 2019
TL;DR: In this paper, the authors optimize a deep network-based decoder with a targeted objective function that penalizes images at different semantic levels using the corresponding terms, which results in more realistic textures and sharper edges.
Abstract: By benefiting from perceptual losses, recent studies have improved significantly the performance of the super-resolution task, where a high-resolution image is resolved from its low-resolution counterpart. Although such objective functions generate near-photorealistic results, their capability is limited, since they estimate the reconstruction error for an entire image in the same way, without considering any semantic information. In this paper, we propose a novel method to benefit from perceptual loss in a more objective way. We optimize a deep network-based decoder with a targeted objective function that penalizes images at different semantic levels using the corresponding terms. In particular, the proposed method leverages our proposed OBB (Object, Background and Boundary) labels, generated from segmentation labels, to estimate a suitable perceptual loss for boundaries, while considering texture similarity for backgrounds. We show that our proposed approach results in more realistic textures and sharper edges, and outperforms other state-of-the-art algorithms in terms of both qualitative results on standard benchmarks and results of extensive user studies.

113 citations

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed an encoder architecture able to extract and use semantic information to super-resolve a given image by using multitask learning, simultaneously for image super-resolution and semantic segmentation.

13 citations

Posted Content
TL;DR: This paper presents a decoder architecture able to extract and use semantic information to super-resolve a given image by using multitask learning, simultaneously for image super-resolution and semantic segmentation, and outperforms the state-of-the-art methods.
Abstract: Despite significant progress toward super resolving more realistic images by deeper convolutional neural networks (CNNs), reconstructing fine and natural textures still remains a challenging problem. Recent works on single image super resolution (SISR) are mostly based on optimizing pixel and content wise similarity between recovered and high-resolution (HR) images and do not benefit from recognizability of semantic classes. In this paper, we introduce a novel approach using categorical information to tackle the SISR problem; we present a decoder architecture able to extract and use semantic information to super-resolve a given image by using multitask learning, simultaneously for image super-resolution and semantic segmentation. To explore categorical information during training, the proposed decoder only employs one shared deep network for two task-specific output layers. At run-time only layers resulting HR image are used and no segmentation label is required. Extensive perceptual experiments and a user study on images randomly selected from COCO-Stuff dataset demonstrate the effectiveness of our proposed method and it outperforms the state-of-the-art methods.

6 citations

Posted Content
TL;DR: A deep network-based decoder with a targeted objective function that penalizes images at different semantic levels using the corresponding terms is optimized, which results in more realistic textures and sharper edges and outperforms other state-of-the-art algorithms.
Abstract: By benefiting from perceptual losses, recent studies have improved significantly the performance of the super-resolution task, where a high-resolution image is resolved from its low-resolution counterpart. Although such objective functions generate near-photorealistic results, their capability is limited, since they estimate the reconstruction error for an entire image in the same way, without considering any semantic information. In this paper, we propose a novel method to benefit from perceptual loss in a more objective way. We optimize a deep network-based decoder with a targeted objective function that penalizes images at different semantic levels using the corresponding terms. In particular, the proposed method leverages our proposed OBB (Object, Background and Boundary) labels, generated from segmentation labels, to estimate a suitable perceptual loss for boundaries, while considering texture similarity for backgrounds. We show that our proposed approach results in more realistic textures and sharper edges, and outperforms other state-of-the-art algorithms in terms of both qualitative results on standard benchmarks and results of extensive user studies.

6 citations


Cited by
More filters
Proceedings ArticleDOI
Cheng Ma1, Yongming Rao1, Yean Cheng1, Ce Chen1, Jiwen Lu1, Jie Zhou1 
14 Jun 2020
TL;DR: Zhang et al. as discussed by the authors proposed a structure-preserving super resolution method to alleviate the undesired structural distortions in the recovered images by exploiting gradient maps of images to guide the recovery in two aspects.
Abstract: Structures matter in single image super resolution (SISR). Recent studies benefiting from generative adversarial network (GAN) have promoted the development of SISR by recovering photo-realistic images. However, there are always undesired structural distortions in the recovered images. In this paper, we propose a structure-preserving super resolution method to alleviate the above issue while maintaining the merits of GAN-based methods to generate perceptual-pleasant details. Specifically, we exploit gradient maps of images to guide the recovery in two aspects. On the one hand, we restore high-resolution gradient maps by a gradient branch to provide additional structure priors for the SR process. On the other hand, we propose a gradient loss which imposes a second-order restriction on the super-resolved images. Along with the previous image-space loss functions, the gradient-space objectives help generative networks concentrate more on geometric structures. Moreover, our method is model-agnostic, which can be potentially used for off-the-shelf SR networks. Experimental results show that we achieve the best PI and LPIPS performance and meanwhile comparable PSNR and SSIM compared with state-of-the-art perceptual-driven SR methods. Visual results demonstrate our superiority in restoring structures while generating natural SR images.

143 citations

Posted Content
Cheng Ma1, Yongming Rao1, Yean Cheng1, Ce Chen1, Jiwen Lu1, Jie Zhou1 
TL;DR: A structure-preserving super resolution method which exploits gradient maps of images to guide the recovery in two aspects and proposes a gradient loss which imposes a second-order restriction on the super-resolved images.
Abstract: Structures matter in single image super resolution (SISR). Recent studies benefiting from generative adversarial network (GAN) have promoted the development of SISR by recovering photo-realistic images. However, there are always undesired structural distortions in the recovered images. In this paper, we propose a structure-preserving super resolution method to alleviate the above issue while maintaining the merits of GAN-based methods to generate perceptual-pleasant details. Specifically, we exploit gradient maps of images to guide the recovery in two aspects. On the one hand, we restore high-resolution gradient maps by a gradient branch to provide additional structure priors for the SR process. On the other hand, we propose a gradient loss which imposes a second-order restriction on the super-resolved images. Along with the previous image-space loss functions, the gradient-space objectives help generative networks concentrate more on geometric structures. Moreover, our method is model-agnostic, which can be potentially used for off-the-shelf SR networks. Experimental results show that we achieve the best PI and LPIPS performance and meanwhile comparable PSNR and SSIM compared with state-of-the-art perceptual-driven SR methods. Visual results demonstrate our superiority in restoring structures while generating natural SR images.

127 citations

Proceedings ArticleDOI
01 Oct 2019
TL;DR: In this paper, the authors optimize a deep network-based decoder with a targeted objective function that penalizes images at different semantic levels using the corresponding terms, which results in more realistic textures and sharper edges.
Abstract: By benefiting from perceptual losses, recent studies have improved significantly the performance of the super-resolution task, where a high-resolution image is resolved from its low-resolution counterpart. Although such objective functions generate near-photorealistic results, their capability is limited, since they estimate the reconstruction error for an entire image in the same way, without considering any semantic information. In this paper, we propose a novel method to benefit from perceptual loss in a more objective way. We optimize a deep network-based decoder with a targeted objective function that penalizes images at different semantic levels using the corresponding terms. In particular, the proposed method leverages our proposed OBB (Object, Background and Boundary) labels, generated from segmentation labels, to estimate a suitable perceptual loss for boundaries, while considering texture similarity for backgrounds. We show that our proposed approach results in more realistic textures and sharper edges, and outperforms other state-of-the-art algorithms in terms of both qualitative results on standard benchmarks and results of extensive user studies.

113 citations

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a dynamic selection network (DSNet) to distinguish the corrupted regions from the valid ones throughout the entire network architecture, which may help make full use of the information in the known area.
Abstract: Image inpainting is a challenging computer vision task that aims to fill in missing regions of corrupted images with realistic contents. With the development of convolutional neural networks, many deep learning models have been proposed to solve image inpainting issues by learning information from a large amount of data. In particular, existing algorithms usually follow an encoding and decoding network architecture in which some operations with standard schemes are employed, such as static convolution, which only considers pixels with fixed grids, and the monotonous normalization style (e.g., batch normalization). However, these techniques are not well-suited for the image inpainting task because the random corrupted regions in the input images tend to mislead the inpainting process and generate unreasonable content. In this paper, we propose a novel dynamic selection network (DSNet) to solve this problem in image inpainting tasks. The principal idea of the proposed DSNet is to distinguish the corrupted region from the valid ones throughout the entire network architecture, which may help make full use of the information in the known area. Specifically, the proposed DSNet has two novel dynamic selection modules, namely, the validness migratable convolution (VMC) and regional composite normalization (RCN) modules, which share a dynamic selection mechanism that helps utilize valid pixels better. By replacing vanilla convolution with the VMC module, spatial sampling locations are dynamically selected in the convolution phase, resulting in a more flexible feature extraction process. Besides, the RCN module not only combines several normalization methods but also normalizes the feature regions selectively. Therefore, the proposed DSNet can illustrate realistic and fine-detailed images by adaptively selecting features and normalization styles. Experimental results on three public datasets show that our proposed method outperforms state-of-the-art methods both quantitatively and qualitatively.

86 citations

Posted Content
Cheng Ma1, Zhenyu Jiang1, Yongming Rao1, Jiwen Lu1, Jie Zhou1 
TL;DR: Results show the proposed FSR method with iterative collaboration between two recurrent networks which focus on facial image recovery and landmark estimation significantly outperforms state-of-the-art FSR methods in recovering high-quality face images.
Abstract: Recent works based on deep learning and facial priors have succeeded in super-resolving severely degraded facial images. However, the prior knowledge is not fully exploited in existing methods, since facial priors such as landmark and component maps are always estimated by low-resolution or coarsely super-resolved images, which may be inaccurate and thus affect the recovery performance. In this paper, we propose a deep face super-resolution (FSR) method with iterative collaboration between two recurrent networks which focus on facial image recovery and landmark estimation respectively. In each recurrent step, the recovery branch utilizes the prior knowledge of landmarks to yield higher-quality images which facilitate more accurate landmark estimation in turn. Therefore, the iterative information interaction between two processes boosts the performance of each other progressively. Moreover, a new attentive fusion module is designed to strengthen the guidance of landmark maps, where facial components are generated individually and aggregated attentively for better restoration. Quantitative and qualitative experimental results show the proposed method significantly outperforms state-of-the-art FSR methods in recovering high-quality face images.

70 citations