scispace - formally typeset
Search or ask a question
Author

Hanwen Liu

Bio: Hanwen Liu is an academic researcher from Hong Kong Polytechnic University. The author has contributed to research in topics: Multigrid method & Image restoration. The author has an hindex of 8, co-authored 21 publications receiving 253 citations. Previous affiliations of Hanwen Liu include ETH Zurich & Politehnica University of Timișoara.

Papers
More filters
Proceedings ArticleDOI
16 Jun 2019
TL;DR: The 3rd NTIRE challenge on single-image super-resolution (restoration of rich details in a low-resolution image) is reviewed with a focus on proposed solutions and results and the state-of-the-art in real-world single image super- resolution.
Abstract: This paper reviewed the 3rd NTIRE challenge on single-image super-resolution (restoration of rich details in a low-resolution image) with a focus on proposed solutions and results. The challenge had 1 track, which was aimed at the real-world single image super-resolution problem with an unknown scaling factor. Participants were mapping low-resolution images captured by a DSLR camera with a shorter focal length to their high-resolution images captured at a longer focal length. With this challenge, we introduced a novel real-world super-resolution dataset (RealSR). The track had 403 registered participants, and 36 teams competed in the final testing phase. They gauge the state-of-the-art in real-world single image super-resolution.

118 citations

Proceedings ArticleDOI
16 Jun 2019
TL;DR: The first NTIRE challenge on perceptual image enhancement as discussed by the authors focused on proposed solutions and results of real-world photo enhancement problem, where the goal was to map low-quality photos from the iPhone 3GS device to the same photos captured with Canon 70D DSLR camera.
Abstract: This paper reviews the first NTIRE challenge on perceptual image enhancement with the focus on proposed solutions and results. The participating teams were solving a real-world photo enhancement problem, where the goal was to map low-quality photos from the iPhone 3GS device to the same photos captured with Canon 70D DSLR camera. The considered problem embraced a number of computer vision subtasks, such as image denoising, image resolution and sharpness enhancement, image color/contrast/exposure adjustment, etc. The target metric used in this challenge combined PSNR and SSIM scores with solutions' perceptual results measured in the user study. The proposed solutions significantly improved baseline results, defining the state-of-the-art for practical image enhancement.

45 citations

Proceedings ArticleDOI
01 Jan 2019
TL;DR: This paper reviews the extreme video super-resolution challenge from the AIM 2019 workshop, with emphasis on submitted solutions and results.
Abstract: This paper reviews the AIM 2019 challenge on extreme image super-resolution, the problem of restoring of rich details in a low resolution image. Compared to previous, this challenge focuses on an extreme upscaling factor, ×16, and employs the novel DIVerse 8K resolution (DIV8K) dataset. This report focuses on the proposed solutions and final results. The challenge had 2 tracks. The goal in Track 1 was to generate a super-resolution result with high fidelity, using the conventional PSNR as the primary metric to evaluate different methods. Track 2 instead focused on generating visually more pleasant super-resolution results, evaluated using subjective opinions. The two tracks had 71 and 52 registered participants, respectively, and 9 teams competed in the final testing phase. This report gauges the experimental protocol and baselines for the extreme image super-resolution task.

34 citations

Proceedings ArticleDOI
01 Jun 2019
TL;DR: This paper reviews the second NTIRE challenge on image dehazing (restoration of rich details in hazy image) with focus on proposed solutions and results and gauge the state-of-the-art in imageDehazing.
Abstract: This paper reviews the second NTIRE challenge on image dehazing (restoration of rich details in hazy image) with focus on proposed solutions and results. The training data consists from 55 hazy images (with dense haze generated in an indoor or outdoor environment) and their corresponding ground truth (haze-free) images of the same scene. The dense haze has been produced using a professional haze/fog generator that imitates the real conditions of haze scenes. The evaluation consists from the comparison of the dehazed images with the ground truth images. The dehazing process was learnable through provided pairs of haze-free and hazy train images. There were ~ 270 registered participants and 23 teams competed in the final testing phase. They gauge the state-of-the-art in image dehazing.

34 citations

Book ChapterDOI
08 Sep 2018
TL;DR: In this paper, a discriminator for adversarial training is proposed, which is multi-scale that resembles a progressive-GAN and includes a new layer to capture significant statistics of natural images.
Abstract: We describe our solution for the PIRM Super–Resolution Challenge 2018 where we achieved the \(\varvec{2^{nd}}\) best perceptual quality for average \(RMSE\leqslant 16\), \(5^{th}\) best for \(RMSE\leqslant 12.5\), and \(7^{th}\) best for \(RMSE\leqslant 11.5\). We modify a recently proposed Multi–Grid Back–Projection (MGBP) architecture to work as a generative system with an input parameter that can control the amount of artificial details in the output. We propose a discriminator for adversarial training with the following novel properties: it is multi–scale that resembles a progressive–GAN; it is recursive that balances the architecture of the generator; and it includes a new layer to capture significant statistics of natural images. Finally, we propose a training strategy that avoids conflicts between reconstruction and perceptual losses. Our configuration uses only 281 k parameters and upscales each image of the competition in 0.2 s in average.

27 citations


Cited by
More filters
Posted Content
TL;DR: The superiority of the proposed HRNet in a wide range of applications, including human pose estimation, semantic segmentation, and object detection, is shown, suggesting that the HRNet is a stronger backbone for computer vision problems.
Abstract: High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection. Existing state-of-the-art frameworks first encode the input image as a low-resolution representation through a subnetwork that is formed by connecting high-to-low resolution convolutions \emph{in series} (e.g., ResNet, VGGNet), and then recover the high-resolution representation from the encoded low-resolution representation. Instead, our proposed network, named as High-Resolution Network (HRNet), maintains high-resolution representations through the whole process. There are two key characteristics: (i) Connect the high-to-low resolution convolution streams \emph{in parallel}; (ii) Repeatedly exchange the information across resolutions. The benefit is that the resulting representation is semantically richer and spatially more precise. We show the superiority of the proposed HRNet in a wide range of applications, including human pose estimation, semantic segmentation, and object detection, suggesting that the HRNet is a stronger backbone for computer vision problems. All the codes are available at~{\url{this https URL}}.

1,278 citations

Journal ArticleDOI
TL;DR: The High-Resolution Network (HRNet) as mentioned in this paper maintains high-resolution representations through the whole process by connecting the high-to-low resolution convolution streams in parallel and repeatedly exchanging the information across resolutions.
Abstract: High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection. Existing state-of-the-art frameworks first encode the input image as a low-resolution representation through a subnetwork that is formed by connecting high-to-low resolution convolutions in series (e.g., ResNet, VGGNet), and then recover the high-resolution representation from the encoded low-resolution representation. Instead, our proposed network, named as High-Resolution Network (HRNet), maintains high-resolution representations through the whole process. There are two key characteristics: (i) Connect the high-to-low resolution convolution streams in parallel and (ii) repeatedly exchange the information across resolutions. The benefit is that the resulting representation is semantically richer and spatially more precise. We show the superiority of the proposed HRNet in a wide range of applications, including human pose estimation, semantic segmentation, and object detection, suggesting that the HRNet is a stronger backbone for computer vision problems. All the codes are available at https://github.com/HRNet .

1,162 citations

Book ChapterDOI
08 Sep 2018
TL;DR: This paper reports on the 2018 PIRM challenge on perceptual super-resolution (SR), held in conjunction with the Perceptual Image Restoration and Manipulation (PIRM) workshop at ECCV 2018, and concludes with an analysis of the current trends in perceptual SR, as reflected from the leading submissions.
Abstract: This paper reports on the 2018 PIRM challenge on perceptual super-resolution (SR), held in conjunction with the Perceptual Image Restoration and Manipulation (PIRM) workshop at ECCV 2018. In contrast to previous SR challenges, our evaluation methodology jointly quantifies accuracy and perceptual quality, therefore enabling perceptual-driven methods to compete alongside algorithms that target PSNR maximization. Twenty-one participating teams introduced algorithms which well-improved upon the existing state-of-the-art methods in perceptual SR, as confirmed by a human opinion study. We also analyze popular image quality measures and draw conclusions regarding which of them correlates best with human opinion scores. We conclude with an analysis of the current trends in perceptual SR, as reflected from the leading submissions.

428 citations

Book ChapterDOI
23 Aug 2020
TL;DR: MIRNet as mentioned in this paper proposes a multi-scale residual block containing several key elements: (a) parallel multi-resolution convolution streams for extracting mult-scale features, (b) information exchange across the multiresolution streams, (c) spatial and channel attention mechanisms for capturing contextual information, and (d) attention-based multiscale feature aggregation.
Abstract: With the goal of recovering high-quality image content from its degraded version, image restoration enjoys numerous applications, such as in surveillance, computational photography and medical imaging. Recently, convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task. Existing CNN-based methods typically operate either on full-resolution or on progressively low-resolution representations. In the former case, spatially precise but contextually less robust results are achieved, while in the latter case, semantically reliable but spatially less accurate outputs are generated. In this paper, we present an architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network and receiving strong contextual information from the low-resolution representations. The core of our approach is a multi-scale residual block containing several key elements: (a) parallel multi-resolution convolution streams for extracting multi-scale features, (b) information exchange across the multi-resolution streams, (c) spatial and channel attention mechanisms for capturing contextual information, and (d) attention based multi-scale feature aggregation. In a nutshell, our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details. Extensive experiments on five real image benchmark datasets demonstrate that our method, named as MIRNet, achieves state-of-the-art results for image denoising, super-resolution, and image enhancement. The source code and pre-trained models are available at https://github.com/swz30/MIRNet.

357 citations