Author
Baopu Li
Other affiliations: York University
Bio: Baopu Li is an academic researcher from Baidu. The author has contributed to research in topics: Real image & Artificial neural network. The author has an hindex of 7, co-authored 23 publications receiving 163 citations. Previous affiliations of Baopu Li include York University.
Papers
More filters
••
01 Jun 2020TL;DR: This paper reviews the NTIRE 2020 challenge on real image denoising with focus on the newly introduced dataset, the proposed methods and their results, based on the SIDD benchmark.
Abstract: This paper reviews the NTIRE 2020 challenge on real image denoising with focus on the newly introduced dataset, the proposed methods and their results. The challenge is a new version of the previous NTIRE 2019 challenge on real image denoising that was based on the SIDD benchmark. This challenge is based on a newly collected validation and testing image datasets, and hence, named SIDD+. This challenge has two tracks for quantitatively evaluating image denoising performance in (1) the Bayer-pattern rawRGB and (2) the standard RGB (sRGB) color spaces. Each track ~250 registered participants. A total of 22 teams, proposing 24 methods, competed in the final phase of the challenge. The proposed methods by the participating teams represent the current state-of-the-art performance in image denoising targeting real noisy images. The newly collected SIDD+ datasets are publicly available at: https://bit.ly/siddplus_data.
72 citations
•
TL;DR: This paper reviews the second AIM learned ISP challenge and provides the description of the proposed solutions and results, defining the state-of-the-art for practical image signal processing pipeline modeling.
Abstract: This paper reviews the second AIM learned ISP challenge and provides the description of the proposed solutions and results. The participating teams were solving a real-world RAW-to-RGB mapping problem, where to goal was to map the original low-quality RAW images captured by the Huawei P20 device to the same photos obtained with the Canon 5D DSLR camera. The considered task embraced a number of complex computer vision subtasks, such as image demosaicing, denoising, white balancing, color and contrast correction, demoireing, etc. The target metric used in this challenge combined fidelity scores (PSNR and SSIM) with solutions' perceptual results measured in a user study. The proposed solutions significantly improved the baseline results, defining the state-of-the-art for practical image signal processing pipeline modeling.
44 citations
••
23 Aug 2020
TL;DR: The second AIM learned ISP challenge as mentioned in this paper focused on real-world RAW-to-RGB mapping problem, where the goal was to map the original low-quality RAW images captured by the Huawei P20 device to the same photos obtained with the Canon 5D DSLR camera.
Abstract: This paper reviews the second AIM learned ISP challenge and provides the description of the proposed solutions and results. The participating teams were solving a real-world RAW-to-RGB mapping problem, where to goal was to map the original low-quality RAW images captured by the Huawei P20 device to the same photos obtained with the Canon 5D DSLR camera. The considered task embraced a number of complex computer vision subtasks, such as image demosaicing, denoising, white balancing, color and contrast correction, demoireing, etc. The target metric used in this challenge combined fidelity scores (PSNR and SSIM) with solutions’ perceptual results measured in a user study. The proposed solutions significantly improved the baseline results, defining the state-of-the-art for practical image signal processing pipeline modeling.
32 citations
•
Sun Yat-sen University1, Harbin Institute of Technology2, ETH Zurich3, Baidu4, Huawei5, Yonsei University6, Dalian Maritime University7, Los Alamos National Laboratory8, KAIST9, Amazon.com10, Peking University11, Jiangnan University12, Karlsruhe Institute of Technology13, Tongji University14, Guangdong University of Technology15, University of Science and Technology of China16, École Polytechnique17, Hong Kong Polytechnic University18, Fuzhou University19, University of Udine20, Indian Institute of Technology Kharagpur21, Université libre de Bruxelles22
TL;DR: This paper introduces the real image Super-Resolution challenge that was part of the Advances in Image Manipulation (AIM) workshop, held in conjunction with ECCV 2020, and gauges the state-of-the-art approaches for real image SR in terms of PSNR and SSIM.
Abstract: This paper introduces the real image Super-Resolution (SR) challenge that was part of the Advances in Image Manipulation (AIM) workshop, held in conjunction with ECCV 2020. This challenge involves three tracks to super-resolve an input image for $\times$2, $\times$3 and $\times$4 scaling factors, respectively. The goal is to attract more attention to realistic image degradation for the SR task, which is much more complicated and challenging, and contributes to real-world image super-resolution applications. 452 participants were registered for three tracks in total, and 24 teams submitted their results. They gauge the state-of-the-art approaches for real image SR in terms of PSNR and SSIM.
32 citations
••
04 May 2020TL;DR: A novel rank selection scheme is proposed, which is inspired by reinforcement learning, to automatically select ranks in recently studied tensor ring decomposition in each convolutional layer for effectively compressing deep neural networks while maintaining comparable accuracy.
Abstract: Tensor decomposition has been proved to be effective for solving many problems in signal processing and machine learning[1]. Recently, tensor decomposition finds its advantage for compressing deep neural networks. In many applications of deep neural networks, it is critical to reduce the number of parameters and computation workload to accelerate inference speed in deployment of the network. Modern deep neural network consists of multiple layers with multi-array weights where tensor decomposition is a natural way to perform compression. It is achieved by decomposing the weight tensors in convolutional layers or fully-connected layers with specified tensor ranks (e.g. canonical ranks, tensor train ranks). Conventional tensor decomposition in compressing deep neural networks selects the ranks manually that requires tedious human efforts to finetune the performance. To overcome this issue, we propose a novel rank selection scheme, which is inspired by reinforcement learning, to automatically select ranks in recently studied tensor ring decomposition in each convolutional layer. Experimental results validate that our learning based rank selection significantly outperforms hand-crafted rank selection heuristics on a number of benchmark datasets, for the purpose of effectively compressing deep neural networks while maintaining comparable accuracy.
22 citations
Cited by
More filters
••
01 Jun 2021TL;DR: Non-local sparse attention (NLSA) as mentioned in this paper is designed to retain long-range modeling capability from non-local operation while enjoying robustness and high-efficiency of sparse representation, which partitions the input space into hash buckets of related features.
Abstract: Both Non-Local (NL) operation and sparse representation are crucial for Single Image Super-Resolution (SISR). In this paper, we investigate their combinations and propose a novel Non-Local Sparse Attention (NLSA) with dynamic sparse attention pattern. NLSA is designed to retain long-range modeling capability from NL operation while enjoying robustness and high-efficiency of sparse representation. Specifically, NLSA rectifies non-local attention with spherical locality sensitive hashing (LSH) that partitions the input space into hash buckets of related features. For every query signal, NLSA assigns a bucket to it and only computes attention within the bucket. The resulting sparse attention prevents the model from attending to locations that are noisy and less-informative, while reducing the computational cost from quadratic to asymptotic linear with respect to the spatial size. Extensive experiments validate the effectiveness and efficiency of NLSA. With a few non-local sparse attention modules, our architecture, called non-local sparse network (NLSN), reaches state-of-the-art performance for SISR quantitatively and qualitatively.
216 citations
•
TL;DR: This work designs a lightweight convolutional neural network for image super resolution with a newly proposed pixel attention scheme that could achieve similar performance as the lightweight networks - SRResNet and CARN, but with only 272K parameters.
Abstract: This work aims at designing a lightweight convolutional neural network for image super resolution (SR). With simplicity bare in mind, we construct a pretty concise and effective network with a newly proposed pixel attention scheme. Pixel attention (PA) is similar as channel attention and spatial attention in formulation. The difference is that PA produces 3D attention maps instead of a 1D attention vector or a 2D map. This attention scheme introduces fewer additional parameters but generates better SR results. On the basis of PA, we propose two building blocks for the main branch and the reconstruction branch, respectively. The first one - SC-PA block has the same structure as the Self-Calibrated convolution but with our PA layer. This block is much more efficient than conventional residual/dense blocks, for its twobranch architecture and attention scheme. While the second one - UPA block combines the nearest-neighbor upsampling, convolution and PA layers. It improves the final reconstruction quality with little parameter cost. Our final model- PAN could achieve similar performance as the lightweight networks - SRResNet and CARN, but with only 272K parameters (17.92% of SRResNet and 17.09% of CARN). The effectiveness of each proposed component is also validated by ablation study. The code is available at this https URL.
128 citations
••
16 Jun 2019TL;DR: The 3rd NTIRE challenge on single-image super-resolution (restoration of rich details in a low-resolution image) is reviewed with a focus on proposed solutions and results and the state-of-the-art in real-world single image super- resolution.
Abstract: This paper reviewed the 3rd NTIRE challenge on single-image super-resolution (restoration of rich details in a low-resolution image) with a focus on proposed solutions and results. The challenge had 1 track, which was aimed at the real-world single image super-resolution problem with an unknown scaling factor. Participants were mapping low-resolution images captured by a DSLR camera with a shorter focal length to their high-resolution images captured at a longer focal length. With this challenge, we introduced a novel real-world super-resolution dataset (RealSR). The track had 403 registered participants, and 36 teams competed in the final testing phase. They gauge the state-of-the-art in real-world single image super-resolution.
118 citations
•
ETH Zurich1, Hong Kong Polytechnic University2, Shanghai Jiao Tong University3, Microsoft4, Korea University5, Ajou University6, École Polytechnique Fédérale de Lausanne7, University of Udine8, Dalian Maritime University9, Tencent10, Peking University11, North China University of Technology12, Huawei13, Fuzhou University14, Samsung15, Ulsan National Institute of Science and Technology16, Sardar Vallabhbhai National Institute of Technology, Surat17, Norwegian University of Science and Technology18
TL;DR: The NTIRE 2020 challenge addresses the real world setting, where paired true high and low-resolution images are unavailable, and the ultimate goal is to achieve the best perceptual quality, evaluated using a human study.
Abstract: This paper reviews the NTIRE 2020 challenge on real world super-resolution. It focuses on the participating methods and final results. The challenge addresses the real world setting, where paired true high and low-resolution images are unavailable. For training, only one set of source input images is therefore provided along with a set of unpaired high-quality target images. In Track 1: Image Processing artifacts, the aim is to super-resolve images with synthetically generated image processing artifacts. This allows for quantitative benchmarking of the approaches \wrt a ground-truth image. In Track 2: Smartphone Images, real low-quality smart phone images have to be super-resolved. In both tracks, the ultimate goal is to achieve the best perceptual quality, evaluated using a human study. This is the second challenge on the subject, following AIM 2019, targeting to advance the state-of-the-art in super-resolution. To measure the performance we use the benchmark protocol from AIM 2019. In total 22 teams competed in the final testing phase, demonstrating new and innovative solutions to the problem.
92 citations
••
23 Aug 2020TL;DR: Zhao et al. as discussed by the authors designed a lightweight convolutional neural network with a pixel attention scheme, which produces 3D attention maps instead of a 1D attention vector or a 2D map.
Abstract: This work aims at designing a lightweight convolutional neural network for image super resolution (SR). With simplicity bare in mind, we construct a pretty concise and effective network with a newly proposed pixel attention scheme. Pixel attention (PA) is similar as channel attention and spatial attention in formulation. The difference is that PA produces 3D attention maps instead of a 1D attention vector or a 2D map. This attention scheme introduces fewer additional parameters but generates better SR results. On the basis of PA, we propose two building blocks for the main branch and the reconstruction branch, respectively. The first one—SC-PA block has the same structure as the Self-Calibrated convolution but with our PA layer. This block is much more efficient than conventional residual/dense blocks, for its two-branch architecture and attention scheme. While the second one—U-PA block combines the nearest-neighbor upsampling, convolution and PA layers. It improves the final reconstruction quality with little parameter cost. Our final model—PAN could achieve similar performance as the lightweight networks—SRResNet and CARN, but with only 272K parameters (17.92% of SRResNet and 17.09% of CARN). The effectiveness of each proposed component is also validated by ablation study. The code is available at https://github.com/zhaohengyuan1/PAN.
80 citations