Fast and Full-Resolution Light Field Deblurring Using a Deep Neural Network

doi:10.1109/LSP.2019.2947379

Home
/
Papers
/
Fast and Full-Resolution Light Field Deblurring Using a Deep Neural Network

Journal Article•DOI•

Fast and Full-Resolution Light Field Deblurring Using a Deep Neural Network

Jonathan Samuel Lumentut¹, Tae Hyun Kim, Ravi Ramamoorthi², In Kyu Park¹•Institutions (2)

Inha University¹, University of California, San Diego²

31 Mar 2019-arXiv: Computer Vision and Pattern Recognition-

TL;DR: This work generates a complex blurry light field dataset and proposes a learning-based deblurring approach that is about 16K times faster than Srinivasan et.

read less

Abstract: Restoring a sharp light field image from its blurry input has become essential due to the increasing popularity of parallax-based image processing. State-of-the-art blind light field deblurring methods suffer from several issues such as slow processing, reduced spatial size, and a limited motion blur model. In this work, we address these challenging problems by generating a complex blurry light field dataset and proposing a learning-based deblurring approach. In particular, we model the full 6-degree of freedom (6-DOF) light field camera motion, which is used to create the blurry dataset using a combination of real light fields captured with a Lytro Illum camera, and synthetic light field renderings of 3D scenes. Furthermore, we propose a light field deblurring network that is built with the capability of large receptive fields. We also introduce a simple strategy of angular sampling to train on the large-scale blurry light field effectively. We evaluate our method through both quantitative and qualitative measurements and demonstrate superior performance compared to the state-of-the-art method with a massive speedup in execution time. Our method is about 16K times faster than Srinivasan et. al. [22] and can deblur a full-resolution light field in less than 2 seconds.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Deep Image Deblurring: A Survey

[...]

Kaihao Zhang, Wenqi Ren, Wenhan Luo, Wei-Sheng Lai, Bjorn Stenger, Ming Yan, Hongdong Li - Show less +3 more

26 Jan 2022-International Journal of Computer Vision

TL;DR: A comprehensive and timely survey of recently published deep-learning based image deblurring approaches can be found in this article , where the authors discuss common causes of image blur, introduce benchmark datasets and performance metrics, and summarize different problem formulations.

...read moreread less

Abstract: Image deblurring is a classic problem in low-level computer vision with the aim to recover a sharp image from a blurred input image. Advances in deep learning have led to significant progress in solving this problem, and a large number of deblurring networks have been proposed. This paper presents a comprehensive and timely survey of recently published deep-learning based image deblurring approaches, aiming to serve the community as a useful literature review. We start by discussing common causes of image blur, introduce benchmark datasets and performance metrics, and summarize different problem formulations. Next, we present a taxonomy of methods using convolutional neural networks (CNN) based on architecture, loss function, and application, offering a detailed review and comparison. In addition, we discuss some domain-specific deblurring applications including face images, text, and stereo image pairs. We conclude by discussing key challenges and future research directions.

...read moreread less

65 citations

Journal Article•DOI•

UrbanLF: A Comprehensive Light Field Dataset for Semantic Segmentation of Urban Scenes

[...]

Hao Sheng, Ruixuan Cong, Da Yang, Rongshan Chen, Sizhe Wang, Zhenglong Cui - Show less +2 more

01 Nov 2022

TL;DR: A high-quality and challenging urban scene dataset, containing 1074 samples composed of real-world and synthetic light field images as well as pixel-wise annotations for 14 semantic classes, is proposed, believed to be the largest and the most diverse light field dataset for semantic segmentation.

...read moreread less

Abstract: As one of the fundamental technologies for scene understanding, semantic segmentation has been widely explored in the last few years. Light field cameras encode the geometric information by simultaneously recording the spatial information and angular information of light rays, which provides us with a new way to solve this issue. In this paper, we propose a high-quality and challenging urban scene dataset, containing 1074 samples composed of real-world and synthetic light field images as well as pixel-wise annotations for 14 semantic classes. To the best of our knowledge, it is the largest and the most diverse light field dataset for semantic segmentation. We further design two new semantic segmentation baselines tailored for light field and compare them with state-of-the-art RGB, video and RGB-D-based methods using the proposed dataset. The outperforming results of our baselines demonstrate the advantages of the geometric information in light field for this task. We also provide evaluations of super-resolution and depth estimation methods, showing that the proposed dataset presents new challenges and supports detailed comparisons among different methods. We expect this work inspires new research direction and stimulates scientific progress in related fields. The complete dataset is available at https://github.com/HAWKEYE-Group/UrbanLF.

...read moreread less

37 citations

Journal Article•DOI•

Harnessing Multi-View Perspective of Light Fields for Low-Light Imaging

[...]

Mohit Lamba, Kranthi Kumar, Kaushik Mitra

05 Mar 2020-arXiv: Image and Video Processing

TL;DR: A deep neural network L3Fnet is proposed for Low-Light Light Field (L3F) restoration, which not only performs visual enhancement of each LF view but also preserves the epipolar geometry across views and can be used for low-light enhancement of single-frame images, despite it being engineered for LF data.

...read moreread less

Abstract: Light Field (LF) offers unique advantages such as post-capture refocusing and depth estimation, but low-light conditions limit these capabilities. To restore low-light LFs we should harness the geometric cues present in different LF views, which is not possible using single-frame low-light enhancement techniques. We, therefore, propose a deep neural network for Low-Light Light Field (L3F) restoration, which we refer to as L3Fnet. The proposed L3Fnet not only performs the necessary visual enhancement of each LF view but also preserves the epipolar geometry across views. We achieve this by adopting a two-stage architecture for L3Fnet. Stage-I looks at all the LF views to encode the LF geometry. This encoded information is then used in Stage-II to reconstruct each LF view. To facilitate learning-based techniques for low-light LF imaging, we collected a comprehensive LF dataset of various scenes. For each scene, we captured four LFs, one with near-optimal exposure and ISO settings and the others at different levels of low-light conditions varying from low to extreme low-light settings. The effectiveness of the proposed L3Fnet is supported by both visual and numerical comparisons on this dataset. To further analyze the performance of low-light reconstruction methods, we also propose an L3F-wild dataset that contains LF captured late at night with almost zero lux values. No ground truth is available in this dataset. To perform well on the L3F-wild dataset, any method must adapt to the light level of the captured scene. To do this we propose a novel pre-processing block that makes L3Fnet robust to various degrees of low-light conditions. Lastly, we show that L3Fnet can also be used for low-light enhancement of single-frame images, despite it being engineered for LF data. We do so by converting the single-frame DSLR image into a form suitable to L3Fnet, which we call as pseudo-LF.

...read moreread less

26 citations

Cites background from "Fast and Full-Resolution Light Fiel..."

...This includes tasks such as spatial super-resolution [24]–[27], deblurring [28]–[30], denoising [31]–[35], and depth estimation [7]–[9]....
[...]

Journal Article•DOI•

Multitask Learning Mechanism for Remote Sensing Image Motion Deblurring

[...]

Jie Fang, Xiaoqian Cao¹, Dianwei Wang, Shengjun Xu²•Institutions (2)

Shaanxi University of Science and Technology¹, Xi'an University of Architecture and Technology²

01 Jan 2021-IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

TL;DR: Wang et al. as discussed by the authors proposed a multitask learning mechanism for remote sensing image motion deblurring, which contains an image restoration subtask and an image texture complexity recognition one, which provides the auxiliary texture complexity information to help the optimization of restoration subbranch.

...read moreread less

Abstract: As a fundamental preprocessing technique, remote sensing image motion deblurring is important for visual understanding tasks. Most conventional approaches formulate the image motion deblurring task as a kernel estimation. Because the kernel estimation is a highly ill-posed problem, many priors have been applied to model the images and kernels. Even though these methods have obtained relatively better performances, they are usually time-consuming and not robust for different conditions. To address this problem, we propose a multitask learning mechanism for remote sensing image motion deblurring in this article, which contains an image restoration subtask and an image texture complexity recognition one. First, we consider the image motion deblurring problem as a domain transformation problem, from the blurred domain to a clear one. Specifically, the blurred domain represents the data space consisted of blurring images, and the definition of clear domain is similar. Second, we design a novel weighted attention map loss to enhance the reconstruction capability of the restoration subbranch for difficult local regions. Third, based on the restoration subbranch, a recognition subbranch is incorporated into the framework to guide the deblurring process, which provides the auxiliary texture complexity information to help the optimization of restoration subbranch. Additionally, in order to optimize the proposed network, we construct three large-scale datasets, and each sample in the dataset contains a clear image, a blurred image, and its texture label obtained by corresponding texture complexity. Finally, the experimental results on three constructed datasets demonstrate the robustness and the effectiveness of the proposed method.

...read moreread less

7 citations

Posted Content•

DUT-LFSaliency: Versatile Dataset and Light Field-to-RGB Saliency Detection.

[...]

Yongri Piao, Zhengkun Rong, Shuang Xu, Miao Zhang, Huchuan Lu - Show less +1 more

30 Dec 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: Zhang et al. as discussed by the authors introduced a large-scale dataset to enable versatile applications for RGB, RGB-D and light field saliency detection, containing 102 classes and 4204 samples.

...read moreread less

Abstract: Light field data exhibit favorable characteristics conducive to saliency detection. The success of learning-based light field saliency detection is heavily dependent on how a comprehensive dataset can be constructed for higher generalizability of models, how high dimensional light field data can be effectively exploited, and how a flexible model can be designed to achieve versatility for desktop computers and mobile devices. To answer these questions, first we introduce a large-scale dataset to enable versatile applications for RGB, RGB-D and light field saliency detection, containing 102 classes and 4204 samples. Second, we present an asymmetrical two-stream model consisting of the Focal stream and RGB stream. The Focal stream is designed to achieve higher performance on desktop computers and transfer focusness knowledge to the RGB stream, relying on two tailor-made modules. The RGB stream guarantees the flexibility and memory/computation efficiency on mobile devices through three distillation schemes. Experiments demonstrate that our Focal stream achieves state-of-the-arts performance. The RGB stream achieves Top-2 F-measure on DUTLF-V2, which tremendously minimizes the model size by 83% and boosts FPS by 5 times, compared with the best performing method. Furthermore, our proposed distillation schemes are applicable to RGB saliency models, achieving impressive performance gains while ensuring flexibility.

...read moreread less

5 citations

References

PDF

Open Access

More filters

Proceedings Article•DOI•

TensorFlow: a system for large-scale machine learning

[...]

Martín Abadi¹, Paul Barham¹, Jianmin Chen¹, Zhifeng Chen¹, Andy Davis¹, Jeffrey Dean¹, Matthieu Devin¹, Sanjay Ghemawat¹, Geoffrey Irving¹, Michael Isard¹, Manjunath Kudlur¹, Josh Levenberg¹, Rajat Monga¹, Sherry Moore¹, Derek G. Murray¹, Benoit Steiner¹, Paul A. Tucker¹, Vijay K. Vasudevan¹, Pete Warden¹, Martin Wicke¹, Yuan Yu¹, Xiaoqiang Zheng¹ - Show less +18 more•Institutions (1)

Google¹

02 Nov 2016

TL;DR: TensorFlow as mentioned in this paper is a machine learning system that operates at large scale and in heterogeneous environments, using dataflow graphs to represent computation, shared state, and the operations that mutate that state.

...read moreread less

Abstract: TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. Tensor-Flow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom-designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with a focus on training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model and demonstrate the compelling performance that TensorFlow achieves for several real-world applications.

...read moreread less

10,913 citations

Posted Content•

Accurate Image Super-Resolution Using Very Deep Convolutional Networks

[...]

Jiwon Kim¹, Jung Kwon Lee¹, Kyoung Mu Lee¹•Institutions (1)

Seoul National University¹

14 Nov 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work presents a highly accurate single-image superresolution (SR) method using a very deep convolutional network inspired by VGG-net used for ImageNet classification and uses extremely high learning rates enabled by adjustable gradient clipping.

...read moreread less

Abstract: We present a highly accurate single-image super-resolution (SR) method. Our method uses a very deep convolutional network inspired by VGG-net used for ImageNet classification \cite{simonyan2015very}. We find increasing our network depth shows a significant improvement in accuracy. Our final model uses 20 weight layers. By cascading small filters many times in a deep network structure, contextual information over large image regions is exploited in an efficient way. With very deep networks, however, convergence speed becomes a critical issue during training. We propose a simple yet effective training procedure. We learn residuals only and use extremely high learning rates ($10^4$ times higher than SRCNN \cite{dong2015image}) enabled by adjustable gradient clipping. Our proposed method performs better than existing methods in accuracy and visual improvements in our results are easily noticeable.

...read moreread less

3,628 citations

"Fast and Full-Resolution Light Fiel..." refers methods in this paper

...We draw on a previous study [9] by predicting and adding the residual image with the input to produce the deblurred result....
[...]

Posted Content•

Instance Normalization: The Missing Ingredient for Fast Stylization.

[...]

Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky

27 Jul 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: A small change in the stylization architecture results in a significant qualitative improvement in the generated images, and can be used to train high-performance architectures for real-time image generation.

...read moreread less

Abstract: It this paper we revisit the fast stylization method introduced in Ulyanov et. al. (2016). We show how a small change in the stylization architecture results in a significant qualitative improvement in the generated images. The change is limited to swapping batch normalization with instance normalization, and to apply the latter both at training and testing times. The resulting method can be used to train high-performance architectures for real-time image generation. The code will is made available on github at this https URL. Full paper can be found at arXiv:1701.02096.

...read moreread less

3,118 citations

"Fast and Full-Resolution Light Fiel..." refers methods in this paper

...The residual blocks follow the traditional ResNet model [7] and simple modification is made by applying instance normalization [26] instead of batch normalization to normalize the features’ contrasts....
[...]
...This strategy is supported by applying instance normalization [26] and ReLU activation on every 2D convolution layer, except for the last layer 2D Conv4, which utilizes Tanh activation without any normalization....
[...]

Proceedings Article•DOI•

Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring

[...]

Seungjun Nah¹, Tae Hyun Kim¹, Kyoung Mu Lee¹•Institutions (1)

Seoul National University¹

01 Jul 2017

TL;DR: This work proposes a multi-scale convolutional neural network that restores sharp images in an end-to-end manner where blur is caused by various sources and presents a new large-scale dataset that provides pairs of realistic blurry image and the corresponding ground truth sharp image that are obtained by a high-speed camera.

...read moreread less

Abstract: Non-uniform blind deblurring for general dynamic scenes is a challenging computer vision problem as blurs arise not only from multiple object motions but also from camera shake, scene depth variation. To remove these complicated motion blurs, conventional energy optimization based methods rely on simple assumptions such that blur kernel is partially uniform or locally linear. Moreover, recent machine learning based methods also depend on synthetic blur datasets generated under these assumptions. This makes conventional deblurring methods fail to remove blurs where blur kernel is difficult to approximate or parameterize (e.g. object motion boundaries). In this work, we propose a multi-scale convolutional neural network that restores sharp images in an end-to-end manner where blur is caused by various sources. Together, we present multi-scale loss function that mimics conventional coarse-to-fine approaches. Furthermore, we propose a new large-scale dataset that provides pairs of realistic blurry image and the corresponding ground truth sharp image that are obtained by a high-speed camera. With the proposed model trained on this dataset, we demonstrate empirically that our method achieves the state-of-the-art performance in dynamic scene deblurring not only qualitatively, but also quantitatively.

...read moreread less

1,560 citations

"Fast and Full-Resolution Light Fiel..." refers background or methods in this paper

...Network Architecture We use the combination of convolution-deconvolutional and residual styles on the network, which is proven to produce satisfying results on image deblurring [10, 17]....
[...]
...Recent blur datasets are only available for 2D or 3D image (video) deblurring [10, 17, 24]....
[...]
...Earlier works have achieved state of the art performance for deblurring 2D images and 3D videos [10, 17, 24]....
[...]
...These recent works apply image deblurring directly without blur kernel estimation [10, 12, 17, 24]....
[...]

Proceedings Article•DOI•

Image and depth from a conventional camera with a coded aperture

[...]

Anat Levin, Rob Fergus, Frédo Durand, William T. Freeman

29 Jul 2007

TL;DR: A simple modification to a conventional camera is proposed to insert a patterned occluder within the aperture of the camera lens, creating a coded aperture, and introduces a criterion for depth discriminability which is used to design the preferred aperture pattern.

...read moreread less

Abstract: A conventional camera captures blurred versions of scene information away from the plane of focus. Camera systems have been proposed that allow for recording all-focus images, or for extracting depth, but to record both simultaneously has required more extensive hardware and reduced spatial resolution. We propose a simple modification to a conventional camera that allows for the simultaneous recovery of both (a) high resolution image information and (b) depth information adequate for semi-automatic extraction of a layered depth representation of the image. Our modification is to insert a patterned occluder within the aperture of the camera lens, creating a coded aperture. We introduce a criterion for depth discriminability which we use to design the preferred aperture pattern. Using a statistical model of images, we can recover both depth information and an all-focus image from single photographs taken with the modified camera. A layered depth map is then extracted, requiring user-drawn strokes to clarify layer assignments in some cases. The resulting sharp image and layered depth map can be combined for various photographic applications, including automatic scene segmentation, post-exposure refocusing, or re-rendering of the scene from an alternate viewpoint.

...read moreread less

1,489 citations