scispace - formally typeset
Search or ask a question

Showing papers on "High-dynamic-range imaging published in 2022"


Journal ArticleDOI
TL;DR: The proposed HDR imaging approach that aggregates the information from multiple LDR images with guidance from image gradient domain generates artifact-free images by integrating the image gradient information and the image context information in the pixel domain.

18 citations


Journal ArticleDOI
TL;DR: This paper introduces the first approach (to the best of the knowledge) to the reconstruction of highresolution, high-dynamic range color images from raw photographic bursts captured by a handheld camera with exposure bracketing.
Abstract: Photographs captured by smartphones and mid-range cameras have limited spatial resolution and dynamic range, with noisy response in underexposed regions and color artefacts in saturated areas. This paper introduces the first approach (to the best of our knowledge) to the reconstruction of highresolution, high-dynamic range color images from raw photographic bursts captured by a handheld camera with exposure bracketing. This method uses a physically-accurate model of image formation to combine an iterative optimization algorithm for solving the corresponding inverse problem with a learned image representation for robust alignment and a learned natural image prior. The proposed algorithm is fast, with low memory requirements compared to state-of-the-art learning-based approaches to image restoration, and features that are learned end to end from synthetic yet realistic data. Extensive experiments demonstrate its excellent performance with super-resolution factors of up to ×4 on real photographs taken in the wild with hand-held cameras, and high robustness to low-light conditions, noise, camera shake, and moderate object motion.

7 citations


Proceedings ArticleDOI
23 May 2022
TL;DR: In this article , an efficient multi-exposure fusion (MEF) approach with a simple yet effective weight extraction method relying on principal component analysis, adaptive well-exposedness and saliency maps was proposed.
Abstract: High dynamic range (HDR) imaging enables to immortalize natural scenes similar to the way that they are perceived by human observers. With regular low dynamic range (LDR) capture/display devices, significant details may not be preserved in images due to the huge dynamic range of natural scenes. To minimize the information loss and produce high quality HDR-like images for LDR screens, this study proposes an efficient multi-exposure fusion (MEF) approach with a simple yet effective weight extraction method relying on principal component analysis, adaptive well-exposedness and saliency maps. These weight maps are later refined through a guided filter and the fusion is carried out by employing a pyramidal decomposition. Experimental comparisons with existing techniques demonstrate that the proposed method produces very strong statistical and visual results.

5 citations


Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed an effective weight map extraction framework which relies on principal component analysis, adaptive well-exposedness and saliency maps, and a blended output image is obtained via pyramidal decomposition.

5 citations


Journal ArticleDOI
TL;DR: An end-to-end deformable HDR imaging network, called DHDRNet, is proposed, which attempts to alleviate problems by building an effective aligning module and adopting self-guided attention and is robust to challenging scenes with large-scale motions and severe saturation.
Abstract: Two key challenges exist in high dynamic range (HDR) imaging from multiexposure low dynamic range (LDR) images for dynamic scenes: 1) aligning the input images with large-scale foreground motions and 2) recovering large saturated regions from a limited number of input LDR images. To tackle these challenges, several deep convolutional neural networks have been proposed that have made significant progress. However, these methods tend to suffer from ghosting and saturation artifacts when applied to some challenging scenes. In this article, we propose an end-to-end deformable HDR imaging network, called DHDRNet, which attempts to alleviate these problems by building an effective aligning module and adopting self-guided attention. First, we analyze the alignment process in the HDR imaging task and correspondingly design a pyramidal deformable module (PDM) that aligns LDR images at multiple scales and reconstructs aligned features in a coarse-to-fine manner. In this way, the proposed DHDRNet can handle large-scale complex motions and suppress ghosting artifacts caused by misalignments. Moreover, we adopt self-guided attention to reduce the influence of saturated regions during the aligning and merging processes, which helps suppress artifacts and retain fine details in the final HDR image. Extensive qualitative and quantitative comparisons demonstrate that the proposed model outperforms the existing start-of-the-art methods and that it is robust to challenging scenes with large-scale motions and severe saturation. The source code is available at: https://github.com/Tx000/DHDRNet.

4 citations


Journal ArticleDOI
01 Jan 2022
TL;DR: APNT-Fusion as discussed by the authors proposes an attention-guided progressive neural texture fusion (APNT)-based HDR restoration model which aims to address content association ambiguities caused by saturation, motion, and various artifacts introduced during multi-exposure fusion such as ghosting, noise and blur.
Abstract: High Dynamic Range (HDR) imaging via multi-exposure fusion is an important task for most modern imaging platforms. In spite of recent developments in both hardware and algorithm innovations, challenges remain over content association ambiguities caused by saturation, motion, and various artifacts introduced during multi-exposure fusion such as ghosting, noise, and blur. In this work, we propose an Attention-guided Progressive Neural Texture Fusion (APNT-Fusion) HDR restoration model which aims to address these issues within one framework. An efficient two-stream structure is proposed which separately focuses on texture feature transfer over saturated regions and multi-exposure tonal and texture feature fusion. A neural feature transfer mechanism is proposed which establishes spatial correspondence between different exposures based on multi-scale VGG features in the masked saturated HDR domain for discriminative contextual clues over the ambiguous image areas. A progressive texture blending module is designed to blend the encoded two-stream features in a multi-scale and progressive manner. In addition, we introduce several novel attention mechanisms, i.e., the motion attention module detects and suppresses the content discrepancies among the reference images; the saturation attention module facilitates differentiating the misalignment caused by saturation from those caused by motion; and the scale attention module ensures texture blending consistency between different coder/decoder scales. We carry out comprehensive qualitative and quantitative evaluations and ablation studies, which validate that these novel modules work coherently under the same framework and outperform state-of-the-art methods.

4 citations


Journal ArticleDOI
TL;DR: In this article , a spatially varying convolution (SVC) is designed to process the Bayer images carried with varying exposures, and an exposure-guidance method is proposed against the interference from over-and under-exposed pixels.
Abstract: Spatially varying exposure (SVE) is a promising choice for high-dynamic-range (HDR) imaging (HDRI). The SVE-based HDRI, which is called single-shot HDRI, is an efficient solution to avoid ghosting artifacts. However, it is very challenging to restore a full-resolution HDR image from a real-world image with SVE because: a) only one-third of pixels with varying exposures are captured by camera in a Bayer pattern, b) some of the captured pixels are over- and under-exposed. For the former challenge, a spatially varying convolution (SVC) is designed to process the Bayer images carried with varying exposures. For the latter one, an exposure-guidance method is proposed against the interference from over- and under-exposed pixels. Finally, a joint demosaicing and HDRI deep learning framework is formalized to include the two novel components and to realize an end-to-end single-shot HDRI. Experiments indicate that the proposed end-to-end framework avoids the problem of cumulative errors and surpasses the related state-of-the-art methods.

3 citations


Proceedings ArticleDOI
01 Jun 2022
TL;DR: In this article , a gamma-enhanced spatial attention network (GSANet) is proposed for reconstructing HDR images from one or multiple input Low Dynamic Range (LDR) images.
Abstract: High dynamic range(HDR) imaging is the task of re-covering HDR image from one or multiple input Low Dynamic Range (LDR) images. In this paper, we present Gamma-enhanced Spatial Attention Network(GSANet), a novel framework for reconstructing HDR images. This problem comprises two intractable challenges of how to tackle overexposed and underexposed regions and how to overcome the paradox of performance and complexity trade-off. To address the former, after applying gamma correction on the LDR images, we adopt a spatial attention module to adaptively select the most appropriate regions of various exposure low dynamic range images for fusion. For the latter one, we propose an efficient channel attention module, which only involves a handful of parameters while bringing clear performance gain. Experimental results show that the proposed method achieves better visual quality on the HDR dataset. The code will be available at: https://github.com/fancyicookie/GSANet

2 citations


Book ChapterDOI
01 Jan 2022
TL;DR: In this article , an additional linear polarizer is mounted in front of one of the two cameras, raising the degree of polarization of rays captured by the sensor, which leads to a larger attenuation range between channels regardless the scene lighting condition.
Abstract: High Dynamic Range (HDR) imaging techniques aim to increase the range of luminance values captured from a scene. The literature counts many approaches to get HDR images out of low-range camera sensors, however most of them rely on multiple acquisitions producing ghosting effects when moving objects are present. In this paper we propose a novel HDR reconstruction method exploiting stereo Polarimetric Filter Array (PFA) cameras to simultaneously capture the scene with different polarized filters, producing intensity attenuations that can be related to the light polarization state. An additional linear polarizer is mounted in front of one of the two cameras, raising the degree of polarization of rays captured by the sensor. This leads to a larger attenuation range between channels regardless the scene lighting condition. By merging the data acquired by the two cameras, we can compute the actual light attenuation observed by a pixel at each channel and derive an equivalent exposure time, producing a HDR picture from a single polarimetric shot. The proposed technique results comparable to classic HDR approaches using multiple exposures, with the advantage of being a one-shot method.

2 citations


Journal ArticleDOI
TL;DR: In this article , a set of high quality reference images was developed by rendering image features in terms of contrast, sharpness, and colorfulness, to achieve good rendering for each image.
Abstract: With the advancement of imaging technology, high dynamic range (HDR) images can now be captured and displayed to produce realistic effects. Tone mapping operators (TMOs) are used to map HDR radiance to the displayable range. A reliable TMO would play a significant role in the accurate reproduction of HDR scenes. The present study aimed to establish an image quality metric based on external references to evaluate various TMOs. Two psychophysical experiments were conducted to develop reference images and to investigate the performance of TMOs. In experiment 1, a set of high quality reference images was developed by rendering image features in terms of contrast, sharpness, and colorfulness, to achieve good rendering for each image. The images were used as reference in experiment 2 to evaluate the performance of 14 TMOs using a six-point categorical judgment method. The TMOs were evaluated using four scales, i.e., contrast, sharpness, colorfulness, and overall performance. The hierarchical relationship among TMOs was established. The results were further compared with previous studies, and high correlation was found between the current experiments and previous studies.

2 citations



Proceedings ArticleDOI
01 Jun 2022
TL;DR: In this article , the authors proposed a multi-bracket HDR pipeline combining a standard camera with an event camera, which shows better overall robustness when using events, with improvements in PSNR by up to 5dB on synthetic data and up to 0.7dB on real world data.
Abstract: Modern high dynamic range (HDR) imaging pipelines align and fuse multiple low dynamic range (LDR) images captured at different exposure times. While these methods work well in static scenes, dynamic scenes remain a challenge since the LDR images still suffer from saturation and noise. In such scenarios, event cameras would be a valid complement, thanks to their higher temporal resolution and dynamic range. In this paper, we propose the first multi-bracket HDR pipeline combining a standard camera with an event camera. Our results show better overall robustness when using events, with improvements in PSNR by up to 5dB on synthetic data and up to 0.7dB on real-world data. We also introduce a new dataset containing bracketed LDR images with aligned events and HDR ground truth.

Journal ArticleDOI
Bozhi Liu1
01 Mar 2022
TL;DR: In this paper , a region-adaptive self-supervised deep learning (RASSDL) technique for high dynamic range (HDR) image tone mapping is presented.
Abstract: This paper presents a region-adaptive self-supervised deep learning (RASSDL) technique for high dynamic range (HDR) image tone mapping. The RASSDL tone mapping operator (TMO) is a convolutional neural network (CNN) trained on local image regions that can seamlessly tone map images of arbitrary sizes. The training of RASSDL TMO is through the design of a self-supervising target that automatically adapts to the local image regions based on their information contents. The self-supervising target is designed to ensure the tone-mapped output achieves a balance between preserving the relative contrast of the original scene and the visibilities of the fine details to achieve faithful reproduction of the HDR scene. Distinguishing from many existing TMOs that require manual tuning of parameters, RASSDL is parameter-free and completely automatic. Experimental results demonstrate that RASSDL TMO can achieve state-of-the-art performance in terms of preserving overall contrasts, revealing fine details, and being free from visual artifacts.

Journal ArticleDOI
TL;DR: In this paper , a spatially-adaptive normalization was proposed to restore the high frequency component by applying different normalization parameters to each element in the feature map according to the characteristics of the input image.
Abstract: Deep convolutional neural networks (CNNs) have recently made significant advances in the inverse tone mapping technique, which generates a high dynamic range (HDR) image from a single low dynamic range (LDR) image that has lost information in over- and under-exposed regions. The end-to-end inverse tone mapping approach specifies the dynamic range in advance, thereby limiting dynamic range expansion. In contrast, the method of generating multiple exposure LDR images from a single LDR image and subsequently merging them into an HDR image enables flexible dynamic range expansion. However, existing methods for generating multiple exposure LDR images require an additional network for each exposure value to be changed or a process of recursively inferring images that have different exposure values. Therefore, the number of parameters increases significantly due to the use of additional networks, and an error accumulation problem arises due to recursive inference. To solve this problem, we propose a novel network architecture that can control arbitrary exposure values without adding networks or applying recursive inference. The training method of the auxiliary classifier-generative adversarial network structure is employed to generate the image conditioned on the specified exposure. The proposed network uses a newly designed spatially-adaptive normalization to address the limitation of existing methods that cannot sufficiently restore image detail due to the spatially equivariant nature of the convolution. Spatially-adaptive normalization facilitates restoration of the high frequency component by applying different normalization parameters to each element in the feature map according to the characteristics of the input image. Experimental results show that the proposed method outperforms state-of-the-art methods, yielding a 5.48dB higher average peak signal-to-noise ratio, a 0.05 higher average structure similarity index, a 0.28 higher average multi-scale structure similarity index, and a 7.36 higher average HDR-VDP-2 for various datasets.

Journal ArticleDOI
01 Jan 2022
TL;DR: In this paper , an attention-driven attention map is proposed to estimate the alignment and exposure uncertainties to produce high-quality HDR results, which is typically achieved by merging multiple low dynamic range images taken at different exposures.
Abstract: High dynamic range (HDR) imaging is of fundamental importance in modern digital photography pipelines and used to produce a high-quality photograph with well exposed regions despite varying illumination across the image. This is typically achieved by merging multiple low dynamic range (LDR) images taken at different exposures. However, over-exposed regions and misalignment errors due to poorly compensated motion result in artefacts such as ghosting. In this paper, we present a new HDR imaging technique that specifically models alignment and exposure uncertainties to produce high quality HDR results. We introduce a strategy that learns to jointly align and assess the alignment and exposure reliability using an HDR-aware, uncertainty-driven attention map that robustly merges the frames into a single high quality HDR image. Further, we introduce a progressive, multi-stage image fusion approach that can flexibly merge any number of LDR images in a permutation-invariant manner. Experimental results show our method can produce better quality HDR images with up to 1.1dB PSNR improvement to the state-of-the-art, and subjective improvements in terms of better detail, colours, and fewer artefacts.

Proceedings ArticleDOI
01 Jan 2022
TL;DR: In this article , a coarse-to-fine merging strategy with explicit saturation compensation was proposed to generate well-aligned multi-exposure features by reformulating a motion alignment problem into a simple brightness adjustment problem.
Abstract: High dynamic range (HDR) imaging is a highly challenging task since a large amount of information is lost due to the limitations of camera sensors. For HDR imaging, some methods capture multiple low dynamic range (LDR) images with altering exposures to aggregate more information. However, these approaches introduce ghosting artifacts when significant inter-frame motions are present. Moreover, although multi-exposure images are given, we have little information in severely over-exposed areas. Most existing methods focus on motion compensation, i.e., alignment of multiple LDR shots to reduce the ghosting artifacts, but they still produce unsatisfying results. These methods also rather overlook the need to restore the saturated areas. In this paper, we generate well-aligned multi-exposure features by reformulating a motion alignment problem into a simple brightness adjustment problem. In addition, we propose a coarse-to-fine merging strategy with explicit saturation compensation. The saturated areas are reconstructed with similar well-exposed content using adaptive contextual attention. We demonstrate that our method outperforms the state-of-the-art methods regarding qualitative and quantitative evaluations.

Proceedings ArticleDOI
04 Mar 2022
TL;DR: In this article , a spatially varying exposure (SVE) approach is proposed to boost the dynamic range of a given imaging sensor by using an appropriate optical mask and the full image reconstructed from the sampled SVE image, resulting in a boosted dynamic-range with only a small sacrifice in resolution.
Abstract: A traditional limitation of holographic displays has been their image quality. Recent advances in computer-generated holography using a camera-in-the-loop (CITL) approach have demonstrated that these issues can be overcome to improve image-fidelity by treating the system as a negative feedback control loop. Such an approach demands high bit-depth camera sensors to realise high system bandwidth. Here, we explore boosting the dynamic-range of a given imaging sensor by using a spatially varying exposure (SVE) approach. The exposure levels of adjacent pixels are spatially-multiplexed using an appropriate optical mask and the full image reconstructed from the sampled SVE image, resulting in a boosted dynamic-range with only a small sacrifice in resolution. This technique is well-tailored to CITL requirements as it promises to boost the dynamic range of the imaging sensor in a single image acquisition. We present our findings on the viability of this approach within the context of CGH displays.

Proceedings ArticleDOI
01 Mar 2022
TL;DR: In this paper , a perceptually-based tone mapping technique was proposed to generate compensated projection images, which minimizes clipping artifacts and contrast degradation under challenging conditions, such as bright environmental lighting and high contrast textures.
Abstract: Radiometric compensation techniques have been proposed to manipulate the appearance of arbitrarily textured surfaces using projectors. However, due to the limited dynamic range of the projectors, these compensation techniques often fail under bright environmental lighting or when the projection surface contains high contrast textures, resulting in clipping artifacts. To address this issue, we propose to apply a perceptually-based tone mapping technique to generate compensated projection images. The experimental results demonstrated that our approach minimizes the clipping artifacts and contrast degradation under challenging conditions.

Journal ArticleDOI
01 Apr 2022-Optik
TL;DR: Wang et al. as mentioned in this paper proposed a multi-scale pyramid framework-based exposure fusion method for high dynamic range image enhancement, which can be used for astronomical exploration, biomedical imaging, and optical imaging fields in the future.

Journal ArticleDOI
26 Jul 2022-Leukos
TL;DR: In this article , the authors reviewed recent research on HDR techniques as it pertains to illuminance measurement and described results of a pilot study comparing illuminANCE values captured and calculated using HDR against illuminances measurements collected with a calibrated illuminant meter, from a turf-based surface.
Abstract: ABSTRACT High Dynamic Range (HDR) imaging has traditionally been used to create photorealistic images by combining multiple Low Dynamic Range (LDR) images. The application of HDR imaging has been since expanded to study the lighting environment by extraction of metrics such as luminance, illuminance, and glare. While luminance mapping of HDR images has been extensively studied in the recent past, research on illuminance mapping has been limited because of its strong dependency on surface materiality. This document reviews recent research on HDR techniques as it pertains to illuminance measurement and describes results of a pilot study comparing illuminance values captured and calculated using HDR against illuminance measurements collected with a calibrated illuminance meter, from a turf-based surface.

Proceedings ArticleDOI
31 May 2022
TL;DR: This paper investigates the theoretical possibilities of combining CNN architectures utilized for HDR images and videos, in order to enhance the outputs of HDR light field image reconstruction.
Abstract: High dynamic range imaging has become a technological trend in the past couple of decades, particularly through its integration into many applications. Numerous attempts were made to reconstruct HDR images from lowdynamic-range data. Such reconstruction techniques can be classified into single-camera and multi-camera approaches. Single-camera setups are less expensive, yet multi-camera setups are more efficient. At the time of this paper, there is already a great number of algorithms for single-camera HDR image reconstruction, but there are only a few for HDR video reconstruction. The latter takes into account the temporal coherence between consecutive video frames, leading to better results. For light field images, this remains a challenging open issue, as the HDR video reconstruction methods do not work as efficiently for light field images as HDR image reconstruction algorithms do. However, analogously to 2D videos, where consecutive frames have temporal coherence, many similarities can be found between the adjacent views of light field contents. In this paper, we investigate the theoretical possibilities of combining CNN architectures utilized for HDR images and videos, in order to enhance the outputs of HDR light field image reconstruction. The concept of our work is to exploit the similarities between light field images since they all visualize the same scene from different angular perspectives.

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed an integrated HDR imaging scheme including registration and matching based on patch (HDR-RMP) for deghosting in complex dynamic scenes, where the combination of Affine Transformation (AT) and Normalization Mutual Information (NMI) has an implicit registration effect on the input images.

Proceedings ArticleDOI
16 Oct 2022
TL;DR: In this paper , a single-shot solution for high dynamic range (HDR) ToF imaging is proposed, where modulo sampling at each ToF pixel is considered, and HDR signals are folded back in the conventional dynamic range.
Abstract: Time-of-Flight (ToF) imagers, e.g. Microsoft Kinect, are active devices that offer a portable, efficient and a consumer-grade solution to three dimensional imaging problems. As the name suggests, in ToF imaging, back scattered light from an active illumination source (typically a sinusoid) is used to measure the ToF, thus resulting in depth information. Despite its prevalence in applications such as autonomous navigation and scientific imaging, current ToF sensors are limited in their dynamic range. Computational imaging solutions enabling high dynamic range (HDR) ToF imaging are largely unexplored. We take a step in this direction by proposing a novel architecture for HDR ToF imaging; we combine ToF imaging with the recently introduced Unlimited Sensing Framework. By considering modulo sampling at each ToF pixel, HDR signals are folded back in the conventional dynamic range. Our work offers a single-shot solution for HDR ToF imaging. We report a sampling density criterion that guarantees inversion of modulo non-linearity. Furthermore, we also present a new algorithm for ToF recovery that circumvents the need for unfolding of modulo samples. Numerical examples based on the Stanford 3D Scanning Repository highlight the merits of our approach, thus paving a path for a novel imaging architecture.

Proceedings ArticleDOI
26 Sep 2022
TL;DR: In this article , the authors proposed an efficient method to extend an image TMO to video TMO, which takes care of temporally coherent intensity variation between frames while addressing the well-known issue of flickering in tone mapped video.
Abstract: Tone mapping is necessary for Low Dynamic Range (LDR) devices to display High Dynamic Range (HDR) images and videos. Multiple video Tone Mapping Operators (vTMOs) have been devised for HDR videos. The majority of vTMOs apply an image TMO to each video frame. This is followed by pre/post filtering to ensure temporal coherence. However, this destroys the natural temporal variation in intensity that is inherent in a changing scene. Furthermore, in these methods computational complexity of an image TMO is scaled up in proportion to the number of frames. We propose an efficient method to extend an image TMO to video TMO. The proposed method is general and takes care of temporally coherent intensity variation between frames while addressing the well-known issue of flickering in tone mapped video. Additionally, it lowers computational complexity as a new tone mapping curve (TMC) is not generated on per frame basis. The proposed vTMO can be used to extend any state-of-the-art global TMO that is deemed to generate a TMC. Fresh TMC is generated only when a hard cut in video is detected or the global change in illumination in HDR video becomes large enough. Visual comparisons and objective evaluations with three well known image TMOs demonstrate that the suggested extension method generates high quality LDR video generally at low computational cost when compared to existing vTMOs. Further work on efficient implementation of embedding an image TMO in our vTMO algorithm is expected to yield even better computational efficiency.

Posted ContentDOI
04 Aug 2022
TL;DR: Zhang et al. as mentioned in this paper proposed a deep progressive feature aggregation network for improving HDR imaging quality in dynamic scenes, which implicitly samples high-correspondence features and aggregates them in a coarse-to-fine manner for alignment.
Abstract: High dynamic range (HDR) imaging is an important task in image processing that aims to generate well-exposed images in scenes with varying illumination. Although existing multi-exposure fusion methods have achieved impressive results, generating high-quality HDR images in dynamic scenes is still difficult. The primary challenges are ghosting artifacts caused by object motion between low dynamic range images and distorted content in under and overexposed regions. In this paper, we propose a deep progressive feature aggregation network for improving HDR imaging quality in dynamic scenes. To address the issues of object motion, our method implicitly samples high-correspondence features and aggregates them in a coarse-to-fine manner for alignment. In addition, our method adopts a densely connected network structure based on the discrete wavelet transform, which aims to decompose the input features into multiple frequency subbands and adaptively restore corrupted contents. Experiments show that our proposed method can achieve state-of-the-art performance under different scenes, compared to other promising HDR imaging methods. Specifically, the HDR images generated by our method contain cleaner and more detailed content, with fewer distortions, leading to better visual quality.

Proceedings ArticleDOI
07 Jan 2022
TL;DR: In this article , a deep residual network is proposed to align the input images of three low dynamic range images with different exposure times for the same scene based on the images with intermediate exposure times and uses them as input to the network.
Abstract: This paper proposes a new HDR image generation method using a deep residual network. The proposed method aligns input images of three low dynamic range (LDR) images with different exposure times for the same scene based on the images with intermediate exposure times and uses them as input to the network. In the network, three LDR images and three images converted into high dynamic range (HDR) regions are input, and three LDR images with improved artifacts and an alpha map are output to generate HDR images. The method proposed through the experimental results can show HDR images without color distortion or ghosting artifact when compared to the existing method, and good performance compared to the HDR function used in consumer imaging systems.

Posted ContentDOI
06 Jul 2022
TL;DR: Zhang et al. as discussed by the authors propose a deep network that tries to learn multi-scale feature flow guided by the regularized loss, and then aligns features from non-reference images.
Abstract: Reconstructing ghosting-free high dynamic range (HDR) images of dynamic scenes from a set of multi-exposure images is a challenging task, especially with large object motion and occlusions, leading to visible artifacts using existing methods. To address this problem, we propose a deep network that tries to learn multi-scale feature flow guided by the regularized loss. It first extracts multi-scale features and then aligns features from non-reference images. After alignment, we use residual channel attention blocks to merge the features from different images. Extensive qualitative and quantitative comparisons show that our approach achieves state-of-the-art performance and produces excellent results where color artifacts and geometric distortions are significantly reduced.

Journal ArticleDOI
01 Jan 2022
TL;DR: In this article , the authors show that the performance of HDR metrics is worse than that of a classic, simple standard dynamic range (SDR) metric applied directly to the HDR content, and that the chrominance metrics specifically developed for HDR/WCG imaging have poor correlation with observer scores.
Abstract: In the quality evaluation of high dynamic range and wide color gamut (HDR/WCG) images, a number of works have concluded that native HDR metrics, such as HDR visual difference predictor (HDR-VDP), HDR video quality metric (HDR-VQM), or convolutional neural network (CNN)-based visibility metrics for HDR content, provide the best results. These metrics consider only the luminance component, but several color difference metrics have been specifically developed for, and validated with, HDR/WCG images. In this paper, we perform subjective evaluation experiments in a professional HDR/WCG production setting, under a real use case scenario. The results are quite relevant in that they show, firstly, that the performance of HDR metrics is worse than that of a classic, simple standard dynamic range (SDR) metric applied directly to the HDR content; and secondly, that the chrominance metrics specifically developed for HDR/WCG imaging have poor correlation with observer scores and are also outperformed by an SDR metric. Based on these findings, we show how a very simple framework for creating color HDR metrics, that uses only luminance SDR metrics, transfer functions, and classic color spaces, is able to consistently outperform, by a considerable margin, state-of-the-art HDR metrics on a varied set of HDR content, for both perceptual quantization (PQ) and Hybrid Log-Gamma (HLG) encoding, luminance and chroma distortions, and on different color spaces of common use.

Posted ContentDOI
28 Oct 2022
TL;DR: In this article , a weakly supervised learning method was proposed to invert the physical image formation process for HDR reconstruction via learning to generate multiple exposures from a single image, which achieved state-of-the-art performance on the DrTMO dataset.
Abstract: High dynamic range (HDR) imaging is an indispensable technique in modern photography. Traditional methods focus on HDR reconstruction from multiple images, solving the core problems of image alignment, fusion, and tone mapping, yet having a perfect solution due to ghosting and other visual artifacts in the reconstruction. Recent attempts at single-image HDR reconstruction show a promising alternative: by learning to map pixel values to their irradiance using a neural network, one can bypass the align-and-merge pipeline completely yet still obtain a high-quality HDR image. In this work, we propose a weakly supervised learning method that inverts the physical image formation process for HDR reconstruction via learning to generate multiple exposures from a single image. Our neural network can invert the camera response to reconstruct pixel irradiance before synthesizing multiple exposures and hallucinating details in under- and over-exposed regions from a single input image. To train the network, we propose a representation loss, a reconstruction loss, and a perceptual loss applied on pairs of under- and over-exposure images and thus do not require HDR images for training. Our experiments show that our proposed model can effectively reconstruct HDR images. Our qualitative and quantitative results show that our method achieves state-of-the-art performance on the DrTMO dataset. Our code is available at https://github.com/VinAIResearch/single_image_hdr.

Proceedings ArticleDOI
01 Aug 2022
TL;DR: In this article , a log-modulo encoding scheme is proposed for high-dynamic-range (HDR) scene reconstruction, where the HDR scene is computed by an irradiance unwrapping algorithm from the wrapped low-dimensional range (LDR) sensor image.
Abstract: The ability to image high-dynamic-range (HDR) scenes is crucial in many computer vision applications. The dynamic range of conventional sensors, however, is fundamentally limited by their well capacity, resulting in saturation of bright scene parts. To overcome this limitation, emerging sensors offer in-pixel processing capabilities to encode the incident irradiance. Among the most promising encoding schemes is modulo wrapping, which results in a computational photography problem where the HDR scene is computed by an irradiance unwrapping algorithm from the wrapped low-dynamic-range (LDR) sensor image. Here, we design a neural network-based algorithm that outperforms previous irradiance unwrapping methods and we design a perceptually inspired “mantissa,” or log-modulo, encoding scheme that more efficiently wraps an HDR scene into an LDR sensor. Combined with our reconstruction framework, MantissaCam achieves state-of-the-art results among modulo-type snapshot HDR imaging approaches. We demonstrate the efficacy of our method in simulation and show benefits of our algorithm on modulo images captured with a prototype implemented with a programmable sensor.