What have the authors contributed in "Dehazenet: an end-to-end system for single image haze removal" ?

In this paper, the authors propose a trainable end-to-end system called DehazeNet, for medium transmission estimation. The authors also propose a novel nonlinear activation function in DehazeNet, called Bilateral Rectified Linear Unit ( BReLU ), which is able to improve the quality of recovered haze-free image. The authors establish connections between components of the proposed DehazeNet and those used in existing methods.

What have the authors stated for future works in "Dehazenet: an end-to-end system for single image haze removal" ?

The authors leave this problem for future research.

What is the noise model for NRE?

As a basic noise model, additive white Gaussian (AWG) noise with standard deviation σ ∈ {10, 15, 20, 25} is used for noise robustness evaluation (NRE).

How can the authors learn haze-relevant features in a deep neural network?

the authors think11atmospheric scattering model can also be learned in a deeper neural network, in which an end-to-end mapping between haze and haze-free images can be optimized directly without the medium transmission estimation.

How many patches are generated from RF?

According to the testing measure of RF, 2000 image patches are randomly sampled from haze-free images with 10 random transmission t ∈ (0, 1) to generate 20,000 hazy patches for testing.

What is the way to evaluate the dehazing performance of different atmosphere airlight?

An airlight robustness evaluation (ARE) is proposed to analyze the dehazing methods for different atmosphere airlight α. Although DehazeNet is trained from the samples generated by setting α = 1, it also achieves the greater robustness on the other values of atmosphere airlight.

What is the state-of-the-art score for DehazeNet?

DehazeNetachieves the best state-of-the-art score, which is 1.19e-2; the MSE between their method and the next state-of-art result (RF [17]) in the literature is 0.07e-2.

What is the reason why the CAP and RF avoid color distortion in the sky?

Based on the learning framework, CAP and RF avoid color distortion in the sky, but non-sky regions are enhanced poorly because of the non-content regression model (for example, the rock-soil of the first image and the green flatlands in the third image).

How does DehazeNet achieve the dehazing effects?

With this lightweight architecture, DehazeNet achieves dramatically high efficiency and outstanding dehazing effects than the state-of-the-art methods.

(Open Access) DehazeNet: An End-to-End System for Single Image Haze Removal (2016) | Bolun Cai

in any current or future media, including reprinting/republishing this material for advertising or promotional

purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted

component of this work in other works.

DehazeNet: An End-to-End System for Single Image

Haze Removal

Bolun Cai, Xiangmin Xu, Member, IEEE, Kui Jia, Member, IEEE, Chunmei Qing, Member, IEEE,

and Dacheng Tao, Fellow, IEEE

Abstract—Single image haze removal is a challenging ill-posed

problem. Existing methods use various constraints/priors to get

plausible dehazing solutions. The key to achieve haze removal is

to estimate a medium transmission map for an input hazy image.

In this paper, we propose a trainable end-to-end system called

DehazeNet, for medium transmission estimation. DehazeNet takes

a hazy image as input, and outputs its medium transmission map

that is subsequently used to recover a haze-free image via atmo-

spheric scattering model. DehazeNet adopts Convolutional Neural

Networks (CNN) based deep architecture, whose layers are

specially designed to embody the established assumptions/priors

in image dehazing. Speciﬁcally, layers of Maxout units are used

for feature extraction, which can generate almost all haze-relevant

features. We also propose a novel nonlinear activation function

in DehazeNet, called Bilateral Rectiﬁed Linear Unit (BReLU),

which is able to improve the quality of recovered haze-free image.

We establish connections between components of the proposed

DehazeNet and those used in existing methods. Experiments

on benchmark images show that DehazeNet achieves superior

performance over existing methods, yet keeps efﬁcient and easy

to use.

Keywords—Dehaze, image restoration, deep CNN, BReLU.

I. INTRODUCTION

AZE is a traditional atmospheric phenomenon where

dust, smoke and other dry particles obscure the clarity

of the atmosphere. Haze causes issues in the area of terrestrial

photography, where the light penetration of dense atmosphere

may be necessary to image distant subjects. This results in the

visual effect of a loss of contrast in the subject, due to the

effect of light scattering through the haze particles. For these

reasons, haze removal is desired in both consumer photography

and computer vision applications.

Haze removal is a challenging problem because the haze

transmission depends on the unknown depth which varies at

different positions. Various techniques of image enhancement

have been applied to the problem of removing haze from a

single image, including histogram-based [1], contrast-based [2]

B. Cai, X. Xu (



) and C. Qing are with School of Electronic and

Information Engineering, South China University of Technology, Wushan

RD., Tianhe District, Guangzhou, P.R.China. E-mail: {caibolun@gmail.com,

xmxu@scut.edu.cn, qchm@scut.edu.cn}.

K. Jia is with the Department of Electrical and Computer Engineering,

Faculty of Science and Technology, University of Macau, Macau 999078,

China. E-mail: {kuijia@gmail.com}.

D. Tao is with Centre for Quantum Computation & Intelligent Sys-

tems, Faculty of Engineering & Information Technology, University of Tech-

nology Sydney, 235 Jones Street, Ultimo, NSW 2007, Australia. E-mail:

{dacheng.tao@uts.edu.au}.

and saturation-based [3]. In addition, methods using multiple

images or depth information have also been proposed. For

example, polarization based methods [4] remove the haze

effect through multiple images taken with different degrees of

polarization. In [5], multi-constraint based methods are applied

to multiple images capturing the same scene under different

weather conditions. Depth-based methods [6] require some

depth information from user inputs or known 3D models. In

practice, depth information or multiple hazy images are not

always available.

Single image haze removal has made signiﬁcant progresses

recently, due to the use of better assumptions and priors.

Speciﬁcally, under the assumption that the local contrast of the

haze-free image is much higher than that of hazy image, a local

contrast maximizing method [7] based on Markov Random

Field (MRF) is proposed for haze removal. Although contrast

maximizing approach is able to achieve impressive results, it

tends to produce over-saturated images. In [8], Independent

Component Analysis (ICA) based on minimal input is pro-

posed to remove the haze from color images, but the approach

is time-consuming and cannot be used to deal with dense-haze

images. Inspired by dark-object subtraction technique, Dark

Channel Prior (DCP) [9] is discovered based on empirical

statistics of experiments on haze-free images, which shows

at least one color channel has some pixels with very low

intensities in most of non-haze patches. With dark channel

prior, the thickness of haze is estimated and removed by the

atmospheric scattering model. However, DCP loses dehazing

quality in the sky images and is computationally intensive.

Some improved algorithms are proposed to overcome these

limitations. To improve dehazing quality, Kratz and Nishino

et al. [10] model the image with a factorial MRF to estimate

the scene radiance more accurately; Meng et al. [11] propose

an effective regularization dehazing method to restore the haze-

free image by exploring the inherent boundary constraint. To

improve computational efﬁciency, standard median ﬁltering

[12], median of median ﬁlter [13], guided joint bilateral

ﬁltering [14] and guided image ﬁlter [15] are used to replace

the time-consuming soft matting [16]. In recent years, haze-

relevant priors are investigated in machine learning framework.

Tang et al. [17] combine four types of haze-relevant features

with Random Forests to estimate the transmission. Zhu et al.

[18] create a linear model for estimating the scene depth of

the hazy image under color attenuation prior and learns the

parameters of the model with a supervised method. Despite the

remarkable progress, these state-of-the-art methods are limited

by the very same haze-relevant priors or heuristic cues - they

arXiv:1601.07661v2 [cs.CV] 17 May 2016

are often less effective for some images.

Haze removal from a single image is a difﬁcult vision task.

In contrast, the human brain can quickly identify the hazy area

from the natural scenery without any additional information.

One might be tempted to propose biologically inspired models

for image dehazing, by following the success of bio-inspired

CNNs for high-level vision tasks such as image classiﬁcation

[19], face recognition [20] and object detection [21]. In fact,

there have been a few (convolutional) neural network based

deep learning methods that are recently proposed for low-level

vision tasks of image restoration/reconstruction [22], [23],

[24]. However, these methods cannot be directly applied to

single image haze removal.

Note that apart from estimation of a global atmospheric light

magnitude, the key to achieve haze removal is to recover an

accurate medium transmission map. To this end, we propose

DehazeNet, a trainable CNN based end-to-end system for

medium transmission estimation. DehazeNet takes a hazy

image as input, and outputs its medium transmission map that

is subsequently used to recover the haze-free image by a simple

pixel-wise operation. Design of DehazeNet borrows ideas from

established assumptions/principles in image dehazing, while

parameters of all its layers can be automatically learned from

training hazy images. Experiments on benchmark images show

that DehazeNet gives superior performance over existing meth-

ods, yet keeps efﬁcient and easy to use. Our main contributions

are summarized as follows.

1) DehazeNet is an end-to-end system. It directly learns

and estimates the mapping relations between hazy im-

age patches and their medium transmissions. This is

achieved by special design of its deep architecture to

embody established image dehazing principles.

2) We propose a novel nonlinear activation function in

DehazeNet, called Bilateral Rectiﬁed Linear Unit

(BReLU). BReLU extends Rectiﬁed Linear Unit

(ReLU) and demonstrates its signiﬁcance in obtaining

accurate image restoration. Technically, BReLU uses

the bilateral restraint to reduce search space and im-

prove convergence.

3) We establish connections between components of De-

hazeNet and those assumptions/priors used in existing

dehazing methods, and explain that DehazeNet im-

proves over these methods by automatically learning

all these components from end to end.

The remainder of this paper is organized as follows. In

Section II, we review the atmospheric scattering model and

haze-relevant features, which provides background knowledge

to understand the design of DehazeNet. In Section III, we

present details of the proposed DehazeNet, and discuss how

it relates to existing methods. Experiments are presented in

During the preparation of this manuscript (in December, 2015), we

ﬁnd that a nonlinear activation function called adjustable bounded rectiﬁer

is proposed in [25] (arXived in November, 2015), which is almost identical

to BReLU. Adjustable bounded rectiﬁer is motivated to achieve the objective

of image recognition. In contrast, BReLU is proposed here to improve image

restoration accuracy. It is interesting that we come to the same activation

function from completely different initial objectives. This may also suggest

the general usefulness of the proposed BReLU.

Section IV, before conclusion is drawn in Section V.

II. RELATED WORKS

Many image dehazing methods have been proposed in the

literature. In this section, we brieﬂy review some important

ones, paying attention to those proposing the atmospheric

scattering model, which is the basic underlying model of

image dehazing, and those proposing useful assumptions for

computing haze-relevant features.

A. Atmospheric Scattering Model

To describe the formation of a hazy image, the atmospheric

scattering model is ﬁrst proposed by McCartney [26], which

is further developed by Narasimhan and Nayar [27], [28]. The

atmospheric scattering model can be formally written as

I (x) = J (x) t (x) + α (1 − t (x)) , (1)

where I (x) is the observed hazy image, J (x) is the real scene

to be recovered, t (x) is the medium transmission, α is the

global atmospheric light, and x indexes pixels in the observed

hazy image I. Fig. 1 gives an illustration. There are three

unknowns in equation (1), and the real scene J (x) can be

recovered after α and t (x) are estimated.

The medium transmission map t (x) describes the light

portion that is not scattered and reaches the camera. t (x) is

deﬁned as

t (x) = e

−βd(x)

, (2)

where d (x) is the distance from the scene point to the camera,

and β is the scattering coefﬁcient of the atmosphere. Equation

(2) suggests that when d (x) goes to inﬁnity, t (x) approaches

zero. Together with equation (1) we have

α = I (x) , d (x) → inf (3)

In practical imaging of a distance view, d (x) cannot be

inﬁnity, but rather be a long distance that gives a very low

transmission t

. Instead of relying on equation (3) to get the

global atmospheric light α, it is more stably estimated based

on the following rule

α = max

y∈{x|t(x)≤t

}

I (y) (4)

The discussion above suggests that to recover a clean scene

(i.e., to achieve haze removal), it is the key to estimate an

accurate medium transmission map.

B. Haze-relevant features

Image dehazing is an inherently ill-posed problem. Based

on empirical observations, existing methods propose various

assumptions or prior knowledge that are utilized to compute

intermediate haze-relevant features. Final haze removal can be

achieved based on these haze-relevant features.

(a) The process of imaging in hazy weather. The transmission attenuation

J (x) t (x) caused by the reduction in reﬂected energy, leads to low brightness

intensity. The airlight α (1 − t (x)) formed by the scattering of the environ-

mental illumination, enhances the brightness and reduces the saturation.

(b) Atmospheric scattering model. The observed hazy image I (x) is gener-

ated by the real scene J (x), the medium transmission t (x) and the global

atmospheric light α.

Fig. 1. Imaging in hazy weather and atmospheric scattering model

1) Dark Channel: The dark channel prior is based on the

wide observation on outdoor haze-free images. In most of the

haze-free patches, at least one color channel has some pixels

whose intensity values are very low and even close to zero.

The dark channel [9] is deﬁned as the minimum of all pixel

colors in a local patch:

D (x) = min

y∈Ω

(x)



min

c∈{r,g,b}

(y)



, (5)

where I

is a RGB color channel of I and Ω

(x) is a

local patch centered at x with the size of r × r. The dark

channel feature has a high correlation to the amount of haze

in the image, and is used to estimate the medium transmission

t (x) ∝ 1 − D (x) directly.

2) Maximum Contrast: According to the atmospheric scat-

tering, the contrast of the image is reduced by the haze trans-

mission as

k∇I (x)k = t

k∇J (x)k ≤

k∇J (x)k .

Based on this observation, the local contrast [7] as the variance

of pixel intensities in a s×s local patch Ω

with respect to the

center pixel, and the local maximum of local contrast values

in a r × r region Ω

is deﬁned as:

C (x) = max

y∈Ω

(x)

|Ω

(y)|

z∈Ω

(y)

kI (z) − I (y)k

, (6)

where |Ω

(y)| is the cardinality of the local neighborhood.

The correlation between the contrast feature and the medium

transmission t is visually obvious, so the visibility of the image

is enhanced by maximizing the local contrast showed as (6).

3) Color Attenuation: The saturation I

(x) of the patch

decreases sharply while the color of the scene fades under

the inﬂuence of the haze, and the brightness value I

(x)

increases at the same time producing a high value for the

difference. According to the above color attenuation prior [18],

the difference between the brightness and the saturation is

utilized to estimate the concentration of the haze:

A (x) = I

(x) − I

(x) , (7)

where I

(x) and I

(x) can be expressed in the HSV

color space as I

(x) = max

c∈{r,b,g}

(x) and I

(x) =



max

c∈{r,b,g}

(x) − min

c∈{r,b,g}

(x)



max

c∈{r,b,g}

(x).

The color attenuation feature is proportional to the scene

depth d (x) ∝ A (x), and is used for transmission estimation

easily.

4) Hue Disparity: Hue disparity between the origi-

nal image I (x) and its semi-inverse image, I

(x) =

max [I

(x) , 1 − I

(x)] with c ∈ {r, g, b}, has been used to

detect the haze. For haze-free images, pixel values in the three

channels of their semi-inverse images will not all ﬂip, resulting

in large hue changes between I

(x) and I (x). In [29], the

hue disparity feature is deﬁned:

H (x) =



(x) − I

(x)



, (8)

where the superscript ”h” denotes the hue channel of the

image in HSV color space. According to (8), the medium

transmission t (x) is in inverse propagation to H (x).

III. THE PROPOSED DEHAZENET

The atmospheric scattering model in Section II-A suggests

that estimation of the medium transmission map is the most

important step to recover a haze-free image. To this end,

we propose DehazeNet, a trainable end-to-end system that

explicitly learns the mapping relations between raw hazy

images and their associated medium transmission maps. In

this section, we present layer designs of DehazeNet, and

discuss how these designs are related to ideas in existing

image dehazing methods. The ﬁnal pixel-wise operation to

get a recovered haze-free image from the estimated medium

transmission map will be presented in Section IV.

A. Layer Designs of DehazeNet

The proposed DehazeNet consists of cascaded convolutional

and pooling layers, with appropriate nonlinear activation func-

tions employed after some of these layers. Fig. 2 shows the

architecture of DehazeNet. Layers and nonlinear activations

Fig. 2. The architecture of DehazeNet. DehazeNet conceptually consists of four sequential operations (feature extraction, multi-scale mapping, local extremum

and non-linear regression), which is constructed by 3 convolution layers, a max-pooling, a Maxout unit and a BReLU activation function.

of DehazeNet are designed to implement four sequential op-

erations for medium transmission estimation, namely, feature

extraction, multi-scale mapping, local extremum, and nonlinear

regression. We detail these designs as follows.

1) Feature Extraction: To address the ill-posed nature of

image dehazing problem, existing methods propose various

assumptions and based on these assumptions, they are able

to extract haze-relevant features (e.g., dark channel, hue dis-

parity, and color attenuation) densely over the image domain.

Note that densely extracting these haze-relevant features is

equivalent to convolving an input hazy image with appropriate

ﬁlters, followed by nonlinear mappings. Inspired by extremum

processing in color channels of those haze-relevant features,

an unusual activation function called Maxout unit [30] is

selected as the non-linear mapping for dimension reduction.

Maxout unit is a simple feed-forward nonlinear activation

function used in multi-layer perceptron or CNNs. When used

in CNNs, it generates a new feature map by taking a pixel-wise

maximization operation over k afﬁne feature maps. Based on

Maxout unit, we design the ﬁrst layer of DehazeNet as follows

(x) = max

j∈[1,k]

i,j

(x) , f

i,j

= W

i,j

∗I + B

i,j

, (9)

where W

= {W

i,j

}

,k)

(i,j)=(1,1)

and B

= {B

i,j

}

,k)

(i,j)=(1,1)

represent the ﬁlters and biases respectively, and ∗ denotes the

convolution operation. Here, there are n

output feature maps

in the ﬁrst layer. W

i,j

∈ R

3×f

×f

is one of the total k × n

convolution ﬁlters, where 3 is the number of channels in the

input image I (x), and f

is the spatial size of a ﬁlter (detailed

in Table I). Maxout unit maps each of kn

-dimensional vectors

into an n

-dimensional one, and extracts the haze-relevant

features by automatic learning rather than heuristic ways in

existing methods.

2) Multi-scale Mapping: In [17], multi-scale features have

been proven effective for haze removal, which densely com-

pute features of an input image at multiple spatial scales.

Multi-scale feature extraction is also effective to achieve

scale invariance. For example, the inception architecture in

GoogLeNet [31] uses parallel convolutions with varying ﬁlter

sizes, and better addresses the issue of aligning objects in input

images, resulting in state-of-the-art performance in ILSVRC14

[32]. Motivated by these successes of multi-scale feature ex-

traction, we choose to use parallel convolutional operations in

the second layer of DehazeNet, where size of any convolution

ﬁlter is among 3 × 3, 5 × 5 and 7 × 7, and we use the same

number of ﬁlters for these three scales. Formally, the output

of the second layer is written as

= W

di/3e,(i\3)

∗F

+ B

di/3e,(i\3)

, (10)

where W

= {W

p,q

}

(3,n

/3)

(p,q)=(1,1)

and B

= {B

p,q

}

(3,n

/3)

(p,q)=(1,1)

contain n

pairs of parameters that is break up into 3 groups.

is the output dimension of the second layer, and i ∈ [1, n

]

indexes the output feature maps. de takes the integer upwardly

and \ denotes the remainder operation.

3) Local Extremum: To achieve spatial invariance, the cor-

tical complex cells in the visual cortex receive responses

from the simple cells for linear feature integration. Ilan et al.

[33] proposed that spatial integration properties of complex

cells can be described by a series of pooling operations.

According to the classical architecture of CNNs [34], the

neighborhood maximum is considered under each pixel to

overcome local sensitivity. In addition, the local extremum is in

accordance with the assumption that the medium transmission

is locally constant, and it is commonly to overcome the noise

of transmission estimation. Therefore, we use a local extremum

operation in the third layer of DehazeNet.

(x) = max

y∈Ω(x)

(y) , (11)

where Ω (x) is an f

× f

neighborhood centered at x, and

the output dimension of the third layer n

= n

. In contrast

to max-pooling in CNNs, which usually reduce resolutions of

feature maps, the local extremum operation here is densely

DehazeNet: An End-to-End System for Single Image Haze Removal

Figures

Citations

AOD-Net: All-in-One Dehazing Network

Benchmarking Single-Image Dehazing and Beyond

Densely Connected Pyramid Dehazing Network

Deep Joint Rain Detection and Removal from a Single Image

Pre-Trained Image Processing Transformer

References

ImageNet Classification with Deep Convolutional Neural Networks

Gradient-based learning applied to document recognition

Image quality assessment: from error visibility to structural similarity

Going deeper with convolutions

ImageNet Large Scale Visual Recognition Challenge

Related Papers (5)

Single Image Haze Removal Using Dark Channel Prior

A Fast Single Image Haze Removal Algorithm Using Color Attenuation Prior

Visibility in bad weather from a single image

Single image dehazing

Image quality assessment: from error visibility to structural similarity

Frequently Asked Questions (12)

Q1. What have the authors contributed in "Dehazenet: an end-to-end system for single image haze removal" ?

Q2. What have the authors stated for future works in "Dehazenet: an end-to-end system for single image haze removal" ?

Q3. What is the noise model for NRE?

Q4. How can the authors learn haze-relevant features in a deep neural network?

Q5. How many patches are generated from RF?

Q6. Why is the local extremum in DehazeNet?

Q7. What is the important step to recover a haze-free image?

Q8. What is the way to evaluate the dehazing performance of different atmosphere airlight?

Q9. What is the state-of-the-art score for DehazeNet?

Q10. What is the haze-relevant feature extracted from the image?

Q11. What is the reason why the CAP and RF avoid color distortion in the sky?

Q12. How does DehazeNet achieve the dehazing effects?