scispace - formally typeset
Open AccessProceedings ArticleDOI

Classification of weather situations on single color images

Martin Roser, +1 more
- pp 798-803
Reads0
Chats0
TLDR
This work presents an approach that is able to distinguish between multiple weather situations based on the classification of single monocular color images, without any additional assumptions or prior knowledge.
Abstract
Present vision based driver assistance systems are designed to perform under good-natured weather conditions. However, limited visibility caused by heavy rain or fog strongly affects vision systems. To improve machine vision in bad weather situations, a reliable detection system is necessary as a ground base. We present an approach that is able to distinguish between multiple weather situations based on the classification of single monocular color images, without any additional assumptions or prior knowledge. The proposed image descriptor clearly outperforms existing descriptors for that task. Experimental results on real traffic images are characterized by high accuracy, efficiency, and versatility with respect to driver assistance systems.

read more

Content maybe subject to copyright    Report

Classication of Weather Situations on Single Color Images
Martin Roser and Frank Moosmann
Institut f
¨
ur Mess- und Regelungstechnik
Universit
¨
at Karlsruhe (TH)
D-76131 Karlsruhe, Germany
Email: {roser, moosmann}@mrt.uka.de
Abstract Present vision based driver assistance systems are
designed to perform under good-natured weather conditions.
However, limited visibility caused by heavy rain or fog strongly
affects vision systems. To improve machine vision in bad
weather situations, a reliable detection system is necessary
as a ground base. We present an approach that is able to
distinguish between multiple weather situations based on the
classification of single monocular color images, without any
additional assumptions or prior knowledge. The proposed
image descriptor clearly outperforms existing descriptors for
that task. Experimental results on real trafc images are
characterized by high accuracy, efciency, and versatility with
respect to driver assistance systems.
I. INTRODUCTION
Vision based driver assistance systems (DAS) are currently
designed to perform under good-natured weather conditions.
Unfortunately, limited visibility often occurs in daily life
(e.g. heavy rain or fog). As t his strongly affects the accuracy
or even the general function of vision systems, the actual
weather condition is a valuable information for assistance
systems. Based on the results of weather classication,
specialized approaches for each class can be invoked to
improve cognition. This will form a key factor to expand the
application of DAS from selected environmental conditions
to an overall approach.
Little work has been done on weather related issues for in-
vehicle camera systems so far. Nayar [8] studied the visual
effects of rain and came up with a photometric rain drop
model that describes refraction and r eection of light by a
rain drop. Additionally, they determined the effect of camera
parameters on image disturbance and developed an approach
of detecting and removing rain from videos. Narasimhan
[14], [15], [16] analyzed images taken under poor static
weather conditions. They used the Koschmieder model to
estimate scattering coefcients of the atmosphere and restore
the contrast of weather degraded images.
Even though Nayar as well as Narasimhan reported impres-
sive results for their specic problems, these approaches
cant be easily transferred to automobile applications. All
approaches assume a st atic observer whereas in automo-
bile applications, egomotion of the camera is obviously
the normal case. In addition, the numerous shapes of rain
drops on a windshield will complicate the employed models
signicantly.
Work on weather related issues i n automobile applications
has been conducted by [9] and [10]. Hautiere [9] estimated
the visibility distance using charge-coupled device cameras.
Kurihata [10] used a machine learning approach with rain-
drop templates, so called eigendrops, to detect rain drops on
windshields. However, both lack a holistic approach to deal
with all kinds of adverse weather conditions.
In this contribution, we propose a general approach suitable
for any kind of weather situation and for any egomotion. As
a rst step, we present an image classication method that
reliably distinguishes between certain weather conditions.
Whereas little work has been done on DAS for bad
weather situations, scientic research in image classication
or categorization is very broad. Generally, the goal is to
decide whether an image belongs to a certain category or not.
Depending on the application, categories can include various
natural scenes [11], but often images are tested for the
presence of a certain object category, e.g. [5], [13], [17], [19].
All modern approaches are based on the extraction of local
image features, as global features turned out to be not robust
enough to deal with variations in view, lighting, occlusion
and object variations. Different kinds of local features have
been proposed with histogram-based features like SIFT [12],
HOG [3], and shape context [1] being among the most
discriminant. However, these features perform weakly for the
intended task.
Based on local features, machine learning classication ap-
proaches are often proposed that range from simple decision
trees [13] up to the introduction of additional semantic layers
as the very popular bag-of-features approach [2], [19].
While these approaches have achieved remarkable results
for generic image categorization, no such system has been
proposed yet to distinguish weather situations. Additionally,
most existing features are based on grayscale images and
only few approaches have tried to use color features, e.g.
[20]. We believe that color casts due to atmospheric effects
may provide valuable, additional information.
The key contributions of this paper are the develop-
ment of robust histogram features for the task of weather
recognition, and their application in an efcient and effec-
tive image classication framework. The method we pro-
pose works on single monoscopic color images from in-
vehicle cameras and extracts robust and meaningful his-
togram features, as depicted in section II. In section III,
we then apply a support vector machine (SVM) on the
2008 IEEE Intelligent Vehicles Symposium
Eindhoven University of Technology
Eindhoven, The Netherlands, June 4-6, 2008
978-1-4244-2569-3/08/$20.00 ©2008 IEEE. 798
First published in:
EVA-STAR (Elektronisches Volltextarchiv – Scientific Articles Repository)
http://digbib.ubka.uni-karlsruhe.de/volltexte/1000011965

SVM
Clear
Light rain
Heavy rain
. . .
v =
v
1
v
n
ROI
Sub-ROISub-ROI Sub-ROI Sub-ROI
Brightness
Contrast
Sharpness
Saturation
Hue
Fig. 1. Proposed method: Several histogram features (like brightness, contrast, etc.) are calculated in different regions of the image and gathered in one
large vector. The vector is then classied by a Support Vector Machine to obtain the estimated weather situation.
features to classify the image into one of the classes
C = {clear weather, light rain, heavy rain}. Section IV
shows detailed experimental classication results, discusses
the proposed features signicance and compares them to
standard image descriptors.
II. FEATURE EXTRACTION
A robust weather classication technique is depending on
reliable, strong environmental features. Single images, in
principle, provide sufcient information for that task.
We rst dene regions of interest (ROI) which will be
used for feature extraction. As depicted in Fig. 1, one
global ROI gathers information about the overall effect of
weather on the image. In addition, this region is divided into
twelve sub-ROI’s that cover the local (and distant-dependent)
effects in more detail. Within each ROI, several features are
evaluated: local contrast, minimum brightness, sharpness,
hue and s aturation, detailed below. As some features cannot
be computed pixelwise, a ROI is subdivided into 10x10 pixel
blocks, and each feature is computed in each block. All
features return values between 0 and 1 and vote for one bin
of their ROI and feature dependent histogram. Thus, there
are 13 ROIs with 5 histograms each.
Histogram bins do not describe the actual local features very
accurately, but on the other hand they are very robust in
terms of outliers and noise. As we are interested in the
overall distribution of features within the image, histogram
bins describe the image information appropriately. Beyond
that, the quantity of extracted features directly inuences the
complexity of classication in terms of accuracy, compu-
tational time and number of required training images. We
discretize the feature histograms into 10 bins as it turned out
to be a good compromise between descriptors accuracy and
classication effort. The proposed features are presented in
greater detail in the following.
A. Contrast and Minimum Brightness
In clear weather conditions, the radiance from a scene
point reaches the observer unaltered. However, dealing with
adverse weather conditions, atmospheric effects cannot be
neglected anymore. In recent literature [14], [15], [16], [9],
the Koschmieder Model has been established as a description
of the atmospheric effects of weather on the observer
E = I
ρe
β(λ)d
+ I
(1 e
β(λ)d
), (1)
where E is the pixel brightness, I
is the background
intensity, ρ is the normalized radiance of a scene point [15],
d is the distance and β(λ) is the scattering coefcient. Note
that β is a function of the wavelength λ whose relationship
is given by Rayleighs law [14], [16]. For small atmospheric
particles like fog or haze, β can be assumed t o be constant.
Equation (1) implies that the irradiance and thus the bright-
ness observed by each pixel of the sensor is altered by two
fundamental scattering phenomena: attenuation and scattered
light. In other words, light directly transmitted from the scene
point will be exponentially attenuated and superimposed by
the environmental illumination that will be refracted towards
the observer. For scene points with a low normalized radiance
ρ, the direct transmission term can be neglected. Hence, we
expect an increasing pixel brightness due to scattered light:
E
min
(1 e
βd
). (2)
In other words, the minimum local pixel brightness will
increase with β and d according to the second term in the
right-hand side of (1).
The local contrast C can be dened as
C =
E
max
E
min
E
max
+ E
min
, (3)
where E
max
and E
min
are the local extrema of the pixel
brightness. To increase robustness of contrast estimation, we
determine brightness extrema by averaging the darkest and
799

brightest pixels within the ROI’s. Inserting (1) in (3) yields
C =
ρ
max
ρ
min
ρ
max
+ ρ
min
+ 2(e
βd
1)
. (4)
As a result, the local contrast solely depends on scene
point properties (which remain constant), distance d and the
scattering coefcient β.
B. Sharpness
Clearly distinguishable objects under friendly weather
conditions are expected to have sharp edges with large
contrast differences. In addition to the contrast feature dis-
cussed above, a gradient based method, called the Tenengrad
Criteria [18], is used to determine the sharpness of the test
images. It is based on an average determination of the sobel
gradient magnitude
T =
P
i
q
S
2
X
(i) + S
2
Y
(i)
P
i
1
, (5)
with i = [1..all pixels] and the sobel lter responses being
S
X
, S
Y
. This method originates from autofocusing tasks
where two images with identical scene information are
evaluated according to their sharpness. It fails when applied
to images with different context due to the following occur-
rences: contrast-variance and edge-quantity-variance. Ferzli
and Karam [6] proposed a perceptual-based sharpness metric
which is invariant to contrast and edge quantity. Similar to
their approach, but with slight variations in detecting edge
pixels and weighting the inuence of contrast, we dene the
Advanced Tenengrad Criteria as
T
adv
=
P
i
δ
i
ρ(i)
q
S
2
X
(i) + S
2
Y
(i)
P
i
δ
i
, (6)
where δ
i
= 1 if pixel i is an edge pixel (0 otherwise) and
ρ(i) is a weighting factor that is assumed to be inversely
proportional to the local contrast.
C. Color Features
Grayscale features are widely used for image processing
tasks that range from low level algorithms to highly so-
phisticated modules, though there is growing attention to
color information [20], [7] in feature extraction and tracking
topics. Dealing with adverse weather conditions and limited
visibility where the signicance of features decreases, we
attach high importance to additional color information. We
extract hue and saturation from the HSV color space. For
robustness, local mean values are taken in each 10x10 pixel
block.
For all ROIs and features, their values are extracted block-
wise and summarized by a histogram. Then, we combine all
histograms into one extended descriptor vector, so we get
vector v = (v
1
, ..., v
n
) with n = (13 ROIs) (5 features)
(10 bins) = 650 scalar elements describing the image.
III. CLASSIFICATION
In this section, we will use the extended descrip-
tor vector v as described in the last section to de-
cide on the image class. In our case, the classes corre-
spond to weather situations which we divide into C =
{clear weather, light rain, heavy rain}. Thus, the problem of
classication can be thought of as nding some function f
that maps from descriptor space D into the classes C with
c = f (v), where f : D C.
For a descriptor space with a small number of dimensions,
such a function f can be designed by hand, whereas for
high-dimensional descriptor spaces (e.g. discussed problem:
D = R
650
) this becomes nearly impossible for a human.
The machine l earning framework can be used to nd such
a function from training examples. Numerous methods have
been proposed [4] using techniques like k-Nearest-Neighbor,
Decision Trees, Neural Networks and Support Vector Ma-
chines (SVM).
As SVMs are simple, fast, and powerful, we decided to use
them as our learning and classication method. In principle,
a linear SVM generates a hyperplane in the descriptor
space D and classies descriptors by calculating on which
side of the hyperplane the descriptor vector (=point) lies.
Mathematically, the hyperplane is represented by its normal
vector w with offset b, then for a given descriptor v a s core
is calculated by s = w
T
v b and the nal decision is
(s 0). As a hyperplane can only separate two classes,
several hyperplanes are needed for the multiclass case, and
the scores from each hyperplane have to be combined to get
the nal classication.
The hyperplane parameters w and b are optimized in the
learning stage to separate the two classes as far as possi ble.
After training, the weights vector w can be evaluated to
get the signicance of single features for the classication
outcome. Huge values correspond to discriminant features,
whereas small values i ndicate weak features.
One of the advantages of SVMs is that kernel methods
can be incorporated in the algorithm. With them, non-
linear decision boundaries can be found. We tested two very
common kernels, linear and RBF (Radial Basis Functions),
with the result that RBF may outperform the linear kernel.
However, since one parameter for the RBF kernel has to be
optimized manually and our descriptor space is big enough
that linear separation is sufcient, we preferred applying a
linear kernel.
IV. EXPERIMENTS
In order to overcome the problem of limited image data
within the widespread eld of adverse weather, we built up
a database with video sequences (currently 150 sequences
500000 single images), labeled according to their partic-
ular weather conditions. We randomly selected images from
the database to build up our xed training and testing data
sets. We ensured that no image is used for both training and
testing at the same time, as well as we ensured equal amount
of images for each class.
800

Clear
Light
rain
Heavy
rain
Fig. 2. Some example images from our image database.
TABLE I
CLASSIFICATION RESULTS FOR SUBSET 1 (EXPRESSWAY ONLY), SUBSET 2 (+ RURAL SCE NES) AND SUBSET 3 (+ RURAL AND URBAN SCENES). ROW S
CONTAIN THE CORRECT CLASSES (THEY SUM UP TO 180, 300 AND 450 RESPECTIVELY), COLUMNS THEIR CLASSIFICATION RESULTS.
SUBSET 1 SUBSET 2 SUBSET 3
clear light rain heavy rain clear light rain heavy rain clear light rain heavy rain
clear 178 2 0 275 24 1 411 39 0
light rain 2 178 0 17 253 30 67 341 42
heavy rain 0 7 173 0 24 276 5 47 398
180 images for each class 300 images for each class 450 images for each class
Total error rate: 2.04% Total error rate: 10.67% Total error rate: 14.81%
(correct: 529, wrong: 11) (correct: 804, wrong: 96) (correct: 1150, wrong: 200)
A. Classication results for linear SVM
We compose three subsets with increasing demands on
the classication by expanding the environmental conditions
from expressway only to all possible scenery:
SUBSET 1: This subset is limited to expressway scenes
only with altogether 1080 images.
SUBSET 2: Here, we expanded the experiments to rural
environments, taking 900 expressway scenes and 900
rural scenes into account.
SUBSET 3: The last subset is similar to SUBSET 2 but
with additional 900 images of urban environments.
Fig. 2 shows some example images which illustrate the
difculty of the task. Each category contains images of a
large range of brightness and color values, so any single
feature would not be sufcient to detect the weather situation
with an acceptable condence.
Experiments reveal that for expressway scenes accurate
classiaction is achieved. We investigate the r esults in Table I
in more detail by applying binary classication to the image
sets, that means we only take images from 2 classes. Images
that have been classied to the omitted class before, are
reassigned to the remaining two classes. It turns out that
for subset 1 we achieved an error-free classication between
clear and heavy rain (correct: 360, wrong: 0, error rate: 0%).
The error rate between clear and light rain is 1.1% (correct:
356, wrong: 4, error rate: 1.1%). The most uncertain decision
is between light rain and heavy rain (correct: 353, wrong:
7, error rate: 1.9%). Even humans would not unanimously
agree on the correct category of images of these two classes,
as the border between light and heavy rain is uent. Fig. 3(a)
shows the corresponding ROC curves, which emphasizes the
quality of the classication result.
With increasing demands due to changing environments
(subset 2 and subset 3), the accuracy decreases. This is based
on the fact that distance dependent features can hardly be
extracted from rural scenes since the sub-ROIs do not reect
a robust distance estimation anymore (obstacles in front of
the vehicle, closed scene with surrounding objects...).
Classwise comparison of the results for s ubset 2 show
error rates up to 10.2%. Remarkably, classication between
clear weather and heavy rain is still very accurate (correct:
801

subset 01 (expressway)
P(TP)
0.50
0.60
0.70
0.80
0.90
1.00
P(FP)
0.00 0.10 0.20 0.30 0.40 0.50
light vs. heavy
clear vs. heavy
clear vs. light
(a) subset 1
P(TP)
0.50
0.60
0.70
0.80
0.90
1.00
P(FP)
0.00 0.10 0.20 0.30 0.40 0.50
light vs. heavy
clear vs. heavy
clear vs. light
subset 02 (expressway & rural)
(b) subset 2
subset 03 (expressway, rural & city)
P(TP)
0.50
0.60
0.70
0.80
0.90
1.00
P(FP)
0.00 0.10 0.20 0.30 0.40 0.50
light vs. heavy
clear vs. heavy
clear vs. light
(c) subset 3
clear vs. light rain
P(TP)
0.50
0.60
0.70
0.80
0.90
1.00
P(FP)
0.00 0.10 0.20 0.30 0.40 0.50
subset 01
subset 02
subset 03
(d) clear vs. light
light rain vs. heavy rain
P(TP)
0.50
0.60
0.70
0.80
0.90
1.00
P(FP)
0.00 0.10 0.20 0.30 0.40 0.50
subset 01
subset 02
subset 03
(e) light vs. heavy
clear vs. heavy rain
P(TP)
0.50
0.60
0.70
0.80
0.90
1.00
P(FP)
0.00 0.10 0.20 0.30 0.40 0.50
subset 01
subset 02
subset 03
(f) clear vs. heavy
Fig. 3. Receiver Operating Characteristic (ROC) curves for classication results. Axis crop ped as to show the top left quarter only.
597, wrong: 3, error rate: 0.5%). In subset 3, classication
between clear and heavy rain again remains at a low error
rate of less than 1% (correct: 892, wrong: 8, error rate: 0.9%).
Most misclassications arise from situations that are hard
to dene, i.e. that are somewhere inbetween clear and
light rain or light rain and heavy rain. The second main
source of error are outlier images where bridges and other
objects confuse the extracted image statis tics. However, since
weather conditions do not change instantly, it is possible to
classify multiple times and combine the results to improve
accuracy. Further optimization is possible by using non-linear
SVM that better suit the application.
Execution times are 1.8s per image on a Centrino
2.4GHz running Matlab. Nearly all time is spent on feature
extraction, which can be reduced signicantly by using
optimized code (possibly on the GPU for real time usage).
Anyway, since high measuring rates > 0.5Hz are not nec-
cessary, this approach is already applicable for DAS.
B. Feature evaluation
To benchmark our approach compared to existing meth-
ods, the proposed features are evaluated in regard to their
signicance for the classication decision as well as their
overall performance. However, studying the effects of omit-
ted features on classication results leads to a known
problem with linear SVM-kernels: If the dimensionality of
descriptor space D drops below a lower bound, a linear
feature separation cannot be drawn anymore. For that reason,
we used a non-linear RBF kernel for subsequent evaluation.
In section II, we proposed a novel image descriptor for
the task of weather classication. In oder to benchmark its
performance, we additionally extracted color wavelets as well
as a combination of SIFT features and color histograms and
compared the classication results. As depicted in Fig. 4(a),
the proposed features clearly outperforms both standard
image descriptors.
Low error rates in SVM classication can only be achieved
with optimal feature selection. As mentioned in section III,
parameter w of the SVM tells us the signicance of each
dimension of descriptor space D. In our experiments, all
feature weights are evenly distributed, that means not one
feature alone or any combination of some features is able
to achive high discrimination, the descriptiveness lies in
the combination of all proposed features. We veried these
results by omitting single features and running tests again.
Altogether, there are
P
5
i=1
5
i
= 31 possible combinations
of the proposed features. Fig. 4(b) shows the classication er-
ror for all possible feature combinations. It can be observed,
that all classication errors considering the same amount are
close to their mean, whereas a classication improvement can
only be achieved by taking additional features into account.
802

Citations
More filters
Posted Content

Auxiliary Tasks in Multi-task Learning.

TL;DR: The proposed deep multi-task CNN architecture was trained on various combination of tasks using synMT, and the experiments confirmed that auxiliary tasks can indeed boost network performance, both in terms of final results and training time.
Journal ArticleDOI

Two-Class Weather Classification

TL;DR: A new data augmentation scheme to substantially enrich the training data, which is used to train a latent SVM framework to make the solution insensitive to global intensity transfer, is proposed.
Proceedings ArticleDOI

Weather classification with deep convolutional neural networks

TL;DR: This paper studied the behavior of all the layers of the Convolutional Neural Networks, the approach outperforms the state of the art by a huge margin in the weather classification task, and interesting findings are discussed.
Journal ArticleDOI

Automated driving recognition technologies for adverse weather conditions

TL;DR: Challenges to identifying adverse weather and other situations that make driving difficult, thus complicating the introduction of automated vehicles to the market are discussed.
Journal ArticleDOI

A CNN–RNN architecture for multi-label weather recognition

TL;DR: Wang et al. as discussed by the authors proposed a CNN-RNN based multi-label classification approach, where the convolutional neural network (CNN) was extended with a channel-wise attention model to extract the most correlated visual features.
References
More filters
Proceedings ArticleDOI

Histograms of oriented gradients for human detection

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.
Proceedings ArticleDOI

Object recognition from local scale-invariant features

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.
Proceedings ArticleDOI

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

TL;DR: This paper presents a method for recognizing scene categories based on approximate global geometric correspondence that exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories.
Proceedings ArticleDOI

Video Google: a text retrieval approach to object matching in videos

TL;DR: An approach to object and scene retrieval which searches for and localizes all the occurrences of a user outlined object in a video, represented by a set of viewpoint invariant region descriptors so that recognition can proceed successfully despite changes in viewpoint, illumination and partial occlusion.
Related Papers (5)
Frequently Asked Questions (11)
Q1. What are the contributions in "Classification of weather situations on single color images" ?

The authors present an approach that is able to distinguish between multiple weather situations based on the classification of single monocular color images, without any additional assumptions or prior knowledge. 

Future work will expand C by adding other weather situations like fog to their database. Improvements of the overall classification results could be achieved by further in-depth studies to non linear SVM-kernels. 

The approach achieves low error rates of less than 1% for the distinction between clear weather and heavy rain and even acceptable error rates for the three-class-case. 

In principle, a linear SVM generates a hyperplane in the descriptor space D and classifies descriptors by calculating on which side of the hyperplane the descriptor vector (=point) lies. 

Specialized methods on certain weather situations can then be invoked based on the classification result to improve existing vision algorithms. 

the authors combine all histograms into one extended descriptor vector, so the authors get vector v = (v1, ..., vn) with n = (13 ROIs) ∗ (5 features) ∗ (10 bins) = 650 scalar elements describing the image. 

As mentioned in section III, parameter w of the SVM tells us the significance of each dimension of descriptor space D. In their experiments, all feature weights are evenly distributed, that means not one feature alone or any combination of some features is able to achive high discrimination, the descriptiveness lies in the combination of all proposed features. 

Within each ROI, several features are evaluated: local contrast, minimum brightness, sharpness, hue and saturation, detailed below. 

Equation (1) implies that the irradiance and thus the brightness observed by each pixel of the sensor is altered by two fundamental scattering phenomena: attenuation and scattered light. 

Inserting (1) in (3) yieldsC = ρmax − ρminρmax + ρmin + 2(eβd − 1) . (4)As a result, the local contrast solely depends on scene point properties (which remain constant), distance d and the scattering coefficient β. 

The authors investigate the results in Table The authorin more detail by applying binary classification to the image sets, that means the authors only take images from 2 classes.