scispace - formally typeset
Open AccessJournal ArticleDOI

Driver Fatigue Classification With Independent Component by Entropy Rate Bound Minimization Analysis in an EEG-Based System

Reads0
Chats0
TLDR
The results of this study suggest the method could be utilized effectively for a countermeasure device for driver fatigue identification and other adverse event applications.
Abstract
This paper presents a two-class electroencephal-ography-based classification for classifying of driver fatigue (fatigue state versus alert state) from 43 healthy participants. The system uses independent component by entropy rate bound minimization analysis (ERBM-ICA) for the source separation, autoregressive (AR) modeling for the features extraction, and Bayesian neural network for the classification algorithm. The classification results demonstrate a sensitivity of 89.7%, a specificity of 86.8%, and an accuracy of 88.2%. The combination of ERBM-ICA (source separator), AR (feature extractor), and Bayesian neural network (classifier) provides the best outcome with a p -value < 0.05 with the highest value of area under the receiver operating curve (AUC-ROC = 0.93) against other methods such as power spectral density as feature extractor (AUC-ROC = 0.81). The results of this study suggest the method could be utilized effectively for a countermeasure device for driver fatigue identification and other adverse event applications.

read more

Content maybe subject to copyright    Report

© 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for
all other uses, in any current or future media, including reprinting/republishing this material for
advertising or promotional purposes, creating new collective works, for resale or redistribution to
servers or lists, or reuse of any copyrighted component of this work in other works.

1
AbstractThis paper presents a two-class
electroencephalography (EEG)-based classification for classifying
of driver fatigue (fatigue state vs. alert state) from 43 healthy
participants. The system uses independent component by entropy
rate bound minimization analysis (ERBM-ICA) for the source
separation, autoregressive (AR) modeling for the features
extraction and Bayesian neural network for the classification
algorithm. The classification results demonstrate a sensitivity of
89.7%, a specificity of 86.8% and an accuracy of 88.2%. The
combination of ERBM-ICA (source separator), AR (feature
extractor) and Bayesian neural network (classifier) provides the
best outcome with a p-value < 0.05 with the highest value of area
under the receiver operating curve (AUC-ROC=0.93) against
other methods such as power spectral density (PSD) as feature
extractor (AUC-ROC=0.81). The results of this study suggest the
method could be utilized effectively for a countermeasure device
for driver fatigue identification and other adverse event
applications.
Index Terms
electroencephalography (EEG), driver fatigue,
autoregressive (AR) model, independent component analysis,
entropy rate bound minimization, Bayesian neural network.
I. I
NTRODUCTION
RIVER related fatigue is a leading factor in road accidents
that can lead to serious injuries and fatalities in
transportation [1]. Driver fatigue has been described as a
feeling of tiredness and reduced alertness when driving which
is associated with drowsiness, and impairs capability and
willingness to perform the driving task [2]. The symptoms of
driver fatigue include increased feelings of tiredness
Manuscript received June 15, 2015; revised January 21, 2016; revised
February 10, 2016; accepted February 15, 2016
Rifai Chai, Ganesh R. Naik, Tuan N. Nguyen, Sai Ho Ling, and Hung T.
Nguyen are with Centre for Health Technologies, Faculty of Engineering and
Information Technology, University of Technology, Sydney, NSW 2007,
Australia. (e-mail: Rifai.Chai@uts.edu.au; Ganesh.Naik@uts.edu.au;
TuanNghia.Nguyen@uts.edu.au,Steve.Ling@uts.edu.au;
Hung.Nguyen@uts.edu.au).
Yvonne Tran is with Centre for Health Technologies University of
Technology, Sydney and the Kolling Institute of Medical Research, the
University of Sydney (e-mail: Yvonne.Tran@uts.edu.au).
Ashley Craig is with the Kolling Institute of Medical Research, Sydney
Medical School, The University of Sydney (e-mail: a.craig@sydney.edu.au)
(yawning, sore or heavy eyes), slower reaction time and lack
of concentration during driving and reduced control of speed
of the vehicle [3, 4]. Fatigue is believed to contribute to 14-
20% of motor vehicle accidents [5, 6], and it not only poses a
risk to drivers themselves in terms of injuries and fatalities,
but it could also results in injury to passengers, other vehicle
drivers, cyclists and pedestrians. As a result an automated
driver fatigue counter measurement/monitor system with
robust and reliable fatigue classification accuracy is required
as a strategy to reduce fatigue related risks on the road [6, 7,
8].
Currently, measurements of fatigue include the following:
(i) psychological measurements employing psychometric
questionnaires that assess an individual’s self-reported fatigue
[9, 10], (ii) video measurement as an indicator of performance
such as facial expression, reaction time, steering errors and
lane deviation [11], and (iii) physiological measurements such
as electroencephalography (EEG) [4, 8, 12] for brain signal
measurement, electrooculography (EOG) [13, 14] and other
eye tracking systems [15] for eye movement detection, and
finally electrocardiography (ECG) to detect the heart rate or
heart rate variability changes associated with fatigue [16, 17].
Using psychological self-report for a fatigue counter
measurement device during driving would arguably be
problematic as it relies on an individual’s potentially
unreliable/ biased subjective feedback, and the strategy itself
may distract the driver, and it requires time to validate the
questionnaire of the person as an overall indicator of the
fatigue symptom [1, 18]. Moreover, video recording the
driver’s face during driving is a non-direct measurement of
detecting fatigue that may lead to privacy issues. Physiological
measurements of fatigue using EOG, ECG and EEG have
been explored widely [19]. For example, an increase in eye
blink rates using EOG during driving may indicate impending
fatigue [13]. Changes in heart rate variability (HRV) have
been shown to be related to fatigue [16] while detection of the
changes in brain activity using EEG has also been shown to be
related to the fatigue state [4, 8]. EEG is considered to be a
significant and reliable method of detecting fatigue, as it
directly measures neurophysiological activity in the human
brain [4]. Accordingly, this paper explores strategies for
improving the fatigue vs. alert classification in an EEG-based
Driver Fatigue Classification with Independent
Component by Entropy Rate Bound
Minimization Analysis in an EEG-based System
Rifai Chai, Member, IEEE, Ganesh R. Naik, Senior Member, IEEE,
Tuan N. Nguyen, Senior Member, IEEE, Sai Ho Ling, Senior Member, IEEE, Yvonne Tran,
Ashley Craig and Hung T. Nguyen, Senior Member, IEEE
D

2
sys
tem.
EEG provides a high temporal resolution of brain activity in
which multiple neural generators may be simultaneously
active [20]. As a result, multivariate techniques can unearth
more complex connections between the dependent and
independent variables in EEG data. Independent components
analysis (ICA) is one of these multivariate techniques and is
typically used as a source separation and estimation tool [21,
22]. ICA is one of the so called blind source separation (BSS)
techniques that utilize both lower and higher order statistics to
estimate sets of linearly mixed variables into their independent
components (ICs). In the recent past, ICA has been
extensively used for EEG signal processing, especially for
BCI applications [23]. One of the advantages of using ICA for
EEG is that it decomposes the linearly mixed neural activities
into its constituent independent components (ICs). Moreover,
an advantage of using ICA-based methods for EEG analysis is
that no explicit prior knowledge about brain activity is needed
to estimate the source components [24, 25].
Most of the existing ICA algorithms exploit both higher
order and second order statistics to minimize the non-Gaussian
aspect of the sources. Recently, ICA by entropy rate bound
minimization (ICA-ERBM) has emerged as an effective
source separation technique. The algorithm takes both non-
Gaussian property and sample correlation into account by
minimizing mutual information rates. It is originally
introduced as a full BSS, which results in improved general
temporal structure of sources, e.g., second-order white noise,
but with higher-order correlated sources [26]. This paper uses
the ICA-ERBM for the separation of EEG fatigue data.
The functional basic components of EEG-based fatigue
classification is similar to other EEG-based classifications,
and consists of several elements including: (i) brain signal
measurement and data acquisition using EEG technology, (ii)
computational intelligence or algorithms such as pre-
processing, features extraction and classification [27, 28]. For
the features extraction in EEG analysis, power spectral density
(PSD) has been used widely, especially in the study of fatigue
[4]. The power spectrum estimation converts the time domain
into the frequency domain of EEG data. This study explores
the autoregressive (AR) model as it is an effective tool in EEG
feature extraction algorithms, and will be used as a
comparison to the PSD method [29, 30, 31]. For the
classification algorithm, Bayesian neural networks capable of
providing optimal structure [32] will be used to classify the
two-state outputs classification (fatigue state vs. alert state).
The main contributions of this paper are the novel
combination source separation (ICA-ERBM) and EEG feature
extraction components which have not been explored
previously for fatigue classification with the goal of improving
classification accuracy. These components/algorithms include
the use of entropy rate bound minimization as a source
separation technique, the AR-modelling as the feature
extraction algorithm and the Bayesian neural network for the
classification algorithm.
The structure of this paper is as follows: section II covers
the methodology: general structure, data collection, feature
extraction methods and classification algorithms. Section III
describes results, followed by section IV for discussion and
section V for the conclusions.
II. B
ACKGROUND AND METHODOLOGY
A. Genera
l Structure
The components for the EEG-based fatigue classification
system presented in this paper are shown in Fig. 1. First EEG
data were collected in a simulated driver fatigue study,
followed by a first signal pre-processing module for removing
EEG artifact, a second signal pre-processing module as an
additional source of separation processing, and third, a pre-
processing module of window segmentation. The next module
includes a features extraction module that transforms the
signals into useful features. The features are processed into a
classification algorithm that includes optimization, training
and classification tasks. The classification comprises two-
state outputs: fatigue state or alert (non-fatigue) state. The
desired output for the fatigue state of the neural network uses
value of one and the desired output for non-fatigue or alert
state of the neural network uses value of zero.
Fig. 1. Components of EEG-based fatigue classification system
B. EEG Experiment and Pre-processing
This EEG experiment used the dataset of forty-three healthy
participants aged 18 to 55 years, obtained from a previous
experimental study [4, 12]. Participants were given
information of the study and informed consent was obtained
before they commenced the experiment. This study was
approved by The Human Research Ethics Committee of
University of Technology Sydney (UTS). The study involved
a repeated measures experimental intervention whereby
baseline EEG and subjective levels of fatigue using
psychometric assessment were taken, then an intervention task
of monotonous simulated driving task followed by post EEG
measures and post-subjective levels of fatigue [12]. The
Divided Attention Steering Simulator (DASS) from Stowood
Scientific Instruments was used as a driver simulator task.
Experiments were conducted in a noise-, stimulus-, and
temperature-controlled laboratory. Participants were asked to
ensure that they kept driving at the centre of the road in the
simulation task. The DASS also required participants to
perform a reaction time response to a target number that

3
ap
peared in any of the four corners of the computer screen,
and these were shown at random times during driving.
The experiment was terminated decrements in performance
were detected, such as if they were driving off the road in the
simulation driving task for greater than 15 seconds or if
participants showed consistent facial signs indicating fatigue
(such as head nodding and extended eyes closure) using video
monitoring. The use of ‘15 seconds off the road’ condition
was to allow a reasonable time for the video monitoring
assessment to check the fatigue occurrence. Although the
experiment was terminated if the deviations off the road
occurred for greater than 15 seconds, this may not be where
the fatigue onset had occurred. In fact, fatigue was most likely
to have occurred before this point. Video monitoring was used
during the real time recording to check for consistent
physiological signs of fatigue such as tired eyes, head nodding
and extended eye closure. The video monitoring is a
subjective assessment. Therefore a post recording validation is
needed to verify fatigue occurrence [4, 33].
Fatigue occurrence was validated using three methods: (i)
monitoring for consistent physiological signs of fatigue such
as tired eyes, nodding and extended eye closure verified
further with EOG analysis of blink rate and eye closure, (ii)
performance decrements such as deviation off the road and
(iii) subjective psychometric measures using a validated
fatigue questionnaire called the Chalder Fatigue Scale and the
Stanford Sleepiness scale, which measures a person’s
perception of how drowsy they feel [12]. The validation of
fatigue in these participants has been previously reported in
studies [4, 12]. The maximum time for the simulated driving
was specified at 2 hours. EEG signals were recorded by
attaching a 32-channel EEG system, the Active-Two system
from Biosemi with the electrode positions based on the
International 10-20 system. These positions are: FP1, AF3, F7,
F3, FC1, FC5, T7, C3, CP1, CP5, P7, P3, PZ, PO3, O1, OZ,
O2, PO4, P4, P8, CP6, CP2, C4, T8, FC6, FC2, F4, F8, AF4,
FP2, FZ and CZ. The recorded EEG data was down sampled
from 2048Hz to 256Hz.
For the alert group of data, the first 5 mins of EEG data
when starting the driving simulation task was chosen while the
fatigue group of data was selected from the last 5 mins of EEG
data before the task was terminated, where consistent signs of
fatigue were identified and additionally verified as fatigue
using the EOG signals. Then in each group of data (alert and
fatigue), 20s segment s were taken and the first 20s segments
were selected for further analysis with the least movement
artifact. As a result, 20s of alert state and 20s of fatigue state
data were available from each participant.
The data was then processed in the signal pre -processing
module. Here, the second-order blind identification (SOBI)
and canonical correlation analysis (CCA) were used for
removing artifact and disturbance related to sources such as
eye activity, muscle activity and heart signals. The data were
then fed to the source separation using ICA-ERBM. The novel
combination of the source separation (ICA-ERBM) and EEG
feature extraction method used for the classification is
explained in detail next.
C. Source separation using ICA-ERBM
ICA by ERBM exploits both sample correlation and non-
Gaussianity and takes this into account by minimizing mutual
information rate. It is originally introduced as a full BSS
(FBSS) algorithm [26]. Let,
N statistically independent, zero
mean source
T
N
tststs )](,),([)(
1
be mixed with an
NN
mixing matrix
A
such that we obtain the mixture
T
N
txtxtx )](,),([)(
1
as )()( tAstx , where
T
and
t
denote
the transpose and time index respectively. ICA separates the
mixture using
)()( tWxty
where
T
N
tytyty )](,),([)(
1
and
W
is the unmixing or separation matrix. ICA assumes the
sources are independent identically distributed (IID), hence, a
cost for achieving the separation of these N independent
sources is the mutual information
);,,;(
1 N
yyI
among
N
random variables
n
y
,
Nn ,,1
, which is:
1
1
(;, ,; ) ( )logdet() ()
N
Nn
n
Iy y Hy W Hx

(1)
where )(
n
yH represents the entropy of the nth individual
separated source, and entropy of observation )(xH is constant
Ch1
Ch2
Ch3
Ch4
Ch5
.
.
Ch32
20s
0.25s
1
2
3
73
gap
2s
Fig. 2. Data Segmentation of Fatigue study

4
w
ith respect to the unmixing matrix W.
()Hx
is the entropy of
any individual member of the separated process. However, a
new cost function is needed since the equation (1) cannot
obtain most of the temporal information of sources. The new
cost function is therefore given as:
1
1
(;, ,; ) ( )logdet() ()
N
rNrn r
n
Iy y Hy W Hx

(2)
where
()lim (1),...,()
rn n nt
Hy Hy yt t

is the entropy
rate of the nth process of
n
y
and entropy rate of observation
() lim (1),...,()
r t
Hx Hx xt t

of the observed vector-
valued process x is constant with respect to the unmixing
matrix W.
()
r
Hx
is the entropy rate of the separated process
of the individual. Equation (1) is modified using the method
proposed by Li and Adali [34] to obtain new entropy estimator
and cost function. The new cost function is explained as:
)det(log)(),,,(
1
1
WvHppWJ
N
n
nN
(3)
where
xwyqtyqatv
T
nnn
p
q
nn
),()()(
1
0
,
the n-th is separated source, and

T
nnn
paaa )1(,),0( are
the filter coefficients. Later, the algorithm is optimized to
obtain a new W, which minimizes the mutual information rate.
The ICs,
ˆ
()
s
t
are recovered using the equation
ˆ
() ()
tWxt
,
where
W
and x(t) are the unmixing matrix and recordings
(mixtures) respectively. More detail of the algorithm is
provided elsewhere [26].
D.
Data Segmentation and Feature Extraction
Before performing feature extraction, ICA-ERBM separated
data are segmented as illustrated in Fig. 2. A moving window
of 2s with overlapping 1.75s was applied to the 20s segments
which provided 73 overlapping segments on each state. With
the 43 participants, a total of 3139 units of datasets were
formed for the alert state and another 3139 units for the fatigue
state.
An autoregressive (AR) model was applied as a features
extraction algorithm in combination with ICA-ERBM
separated sources in this study. AR modelling has been used in
EEG studies as an alternative to Fourier-based method [29, 30,
31]. The advantage of AR modelling is its inherent capacity to
model the peak spectra that are characteristic of the EEG
signals and it is an all-pole model making it efficient for
resolving sharp changes in the spectra. The fast Fourier
transform (FFT) is a widely used nonparametric approach that
can provide accurate and efficient result, but which does not
have spectral resolution for short data segment [35]. Further,
other EEG classifications used previously have shown that the
AR modelling achieved a better result [29, 36, 37].
In AR modelling, it is assumed to be a random process that
is independent of the previous value of the signal. The Burg
method is the most popular of the AR methods that have been
used that recursively estimates the reflection coefficients of an
AR lattice filter by minimizing the mean of forward and
backward least squares linear prediction error. This method is
used in this paper to estimate the coefficients of the AR.
AR modelling requires the selection of the model order
number. The best AR model order number represents a
consideration of both the signal complexity and the sampling
rate. If the AR model order is too low, the whole signal cannot
be captured in the model. On the other hand, if the model
order is too high, then more noise is captured [38]. In this
study, different AR order numbers were tested and the order
providing the best classification accuracy was the chosen AR
order number. The calculation of the AR modelling for ICA-
ERBM separated sources is as follows:
1
ˆˆ
() ( )( ) ()
P
k
s
takstket

(4)
where
ˆ
()
s
t represents the ICA-ERBM separated EEG data
(sources) at time (t), P is the order of the AR, e(t) represents
the white noise with, zero means error and finite variance, and
a(k) represents the AR coefficients which need to be estimated
from finite samples of data
12
ˆˆ ˆ
(), ( ), ( )
Nss s
.
For comparison purposes, power spectral density (PSD), a
popular feature extractor in fatigue studies, is also used in this
paper [4, 8]. The PSD of the Welch spectrum is given by:
1
1
ˆˆ
() ()
S
wl
l
Pf Pf
S
(5)
where the
ˆ
()
w
Pf
denotes the Welch PSD estimation,
ˆ
()
l
Pf
denotes the periodogram estimate of l-th segment and S
denotes the number of segments.
E. Classification Algorithm
One of the crucial issues in developing a neural network is
generalization, defined by how well the network can make
predictions for new cases that are not in the training data. A
network that is not complex enough may ignore the data,
leading to “under-fitting”, while a network that is too complex
may fit the noise, not just training data, leading to “over-
fitting”. The complexity of the network is concerned with the
network architecture and magnitudes of network weights and
biases.
Several frameworks have been proposed to prevent MLP
networks from under-fitting or over-fitting such as growing,
pruning, global searches, and early stopping. However, these
frameworks require intensive searching for network
parameters or do not make maximum use of the available data
[32, 39].
The Bayesian neural network structure uses a three layered
feed-forward structure and modeled by:
11
(, )
lm
kkkjjjii
ji
zxw fb wfb wx









(6
)
where f(.) denotes the transfer functions and hyperbolic
tangent function is used in this paper,
m denotes the input

Citations
More filters
Journal ArticleDOI

EEG-Based Spatio–Temporal Convolutional Neural Network for Driver Fatigue Evaluation

TL;DR: A novel EEG-based spatial–temporal convolutional neural network (ESTCNN) to detect driver fatigue that could automatically learn valid features from EEG signals, which outperforms the classical two-step machine learning algorithms.
Journal ArticleDOI

Driver Fatigue Detection Systems: A Review

TL;DR: This paper presents state-of-the-art review of recent advancement in the field of driver fatigue detection and various approaches have been compared for fatigue detection, and areas open for improvements are deduced.
Journal ArticleDOI

Driver Activity Recognition for Intelligent Vehicles: A Deep Learning Approach

TL;DR: A driver activities recognition system is designed based on the deep convolutional neural networks (CNN) to identify whether the driver is being distracted or not and the binary detection rate achieved 91.4% accuracy shows the advantages of using the proposed deep learning approach.
Journal ArticleDOI

Current Status, Challenges, and Possible Solutions of EEG-Based Brain-Computer Interface: A Comprehensive Review

TL;DR: This article provides a comprehensive review of the state-of-the-art of a complete BCI system and a considerable number of popular BCI applications are reviewed in terms of electrophysiological control signals, feature extraction, classification algorithms, and performance evaluation metrics.
Journal ArticleDOI

Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks.

TL;DR: An improvement of classification performance for electroencephalography-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants using autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm.
References
More filters
Journal ArticleDOI

An introduction to ROC analysis

TL;DR: The purpose of this article is to serve as an introduction to ROC graphs and as a guide for using them in research.
Book ChapterDOI

Neural Networks for Pattern Recognition

TL;DR: The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue.
Book

Neural network design

TL;DR: This book, by the authors of the Neural Network Toolbox for MATLAB, provides a clear and detailed coverage of fundamental neural network architectures and learning rules, as well as methods for training them and their applications to practical problems.
Proceedings Article

Independent Component Analysis of Electroencephalographic Data

TL;DR: First results of applying the ICA algorithm to EEG and event-related potential (ERP) data collected during a sustained auditory detection task show that ICA training is insensitive to different random seeds and ICA may be used to segregate obvious artifactual EEG components from other sources.
Journal ArticleDOI

Enhanced detection of artifacts in EEG data using higher-order statistics and independent component analysis

TL;DR: Simulations demonstrate that ICA decomposition, here tested using three popular ICA algorithms, Infomax, SOBI, and FastICA, can allow more sensitive automated detection of small non-brain artifacts than applying the same detection methods directly to the scalp channel data.
Related Papers (5)
Frequently Asked Questions (2)
Q1. What are the contributions in this paper?

This paper presents a two-class electroencephalography ( EEG ) -based classification for classifying of driver fatigue ( fatigue state vs. alert state ) from 43 healthy participants. 2 %. The combination of ERBM-ICA ( source separator ), AR ( feature extractor ) and Bayesian neural network ( classifier ) provides the best outcome with a p-value < 0. 05 with the highest value of area under the receiver operating curve ( AUC-ROC=0. 93 ) against other methods such as power spectral density ( PSD ) as feature extractor ( AUC-ROC=0. 81 ). The results of this study suggest the method could be utilized effectively for a countermeasure device for driver fatigue identification and other adverse event applications. 

Future research should focus on optimizing the above techniques for a wider pool of participants ( e. g. wider age range ) and also investigate the efficacy of the driver fatigue detection system in real time.