scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

A hybrid model combining neural networks and decision tree for comprehension detection.

08 Jul 2018-pp 1-7
TL;DR: This paper investigates the use of a hybrid model comprising multiple artificial neural networks with a final C4.5 decision tree classifier to investigate the potential of explaining the classification decision through production rules and the significant tree size questions the rule transparency to a human.
Abstract: The Artificial Neural Network is generally considered to be an effective classifier, but also a “Black Box” component whose internal behavior cannot be understood by human users. This lack of transparency forms a barrier to acceptance in high-stakes applications by the general public. This paper investigates the use of a hybrid model comprising multiple artificial neural networks with a final C4.5 decision tree classifier to investigate the potential of explaining the classification decision through production rules. Two large datasets collected from comprehension studies are used to investigate the value of the C4.5 decision tree as the overall comprehension classifier in terms of accuracy and decision transparency. Empirical trials show that higher accuracies are achieved through using a decision tree classifier, but the significant tree size questions the rule transparency to a human.

Summary (3 min read)

Introduction

  • Considered to be an effective classifier, but also a “Black Box” component whose internal behavior cannot be understood by human users.
  • Keywords—knowledge rule extraction, artificial neural networks, decision trees, backpropagation, comprehension, FATHOM, Silent Talker, non-verbal behavior.
  • Many studies have been conducted to compare decision trees with neural networks – a more recent study of multiple classifiers can be found in Delgado et al. [13].
  • The experimental study known as “Termites”, reported in [15] was used to identify whether high and low human comprehension associated multi-channels of non-verbal behaviour reside within a video-recorded British (UK-based/English speaking) sample of participants in a class room environment.

C. FATHOM

  • FATHOM utilises a bank of BPANN’s to capture, monitor and detect multiple channels of human non-verbal behaviour continuously.
  • The NVBs identified are then coded into individual channels and group channels i.e. all channels associated with eye behaviour.
  • States are typically collated over a time interval, e.g. 3 seconds grouped into one vector for Silent Talker– but this can be varied depending on the problem domain and FATHOM uses a 1-second interval.
  • Each vector is passed to the final BPANN Comprehension classifier which outputs a value between +1 and -1, indicating whether the person exhibits high comprehension (+1) or low comprehension (-1) during that period of time.
  • FATHOM simultaneously monitors 40 non-verbal behavioural channels that include 20 channels capturing facial features such as blushing and 16 channels capturing eye movement such as right eye looking left.

B. Study 2: Termites

  • Prior to the study a short learning topic was selected, which was a factual digital video on Termites with a total duration of 8 minutes 40 seconds.
  • The experts agreed both the question difficulty levels and the contents of the answer that the participants should provide.
  • Forty participants were selected to participate in the study, from academic and technical staff at the Manchester Metropolitan University (MMU) in the UK.
  • Each participant was invited to engage individually in a short learning task, which was comprised of watching a short video on Termites and then answering a small set of associated assessment questions whilst being video recorded.
  • The experimental methodology was to take the pair of datasets outlined in Section III and use them to train and evaluate C4.5 decision trees to replace the final stage BPANN classifier.

A. Using a Back Propagation ANN as the final classifier

  • For each study, FATHOM’s object locators and pattern detectors were used to extract and collate the non-verbal vector-based dataset for the purpose of training the final BPANN classifier.
  • For both studies, HIV Informed Consent and Termites, each vector in the final dataset covered a 1- second time period and represented the state changes for the compiled non-verbal channels over the period.
  • Each channel was normalised in the range +1 to -1.
  • The last attribute in each vector was the desired classification, with discrete values of +1 for comprehension and -1 for non-comprehension.
  • Automatic range (0±1/sqrt(fan-in)) where fan-in represents the number of inputs entering the neuron, also known as Weight initialisation.

B. Using a C4.5 Decision Tree as the final classifier

  • In order to use a decision tree as a comprehension classifier the final, BPANN, comprehension classifier shown in Figure 1 was replaced by the C4.5 decision tree algorithm.
  • The Weka implementation of C4.5, known as J48 was used.
  • 4) Finally, further experiments were performed varying confidence interval and MNO independently, to find the most severely pruned tree for each dataset, which, was not significantly worse than the baseline in terms of CA.

A. BPANN Comprehension Classifier

  • Table I shows the overall best BPANN Classifiers for both studies.
  • Comprehension (C%) and Non-comprehension (NC%) are the percentages of comprehension and non-comprehension vectors, respectively, which were classified correctly.
  • Overall % is the total normalised percentage of comprehension and non-comprehension vectors classified correctly.

B. C4.5 Comprehension Classifier

  • Table II shows the results of varying the Confidence Interval used for pruning in decision tree construction (Pruning CI).
  • The corrected re-sampled t-test was used for significance testing of the following results.
  • None of the other pruning levels produces an increase in CA.
  • Table III shows the results obtained from varying the minNumObj – the minimum number of instances per leaf.
  • There was a significant drop in the overall accuracy when the minimum number of objects increased to 5.

C. Discussion

  • Also a further experiment showed that an over-trained tree (rote-learning of the dataset) for the Termites dataset has a size of 6489 nodes; more than double the size of the optimally pruned tree.
  • These approaches should contribute to a move to AI classifiers, which are themselves comprehensible by the human population.

ACKNOWLEDGMENT

  • For Study 1, the authors wish to thank the women in Tanzania who participated in this research and especially for their willingness to be video-recorded.
  • Finally, the authors wish to thank Dr Fiona Buckingham for preprocessing the data sets used in this study.
  • Rothwell, J. Bandar, Z. O’Shea, J. McLean, d. “Charting the behavioural state of a person using a backpropagation neural network”.
  • A paradigm for cognition, also known as Kintsch, W. Comprehension.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Crockett, KA ORCID logoORCID: https://orcid.org/0000-0003-1941-6201,
O’Shea, James ORCID logoORCID: https://orcid.org/0000-0001-5645-2370,
Khan, Wasiq ORCID logoORCID: https://orcid.org/0000-0002-7511-3873
and Bandar, Zuhair (2018) A hybrid model combining neural networks and
decision tree for comprehension detection. In: 2018 International Joint Con-
ference on Neural Networks (IJCNN), 08 July 2018 - 13 July 2018, Rio de
Janeiro, Brazil.
Downloaded from:
https://e-space.mmu.ac.uk/624526/
Publisher: IEEE
DOI: https://doi.org/10.1109/IJCNN.2018.8489621
Please cite the published version
https://e-space.mmu.ac.uk

A hybrid model combining neural networks and
decision tree for comprehension detection.
James O’Shea
1
, Keeley Crockett
1
, Wasiq Khan
1
, Zuhair Bandar
2
1
School of Computing, Mathematics and Digital Technology
Manchester Metropolitan University
Chester Street, Manchester, M1 5GD, UK
2
Silent Talker Ltd, Manchester, UK
J.D.OShea@mmu.ac.uk
Abstract The Artificial Neural Network is generally
considered to be an effective classifier, but also a “Black Box”
component whose internal behavior cannot be understood by
human users. This lack of transparency forms a barrier to
acceptance in high-stakes applications by the general public. This
paper investigates the use of a hybrid model comprising multiple
artificial neural networks with a final C4.5 decision tree classifier
to investigate the potential of explaining the classification
decision through production rules. Two large datasets collected
from comprehension studies are used to investigate the value of
the C4.5 decision tree as the overall comprehension classifier in
terms of accuracy and decision transparency. Empirical trials
show that higher accuracies are achieved through using a
decision tree classifier, but the significant tree size questions the
rule transparency to a human.
Keywordsknowledge rule extraction, artificial neural
networks, decision trees, backpropagation, comprehension,
FATHOM, Silent Talker, non-verbal behavior.
I. INTRODUCTION
Non-Verbal Behaviour (NVB) was first studied systematically
by Charles Darwin and it has become a well-established part of
sciences such as biology and psychology. NVB consists of all
of the signs and signals - visual, audio, tactile and chemical
used by human beings to express themselves apart from speech
and manual sign language. It has been postulated that NVB
features are indicators of internal mental states, in particular
that they can be used to detect deception during interviews [1].
The first system to classify deceptive behaviour automatically,
Silent Talker (ST), used Artificial Neural Networks [2].
The Silent Talker architecture is highly flexible, and has been
adapted to monitor human comprehension in clinical trials
using non-verbal behaviour, employing ANN classifiers [3].
This version is known as FATHOM and currently work is
underway to incorporate FATHOM in an intelligent tutoring
system to provide round-the-clock support in the form of
learner-adaptive online teaching and learning tutorials. For
both FATHOM and ST there has been great interest in how the
system works i.e. which non-verbal indicators are actually
conveying the information to perform the classification. This is
particularly true for Lie Detection where interrogators are
looking for techniques they can apply during questioning and
suspects are looking for countermeasures they can use to avoid
being detected, for example, the well-known myth that looking
up and to the right indicates lying. Unfortunately, although
ANNs are powerful and versatile components in the AI
toolbox, they are also black boxes with no ready explanations
of how they achieve their ends and this has been a concern for
decades [4].
There are many other fields than education in which ANNs
may make high-stakes decisions and some progress has been
made in extracting rules from ANNs, although the degree to
which solutions to reasonably complex problems could be
understood by a non-AI specialist remains debatable. These
include classifying incipient faults in a power transformer [5],
hydrological modelling [6], Credit-Risk Evaluation [7] and
software cost estimation [8]. Some progress has been made in
extracting rules from recurrent neural networks by
transforming them to finite state machines [9], and [10] has
attempted to unify various neuro-fuzzy rule approaches for
ruled generation from recurrent and feedforward neural
networks in a single soft computing framework. Nevertheless,
in analysing a study of using neural networks to predict
academic performance of college students one year in advance,
Schneider et al. [11] observed that the basic problems of
communicating how they reach their conclusions in
meaningful terms has yet to be solved. They highlighted the
problem of explaining how a combination of currently high
subject performances could lead to an anticipated decrease in
the student’s achievement.
Decision trees [12] are highly effective for classification
tasks. They are also considered inherently transparent in
explaining how they reach their conclusions and may be
expressed in the form of production rules, which are generated,
by learning and reasoning from feature-based examples. Many
studies have been conducted to compare decision trees with
neural networks a more recent study of multiple classifiers
can be found in Delgado et al. [13]. In general, ANNs take
longer to train than decision trees due to the large number of
iterations required to ensure training reaches its full potential
[14]. Classification accuracy is largely dependent on the
dataset, but the transparent nature of decision trees gives
insight into the relationships between features [14]. In such a
domain as the analysis of NVB for comprehension detection,
decisions trees would provide an insight into key behaviours
and their interactions.
In the FATHOM architecture to date, classification of
comprehension / non-comprehension has been performed by a

single, final back propagation artificial neural network
(BPANN), preceded by layers of BPANNs that process
individual features. It is this final stage in which an
intervention should be possible to explain how these features
indicate comprehension / non-comprehension. Therefore, the
research questions addressed in the work presented in this
paper are:
1. Can the final ANN classifier be replaced by a decision tree
without loss of performance?
2. Can the decision tree be converted into comprehensible
production rules?
For a comprehensible rule set be possible, there must be a
limited number of rules for the human user to interpret and
these are proportional to the number of nodes in the tree.
Consequently, the primary interest in answering question 2 is
whether or not the tree has a manageable number of nodes.
To answer these research questions, two datasets collected
from FATHOM studies have been used. The experimental
study known as Termites”, reported in [15] was used to
identify whether high and low human comprehension
associated multi-channels of non-verbal behaviour reside
within a video-recorded British (UK-based/English speaking)
sample of participants in a class room environment. The
Termites exploratory study builds upon lessons learned in prior
work [3] where evidence was found that comprehension / non-
comprehension could be detected in an African female
population sample using a BPANN. This second study is
known as HIV Informed Consent.
This paper continues as follows: Section II reviews related
work in non-verbal behaviour and comprehension, and then
describes the FATHOM comprehension monitoring system
that uses BPANNs. Section III describes the comprehension
scenarios from which the two datasets used in this study were
obtained. Section IV and V describe the experimental
methodology and results. Conclusions and recommendations
for future work are presented in section VI.
II. RELATED WORK
A. Non-verbal Behaviour and Comprehnsion
Non-verbal behaviour comprises all of the signals or cues,
which human beings use to communicate, including visual,
audio, tactile and chemical components [16, 17]. During a
spoken dialogue, humans will often transmit non-verbal cues
before the verbal component [17], which can be used to detect
the senders state. It has been recognised that the face is a
source of rich information in terms of exhibiting meaningful
non-verbal behaviour. Little work has been done in the
automatic detection of classification of non-verbal behaviour.
Traditional methods employed human judges to code each
channel [18, 19]. However, each judge needs to be trained and
will provide a subjective opinion on the behaviour being
delivered by a particular channel. The process is time
consuming and an impossible task for a human to monitor
more than a limited number of channels accurately.
Two recent research strategies for acquiring non-verbal
behavioural cues have attracted attention in the literature; these
are Facial Microexpressions [1] and using the Microsoft Kinect
computer vision algorithm [20]. Micro-expressions are said to
be a small universal set of expressions of extreme emotion:
disgust, anger, fear, sadness, happiness, surprise, and contempt,
and a formalised method of encoding them was defined by
Ekman. The weaknesses of this technique are: its results are
largely based on highly artificial “posed” images using actors
or students provided with highly specific instructions [21,22]
or even training in how to produce facial actions [23,24], low
numbers of detectable Ekman micro-expressions in
spontaneous interviews [25] and a low Classification
Accuracy (CA) for those micro-expressions actually found
[26].
The Microsoft Kinect is primarily aimed at observing
whole body gestures in commercial video game applications.
However, there has been some interest in adapting it for NVB
research. For example, facial expressions have been
investigated as indicators of happiness, anger, sadness and
surprise that are integrated with the head pose changing
information to conceive the human interaction with 3D sensing
technology [27]. Although it should be noted that the
experimental results show emotional and head position change
instead of discrete level accuracy in terms of emotional
classification for the four aforementioned emotions. Likewise,
it was tested on a limited participants (i.e. 20) as well as
insufficient facial channels (i.e. 12). Typically, psychological
experiments do not provide any methodology for applying
these population sample differences to classify particular
individuals.
FATHOM (described in Section II, C) is distinguished
from these two techniques in three respects. Firstly, it uses
large numbers of features at a much finer level of granularity
than body gestures or facial expressions. Secondly, the domain
it operates in, human comprehension has not been a previous
subject of AI research. Thirdly, it does classify an individual
person’s state of comprehension / non-comprehension based
the non-verbal behaviour. Fathom does not rely on high frame-
rate cameras or constrained recording environments that
facilitate the setup of the technology, nor does it depend on
specialised hardware whose future availability may be
dependent on market forces (such as the game-oriented Kinect)
making it suited for everyday classroom use.
B. Non-Comprehension
Non-comprehension is regarded as a state of knowledge that
ranges from uncertainty to complete lack of understanding of
the materials under discussion” [28], i.e. an absence of
comprehension. The vast majority of research on
comprehension concerns reading and the understanding of
written text, initially by identifying the main ideas in the text
[28, 29]. A further elaboration is the view that successful
comprehension depends on the construction of a coherent
representation of text in memory [30]. Despite the traditional
bias towards reading texts, there has been interest for some
time in comprehending audio and video materials in language
teaching [31] and the informed consent process [32]. At a
more abstract level, the comprehension of metaphors, requires

thinking beyond the literal meaning in order to understand the
figurative meaning of the sentence [33] yet metaphors and
similes are frequently used by good teachers to convey
complex ideas. In the completely independent field of
advertising, a controlled degree of cognitive complexity is
considered desirable, where confronting an audience with a
cognitive challenge generates an appreciative payoff if they
can solve the challenge [34]. So the non-comprehension state
may be characterized as an inability to extract and characterize
the salient elements of information received, an inability to
model such information in a more abstract form or an inability
to generalize from a specific meaning to more abstract
thoughts about such a communication.
C. FATHOM
FATHOM utilises a bank of BPANN’s to capture, monitor
and detect multiple channels of human non-verbal behaviour
continuously. FATHOM has been successfully shown to detect
non-verbal behaviour associated with comprehension in two
studies [3] [15].
Input to FATHOM is currently offline through recorded
videos, which are streamed into FATHOM where a series of
BPANN facial object locators, identify the location in a video
frame of key visual features such as the eyes. For each non-
verbal behavioural feature identified from a specific visual
feature, the BPANN facial object pattern detectors identify its
state i.e. the left eye is half-open. The NVBs identified are then
coded into individual channels and group channels i.e. all
channels associated with eye behaviour.
States are typically collated over a time interval, e.g. 3
seconds grouped into one vector for Silent Talker but this can
be varied depending on the problem domain and FATHOM
uses a 1-second interval. Classification features (patterns) are
extracted from aggregated video-streamed frames over the time
interval and compiled to form a vector. Each vector is passed
to the final BPANN Comprehension classifier which outputs a
value between +1 and -1, indicating whether the person
exhibits high comprehension (+1) or low comprehension (-1)
during that period of time. If there were insufficient
information in the vector during a specific time slot, FATHOM
would recognise this and categorise the timeslot as
unclassifiable. At the end of a session i.e. a tutorial, the overall
comprehension/non-comprehension classification level is
displayed.
FATHOM simultaneously monitors 40 non-verbal
behavioural channels that include 20 channels capturing facial
features such as blushing and 16 channels capturing eye
movement such as right eye looking left. An overview of the
FATHOM architecture can be seen in Figure 1.
The work presented in this paper investigates the
consequences of replacing the BPANN comprehension
classifier in the FATHOM system by a C4.5 decision tree [12],
to answer questions about their relative performance and
transparency.
Fig.1 FATHOM Architecture
III. COMPREHENSION SCENARIOS
This section outlines the two comprehension scenarios
used to collect the data.
A. Study 1:HIV Informed Consent
The first comprehension study was undertaken in Tanzania in
Africa by FHI-360 [35] in collaboration with the National
Institute for Medical Research (NIMR) [36]. NIMR enlisted
sexually active women aged 18-35, who were native Kiswahili-
speakers. 292 participants took part in the study. Two different
experimental conditions (tasks) were used for data collection:
condition A was designed to be familiar and easy-to-
comprehend (condom use) and condition B was designed to be
unfamiliar and intentionally hard-to-comprehend (the effects of
HIV viral mutation on antiretroviral treatment). Each
participant listened to a short learning task script and then
received the associated ten closed and open-ended questions
with randomisation applied. Task order was also randomised so
that half of the participants completed task A followed by task
B and vice versa.
B. Study 2: Termites
Prior to the study a short learning topic was selected, which
was a factual digital video on Termites with a total duration of
8 minutes 40 seconds. The Termite video was targeted at the
general public with no age restriction and covered: functional
architectural aspects of the termite mounds, roles within the
social structure of a termite colony and locations where termite
colonies thrive. Two experts (Academic Professors in the field)
on the subject area were recruited to develop ten difficult
(hard) questions and ten easy questions related to the video
content. The experts agreed both the question difficulty levels
and the contents of the answer that the participants should
provide. The experts were required to devise five open
questions and closed questions within each set of hard and easy
questions. At the same time, the experts noted down the correct
answer(s) for each question, which were later incorporated into
a scoring scheme.

Forty participants were selected to participate in the study,
from academic and technical staff at the Manchester
Metropolitan University (MMU) in the UK. The sample was
composed of 20 males and 20 females. The males had a mean
age of 41 years old (SD = 14 years) and the females had a
mean age of 39 years old (SD = 14 years). Each participant was
invited to engage individually in a short learning task, which
was comprised of watching a short video on Termites and then
answering a small set of associated assessment questions whilst
being video recorded.
IV. EXPERIMENTAL METHODLOGY
The experimental methodology was to take the pair of
datasets outlined in Section III and use them to train and
evaluate C4.5 decision trees to replace the final stage BPANN
classifier.
A. Using a Back Propagation ANN as the final classifier
For each study, FATHOM’s object locators and pattern
detectors were used to extract and collate the non-verbal
vector-based dataset for the purpose of training the final
BPANN classifier. For both studies, HIV Informed Consent
and Termites, each vector in the final dataset covered a 1-
second time period and represented the state changes for the
compiled non-verbal channels over the period. Each channel
was normalised in the range +1 to -1. The last attribute in each
vector was the desired classification, with discrete values of +1
for comprehension and -1 for non-comprehension. The
following training parameters (determined from previous
exploratory cross-validation sessions) were used to train the
single hidden layer neural network in the Fathom training
application:
Topology: 40:20:1
Accept value: 1.0 (output >= 0.0 equals comprehension
AND output <0.0 equals non-comprehension)
Maximum epochs: 10,000
Checking epochs: 250, i.e. at every 250th epoch the total
Classification accuracy (CA) was checked and if there was
no improvement training was terminated.
Learning rate (ƞ) was set at 0.005.
Weight initialisation: automatic range (0±1/sqrt(fan-in))
where fan-in represents the number of inputs entering the
neuron.
Cross-validation: 10-folds
For study 1, eighty randomly selected participant videos (
from the 292 obtained in the study) comprised the HIV
Informed Consent dataset containing 71,787 vectors with
63.5% comprehension and 36.5% non-comprehension. For
study 2, the forty participant videos yielded 16,951
comprehension vectors and 23,857 non-comprehension
vectors. The study 2 Termites dataset was composed of 40,808
vectors with 41.5% in the comprehension class.
B. Using a C4.5 Decision Tree as the final classifier
In order to use a decision tree as a comprehension
classifier the final, BPANN, comprehension classifier
shown in Figure 1 was replaced by the C4.5 decision tree
algorithm.
1) The experiment consisted of a series of trials using
different degrees of pruning to find the optimal C4.5
decision trees and determine the extent to which they
could be pruned. The Weka implementation of C4.5,
known as J48 was used. This was achieved by
establishing a baseline decision tree for each scenario
setting the pruning parameters to Confidence Interval
(CI) = 0.25 and minimum number of objects = 2 cases
per leaf (i.e. the default settings)
2) This was followed by fixing the minimum number of
objects (MNO) at 2 and conducting a series of trials over
a range of confidence interval values to determine which
provides the greatest improvement in CA over the
baseline tree performance.
3) Then the complementary process was performed, fixing
the CI at 0.25 and conducting a series of trials over a
range of values of MNO.
4) Finally, further experiments were performed varying
confidence interval and MNO independently, to find the
most severely pruned tree for each dataset, which, was
not significantly worse than the baseline in terms of CA.
The initial ranges used for the experiments were, for CI: 0.25,
0.2, 0.15, 0.1, 0.05, and for MNO: 2, 5, 10, 15, 20.
V. RESULTS
A. BPANN Comprehension Classifier
Table I shows the overall best BPANN Classifiers for both
studies. Comprehension (C%) and Non-comprehension (NC%)
are the percentages of comprehension and non-comprehension
vectors, respectively, which were classified correctly. Overall
% is the total normalised percentage of comprehension and
non-comprehension vectors classified correctly.
TABLE I: BPANN RESULTS
C%
NC%
Overall CA
%
Study 1: HIV
Informed Consent
88.08
87.44
Study 2: Termites
72.77
78.43
B. C4.5 Comprehension Classifier
Table II shows the results of varying the Confidence Interval
used for pruning in decision tree construction (Pruning CI).

Citations
More filters
Journal ArticleDOI
TL;DR: This paper critically examines a recently developed proposal for a border control system called iBorderCtrl, designed to detect deception based on facial recognition technology and the measurement of micro-expressions, termed 'biomarkers of deceit'.
Abstract: This paper critically examines a recently developed proposal for a border control system called iBorderCtrl, designed to detect deception based on facial recognition technology and the measurement ...

25 citations

01 Jan 2015
TL;DR: The results of this study indicate that the negativity associated with a particular 5-HTTLPR genotype may be due to decreased processing of positive emotion rather than increased processing of negative emotion.
Abstract: Facial mimicry has been considered an automatic, spontaneous process. However, recent research suggests that facial mimicry is dependent on the context of the social interaction, with increased mimicry occurring when the understanding of another’s emotional states is important. In this study, we examined the social context of facial mimicry of positive and negative facial expressions of emotion, and how mimicry relates to common variants in the serotonin transporter genotype 5-HTTLPR, which has been found to relate to proneness to negativity and to social sensitivity. Overall, the results of this study indicate that the negativity associated with a particular 5-HTTLPR genotype may be due to decreased processing of positive emotion rather than increased processing of negative emotion. ASSOCIATION BETWEEN 5-HTTLPR AND MIMICRY 3

10 citations

Journal ArticleDOI
TL;DR: In this article, the authors compared the performance of three data mining techniques: Artificial Neural Network (ANN), Genetic Algorithm (GA), and Tobit Regression (Tobin) in determining the credit risk of local government units in Croatia.
Abstract: Over the past few decades, data mining techniques, especially artificial neural networks, have been used for modelling many real-world problems. This paper aims to test the performance of three methods: (1) an artificial neural network (ANN), (2) a hybrid artificial neural network and genetic algorithm approach (ANN-GA), and (2) the Tobit regression approach in determining the credit risk of local government units in Croatia. The evaluation of credit risk and prediction of debtor bankruptcy have long been regarded as an important topic in accounting and finance literature. In this research, credit risk is modelled under a regression approach unlike typical credit risk analysis, which is generally viewed as a classification problem. Namely, a standard evaluation of credit risk is not possible due to a lack of bankruptcy data. Thus, the credit risk of a local unit is approxi-mated using the ratio of outstanding liabilities maturing in a given year to total expendi-ture of the local unit in the same period. The results indicate that the ANN-GA hybrid approach performs significantly better than the Tobit model by providing a significantly smaller average mean squared error. This work is beneficial to researchers and the govern-ment in evaluating a local government unit’s credit score.

4 citations

Book ChapterDOI
18 Sep 2018
TL;DR: The chapter concludes by examining the future of ex-plainable decision making through proposing a new Hierarchy of Explainability and Empowerment that allows information and decision-making complexity to be explained at different levels depending on a person’s abilities.
Abstract: Adaptive Psychological Profiling systems use artificial intelligence algorithms to analyze a person’s non-verbal behavior in order to determine a specific mental state such as deception. One such system known as, Silent Talker, combines image processing and artificial neural networks to classify multiple non-verbal signals mainly from the face during a verbal exchange i.e. interview, to produce an accurate and comprehensive time-based profile of a subject’s psychological state. Artificial neural networks are typically black-box algorithms; hence, it is difficult to understand how the classification of a person’s behaviour is obtained. The new European Data Protection Legislation (GDPR), states that individuals who are automatically profiled, have the right to an explanation of how the “machine” reached its decision and receive meaningful information on the logic involved in how that decision was reached. This is practically difficult from a technical perspective, whereas from a legal point of view, it remains unclear whether this is sufficient to safeguard the data subject’s rights. This chapter is an extended version of a previous published paper in IJCCI 2019 [35] which examines the new European Data Protection Legislation and how it impacts on an application of psychological profiling within an Automated Deception Detection System (ADDS) which is one component of a smart border control system known as iBorderCtrl. ADDS detects deception through an avatar border guard interview, during a participants’ pre-registration, to demonstrate the challenges faced in trying to obtain explainable decisions from models derived through computational intelligence techniques. The chapter concludes by examining the future of explainable decision making through proposing a new Hierarchy of Explainability and Empowerment that allows information and decision-making complexity to be explained at different levels depending on a person’s abilities.

3 citations

Posted Content
TL;DR: In this paper, a recently developed proposal for a border control system called iBorderCtrl, designed to detect deception based on facial recognition technology and the measurement of micro-expressions, termed "biomarkers of deceit".
Abstract: This paper critically examines a recently developed proposal for a border control system called iBorderCtrl, designed to detect deception based on facial recognition technology and the measurement of micro-expressions, termed 'biomarkers of deceit'. Funded under the European Commission's Horizon 2020 programme, we situate our analysis in the wider political economy of 'emotional AI' and the history of deception detection technologies. We then move on to interrogate the design of iBorderCtrl using publicly available documents and assess the assumptions and scientific validation underpinning the project design. Finally, drawing on a Bayesian analysis we outline statistical fallacies in the foundational premise of mass screening and argue that it is very unlikely that the model that iBorderCtrl provides for deception detection would work in practice. By interrogating actual systems in this way, we argue that we can begin to question the very premise of the development of data-driven systems, and emotional AI and deception detection in particular, pushing back on the assumption that these systems are fulfilling the tasks they claim to be attending to and instead ask what function such projects carry out in the creation of subjects and management of populations. This function is not merely technical but, rather, we argue, distinctly political and forms part of a mode of governance increasingly shaping life opportunities and fundamental rights.

3 citations

References
More filters
Book
13 Jan 1998
TL;DR: This work proposes a new model of comprehension processes: the construction-integration model, which combines the role of working memory, Cognition and representation, and Propositional representations.
Abstract: Preface Acknowledgements 1. Introduction Part I. The Theory: 2. Cognition and representation 3. Propositional representations 4. Modeling comprehension processes: the construction-integration model Part II. Models of Comprehension: 5. Word identification in discourse 6. Textbases and situation models 7. The role of working memory in comprehension 8. Memory for text 9. Learning from text 10. Word problems 11. Beyond text References name index Subject index.

3,809 citations


"A hybrid model combining neural net..." refers background in this paper

  • ...The vast majority of research on comprehension concerns reading and the understanding of written text, initially by identifying the main ideas in the text [28, 29]....

    [...]

Book
01 Jan 1972
TL;DR: In this paper, the effects of the environment on human communication are discussed, as well as the relationship between the environment and human communication, including the ability to receive and send nonverbal signals.
Abstract: Preface. Part I: AN INTRODUCTION TO THE STUDY OF NONVERBAL COMMUNICATION. 1. Nonverbal Communication: Basic Perspectives. 2. The Roots of Nonverbal Behavior. 3. The Ability to Receive and Send Nonverbal Signals. Part II: THE COMMUNICATION ENVIRONMENT. 4. The Effects of the Environment on Human Communication. 5. The Effects of Territory and Personal Space on Human Communication. Part III: THE COMMUNICATORS. 6. The Effects of Physical Characteristics on Human Communication. Part IV: The Communicators' Behavior. 7. The Effects of Gesture and Posture on Human Communication. 8. The Effects of Touch on Human Communication. 9. The Effects of the Face on Human Communication. 10. The Effects of Eye Behavior on Human Communication. 11. The Effects of Vocal Cues That Accompany Spoken Words. Part V: COMMUNICATING IMPORTANT MESSAGES. 12. Using Nonverbal Behavior in Daily Interaction. 13. Nonverbal Messages in Special Contexts.

1,989 citations

Journal ArticleDOI
TL;DR: The career of metaphor hypothesis offers a unified theoretical framework that can resolve the debate between comparison and categorization models of metaphor and suggests that whether metaphors are processed directly or indirectly and whether they operate at the level of individual concepts or entire conceptual domains, will depend both on their degree of conventionality and on their linguistic form.
Abstract: A central question in metaphor research is how metaphors establish mappings between concepts from different domains. The authors propose an evolutionary path based on structure-mapping theory. This hypothesis--the career of metaphor--postulates a shift in mode of mapping from comparison to categorization as metaphors are conventionalized. Moreover, as demonstrated by 3 experiments, this processing shift is reflected in the very language that people use to make figurative assertions. The career of metaphor hypothesis offers a unified theoretical framework that can resolve the debate between comparison and categorization models of metaphor. This account further suggests that whether metaphors are processed directly or indirectly, and whether they operate at the level of individual concepts or entire conceptual domains, will depend both on their degree of conventionality and on their linguistic form.

944 citations

Book ChapterDOI
23 Dec 2008
TL;DR: A new 3D face database that includes a rich set of expressions, systematic variation of poses and different types of occlusions is presented, which can be a very valuable resource for development and evaluation of algorithms on face recognition under adverse conditions and facial expression analysis as well as for facial expression synthesis.
Abstract: A new 3D face database that includes a rich set of expressions, systematic variation of poses and different types of occlusions is presented in this paper. This database is unique from three aspects: i) the facial expressions are composed of judiciously selected subset of Action Units as well as the six basic emotions, and many actors/actresses are incorporated to obtain more realistic expression data; ii) a rich set of head pose variations are available; and iii) different types of face occlusions are included. Hence, this new database can be a very valuable resource for development and evaluation of algorithms on face recognition under adverse conditions and facial expression analysis as well as for facial expression synthesis.

819 citations


"A hybrid model combining neural net..." refers background in this paper

  • ...The weaknesses of this technique are: its results are largely based on highly artificial “posed” images using actors or students provided with highly specific instructions [21,22] or even training in how to produce facial actions [23,24], low numbers of detectable Ekman micro-expressions in spontaneous interviews [25] and a low Classification Accuracy (CA) for those micro-expressions actually found [26]....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors compared the performance of feed-forward back-propagation artificial neural network (ANN) with random forest (RF), an ensemble-based method gaining popularity in prediction, for predicting the hourly HVAC energy consumption of a hotel in Madrid, Spain.
Abstract: Energy prediction models are used in buildings as a performance evaluation engine in advanced control and optimisation, and in making informed decisions by facility managers and utilities for enhanced energy efficiency. Simplified and data-driven models are often the preferred option where pertinent information for detailed simulation are not available and where fast responses are required. We compared the performance of the widely-used feed-forward back-propagation artificial neural network (ANN) with random forest (RF), an ensemble-based method gaining popularity in prediction – for predicting the hourly HVAC energy consumption of a hotel in Madrid, Spain. Incorporating social parameters such as the numbers of guests marginally increased prediction accuracy in both cases. Overall, ANN performed marginally better than RF with root-mean-square error (RMSE) of 4.97 and 6.10 respectively. However, the ease of tuning and modelling with categorical variables offers ensemble-based algorithms an advantage for dealing with multi-dimensional complex data, typical in buildings. RF performs internal cross-validation (i.e. using out-of-bag samples) and only has a few tuning parameters. Both models have comparable predictive power and nearly equally applicable in building energy applications.

600 citations

Frequently Asked Questions (16)
Q1. What are the contributions in "A hybrid model combining neural networks and decision tree for comprehension detection" ?

This paper investigates the use of a hybrid model comprising multiple artificial neural networks with a final C4. 5 decision tree classifier to investigate the potential of explaining the classification decision through production rules. 

CONCLUSIONS AND FUTURE WORK The authors propose that future work should explore several options to simplify the decision trees and their representation. Second, investigation of the potential to reduce the number of input channels – through empirical experiment, by identifying the potentially lowest contributing channels through calculating information content and by grouping channels. 

This paper investigates the use of a hybrid model comprising multiple artificial neural networks with a final C4. 5 decision tree classifier to investigate the potential of explaining the classification decision through production rules. 

CONCLUSIONS AND FUTURE WORK The authors propose that future work should explore several options to simplify the decision trees and their representation. Second, investigation of the potential to reduce the number of input channels – through empirical experiment, by identifying the potentially lowest contributing channels through calculating information content and by grouping channels. 

For each study, FATHOM’s object locators and pattern detectors were used to extract and collate the non-verbal vector-based dataset for the purpose of training the final BPANN classifier. 

Non-verbal behaviour comprises all of the signals or cues, which human beings use to communicate, including visual, audio, tactile and chemical components [16, 17]. 

by using fuzzy rule extraction or random forest techniques to reduce the rule sets extracted from the more efficient trees to a more tractable size. 

Forty participants were selected to participate in the study, from academic and technical staff at the Manchester Metropolitan University (MMU) in the UK. 

Input to FATHOM is currently offline through recorded videos, which are streamed into FATHOM where a series of BPANN facial object locators, identify the location in a video frame of key visual features such as the eyes. 

The initial ranges used for the experiments were, for CI: 0.25,0.2, 0.15, 0.1, 0.05, and for MNO: 2, 5, 10, 15, 20.V. RESULTSTable 

Cross-validation: 10-foldsFor study 1, eighty randomly selected participant videos ( from the 292 obtained in the study) comprised the HIV Informed Consent dataset containing 71,787 vectors with 63.5% comprehension and 36.5% non-comprehension. 

The following training parameters (determined from previous exploratory cross-validation sessions) were used to train the single hidden layer neural network in the Fathom training application: Topology: 40:20:1 Accept value: 1.0 (output >= 0.0 equals comprehension AND output <0.0 equals non-comprehension) Maximum epochs: 10,000 Checking epochs: 250, i.e. at every 250th epoch the total Classification accuracy (CA) was checked and if there wasno improvement training was terminated. 

pre-preprocessing the data to cleanse it, particularly removing outliers, noise and conflicting records - all of which might be better handled by the BPANN than DTs. 

Each participant was invited to engage individually in a short learning task, which was comprised of watching a short video on Termites and then answering a small set of associated assessment questions whilst being video recorded. 

The work presented in this paper investigates the consequences of replacing the BPANN comprehension classifier in the FATHOM system by a C4.5 decision tree [12], to answer questions about their relative performance and transparency. 

The experimental methodology was to take the pair ofdatasets outlined in Section III and use them to train andevaluate C4.5 decision trees to replace the final stage BPANNclassifier.