scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Study Into the Factors That Influence the Understandability of Business Process Models

01 May 2011-Vol. 41, Iss: 3, pp 449-462
TL;DR: Findings are that both types of investigated factors affect model understanding, while personal factors seem to be the more important of the two.
Abstract: Business process models are key artifacts in the development of information systems. While one of their main purposes is to facilitate communication among stakeholders, little is known about the factors that influence their comprehension by human agents. On the basis of a sound theoretical foundation, this paper presents a study into these factors. Specifically, the effects of both personal and model factors are investigated. Using a questionnaire, students from three different universities evaluated a set of realistic process models. Our findings are that both types of investigated factors affect model understanding, while personal factors seem to be the more important of the two. The results have been validated in a replication that involves professional modelers.

Summary (3 min read)

Origine et évolution du concept de système d’innovation

  • Le concept de SI est même retenu comme l’un des quatre piliers de cette communauté scientifique, dominée par les approches économiques, mais incluant aussi des chercheurs en sciences de gestion, histoire ou sociologie (Fagerberg, Verspagen, 2009).
  • Des relations ou figures particulières ont aussi été mises en avant, comme le modèle de la « triple hélice » associant entreprises, universités et État (Leydesdorff, Etzkowitz, 1998).
  • Le concept de « système social d’innovation et de production » a été proposé pour prendre en compte les complémentarités institutionnelles jouant sur l’innovation et intégrer les dynamiques technologiques (Amable, 2003).
  • La Banque Mondiale s’est aussi fait un relai pour médiatiser le concept de SI qu’elle définit comme « un réseau d’organisations, d’entreprises et d’individus focalisés sur l’exploitation économique de nouveaux produits, procédés et formes d’organisation, ainsi que les institutions et les politiques qui influencent leur comportement et leur performance » (World Bank, 2006).

Un cadre d’analyse pour les travaux se référant aux Systèmes d’Innovation

  • Ce retour sur l’évolution des travaux sur les SI permet de proposer une grille d’analyse générique, dont le but est de préciser comment le concept est mobilisé dans des articles scientifiques, ouvrages ou documents politiques, en particulier dans un domaine comme l’agriculture et l’agroalimentaire.
  • Nous avons pour cela réalisé une étude bibliométrique1 à partir de trois moteurs de recherche : CAB2, Web of Science3 et Scopus4.
  • Multidisciplinaire, il recense plus de 10 000 revues.
  • Activités ciblées par le SI (citations et expertise) : Agriculture, biotechnologie, agroalimentaire, développement rural 2.2.

Deux principaux référentiels théoriques mobilisés par les auteurs

  • De manière générale le référentiel théorique est peu explicité dans les résumés, titres et mots clés.
  • Les Pays Moins Avancés (Afrique subsaharienne pour l’essentiel) sont les premiers concernés avec 34 % des publications, privilégiant l’analyse d’innovations agricoles.
  • Les Pays en transition ou méditerranéens sont peu présents (6 %, avec des articles essentiellement sur l’Espagne), tout comme les articles comparatifs questionnant la dimension internationale des SI (moins de 15 % des articles).
  • Dans 22 % des articles, le SI est un objet d’étude dans sa globalité, comme système.

Le repérage de profils d’articles par l’analyse de correspondance multiple

  • Pour synthétiser ces résultats et repérer les combinaisons entre ancrage théorique, domaine d’application et usages du SI, nous avons réalisé une analyse factorielle6 (Analyse des Correspondances Multiples) à partir du tableau de Burt rassemblant les variables qualitatives décrites précédemment.
  • Il oppose donc une posture critique et détachée de l’action, à 6.
  • Une analyse des contingences entre les modalités de la référence théorique et celles des autres variables précise cette observation (tableau 8).

ET SPÉCIFICITÉ AGRICOLE DES SI

  • Plusieurs communautés de connaissance mobilisent le concept de SI dans l’agriculture L’analyse bibliométrique suggère l’existence de 4 groupes d’articles marqués par des références théoriques différentes.
  • Ces communautés sont des regroupements plus ou moins structurés de scientifiques utilisant la notion SI, associés à des acteurs politiques ou économiques.
  • Les scientifiques viennent pour une large part d’une tradition de recherche constituée autour de l’agriculture (travaux sur le développement agricole, analyse des systèmes de recherche agronomiques, approches Farming System…) et sont associés à des institutions de recherche et de développement agronomique.
  • Les scientifiques peuvent être associés à une ingénierie du développement ou de la formation qui se recentre sur les innovations en milieu rural et la recherche action (Sanginga et al., 2009 ; Faure et al., 2010).

Réinterroger la spécificité agricole de l’innovation, des SI et des recherches sur les SI

  • Le repérage de ces communautés de connaissance amène alors à revenir sur les conditions d’une construction de définitions et d’usages du concept de SI qui seraient spécifiques à l’agriculture et l’agroalimentaire.
  • Ce questionnement peut être conduit en partant des arguments que développe la communauté épistémique qui cherche à produire des notions propres (AIS, AKIS…).
  • Les multiples domaines d’apprentissage technique et organisationnel, la nécessité d’adaptation et d’expérimentation locale de connaissances génériques, l’importance de « connaissances tacites », mais aussi l’implication croissante des consommateurs citoyens dans les conditions de production orientent les besoins de formation et les formes de médiations associés à la construction de ces connaissances (Goulet, Vinck, 2012).
  • – Enfin, l’agriculture et l’agroalimentaire sont confrontés, sans doute plus que d’autres secteurs, à un renouvellement d’enjeux qui appellent à inscrire les innovations agricoles dans des perspectives de long terme :.
  • Mais elles appellent aussi à développer les travaux sur les Systèmes Sectoriels d’Innovation (Malerba, 2002), en confrontant l’exemple agricole et agroalimentaire à d’autres secteurs pour renforcer l’édifice conceptuel permettant d’étudier l’innovation.

CONCLUSION

  • Nous avons montré en partant d’une analyse bibliométrique et bibliographique que le concept de SI connaît un développement important dans les travaux sur l’innovation dans l’agriculture et l’agroalimentaire.
  • Le succès croissant du concept dans ce secteur apparaît lié à la co-évolution de plusieurs communautés de connaissance, cherchant pour certaines à appliquer le cadre théorique et analytique des Innovation Studies (Martin, 2012), pour d’autres à construire un corpus de concepts et méthodes plus original, dans la lignée de travaux de sociologie et d’économie rurale ou du développement.
  • La question de la spécificité des conditions de l’innovation dans le secteur agricole est centrale et apparaît renouvelée du fait d’enjeux qui inscrivent aujourd’hui cette activité davantage dans les perspectives de long terme et dans les débats de la société.
  • Du côté des travaux appliquant les concepts évolutionnistes, l’arbitrage entre la reconnaissance d’une spécificité sectorielle (via la notion de Système Sectoriel d’Innovation) ou la minimisation de cette question (accompagnant par exemple le développement des biotechnologies) n’est pas tranchée.
  • Dans tous les cas, le maintien d’une dialectique entre plusieurs communautés de connaissance autour de l’usage de SI apparaît comme un gage de vivacité scientifique, à condition sans doute que les différentes communautés puissent interagir davantage.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A 1
A Study into the Factors that Influence the
Understandability of Business Process Models
Hajo A. Reijers and Jan Mendling
Abstract—Business process models are key artifacts in the
development of information systems. While one of their main
purposes is to facilitate communication among stakeholders, little
is known about the factors that influence their comprehension by
human agents. On the basis of a sound theoretical foundation, this
paper presents a study into these factors. Specifically, the effects
of both personal and model factors are investigated. Using a
questionnaire, students from three different universities evaluated
a set of realistic process models. Our findings are that both types
of investigated factors affect model understanding, while personal
factors seem to be the more important of the two. The results
have been validated in a replication that involves professional
modelers.
Index Terms—Business Process Modeling, Process models,
Human information processing, Complexity measures.
I. INTRODUCTION
S
INCE the 1960s, conceptual models are in use to facilitate
the early detection and correction of system development
errors [1]. In more recent years, the primary focus of con-
ceptual modeling efforts has shifted to business processes [2].
Models resulting from such efforts are commonly referred to
as business process models, or process models for short. They
are used to support the analysis and design of, for example,
process-aware information systems [3], service-oriented archi-
tectures [4], and web services [5].
Process models typically capture in some graphical notation
what tasks, events, states, and control flow logic constitute a
business process. A business process that is in place to deal
with complaints may, for example, contain a task to evaluate
the complaint, which is followed by another one specifying
that the customer in question is to be contacted. Similar to
other conceptual models, process models are first and foremost
required to be intuitive and easily understandable, especially
in IS project phases that are concerned with requirements doc-
umentation and communication [6]. Today, many companies
design and maintain several thousand process models, often
also involving non-expert modelers [7]. It has been observed
that such large model collections exhibit serious quality issues
in industry practice [8].
Against this background it is problematic that little insights
exist into what influences the quality of process models, in
particular with respect to their understandability. The most
important insight is that the size of the model is of notable
Hajo A. Reijers is with the School of Industrial Engineering, Eind-
hoven University of Technology, Eindhoven, The Netherlands, e-mail:
h.a.reijers@tue.nl.
Jan Mendling is with the School of Business and Economics, Humboldt-
Universit
¨
at zu Berlin, Germany, e-mail jan.mendling@wiwi.hu-berlin.de
Manuscript submitted April 19, 2009.
impact. An empirical study provides evidence that larger, real-
world process models tend to have more formal flaws (such as
e.g. deadlocks or unreachable end states) than smaller models
[9]. A likely explanation for this phenomenon would be that
human modelers loose track of the interrelations in large and
complex models due to their limited cognitive capabilities (cf.
[10]). They then introduce errors that they would not insert in
a small model, which will make the model less effective for
communication purposes.
There is both an academic and a practical motivation to
look beyond this insight. To start with the former, it can be
virtually ruled out that size is the sole factor that plays a
role in understanding a process model. To illustrate, a purely
sequential model will be easier to understand than a model that
is similar in size but where tasks are interrelated in various
ways. This raises the interest into the other factors that play
a role here. A more practical motivation is that model size
is often determined by the modeling domain or context. So,
process modelers will find it difficult to affect the size metric
of a process model towards creating a better readable version:
They cannot simply skip relevant parts from the model.
The aim of this paper is to investigate whether factors can
be determined beyond the size of process model that influence
its understanding. In that respect, we distinguish between
model factors and personal factors. Model factors relate to
the process model itself and refer to characteristics such as
a model’s density or structuredness. Personal factors relate to
the reader of such a model, for example with respect to one’s
educational background or the perceptions that are held about
a process model. While insights are missing into the impact
of any of such factors beyond size on process model
understandability, research has suggested the importance of
similar factors in other conceptual data models [11], [12].
To investigate the impact of personal and model factors,
the research that is presented here takes on a survey design.
Using a questionnaire that has been filled out by 73 students
from three different universities, hypothetical relations be-
tween model and personal factors on the understanding of a set
of process models are investigated. Some exploratory findings
from this data are reported in [13], [14], which essentially
confirm the significance of the two types of factors. The
contribution of this paper is quite different. We develop a
sound theoretical foundation for discussing individual model
understanding that is rooted in cognitive research on computer
program comprehension. We use this foundation to establish
hypotheses on the connection between understandability on
the one hand, and personal and model factors on the other
hand. For these tests we use the before mentioned survey

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A 2
data. Beyond that, we provide an extensive validation of our
findings and our instruments addressing two major challenges.
First, there has been little research on construct validity of
understandability measures. We use Cronbach’s alpha to check
the consistency of our questions that are used in calculating
the understandability score. Furthermore, we address potential
threats to external validity. We report on a replication of our
survey with practitioners and investigate if the results differ
from that of the students.
The rest of this paper is organized in accordance with the
presented aims. Section II introduces the theoretical founda-
tions of process model understanding. We identify matters of
process model understanding and respective challenges. This
leads us to factors of understanding. Section III describes the
setup of our survey design and the motivations behind it. Sec-
tion IV then presents the data analysis and the interpretation.
Section V discusses threats to validity and how we addressed
them. Section VI concludes the article. We use Appendix A
to summarize our survey design.
II. BACKGROUND
This section introduces the theoretical background of our
empirical research. Section II-A gives a brief overview of the
information content of a process model, defines a notion of
understandability, and summarizes related work on process
model understanding. Section II-B investigates potential fac-
tors of understandability. We utilize insights from cognitive
research into computer program comprehension in order to
derive propositions about the significance of personal and
model factors for understanding.
A. Matters of Process Model Understanding
Before considering a notion of understandability we first
have to discuss matters that can be understood from a process
model. We are focusing on so-called activity-based or control-
flow-based process models (in contrast to goal-oriented [15]
and choreography-oriented languages [16]). Figure 1 shows an
example of such a process model in a notation that we will
use throughout this paper. This notation essentially covers the
commonalities of Event-driven Process Chains (EPCs) [17],
[18] and the Business Process Modeling Notation (BPMN)
[19], which are two of the most frequently used notations for
process modeling. Such a process model describes the control
flow between different activities (A, B, I, J, K, L, M, N, and
O in Figure 1) using arcs. So-called connectors (XOR, AND,
OR) define complex routing constraints of splits (multiple
outgoing arcs) and joins (multiple ingoing arcs). XOR-splits
represent exclusive choices and XOR-joins capture respective
merges without synchronization. AND-splits introduce concur-
rency of all outgoing branches while AND-joins synchronize
all incoming arcs. OR-splits define inclusive choices of a 1-to-
all fashion. OR-joins synchronize such multiple choices, which
requires a quite sophisticated implementation (see [18], [20]).
Furthermore, there are specific nodes to indicate start and end.
In this paper we consider formal statements that can be
derived about the behavior described by such a process model,
ignoring the (informal) content of activity labels. This formal
Start
OR
BA
XOR
O
XOR
J
L M
AND
AND
XOR
I
XOR
K
N
XOR
XOR
OR
Fig. 1. Part of a process model
focus has the advantage that we can unambiguously evaluate
whether an individual has grasped a particular model aspect
correctly. In particular, we focus on binary relationships be-
tween two activities in terms of execution order, exclusiveness,
concurrency, and repetition. These relationships play an impor-
tant role for reading, modifying, and validating the model.
Execution Order relates to whether the execution of
one activity a
i
eventually leads the execution of another
activity a
j
. In Figure 1, the execution of J leads to the
execution of L.
Exclusiveness means that two activities a
i
and a
j
can
never be executed in the same process instance. In
Figure 1, J and K are mutually exclusive.
The concurrency relation covers two activities a
i
and a
j
if
they can potentially be executed concurrently. In Figure 1,
L and M are concurrent.
A single activity a is called repeatable if it is possible
to execute it more than once for a process instance. In
Figure 1, among others, K, N, and I are repeatable.
Statements such as “Executing activity a
i
implies that
a
j
will be executed later” can be easily verified using the
reachability graph of the process model. A reachability graph
captures all states and transitions represented by the process

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A 3
model and it can be (automatically) generated from it. For
some classes of models, several relationships can be calcu-
lated more efficiently without the reachability graph [21]. For
instance, these relations can be constructed for those process
models that map to free-choice Petri nets in O(n
3
) time [22],
[23].
B. Factors of Process Model Understanding
Throughout this paper, we use the term understandability in
order to refer to the degree to which information contained in
a process model can be easily understood by a reader of that
model. This definition already implies that understandability
can be investigated from two major angles: personal factors
related to the model reader and factors that relate to the model
itself. We discuss the relevance of both categories using the
cognitive dimensions framework as a theoretical foundation.
The cognitive dimensions framework is a set of aspects
that have empirically been proven to be significant for the
comprehension of computer programs and visual notations
[24]. There are two major findings that the framework builds
upon: A representation always emphasizes a certain informa-
tion at the expense of another one, and there has to be a fit
between the mental task at hand and the notation [25], [26].
The implications of these insights are reflected by cognitive
dimensions that are relevant for process model reading.
Abstraction Gradient refers to the grouping capabilities
of a notation. In a single process model, there is no
mechanism to group activities. Therefore, flow languages
are called abstraction-hating [24]. As a consequence,
the more complex the model gets the more difficult it
becomes for the model reader to identify those parts that
closely relate. Presumably, expert model readers will be
more efficient in finding the related parts.
Hard Mental Operations relates to an over-proportional
increase in difficulty when elements are added to a
representation. This is indeed the case for the behav-
ioral semantics of a process model. In the general case,
calculating the reachability graph for a process model
is NP-complete [27]. Therefore, a larger process model
is over-proportionally more difficult to interpret than a
simple model. On the other hand, experts are more likely
to know decomposition strategies, e.g. as described in
[28], to cope with complexity.
Hidden Dependencies refer to interdependencies that are
not fully visible. In process models such hidden depen-
dencies exist between split and join connectors: each split
should have a matching join connector of the same type,
e.g. to synchronize concurrent paths. In complex models,
distant split-join pairs can be quite difficult to trace. In
general such interdependencies can be analyzed using the
reachability graph, but many analyses can be performed
also structurally (see [29]). Experts modelers tend to use
structural heuristics for investigating the behavior of a
process model.
Secondary Notation refers to any piece of extra informa-
tion that is not part of the formalism. In process models
secondary notation is an important matter, among others
in terms of labeling conventions [30] or layout strategies
[31]. For models of increasing complexity, secondary
notation also gains in importance for making the hidden
dependencies better visible. On the other hand, it has
been shown that experts’ performance is less dependent
on secondary notation as that of novices [32].
Personal factors have also been recognized as important
factors in engineering and design [33], [34]. In particular,
the matter of expertise is clearly established by prior research
on human-computer interaction. While research on perceptual
quality and perceptual expertise is only emerging recently in
conceptual modeling (see [35], [36]), there are some strong
insights into the factors of expert performance in different
areas. A level of professional expertise is assumed to take at
least 1,000 to 5,000 hours of continuous training [37, p.563].
In this context, it is not only important that the expert has
worked on a certain matter for years, but also that practicing
has taken place on a daily basis [38]. Such regular training is
needed to build up experience, knowledge, and the ability to
recognize patterns [39]. Furthermore, the way information is
processed by humans is influenced by cognitive styles, which
can be related to personality. There are persons who prefer
verbal over image information and who rather grasp the whole
instead of analytically decomposing a matter, or the other way
round [40]. As models enable reasoning through visualization,
perceptional capabilities of a person are also relevant [41].
Clearly, these capabilities differ between persons with different
process modeling expertise.
We conclude for this theoretical discussion that model
features and personal characteristics are indeed likely to be
relevant factors of process model understandability.
C. Related work
In this section we present related work grouped into three
categories: model factors, personal factors, and other factors.
The importance of model characteristics was intuitively
assumed by early work into process model metrics. Such
metrics quantify structural properties of a process model,
inspired by prior work in software engineering on lines of
code, cyclomatic number, or object-oriented metrics [42]–[44].
Early contributions by Lee and Yoon, Nissen, and Morasca
[45]–[47] focus on defining metrics. More recently, different
metrics have been validated empirically. The work of Cardoso
is centered around an adaptation of the cyclomatic number
for business processes he calls control-flow complexity (CFC)
[48]. This metric was validated with respect to its correlation
with perceived complexity of process models [49]. The re-
search conducted by a group including Canfora, Rol
´
on, and
Garc
´
ıa analyzes understandability as an aspect of maintainabil-
ity. They include different metrics of size, complexity, and cou-
pling in a set of experiments, and identify several correlations
[50], [51]. Further metrics take their motivation from cognitive
research, e.g. [14], and based on concepts of modularity, e.g.
[52], [53]. Most notably, an extensive set of metrics has been
validated as factors of error probability [9], a symptom of bad
understanding. The different validations clearly show that size
is an important model factor for understandability, but does

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A 4
not fully determine phenomenons of understanding: additional
metrics like structuredness help to improve the explanatory
power significantly [18].
Personal factors have been less intensively researched as
factors of process model understanding. The experiment by
Recker and Dreiling operationalizes the notion of process
modeling expertise by a level of familiarity of a particular
modeling notation [54]. In a survey by Mendling, Reijers, and
Cardoso participants are characterized based on the number
of process models they created and the years of modeling
experience they have [13]. Mendling and Strembeck measure
theoretical knowledge of the participants in another survey
using six yes/no questions [55]. Most notable are two results
that point to the importance of theoretical process modeling
knowledge. In the Mendling, Reijers, and Cardoso survey the
participants from TU Eindhoven with strong Petri net educa-
tion scored best and in the Mendling and Strembeck survey,
there was a high correlation between theoretical knowledge
and the understandability score.
There are other factors that also might have an impact on
process model understanding. We briefly discuss model pur-
pose, problem domain, modeling notation, and visual layout.
Model purpose: The understanding of a model may be
affected by the specific purpose the modeler had in mind. The
best example is that some process models are not intended
to be used on a day-to-day basis by people but instead are
explicitly created for automatic enactment. In such a case,
less care will be given to make them comprehensible to
humans. The differences between process models as a result
of different modeling purposes are mentioned, for example, in
[6]. Empirical research into this factor is missing.
Problem domain: People may find it easier to read a model
about the domain they are familiar with than other models.
While this has not been established for process models, it
is known from software engineering that domain knowledge
affects the understanding of particular code [56].
Modeling notation: In the presence of many different nota-
tions for process models, e.g. UML Activity diagrams, EPCs,
BPMN, and Petri nets, it cannot be ruled out that some of these
are inherently more suitable to convey meaning to people than
others. Empirical research that has explored this difference is,
for example, reported in [57]. According to these publications,
the impact of the notation being used is not very high, maybe
because the languages are too similar. Other research that
compares notations of a different focus identify a significant
impact on understanding [58], [59].
Visual layout: Semantically equivalent models can be ar-
ranged in different ways. For example, different line drawing
algorithms can be used or models may be split up into
different submodels. The effect of layout on process model
understanding was already noted in the early 1990s [60]. With
respect to graphs, it is a well-known result that edge crosses
negatively affect understanding [61]. Also, for process models,
the use of modularity can improve understanding [62].
Given that, as we argued, the insights into the understanding
of process models are limited, this is probably not a complete
set of factors. But even at this number, it would be difficult to
investigate them all together. In this study, we restrict ourselves
to the first two categories, i.e. personal and model factors. In
the definition of this survey, which will be explained in the
next section, we will discuss how we aimed to neutralize the
potential effects of the other categories.
D. Summary
From cognitive research into program understanding we can
expect that personal and model factors are likely to be factors
of process model understandability. The impact of size as an
important metric has been established by prior research. Yet, it
only partially explains phenomena of understanding. Personal
factors also appear to be relevant. Theoretical knowledge
was found to be a significant factor, but so far only in
student experiments. Furthermore, research into the relative
importance of personal and model factors are missing. In the
following sections, we present a survey to investigate this
question and analyze threats to validity.
III. DEFINITION, PLANNING, AND OPERATION OF THE
SURVEY DESIGN
This section explains the definition, the planning and the
operation of a survey design in personal and model related
factors of understanding.
A. Definition
According to the theoretical background we provided, both
the characteristics of the reader of a process model and those
of the process model itself affect the understanding that such
a reader may gain from studying that model. Both types of
characteristics can be considered as independent variables,
while the understanding gained from studying a process model
constitutes the dependent variable. Beyond this, there are other
potential factors of influence which we wish to neutralize,
i.e. model purpose, problem domain, modeling notation, and
visual layout (see Section II-C). To explore the relations that
interest us, the idea is to expose a group of respondents to a set
of process models and then test their understanding of these
models using a self-administered questionnaire. Such a design
shares characteristics with a survey where personal and model
parameters are recorded, but without predefined factor levels.
We use a convenience sample of students. From the analysis
perspective it can be classified as a correlational study that
seeks to identify statistical connections between measurements
of interest. Similar designs have been used to investigate
metrics in software engineering, e.g. in [63]. Conclusions on
causality are limited, though.
We strived to neutralize the influence of other relevant
factors. First of all, a set of process models from practice was
gathered that was specifically developed for documentation
purposes. Next, to eliminate the influence of domain knowl-
edge all the task labels in these process models were replaced
by neutral identifiers, i.e. capital letters A to W . In this way,
we also prevent a potential bias stemming from varying length
of natural activity label (see [55]). Based on the observation
by [57] that EPCs appear to be easier to understand than
Petri nets, we chose for an EPC-like notation without events.

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A 5
The participants received a short informal description of the
semantics similar to [64, p. 25]. Finally, all models were
graphically presented on one page, without use of modularity,
and drawn in the same top-to-bottom layout with the start
element at the top and end element at the bottom.
Furthermore, in our exploration we wish to exclude one
particular process model characteristic, which is size. As we
argued in the introduction of this paper and our discussion
of related work, process model size is the one model char-
acteristic of which its impact on both error proneness and
understanding is reasonably well understood. Because it is our
purpose to look beyond the impact of this particular aspect,
we have controlled the number of tasks in the process model.
Each of the included process models has the same number
of tasks. However, to allow for variation across the other
model characteristics, two additional variants were constructed
for each of the real process models. The variations were
established by changing one or two routing elements in each
of these models (e.g. a particular XOR-split in a AND-split).
Having taken care of the various factors we wish to control,
at this point we can refine what personal and model factors
are taken into account and how these are measured in the
questionnaire. Note that a summary of the questionnaire is
presented in Appendix A.
For the personal factors, we take the following variables
into consideration:
THEORY: A person’s theoretical knowledge on process
modeling. This variable is measured as a self-assessment
by the respondents on a 5-point ordinal scale, with anchor
points “I have weak theoretical knowledge” and “I have
strong theoretical knowledge”.
PRACTICE: A person’s practical experience with pro-
cess modeling. This variable is a self-assessment by
the respondents. It is measured on a 4-point ordinal
scale. The scale has anchor points “I never use business
process modeling in practice” and “I use business process
modeling in practice every day”.
EDUCATION: A person’s educational background. This
categorical variable refers to the educational institute that
the respondents is registered at.
For the model factors, several variables are included. These
variables are all formally defined in [18, pp. 117-128], with
the exception of the cross-connectivity metric that is specified
in [14]. The model factors can be characterized as follows:
#NODES, #ARCS, #TASKS, #CONNECTORS, #AND-
SPLITS, #AND-JOINS, #XOR-SPLITS, #XOR-JOINS,
#OR-SPLITS, #OR-JOINS: These variables all relate to
the number of a particular type of elements in a pro-
cess model. These include counts for the number of
arcs (#ARCS) and nodes (#NODES). The latter can be
further subdivided into #TASKS on the one hand and
#CONNECTORS on the other. The most specific counts
are subcategorizations of the different types of logical
connectors, like #AND-SPLITS and #OR-JOINS.
DIAMETER: The length of the longest path from a start
node to an end node in the process model.
TOKEN SPLITS: The maximum number of paths in a
process model that may be concurrently initiated through
the use of AND-splits and OR-splits.
AVERAGE CONNECTOR DEGREE, MAXIMUM CONNEC-
TOR DEGREE: The AVERAGE CONNECTOR DEGREE ex-
presses the average of the number of both incoming
and outgoing arcs of the connector nodes in the process
model; the MAXIMUM CONNECTOR DEGREE expresses
the maximum sum of incoming and outgoing arcs of those
connector nodes.
CONTROL FLOW COMPLEXITY: A weighted sum of all
connectors that are used in a process model.
MISMATCH: The sum of connector pairs that do not match
with each other, e.g. when an AND-split is followed up
by an OR-join.
DEPTH: The maximum nesting of structured blocks in a
process model.
CONNECTIVITY, DENSITY: While CONNECTIVITY re-
lates to the ratio of the total number of arcs in a process
model to its total number of nodes, DENSITY relates to
the ratio of the total number of arcs in a process model
to the theoretically maximum number of arcs (i.e. when
all nodes are directly connected).
CROSS-CONNECTIVITY: The extent to which all the
nodes in a model are connected to each other.
SEQUENTIALITY: The degree to which the model is
constructed of pure sequences of tasks.
SEPARABILITY: The ratio of the number of cut-vertices
on the one hand, i.e. nodes that serve as bridges between
otherwise disconnected components, to the total number
of nodes in the process model on the other.
STRUCTUREDNESS: The extent to which a process model
can be built by nesting blocks of matching split and join
connectors.
CONNECTOR HETEROGENEITY: The extent to which dif-
ferent types of connectors are used in a process model.
To illustrate these factors, we refer the reader to Figure 2.
Shown here is a model of a loan request process expressed in
the EPC modeling notation, which is elaborated in [18, pp. 19-
20]. In addition to the standard EPC notational elements, tags
are added to identify sequence arcs, cut vertices, and cycle
nodes. Additionally, the numbers of incoming and outgoing
arcs are given for each node, as well as a bold arc that provides
the diameter of the model. All these notions are instrumental
in calculating the exact values of the model factors that were
presented previously. For this particular model, the values of
the model factors are given in Table I.
Having discussed the independent variables, we need to
address now how a process model’s understanding is captured.
There are various dimensions in how far comprehension can
be measured, for an overview see [65]. For our research, we
focus on a SCORE variable. SCORE is a quantification of a
respondent’s accurate understanding of a process model. This
ratio is determined by the set of correct answers to a set of
seven closed questions and one open question. The closed
questions confront the respondent with execution order, ex-
clusiveness, concurrency, and repeatability issues (see Section
II-A) which are linked to closed questions (yes/no/I don’t

Citations
More filters
Journal ArticleDOI
TL;DR: This paper compares the success of university students in interpreting business process descriptions, for an established graphical notation (BPMN) and for an alternative textual notation (based on written use-cases)

136 citations


Cites background from "A Study Into the Factors That Influ..."

  • ...Post-graduate students (like these) have been previously found to be adequate proxies for analysts with low to medium expertise levels [18, 51, 60]....

    [...]

  • ...Also the importance of textual aptitude confirms the importance of individual difference as observed in text graphics comparisons [10, 60]....

    [...]

  • ...Similarly, post-graduate students who study industrial engineering or business process modelling have been shown previously to be valid proxies for business analysts with low or median expertise in the industry [18, 51, 60]....

    [...]

  • ...In the same vein, a variation in complexity of the process model in terms of size and other metrics [7, 24, 43, 38] and individual differences [60] results in different levels of understanding....

    [...]

Journal ArticleDOI
01 Apr 2012
TL;DR: In this article, the authors use theories of semiotics and cognitive load to theorize how model and personal factors influence how model viewers comprehend the syntactical information of process models and report on a four-part series of experiments, in which they examined these factors.
Abstract: In order to make good decisions about the design of information systems, an essential skill is to understand process models of the business domain the system is intended to support. Yet, little knowledge to date has been established about the factors that affect how model users comprehend the content of process models. In this study, we use theories of semiotics and cognitive load to theorize how model and personal factors influence how model viewers comprehend the syntactical information of process models. We then report on a four-part series of experiments, in which we examined these factors. Our results show that additional semantical information impedes syntax comprehension, and that theoretical knowledge eases syntax comprehension. Modeling experience further contributes positively to comprehension efficiency, measured as the ratio of correct answers to the time taken to provide answers. We discuss implications for practice and research. Highlights? We uncover effects of knowledge and model semantics on user syntax comprehension. ? Modeling of additional semantic information impedes understanding of model syntax. ? Theoretical knowledge eases syntax comprehension. ? Modeling experience increases comprehension efficiency. ? The findings inform process modeling training decisions and workflow verification.

132 citations

Journal ArticleDOI
TL;DR: A parsing algorithm is developed that is able to deal with the shortness of activity labels, which integrates natural language tools like WordNet and the Stanford Parser and shifts the boundary of process model quality issues that can be checked automatically from syntactic to semantic aspects.

118 citations


Cites result from "A Study Into the Factors That Influ..."

  • ...In comparison to a straightforward application of standard natural language tools, our technique provides much more stable results....

    [...]

DOI
01 Jan 2014
TL;DR: The Evolutionary Tree Miner framework is presented, which is implemented as a plug-in for the process mining toolkit ProM and is able to balance these different quality metrics and be able to produce (a collection of) process models that have a specific balance of these quality dimensions, as specified by the user.
Abstract: Process mining automatically produces a process model while considering only an organization’s records of its operational processes. Over the last decade, many process discovery techniques have been developed, and many authors have compared these techniques by focusing on the properties of the models produced. However, none of the current techniques guarantee to produce sound (i.e., syntactically correct) process models. Furthermore, none of the current techniques provide insights into the trade-offs between the different quality dimensions of process models. In this thesis we present the Evolutionary Tree Miner (ETM) framework. Its main feature is the guarantee that the discovered process models are sound. Another feature is that the ETM framework also incorporates all four well-known quality dimensions in process discovery (replay fitness, precision, generalization and simplicity). Additional quality metrics can be easily added to the Evolutionary Tree Miner. The Evolutionary Tree Miner framework is able to balance these different quality metrics and is able to produce (a collection of) process models that have a specific balance of these quality dimensions, as specified by the user. The third main feature of the Evolutionary Tree Miner is that it is easily extensible. In this thesis we discuss extensions for the discovery of a collection of process models with different quality trade-offs, the discovery of (a collection of) process models using a given process model, and the discovery of a configurable process model that describes multiple event-logs. The Evolutionary Tree Miner is implemented as a plug-in for the process mining toolkit ProM. The Evolutionary Tree Miner and all of its extensions are evaluated using both artificial and real-life data sets.

117 citations


Cites background from "A Study Into the Factors That Influ..."

  • ...Many factors influence the understandability of a process model [152], one of which is the visualization of the process model [119]....

    [...]

01 Jan 1999
TL;DR: In this article, the authors show that expert and exceptional performance are mediated by cognitive and perceptual-motor skills and by domain-specific physiological and anatomical adaptations, and that the highest levels of human performance in different domains can only be attained after around ten years of extended, daily amounts of deliberate practice activities.
Abstract: Expert and exceptional performance are shown to be mediated by cognitive and perceptual-motor skills and by domain-specific physiological and anatomical adaptations. The highest levels of human performance in different domains can only be attained after around ten years of extended, daily amounts of deliberate practice activities. Laboratory analyses of expert performance in many domains such as chess, medicine, auditing, computer programming, bridge, physics, sports, typing, juggling, dance, and music reveal maximal adaptations of experts to domain-specific constraints. For example, acquired anticipatory skills circumvent general limits on reaction time, and distinctive memory skills allow a domain-specific expansion of working memory capacity to support planning, reasoning, and evaluation. Many of the mechanisms of superior expert performance serve the dual purpose of mediating experts' current performance and of allowing continued improvement of this performance in response to informative feedback during practice activities.

115 citations

References
More filters
Book
01 Jan 1956
TL;DR: This is the revision of the classic text in the field, adding two new chapters and thoroughly updating all others as discussed by the authors, and the original structure is retained, and the book continues to serve as a combined text/reference.
Abstract: This is the revision of the classic text in the field, adding two new chapters and thoroughly updating all others. The original structure is retained, and the book continues to serve as a combined text/reference.

35,552 citations

Book
01 Jan 1980
TL;DR: In this article, the context of educational research, planning educational research and the styles of education research are discussed, along with strategies and instruments for data collection and research for data analysis.
Abstract: Part One: The Context Of Educational Research Part Two: Planning Educational Research Part Three: Styles Of Educational Research Part Four: Strategies And Instruments For Data Collection And Researching Part Five: Data Analysis

21,163 citations

Book
01 Jan 1969
TL;DR: A new edition of Simon's classic work on artificial intelligence as mentioned in this paper adds a chapter that sorts out the current themes and tools for analyzing complexity and complex systems, taking into account important advances in cognitive psychology and the science of design while confirming and extending Simon's basic thesis that a physical symbol system has the necessary and sufficient means for intelligent action.
Abstract: Continuing his exploration of the organization of complexity and the science of design, this new edition of Herbert Simon's classic work on artificial intelligence adds a chapter that sorts out the current themes and tools -- chaos, adaptive systems, genetic algorithms -- for analyzing complexity and complex systems. There are updates throughout the book as well. These take into account important advances in cognitive psychology and the science of design while confirming and extending the book's basic thesis: that a physical symbol system has the necessary and sufficient means for intelligent action. The chapter "Economic Reality" has also been revised to reflect a change in emphasis in Simon's thinking about the respective roles of organizations and markets in economic systems.

11,845 citations

Book
02 Sep 2011
TL;DR: This research addresses the needs for software measures in object-orientation design through the development and implementation of a new suite of metrics for OO design, and suggests ways in which managers may use these metrics for process improvement.
Abstract: Given the central role that software development plays in the delivery and application of information technology, managers are increasingly focusing on process improvement in the software development area. This demand has spurred the provision of a number of new and/or improved approaches to software development, with perhaps the most prominent being object-orientation (OO). In addition, the focus on process improvement has increased the demand for software measures, or metrics with which to manage the process. The need for such metrics is particularly acute when an organization is adopting a new technology for which established practices have yet to be developed. This research addresses these needs through the development and implementation of a new suite of metrics for OO design. Metrics developed in previous research, while contributing to the field's understanding of software development processes, have generally been subject to serious criticisms, including the lack of a theoretical base. Following Wand and Weber (1989), the theoretical base chosen for the metrics was the ontology of Bunge (1977). Six design metrics are developed, and then analytically evaluated against Weyuker's (1988) proposed set of measurement principles. An automated data collection tool was then developed and implemented to collect an empirical sample of these metrics at two field sites in order to demonstrate their feasibility and suggest ways in which managers may use these metrics for process improvement. >

5,476 citations

Frequently Asked Questions (7)
Q1. What have the authors contributed in "A study into the factors that influence the understandability of business process models" ?

On the basis of a sound theoretical foundation, this paper presents a study into these factors. 

Future research is needed for analyzing the relative importance of model size in comparison to personal expertise, and should explicitly consider potential interaction effects. The authors also plan to investigate the significance of those factors for understanding that they neutralized in this research. Finally, the case of model L points to research opportunities on the difficulty of understanding particular process model components. As certain components can be reduced because they are always correct, it might be interesting to investigate whether certain components can be understood with the same ease, even if they are moved to a different position in the process model. 

In particular, the authors focus on binary relationships between two activities in terms of execution order, exclusiveness, concurrency, and repetition. 

For models of increasing complexity, secondary notation also gains in importance for making the hidden dependencies better visible. 

Statements such as “Executing activity ai implies that aj will be executed later” can be easily verified using the reachability graph of the process model. 

The research conducted by a group including Canfora, Rolón, and Garcı́a analyzes understandability as an aspect of maintainability. 

This notation essentially covers the commonalities of Event-driven Process Chains (EPCs) [17], [18] and the Business Process Modeling Notation (BPMN) [19], which are two of the most frequently used notations for process modeling.