scispace - formally typeset
Open AccessBook ChapterDOI

Exploiting Inductive Logic Programming Techniques for Declarative Process Mining

Reads0
Chats0
TLDR
This paper investigates how, by properly tuning the learning algorithm, the approach can be adopted to mine models expressed in the ConDec notation, a graphical language for the declarative specification of business processes, and how such a mining framework has been concretely implemented as a ProM plug-in called DecMiner.
Abstract
In the last few years, there has been a growing interest in the adoption of declarative paradigms for modeling and verifying process models. These paradigms provide an abstract and human understandable way of specifying constraints that must hold among activities executions rather than focusing on a specific procedural solution. Mining such declarative descriptions is still an open challenge. In this paper, we present a logic-based approach for tackling this problem. It relies on Inductive Logic Programming techniques and, in particular, on a modified version of the Inductive Constraint Logic algorithm. We investigate how, by properly tuning the learning algorithm, the approach can be adopted to mine models expressed in the ConDec notation, a graphical language for the declarative specification of business processes. Then, we sketch how such a mining framework has been concretely implemented as a ProM plug-in called DecMiner. We finally discuss the effectiveness of the approach by means of an example which shows the ability of the language to model concurrent activities and of DecMiner to learn such a model.

read more

Content maybe subject to copyright    Report

Exploiting Inductive Logic Programming
Techniques for Declarative Process Mining
Federico Chesani
1
, Evelina Lamma
2
, Paola Mello
1
,
Marco Montali
1
, Fabrizio Riguzzi
2
, and Sergio Storari
2
1
DEIS Universit`a di Bologna
viale Risorgimento, 2 40136 Bologna, Italy
{federico.chesani,paola.mello,marco.montali}@unibo.it
2
ENDIF Universit`a di Ferrara
Via Saragat, 1 44100 Ferrara, Italy
{evelina.lamma,fabrizio.riguzzi,sergio.storari}@unife.it
Abstract. In the last few years, there has been a growing interest in
the adoption of declarative paradigms for modeling and verifying pro-
cess models. These paradigms provide an abstract and human under-
standable way of specifying constraints that must hold among activities
executions rather than focusing on a specific procedural solution. Min-
ing such declarative descriptions is still an open challenge. In this paper,
we present a logic-based approach for tackling this problem. It relies on
Inductive Logic Programming techniques and, in particular, on a modi-
fied version of the Inductive Constraint Logic algorithm. We investigate
how, by properly tuning the learning algorithm, the approach can be
adopted to mine models expressed in the ConDec notation, a graphical
language for the declarative specification of business processes. Then, we
sketch how such a mining framework has been concretely implemented
as a ProM plug-in called DecMiner. We finally discuss the effectiveness
of the approach by means of an example which shows the ability of the
language to model concurrent activities and of DecMiner to learn such a
model.
1 Introduction
When facing the problem of defining and developing a Business Process (BP), we
can mainly identify two different and complementary roles: the business analyst,
a domain expert aiming at improving the performances of her company, and
the IT-expert, who has the responsibility of bringing business-level models to an
effective underlying implementation. The complementarity of these roles leads
to different perspectives about the process to be developed: while the IT-expert
typically adopts a procedural style of modeling, dealing with implementation
aspects and trying to obtain an executable process, the business analyst follows
a more declarative approach (see Figure 1). Indeed, at a business level it is very
important to represent in an intuitive and concise way the domain and problem
under study, rather than focusing on a specific solution. In this respect, the
K. Jensen and W. van der Aalst (Eds.): ToPNoC II, LNCS 5460, pp. 278–295, 2009.
c
Springer-Verlag Berlin Heidelberg 2009

Exploiting Inductive Logic Programming Techniques 279
execution
modeling
procedural
model
declarative
model
policies
regulations
business
rules
execution traces
Declarative
Process Mining
mining
Fig. 1. Declarative and procedural perspectives when modeling Business Processes
model will typically involve business rules, covering best practices and internal
constraints as well as internal/external regulations and compliance requirements.
The importance of adopting a declarative style of modeling has been recently
pointed out by van der Aalst and Pesic [18]: we agree with their claim that
declarative languages fit better complex, unpredictable processes, where a good
balance between support and flexibility is of key importance. To this end, in [18]
they propose a new graphical language for specifying process flows in a declara-
tive manner. The language, called ConDec, does not completely fix the control
flow among activities, but rather envisages a set of constraints expressing poli-
cies/business rules for specifying either what is forbidden as well as mandatory
in the process. Therefore, the approach is inherently open and flexible, because
workers can perform actions if they are not explicitly forbidden. ConDec adopts
an underlying semantics by means of Linear Temporal Logics (LTL), and can
also be mapped onto a logic programming-based framework called SCIFF (So-
cial Constrained IFF) [2,4], which was originally developed for the specification
and verification of global interaction protocols in open Multi-Agent Systems but
has recently been applied in the context of BPs and SOA (Service-Oriented
Architecture) Choreographies. SCIFF provides a declarative language based on
Computational Logic, where constraints are imposed on activities in terms of re-
active rules (namely Integrity Constraints). Such reactive rules mention in their
body occurring activities, i.e., events, and additional constraints on their vari-
ables in the style of Constraint Logic Programming (CLP) [12]. SCIFF rules
contain in their head expectations over the course of events. Such expectations
can be positive, when a certain activity is required to happen, or negative, when
a certain activity is forbidden to happen.
An important topic related to declarative process specification, which is still
an open challenge, concerns their discovery starting from execution traces, i.e.,
declarative process mining. Indeed, up to now, the goal of process mining has
been the discovery of procedural process models (such as Petri Nets or Event-
driven Process Chains [21,24]). We claim the necessity of mining also declarative
models, to enable the possibility of inferring essential process constraints, easily
understandable by business analysts and not affected by procedural details.
In this paper, we present a logic-based approach to address this issue. It
relies on Inductive Logic Programming (ILP) techniques and, in particular, on
a modified version of the Inductive Constraint Logic (ICL) algorithm [15]. The

280 F. Chesani et al.
algorithm takes as input a set of process execution traces, previously labeled
as compliant or not, and produces a set of SCIFF rules which correctly classify
them. This algorithm has been further modified, by properly tuning it and relying
on the mapping presented in [4], for learning ConDec models. Then, we describe
how the whole approach has been implemented as a plug-in of the ProM [23]
process mining framework. The plug-in, called DecMiner, is capable of mining
ConDec models starting from a set of process execution traces. The plug-in
envisages different phases, ranging from the classification of traces into compliant
and non-compliant subsets to the choice of which ConDec constraints have to be
considered and finally to the presentation of the mined model. The effectiveness
of the approach is illustrated by considering an example inspired by the one
presented in [17] that involves the management of a hotel and spa.
Our previous papers on process mining [14,13] focused on the algorithm for
learning SCIFF rules and presented only a sketch of the technique for the trans-
lation into ConDec. In this work we describe how we automated this process and
implemented it into the DecMiner ProM plug-in.
The paper is organized as follows. Section 2 describes the declarative languages
we consider, namely SCIFF and ConDec, and the mapping between ConDec
and a subset of SCIFF rules. Section 3 presents the learning process and the
DecMiner plug-in. Section 4 discusses the experiments performed for validating
the approach. Section 5 presents related works and, finally, Section 6 concludes
the paper and discusses future work.
2 Declarative Specification of Business Processes
In this section, we first briefly introduce the SCIFF language, a logic-based
language originally developed for specifying and verifying interaction protocols in
open Multi-Agent Systems [2]. We then briefly describe ConDec [18], a graphical
language supporting the intuitive modeling of declarative constraints on the flow
of activities. Finally, we sketch how SCIFF can be exploited to formalize ConDec
models as well as to extend its expressiveness, relying on the results presented
in [4].
2.1 An Overview of the SCIFF Framework
The SCIFF framework [2] is based on abduction, a reasoning paradigm which
allows to formulate hypotheses (called abducibles) accounting for observations.
In most abductive frameworks, integrity constraints are imposed over possible
hypotheses in order to prevent inconsistent explanations. SCIFF considers a
set of interacting peers as an open society, formalizing interaction protocols by
means of a set of global rules (constraints) which constrain the external and
observable behavior of participants.
To represent that an event ev happened (i.e., an atomic activity has been
executed) at a certain time T , SCIFF uses the symbol H(ev, T ), where ev is a
term and T is a variable or a number indicating the time. Hence, an execution

Exploiting Inductive Logic Programming Techniques 281
trace is modeled as a set of executed (happened) events. For example, we could
formalize that bob has performed activity a at time 5 as follows: H(a(bob), 5). Fur-
thermore, SCIFF introduces the concept of expectation, which plays a key role
when defining global interaction protocols, choreographies, and more in general
event-driven processes. It is quite natural, in fact, to think of a process in terms
of rules of the form: “if ev
1
happened, then ev
2
is expected to happen.” Positive
expectations are denoted by E(ev, T ) meaning that ev is expected to happen
at time T . To satisfy a positive expectation, an execution trace must contain
a matching happened event. Negative expectations are denoted by EN(ev, T )
meaning that ev is expected not to happen at time T . To satisfy a negative
expectation an execution trace must not contain a matching happened event.
SCIFF Integrity Constraints (ICs for short) are forward rules of the form
body head,wherebody can contain literals (i.e. a logical atom or its negation)
and happened events, and head contains a disjunction of conjunctions of expec-
tations and literals. In this paper, we consider a syntax of ICs that is a subset of
the one in [2]. In this simplified syntax, an IC C is a logical formula of the form
Body DisjE
1
... DisjE
n
DisjEN
1
... DisjEN
m
(1)
We will use Body(C) to indicate Body and Head(C) to indicate DisjE
1
...
DisjE
n
DisjEN
1
... DisjEN
m
of a rule C. Body is of the form b
1
...b
l
where the b
i
s are literals. Some of the literals may be of the form H(ev, T )
meaning that event ev has happened at time T . DisjE
j
is a formula of the
form E(ev, T ) d
1
... d
k
where ev is an event and the d
i
s are literals. All
the formulas DisjE
j
in Head(C) will be called positive disjuncts. DisjEN
j
is a
formula of the form EN(ev, T )d
1
...d
k
where ev is an event and the d
i
sare
literals. All the formulas DisjEN
j
in Head(C) will be called negative disjuncts.
The event ev can be a term. The literals b
i
sandd
i
s refer to predicates defined
in a SCIFF knowledge base. Variables in common to Body(C)andHead(C)are
universally quantified () with scope the whole IC. Variables occurring only in
positive disjuncts are existentially quantified () with scope the disjunct itself.
Variables occurring only in negative disjuncts are universally quantified ()with
scope the disjunct itself. An example of an IC is
(IC.1) H(a(bob),T) T<10
E(b(alice),T1) T<T1
EN (c(mary),T2) T<T2 T 2 <T+10
The meaning of the IC.1 is the following: if bob has executed action a at a time
T<10, then we expect alice to execute action b at some time T 1 later than T
(T 1) or we expect that mary does not execute action c at any time T 2(T 2)
within 9 time units after T .
The interpretation of an IC is the following: if there exists a substitution of
variables such that the body is true in an interpretation representing a trace,
then one of the disjuncts in the head must be true. A positive disjunct means
that we expect event ev to happen with T and its variables satisfying d
1
...d
k
.
Therefore the disjunct is true if there exist a substitution of variables occurring

282 F. Chesani et al.
in it such that ev is present in the trace and the d
i
s are satisfied. A negative
disjunct means that we expect event ev not to happen with T and its variables
satisfying d
1
... d
k
. Therefore the disjunct is true if for all substitutions of
variables occurring in it and not appearing in Body either ev does not happen
or, if it happens, its properties violate d
1
... d
k
.
The main and original application of the SCIFF framework and its proof pro-
cedure is to verify whether an execution of the process concretely adheres to
the specification, i.e., to perform compliance checking. SCIFF is seamlessly able
to check compliance both at run-time, by dynamically collecting and reason-
ing upon occurring events, or a posteriori, by analyzing the log of an observed
execution trace.
Roughly speaking, SCIFF combines occurred events with the specified rules,
to suitably generate the corresponding expectations; then expectations are veri-
fied against the execution trace: a positive expectation must have a correspond-
ing matching event, whereas a negative expectation forbids the presence of a
matching event. If such conditions are not met (i.e., a positive/negative expec-
tation is not/is matched by a corresponding event), then the expectations are
violated, and the execution trace is evaluated as non-compliant.
A posteriori compliance checking has been wrapped into a ProM plug-in called
SCIFFChecker [3], which can be exploited to classify MXML execution traces
as compliant or non-compliant w.r.t. a high-level declarative criterion. Such a
criterion is specified by configuring reactive business rules expressed in a natural
language-like manner and by automatically mapping them onto the underlying
formalism.
2.2 ConDec and Its SCIFF Mapping
ConDec [18,16] is a graphical language suitable for the declarative specification
of flexible Business Processes. Flexibility is provided since ConDec does not fix
a completely specified process flow, but rather imposes only the (minimal) set
of constraints that must be satisfied when executing the process activities. Con-
straints are policies/business rules which can be exploited to describe both what
is mandatory and what is forbidden in the process. They are mainly organized
into three basic groups: (i) existence constraints, unary relationships constraining
the cardinality of activity executions; (ii) relation constraints, positive relation-
ships between two activities used to specify what should be executed when a
given situation holds; (iii) negation constraints, the negated version of relation
ones, imposed to forbid the execution of a certain activity when a given situation
holds.
We have provided a complete mapping of ConDec relationships to SCIFF [4].
Table 1 shows some basic ConDec constraints, together with their corresponding
formalization. For example, the existence constraint specifies that the involved
activity must be executed at least once; this can be expressed in SCIFF by simply
stating that the activity is expected to happen.Theresponded existence between
A and B imposes the existence of B only if activity A is executed, without
putting any temporal condition between the two executions. Temporizing such

Citations
More filters
Journal ArticleDOI

Declarative specification and verification of service choreographiess

TL;DR: This work presents how DecSerFlow semantics can be mapped onto Linear Temporal Logic and onto Abductive Logic Programming, and illustrates the advantages of using a declarative language in conjunction with logic-based semantics.
Proceedings ArticleDOI

User-guided discovery of declarative process models

TL;DR: In this paper, the authors use DECLARE, a declarative language that provides more flexibility than conventional procedural notations such as BPMN, Petri nets, UML ADs, EPCs and BPEL.
Journal ArticleDOI

On the Discovery of Declarative Control Flows for Artful Processes

TL;DR: This article discussed how it addressed the challenge of discovering declarative control flows in the context of artful processes by devised and implemented a two-phase algorithm, named MINERful, and described in detail its discovery technique.
Book ChapterDOI

Discovering data-aware declarative process models from event logs

TL;DR: This paper proposes a technique to automatically discover declarative process models that incorporate both control-flow dependencies and data conditions and discovers underspecified models capturing recurrent rules relating pairs of activities, thus providing a summarized view of key rules governing the process.
Journal ArticleDOI

Online Discovery of Declarative Process Models from Event Streams

TL;DR: This paper presents a novel framework for the discovery of LTL-based declarative process models from streaming event data in settings where it is impossible to store all events over an extended period of time or where processes evolve while being analyzed.
References
More filters
Journal ArticleDOI

A Machine-Oriented Logic Based on the Resolution Principle

TL;DR: The paper concludes with a discussion of several principles which are applicable to the design of efficient proof-procedures employing resolution as the basle logical process.
Book ChapterDOI

Negation as failure

TL;DR: It is shown that when the clause data base and the queries satisfy certain constraints, which still leaves us with a data base more general than a conventional relational data base, the query evaluation process will find every answer that is a logical consequence of the completed data base.
Journal ArticleDOI

Workflow mining: discovering process models from event logs

TL;DR: A new algorithm is presented to extract a process model from a so-called "workflow log" containing information about the workflow process as it is actually being executed and represent it in terms of a Petri net.
Journal ArticleDOI

Inductive Logic Programming : Theory and Methods

TL;DR: The most important theories and methods of Inductive Logic Programming, a new discipline which investigates the inductive construction of first-order clausal theories from examples and background knowledge, are surveyed.
Journal ArticleDOI

Constraint logic programming : A survey

TL;DR: This survey of CLP is to give a systematic description of the major trends in terms of common fundamental concepts and the three main parts cover the theory, implementation issues, and programming for applications.
Related Papers (5)
Frequently Asked Questions (18)
Q1. What contributions have the authors mentioned in the paper "Exploiting inductive logic programming techniques for declarative process mining" ?

In this paper, the authors present a logic-based approach for tackling this problem. The authors investigate how, by properly tuning the learning algorithm, the approach can be adopted to mine models expressed in the ConDec notation, a graphical language for the declarative specification of business processes. The authors finally discuss the effectiveness of the approach by means of an example which shows the ability of the language to model concurrent activities and of DecMiner to learn such a 

In the future, the authors plan to apply DecMiner to university students ’ careers, where positive traces are careers of students that graduated on time, and negative ones are careers of students who did not finish their studies in the prescribed time. Moreover, the authors plan to investigate the development of a mining-checking cycle, in which learning is interleaved with classification of traces into positive or negative either manually by the user or automatically using the SCIFF Checker plug-in with a user specified model. 

The generality order that is used is θ-subsumption [19], a relationships between two clauses that can be checked syntactically and is stronger than implications. 

In it, a function named Inductive-Constraint-Logic performs a covering loop in which negative interpretations are progressively ruled out and removed from the set N . 

They are mainly organized into three basic groups: (i) existence constraints, unary relationships constraining the cardinality of activity executions; (ii) relation constraints, positive relationships between two activities used to specify what should be executed when a given situation holds; (iii) negation constraints, the negated version of relation ones, imposed to forbid the execution of a certain activity when a given situation holds. 

An advantage of mining ConDec constraints through SCIFF is that the approach can be extended to induce constraints involving more than two activities, for example constraints having a conjunction of preconditions or a disjunction of postconditions, and constraints with conditions over data. 

Activities room service, laundry service, and massage service log which services have been accessed to by the client, while billings for each service are represented by corresponding activities. 

The approach for learning process models of [9] involves iterating planning and operator refinement: given the current definition of the pre-conditions and post-conditions of the activities, a plan for achieving the business goal is generated and presented to the user which has to specify whether each activity of the plan can be executed. 

They influence the accuracy of the learned model because an activity relation discriminating between compliant and non-compliant execution traces cannot be learned if the appropriate template and/or activities were not chosen. 

The importance of adopting a declarative style of modeling has been recently pointed out by van der Aalst and Pesic [18]: the authors agree with their claim that declarative languages fit better complex, unpredictable processes, where a good balance between support and flexibility is of key importance. 

From them the authors learn a set of declarative constraints expressed as SCIFF rules able to accurately classify a new trace, and corresponding to a ConDec model. 

In order to avoid asking the user to classify activities, [10] proposed an approach for automatically generating negative events, i.e., events that are used as negative examples. 

DecMiner implements all the data preparation and learning phases of the mining process described above and guides the user by means of its graphical user interface. 

In the third phase, named “Templates”, the user uses the graphical interface shown in Figure 4 to choose the set of existence, relation and negation ConDec templates to be used in the mining phase. 

If the authors define a generality order and a generalization operator for ICs, the authors can apply an algorithm similar to ICL for learning ICs. 

The authors differ from these works because the authors use a representation that is declarative rather than procedural, without sacrificing expressiveness. 

[9] related BPM to the field of planning in artificial intelligence: activities in business process are seen as planning operators with pre-conditions and postconditions. 

The authors also investigated the robustness of DecMiner to noise in the classification of traces: the authors repeated the experiments by considering training sets with an increasing portion of misclassified examples.