scispace - formally typeset
Search or ask a question

Showing papers by "Laura Maruster published in 2003"



Journal ArticleDOI
01 Nov 2003
TL;DR: This paper introduces the concept of workflow mining and presents a common format for workflow logs, and discusses the most challenging problems and present some of the workflow mining approaches available today.
Abstract: Many of today's information systems are driven by explicit process models. Workflow management systems, but also ERP, CRM, SCM, and B2B, are configured on the basis of a workflow model specifying the order in which tasks need to be executed. Creating a workflow design is a complicated time-consuming process and typically there are discrepancies between the actual workflow processes and the processes as perceived by the management. To support the design of workflows, we propose the use of workflow mining. Starting point for workflow mining is a so-called "workflow log" containing information about the workflow process as it is actually being executed. In this paper, we introduce the concept of workflow mining and present a common format for workflow logs. Then we discuss the most challenging problems and present some of the workflow mining approaches available today.

1,168 citations


DOI
01 Jan 2003
TL;DR: The goal of the thesis is to show that machine learning techniques can be used successfully for understanding a process on the basis of data, by means of clustering process related measures, induction of predictive models, and process discovery.
Abstract: Business processes (industries, administration, hospitals, etc.) become nowadays more and more complex and it is difficult to have a complete understanding of them. The goal of the thesis is to show that machine learning techniques can be used successfully for understanding a process on the basis of data, by means of clustering process related measures, induction of predictive models, and process discovery. This goal is achieved by means of two approaches: (i) classify process cases (e.g. patients) into logistic homogeneous groups and induce models that assign a new case to a logistic group and (ii) discover the underlying process. By doing so, the process can be modelled, analysed and improved. Another benefit is that systems can be designed more efficiently to support and control the processes more effectively. We target on the analysis of two sorts of data, namely aggregated data and sequence data. Aggregated data result from performing some transformations on raw data, focusing on a specific concept, that is not yet explicit in the raw data. This aggregation is similar to feature construction, as used in the machine learning domain. In this thesis, aggregated data are the variables that result from operationalizing the concept of process complexity. These aggregated data are used to develop logistic homogeneous clusters. This means that elements in different clusters will differ from the routing complexity point of view. We show that developing homogeneous clusters for a given process is relevant in connection with the induction of predictive models. Namely, the routing in the process can be predicted using the logistic clusters. We do not aim to provide concrete directives for building control systems, rather our models should be taken as indicatives of their potential. Sequence data describe the sequence of activities over time in a process execution. They are recorded in a process log, during the execution of the process steps. Due to exceptions, missing or incomplete registration and errors, the data can be noisy. By using sequence data, the goal is to derive a model explaining the events recorded. In situations without noise and sufficient information, we provide a method for building a process model from the process log. Moreover, we discuss the class of models for which it is possible to accurately rediscover the model by looking at the process log. Machine learning techniques are especially useful when discovering a process model from noisy sequence data. Such a model can be further analyzed and eventually improved, but these issues are beyond the scope of this thesis. Through the applications of our proposed methods on different data (e.g. hospital data, workflow data and administrative governmental data), we have shown that our methods result in useful models and subsequently can be used in practice. We applied our methods on data-sets for which (i) it was possible to aggregate relevant information and (ii) sequence data were available.

35 citations


Book ChapterDOI
01 Jan 2003
TL;DR: It is claimed that the overall distributed process can be induced, by using this partial information of all involved parties in the information system of each supply chain party.
Abstract: Processes such as tendering, ordering, delivery, and paying are executed by several parties in almost all supply chains. However, none of these parties has a proper overview over the whole set of activities executed. Therefore, none of the parties can take the lead in business process redesign. Business processes are often not described in an explicit manner, and therefore they are not available for analysis. However, in the information system of each supply chain party, partial information about the business process are recorded. We claim that the overall distributed process can be induced, by using this partial information of all involved parties.

16 citations