scispace - formally typeset
Search or ask a question
Author

Alfredo Bolt

Bio: Alfredo Bolt is an academic researcher from Eindhoven University of Technology. The author has contributed to research in topics: Process mining & Business process discovery. The author has an hindex of 11, co-authored 16 publications receiving 358 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A novel approach to incrementally compute prefix-alignments is presented, paving the way for real-time online conformance checking.
Abstract: Companies often specify the intended behaviour of their business processes in a process model. Conformance checking techniques allow us to assess to what degree such process models and corresponding process execution data correspond to one another. In recent years, alignments have proven extremely useful for calculating conformance checking statistics. Existing techniques to compute alignments have been developed to be used in an offline, a posteriori setting. However, we are often interested in observing deviations at the moment they occur, rather than days, weeks or even months later. Hence, we need techniques that enable us to perform conformance checking in an online setting. In this paper, we present a novel approach to incrementally compute prefix-alignments, paving the way for real-time online conformance checking. Our experiments show that the reuse of previously computed prefix-alignments enhances memory efficiency, whilst preserving prefix-alignment optimality. Moreover, we show that, in case of computing approximate prefix-alignments, there is a clear trade-off between memory efficiency and approximation error.

80 citations

Posted Content
TL;DR: RapidProM, an extension of RapidMiner based on ProM, combines the best of both worlds: complex process mining workflows can be modeled and executed easily and subsequently reused for other data sets.
Abstract: The number of events recorded for operational processes is growing every year. This applies to all domains: from health care and e-government to production and maintenance. Event data are a valuable source of information for organizations that need to meet requirements related to compliance, efficiency, and customer service. Process mining helps to turn these data into real value: by discovering the real processes, by automatically identifying bottlenecks, by analyzing deviations and sources of non-compliance, by revealing the actual behavior of people, etc. Process mining is very different from conventional data mining and machine learning techniques. ProM is a powerful open-source process mining tool supporting hundreds of analysis techniques. However, ProM does not support analysis based on scientific workflows. RapidProM, an extension of RapidMiner based on ProM, combines the best of both worlds. Complex process mining workflows can be modeled and executed easily and subsequently reused for other data sets. Moreover, using RapidProM, one can benefit from combinations of process mining with other types of analysis available through the RapidMiner marketplace.

50 citations

Journal ArticleDOI
TL;DR: The results show how the technique is able to detect relevant differences that could not be captured using existing approaches, and the user is not overloaded with diagnostics on differences that are less significant.

44 citations

Book ChapterDOI
08 Jun 2015
TL;DR: This paper formalizes the notion of process cubes where the event data is presented and organized using different dimensions, where each cell in the cube corresponds to a set of events which can be used as an input by any process mining technique.
Abstract: Process mining techniques enable the analysis of processes using event data. For structured processes without too many variations, it is possible to show a relative simple model and project performance and conformance information on it. However, if there are multiple classes of cases exhibiting markedly different behaviors, then the overall process will be too complex to interpret. Moreover, it will be impossible to see differences in performance and conformance for the different process variants. The different process variations should be analysed separately and compared to each other from different perspectives to obtain meaningful insights about the different behaviors embedded in the process. This paper formalizes the notion of process cubes where the event data is presented and organized using different dimensions. Each cell in the cube corresponds to a set of events which can be used as an input by any process mining technique. This notion is related to the well-known OLAP (Online Analytical Processing) data cubes, adapting the OLAP paradigm to event data through multidimensional process mining. This adaptation is far from trivial given the nature of event data which cannot be easily summarized or aggregated, conflicting with classical OLAP assumptions. For example, multidimensional process mining can be used to analyze the different versions of a sales processes, where each version can be defined according to different dimensions such as location or time, and then the different results can be compared. This new way of looking at processes may provide valuable insights for process optimization.

42 citations

01 Jan 2015
TL;DR: This paper structures the basic building blocks needed for process mining and describes various analysis scenarios, and implemented RapidProM, a tool supporting scientific workflows for processmining.
Abstract: Over the last decade process mining emerged as a new analytical discipline able to answer a variety of questions based on event data. Event logs have a very particular structure; events have timestamps, refer to activities and resources, and need to be correlated to form process instances. Process mining results tend to be very different from classical data mining results, e.g., process discovery may yield end-to-end process models capturing different perspectives rather than decision trees or frequent patterns. A process-mining tool like ProM provides hundreds of different process mining techniques ranging from discovery and conformance checking to filtering and prediction. Typically, a combination of techniques is needed and, for every step, there are different techniques that may be very sensitive to parameter settings. Moreover, event logs may be huge and may need to be decomposed and distributed for analysis. These aspects make it very cumbersome to analyze event logs manually. Process mining should be repeatable and automated. Therefore, we propose a framework to support the analysis of process mining workflows. Existing scientific workflow systems and data mining tools are not tailored towards process mining and the artifacts used for analysis (process models and event logs). This paper structures the basic building blocks needed for process mining and describes various analysis scenarios. Based on these requirements we implemented RapidProM, a tool supporting scientific workflows for process mining. Examples illustrating the different scenarios are provided to show the feasibility of the approach.

42 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal Article
TL;DR: LNBIP reports state-of-the-art results in areas related to business information systems and industrial application software development – timely, at a high level, and in both printed and electronic form.
Abstract: LNBIP reports state-of-the-art results in areas related to business information systems and industrial application software development – timely, at a high level, and in both printed and electronic form. The type of material published includes • Proceedings (published in time for the respective event) • Postproceedings (consisting of thoroughly revised and/or extended final papers) • Other edited monographs (such as, for example, project reports or invited volumes) • Tutorials (coherently integrated collections of lectures given at advanced courses, seminars, schools, etc.) • Award-winning or exceptional theses LNBIP is abstracted/indexed in DBLP, EI and Scopus. LNBIP volumes are also submitted for the inclusion in ISI Proceedings.

347 citations

Journal ArticleDOI
TL;DR: The results highlight gaps and unexplored tradeoffs in the field, including the lack of scalability of some methods and a strong divergence in their performance with respect to the different quality metrics used.
Abstract: Process mining allows analysts to exploit logs of historical executions of business processes to extract insights regarding the actual performance of these processes. One of the most widely studied process mining operations is automated process discovery. An automated process discovery method takes as input an event log, and produces as output a business process model that captures the control-flow relations between tasks that are observed in or implied by the event log. Various automated process discovery methods have been proposed in the past two decades, striking different tradeoffs between scalability, accuracy, and complexity of the resulting models. However, these methods have been evaluated in an ad-hoc manner, employing different datasets, experimental setups, evaluation measures, and baselines, often leading to incomparable conclusions and sometimes unreproducible results due to the use of closed datasets. This article provides a systematic review and comparative evaluation of automated process discovery methods, using an open-source benchmark and covering 12 publicly-available real-life event logs, 12 proprietary real-life event logs, and nine quality metrics. The results highlight gaps and unexplored tradeoffs in the field, including the lack of scalability of some methods and a strong divergence in their performance with respect to the different quality metrics used.

225 citations

Posted Content
TL;DR: A novel process mining library that aims to bridge the gap between commercial and open-source process mining tools, providing integration with state-of-the-art data science libraries, e.g., pandas, numpy, scipy and scikit-learn is presented.
Abstract: Process mining, i.e., a sub-field of data science focusing on the analysis of event data generated during the execution of (business) processes, has seen a tremendous change over the past two decades. Starting off in the early 2000's, with limited to no tool support, nowadays, several software tools, i.e., both open-source, e.g., ProM and Apromore, and commercial, e.g., Disco, Celonis, ProcessGold, etc., exist. The commercial process mining tools provide limited support for implementing custom algorithms. Moreover, both commercial and open-source process mining tools are often only accessible through a graphical user interface, which hampers their usage in large-scale experimental settings. Initiatives such as RapidProM provide process mining support in the scientific workflow-based data science suite RapidMiner. However, these offer limited to no support for algorithmic customization. In the light of the aforementioned, in this paper, we present a novel process mining library, i.e. Process Mining for Python (PM4Py) that aims to bridge this gap, providing integration with state-of-the-art data science libraries, e.g., pandas, numpy, scipy and scikit-learn. We provide a global overview of the architecture and functionality of PM4Py, accompanied by some representative examples of its usage.

141 citations