scispace - formally typeset
Search or ask a question

Showing papers on "Workflow technology published in 2013"


Journal ArticleDOI
TL;DR: A characterization of workflows from six diverse scientific applications, including astronomy, bioinformatics, earthquake science, and gravitational-wave physics is provided, based on novel workflow profiling tools that provide detailed information about the various computational tasks that are present in the workflow.

648 citations


Journal ArticleDOI
TL;DR: Two workflow scheduling algorithms are proposed which aim to minimize the workflow execution cost while meeting a deadline and have a polynomial time complexity which make them suitable options for scheduling large workflows in IaaS Clouds.

580 citations


01 Jan 2013
TL;DR: This paper presents a framework for computing activity deadlines so that the overall process deadline is met and all external time constraints are satisfied.
Abstract: Time management is a critical component of workflow-based process management. Important aspects of time management include planning of workflow process execution in time, estimating workflow execution duration, avoiding deadline violations, and satisfying all external time constraints such as fixed-date constraints and upper and lower bounds for time intervals between activities. In this paper, we present a framework for computing activity deadlines so that the overall process deadline is met and all external time constraints are satisfied.

198 citations


Journal ArticleDOI
TL;DR: This work advances the idea of service-oriented modeling by presenting a design for a modeling service that builds from the Open Geospatial Consortium Web Processing Service (WPS) protocol, and demonstrates how the WPS protocol can be used to create modeling services, and how these modeling services can be brought into workflow environments using generic client-side code.
Abstract: Environmental modeling often requires the use of multiple data sources, models, and analysis routines coupled into a workflow to answer a research question. Coupling these computational resources can be accomplished using various tools, each requiring the developer to follow a specific protocol to ensure that components are linkable. Despite these coupling tools, it is not always straight forward to create a modeling workflow due to platform dependencies, computer architecture requirements, and programming language incompatibilities. A service-oriented approach that enables individual models to operate and interact with others using web services is one method for overcoming these challenges. This work advances the idea of service-oriented modeling by presenting a design for a modeling service that builds from the Open Geospatial Consortium (OGC) Web Processing Service (WPS) protocol. We demonstrate how the WPS protocol can be used to create modeling services, and then demonstrate how these modeling services can be brought into workflow environments using generic client-side code. We implemented this approach within the HydroModeler environment, a model coupling tool built on the Open Modeling Interface standard (version 1.4), and show how a hydrology model can be hosted as a WPS web service and used within a client-side workflow. The primary advantage of this approach is that the server-side software follows an established standard that can be leveraged and reused within multiple workflow environments and decision support systems.

137 citations


Journal ArticleDOI
TL;DR: This study proposes a new approach for multi-objective workflow scheduling in clouds, and presents the hybrid PSO algorithm to optimize the scheduling performance, based on the Dynamic Voltage and Frequency Scaling (DVFS) technique to minimize energy consumption.
Abstract: We address the problem of scheduling workflow applications on heterogeneous computing systems like cloud computing infrastructures. In general, the cloud workflow scheduling is a complex optimization problem which requires considering different criteria so as to meet a large number of QoS (Quality of Service) requirements. Traditional research in workflow scheduling mainly focuses on the optimization constrained by time or cost without paying attention to energy consumption. The main contribution of this study is to propose a new approach for multi-objective workflow scheduling in clouds, and present the hybrid PSO algorithm to optimize the scheduling performance. Our method is based on the Dynamic Voltage and Frequency Scaling (DVFS) technique to minimize energy consumption. This technique allows processors to operate in different voltage supply levels by sacrificing clock frequencies. This multiple voltage involves a compromise between the quality of schedules and energy. Simulation results on synthetic and real-world scientific applications highlight the robust performance of the proposed approach.

133 citations


Journal ArticleDOI
TL;DR: A dynamic critical‐path‐based adaptive workflow scheduling algorithm for grids, which determines efficient mapping of workflow tasks to grid resources dynamically by calculating the critical path in the workflow task graph at every step is proposed.
Abstract: SUMMARY Effective scheduling is a key concern for the execution of performance-driven grid applications such as workflows. In this paper, we first define the workflow scheduling problem and describe the existing heuristic-based and metaheuristic-based workflow scheduling strategies in grids. Then, we propose a dynamic critical-path-based adaptive workflow scheduling algorithm for grids, which determines efficient mapping of workflow tasks to grid resources dynamically by calculating the critical path in the workflow task graph at every step. Using simulation, we compared the performance of the proposed approach with the existing approaches, discussed in this paper for different types and sizes of workflows. The results demonstrate that the heuristic-based scheduling techniques can adapt to the dynamic nature of resource and avoid performance degradation in dynamically changing grid environments. Finally, we outline a hybrid heuristic combining the features of the proposed adaptive scheduling technique with metaheuristics for optimizing execution cost and time as well as meeting the users requirements to efficiently manage the dynamism and heterogeneity of the hybrid cloud environment. Copyright © 2013 John Wiley & Sons, Ltd.

124 citations


Journal ArticleDOI
01 Feb 2013
TL;DR: This paper investigating the application of process mining for workflow integration based on the concept of RM_WF_Net, a type of Petri net extended with resource and message factors, finds the coordination patterns between different organizations and the workflow models in different organizations from the running logs containing the information about resource allocation.
Abstract: Today's enterprise business processes become increasingly complex given that they are often executed by geographically dispersed partners or different organizations. Designing and modeling such a cross-organizational workflow is a complicated, time-consuming process and requires that a designer has extensive experience. Workflow logs captured by different cross-organizational systems provide a very valuable source of information on how business processes are executed in reality and thus can be used to derive workflow models through process mining. In this paper, we investigate the application of process mining for workflow integration based on the concept of RM_WF_Net, a type of Petri net extended with resource and message factors. Four coordination patterns are defined for workflow integration. A process mining approach is presented to discover the coordination patterns between different organizations and the workflow models in different organizations from the running logs containing the information about resource allocation. A process integration approach is then presented to obtain the model for a cross-organizational workflow based on the model mined for each organization and the coordination patterns between different organizations.

74 citations


Journal ArticleDOI
TL;DR: This paper discusses basic concepts of scientific workflows and presents workflow system tools and frameworks used today for the implementation of application in science and engineering on high-performance computers and distributed systems.
Abstract: The wide availability of high-performance computing systems, Grids and Clouds, allowed scientists and engineers to implement more and more complex applications to access and process large data repositories and run scientific experiments in silico on distributed computing platforms. Most of these applications are designed as workflows that include data analysis, scientific computation methods, and complex simulation techniques. Scientific applications require tools and high-level mechanisms for designing and executing complex workflows. For this reason, in the past years, many efforts have been devoted towards the development of distributed workflow management systems for scientific applications. This paper discusses basic concepts of scientific workflows and presents workflow system tools and frameworks used today for the implementation of application in science and engineering on high-performance computers and distributed systems. In particular, the paper reports on a selection of workflow systems largely used for solving scientific problems and discusses some open issues and research challenges in the area.

68 citations


Journal ArticleDOI
TL;DR: This paper shows how the workflow algebra is efficiently implemented in Chiron, an algebraic based parallel scientific workflow engine that has a unique native distributed provenance mechanism that enables runtime queries in a relational database.
Abstract: Large-scale scientific experiments based on computer simulations are typically modeled as scientific workflows, which eases the chaining of different programs. These scientific workflows are defined, executed and monitored by Scientific Workflow Management Systems (SWfMS). As these experiments manage large amounts of data, it becomes critical to execute them in High Performance Computing (HPC) environments, such as clusters, grids and clouds. However, few SWfMS provide parallel support. The ones that do so are usually labor-intensive for workflow developers and have limited primitives to optimize workflow execution. To address these issues, we developed a workflow algebra to specify and enable the optimization of parallel execution of scientific workflows. In this paper, we show how the workflow algebra is efficiently implemented in Chiron, an algebraic based parallel scientific workflow engine. Chiron has a unique native distributed provenance mechanism that enables run-time queries in a relational database. We developed two studies to evaluate the performance of our algebraic approach implemented in Chiron; the first study compares Chiron with different approaches while the second one evaluates the scalability of Chiron. By analyzing the results, we conclude that Chiron is efficient in executing scientific workflows, with the benefits of declarative specification and runtime provenance support.

64 citations


Proceedings ArticleDOI
18 Mar 2013
TL;DR: The benefits of representing and sharing runtime provenance data for improving the experiment management as well as the analysis of the scientific data are shown.
Abstract: Scientific workflows are commonly used to model and execute large-scale scientific experiments. They represent key resources for scientists and are enacted and managed by Scientific Workflow Management Systems (SWfMS). Each SWfMS has its particular approach to execute workflows and to capture and manage their provenance data. Due to the large scale of experiments, it may be unviable to analyze provenance data only after the end of the execution. A single experiment may demand weeks to run, even in high performance computing environments. Thus scientists need to monitor the experiment during its execution, and this can be done through provenance data. Runtime provenance analysis allows for scientists to monitor workflow execution and to take actions before the end of it (i.e. workflow steering). This provenance data can also be used to fine-tune the parallel execution of the workflow dynamically. We use the PROV data model as a basic framework for modeling and providing runtime provenance as a database that can be queried even during the execution. This database is agnostic of SWfMS and workflow engine. We show the benefits of representing and sharing runtime provenance data for improving the experiment management as well as the analysis of the scientific data.

64 citations


Patent
14 Mar 2013
TL;DR: In this article, a computer-implemented method for managing a release of a software product includes obtaining a request for the release, the request including workflow action parameter data to define a release pipeline involving a plurality of software engineering systems configured to process data indicative of the software product, and executing, with a processor, a workflow to implement the release pipeline in accordance with the workflow action parameters.
Abstract: A computer-implemented method for managing a release of a software product includes obtaining a request for the release, the request including workflow action parameter data to define a release pipeline involving a plurality of software engineering systems configured to process data indicative of the software product, and executing, with a processor, a workflow to implement the release pipeline in accordance with the workflow action parameter data. Executing the workflow includes sending a series of instructions to the plurality of software engineering systems. A successive instruction in the series of instructions is sent based on whether a gating rule for the release is met.

Patent
06 Mar 2013
TL;DR: In this article, a system receives data storage workflow activities that include computer-executable instructions for carrying out data storage workflows in a network data storage system, and the system deploys the workflow to one or more workflow engines that can execute the various data storage activities related to the workflow.
Abstract: A system receives data storage workflow activities that include computer-executable instructions for carrying out data storage workflow in a network data storage system. Once the workflow is received, the system deploys the workflow to one or more workflow engines that can execute the various data storage activities related to the workflow. Prior to executing a data storage activity, the system can determine which workflow engine to use based on an allocation scheme.

Journal ArticleDOI
TL;DR: This work presents two cases demonstrating the potential value of patient-oriented workflow models, and defines meaningful system boundaries and can lead to HIT implementations that are more consistent with cooperative work and its emergent features.

Journal ArticleDOI
TL;DR: The aim of this work is to propose an alternative approach for flexible process support within PLM systems that deal with a service-oriented perspectives rather than an activity-oriented one.
Abstract: Manufacturing industries collaborating to develop new products need to implement an effective management of their design processes DPs and product information. Unfortunately, product lifecycle management PLM systems which are dedicated to support design activities are not efficient as it might be expected. Indeed, DPs are changing, emergent and non deterministic, due to the business environment under which they are carried out. PLM systems are currently based on workflow technology which does not support process agility. So, needs in terms of process support flexibility are necessary to facilitate the coupling with the environment reality. Furthermore, service-oriented approaches SOA enhances flexibility and adaptability of composed solutions. Systems based on SOA have the ability to inherently being evolvable. So, we can say that SOA can promote a support of flexible DPs. The aim of this work is to propose an alternative approach for flexible process support within PLM systems. The objective is to specify, design and implement business processes BPs in a very flexible way so that business changes can rapidly be considered in PLM solutions. Unlike existing approaches, the proposed one deal with a service-oriented perspectives rather than an activity-oriented one.

Journal ArticleDOI
TL;DR: A large class of applications need to execute the same workflow on different datasets of identical size, and the scheduling problem that encompasses all of these techniques is called pipelined workflow scheduling.
Abstract: A large class of applications need to execute the same workflow on different datasets of identical size. Efficient execution of such applications necessitates intelligent distribution of the application components and tasks on a parallel machine, and the execution can be orchestrated by utilizing task, data, pipelined, and/or replicated parallelism. The scheduling problem that encompasses all of these techniques is called pipelined workflow scheduling, and it has been widely studied in the last decade. Multiple models and algorithms have flourished to tackle various programming paradigms, constraints, machine behaviors, or optimization goals. This article surveys the field by summing up and structuring known results and approaches.

BookDOI
01 Jan 2013
TL;DR: This book follows the loose programming paradigm, a novel approach to user-level workflow design, which makes essential use of constraint-driven workflow synthesis : Constraints provide high-level, declarative descriptions of individual components and entire workflows, which are then used to automatically translate the high- level specifications into concrete workflows that conform to the constraints by design.
Abstract: Just as driving a car needs no engineer, steering a computer should need no programmer and the development of user-specific software should be in the hands of the user. Service-oriented and model-based approaches have become the methods of choice for user-centric development of variant-rich workflows in many application domains. Formal methods can be integrated to further support the workflow development process at different levels. Particularly effective with regard to user-level workflow design are constraint-based methods, where the key role of constraints is to capture intents about the developed applications in the user’s specific domain language. This book follows the loose programming paradigm, a novel approach to user-level workflow design, which makes essential use of constraint-driven workflow synthesis : Constraints provide high-level, declarative descriptions of individual components and entire workflows. Process synthesis techniques are then used to automatically translate the high-level specifications into concrete workflows that conform to the constraints by design. Loose programming is moreover characterized by its unique holistic perspective on workflow development: being fully integrated into a mature process development framework, it profits seamlessly from the availability of various already established features and methods. In this book, the applicability of this framework is evaluated with a particular focus on the bioinformatics application domain. For this purpose, the first reference implementation of the loose programming paradigm is applied to a series of real-life bioinformatics workflow scenarios, whose different characteristics allow for a detailed evaluation of the features, capabilities, and limitations of the approach. The applications show that the proposed approach to constraint-driven design of variant-rich workflows enables the user to effectively create and manage software processes in his specific domain language and frees him from dealing with the technicalities of the individual services and their composition. Naturally, the quality of the synthesis solutions crucially depends on the provided domain model and on the applied synthesis strategy and constraints.

Journal ArticleDOI
01 Sep 2013
TL;DR: This work presents the fine-grained interoperability solution proposed in the SHIWA European project that brings together four representative European workflow systems: ASKALON, MOTEUR, WS-PGRADE, and Triana, and proposes a generic Interoperable Workflow Intermediate Representation (IWIR) that can be used as a common bridge for translating workflows between different languages independent of the underlying distributed computing infrastructure.
Abstract: Today there exist a wide variety of scientific workflow management systems, each designed to fulfill the needs of a certain scientific community. Unfortunately, once a workflow application has been designed in one particular system it becomes very hard to share it with users working with different systems. Portability of workflows and interoperability between current systems barely exists. In this work, we present the fine-grained interoperability solution proposed in the SHIWA European project that brings together four representative European workflow systems: ASKALON, MOTEUR, WS-PGRADE, and Triana. The proposed interoperability is realised at two levels of abstraction: abstract and concrete. At the abstract level, we propose a generic Interoperable Workflow Intermediate Representation (IWIR) that can be used as a common bridge for translating workflows between different languages independent of the underlying distributed computing infrastructure. At the concrete level, we propose a bundling technique that aggregates the abstract IWIR representation and concrete task representations to enable workflow instantiation, execution and scheduling. We illustrate case studies using two real-workflow applications designed in a native environment and then translated and executed by a foreign workflow system in a foreign distributed computing infrastructure.

Proceedings ArticleDOI
22 Apr 2013
TL;DR: A broker-based framework for running workflows in a multi-Cloud environment that allows an automatic selection of the target Clouds, a uniform access to the Clouds, and workflow data management with respect to user Service Level Agreement (SLA) requirements is presented.
Abstract: Computational science workflows have been successfully run on traditional HPC systems like clusters and Grids for many years. Today, users are interested to execute their workflow applications in the Cloud to exploit the economic and technical benefits of this new emerging technology. The deployment and management of workflows over the current existing heterogeneous and not yet interoperable Cloud providers, however, is still a challenging task for the workflow developers. In this paper, we present a broker-based framework for running workflows in a multi-Cloud environment. The framework allows an automatic selection of the target Clouds, a uniform access to the Clouds, and workflow data management with respect to user Service Level Agreement (SLA) requirements. Following a simulation approach, we evaluated the framework with a real scientific workflow application in different deployment scenarios. The results show that our framework offers benefits to users by executing workflows with the expected performance and service quality at lowest cost.

Proceedings ArticleDOI
15 Jul 2013
TL;DR: This work evaluated FlowFixer on 16 broken workflows from 5 realworld GUI applications written in Java and found that it produced significantly better results than two alternative approaches.
Abstract: A workflow is a sequence of UI actions to complete a specific task. In the course of a GUI application's evolution, changes ranging from a simple GUI refactoring to a complete rearchitecture can break an end-user's well-established workflow. It can be challenging to find a replacement workflow. To address this problem, we present a technique (and its tool implementation, called FlowFixer) that repairs a broken workflow. FlowFixer uses dynamic profiling, static analysis, and random testing to suggest a replacement UI action that fixes a broken workflow. We evaluated FlowFixer on 16 broken workflows from 5 realworld GUI applications written in Java. In 13 workflows, the correct replacement action was FlowFixer's first suggestion. In 2 workflows, the correct replacement action was FlowFixer's second suggestion. The remaining workflow was un-repairable. Overall, FlowFixer produced significantly better results than two alternative approaches.

Patent
15 Mar 2013
TL;DR: In this paper, the authors present systems and methods for use in creating an evaluation workflow defining a multiple step evaluation process for use by one or more users variously involved in an evidence-based evaluation.
Abstract: Several embodiments provide systems and methods for use in creating an evaluation workflow defining a multiple step evaluation process for use by one or more users variously involved in an evidence-based evaluation. The systems and methods allow the user to define the evaluation workflow and store the evaluation workflow in a database, allow the user to add a plurality of assessments to the evaluation workflow and store the plurality of assessments in association with the evaluation workflow in the database, each assessment defining an evaluation event at a given point in time to be assessed as part of the evaluation process spanning an evaluation period of time, and allow the user to add one or more parts to each of the plurality of assessments and store the one or more parts in association with the plurality of assessments in the database.

Book ChapterDOI
01 Jan 2013
TL;DR: Some of the most important research efforts and results in modeling temporal aspects of workflows, analysis of temporal properties of workflow models, computation of workflow execution schedules, and minimization of exceptions due to violation of temporal constraints are summarized.
Abstract: Time is an important aspect of business process management. Here we revisit the following contributions of early workflow time management approaches: representation of temporal information and temporal constraints, analysis of temporal constraint satisfiability, and computation of workflow execution plans that satisfy temporal constraints. In particular, we summarize some of the most important research efforts and results in: (a) modeling temporal aspects of workflows, (b) analysis of temporal properties of workflow models, (c) computation of workflow execution schedules, (d) minimization of exceptions due to violation of temporal constraints, (e) monitoring of temporal workflow aspects, and (f) modeling and calculation of temporal properties for distributed workflows and for guaranteeing Quality of Service in Web-service composition.

Journal ArticleDOI
TL;DR: A performance evaluation for SciPhylomics executions in a real cloud environment using two parallel execution approaches (SciCumulus and Hadoop) at the Amazon EC2 cloud reinforces the benefits of parallelizing data for the phylogenomic inference workflow using MapReduce-like parallel approaches in the cloud.

Proceedings ArticleDOI
23 Dec 2013
TL;DR: This paper presents two general approaches, one that exclusively uses object stores to store all the files accessed and generated by a workflow, while the other relies on the shared filesystem for caching intermediate data sets.
Abstract: Scientific workflows consist of tasks that operate on input data to generate new data products that are used by subsequent tasks. Workflow management systems typically stage data to computational sites before invoking the necessary computations. In some cases data may be accessed using remote I/O. There are limitations with these approaches, however. First, the storage at a computational site may be limited and not able to accommodate the necessary input and intermediate data. Second, even if there is enough storage, it is sometimes managed by a filesystem with limited scalability. In recent years, object stores have been shown to provide a scalable way to store and access large datasets, however, they provide a limited set of operations (retrieve, store and delete) that do not always match the requirements of the workflow tasks. In this paper, we show how scientific workflows can take advantage of the capabilities of object stores without requiring users to modify their workflow-based applications or scientific codes. We present two general approaches, one that exclusively uses object stores to store all the files accessed and generated by a workflow, while the other relies on the shared filesystem for caching intermediate data sets. We have implemented both of these approaches in the Pegasus Workflow Management System and have used them to execute workflows in variety of execution environments ranging from traditional supercomputing environments that have a shared filesystem to dynamic environments like Amazon AWS and the Open Science Grid that only offer remote object stores. As a result, Pegasus users can easily migrate their applications from a shared filesystem deployment to one using object stores without changing their application codes.

Journal ArticleDOI
TL;DR: The paper presents implementation details of the multithreaded workflow execution engine implemented in JEE, and performs tests for three different optimization goals for two business and scientific workflow applications.
Abstract: The paper presents a complete solution for modeling scientific and busi- ness workflow applications, static and just-in-time QoS selection of services and workflow execution in a real environment. The workflow application is modeled as an acyclic directed graph where nodes denote tasks and edges denote dependencies between the tasks. The BeesyCluster middleware is used to allow providers to pub- lish services from sequential or parallel applications, from their servers or clusters. Optimization algorithms are proposed to select a capable service for each task so that a global criterion is optimized such as a product of workflow execution time and cost, a linear combination of those or minimization of the time with a cost constraint. The paper presents implementation details of the multithreaded workflow execution en- gine implemented in JEE. Several tests were performed for three different optimiza- tion goals for two business and scientific workflow applications. Finally, the overhead of the solution is presented.

Journal ArticleDOI
TL;DR: A Petri net based approach for recourse requirements analysis, which can be used for more general purposes, and the concept of resource-oriented workflow nets (ROWN) is introduced and the transition firing rules of ROWN are presented.
Abstract: Petri nets are a powerful formalism in modeling workflows. A workflow determines the flow of work according to pre-defined business process. In many situations, business processes are constrained by scarce resources. The lack of resources can cause contention, the need for some tasks to wait for others to complete, which slows down the accomplishment of larger goals. In our previous work, a resource-constrained workflow model was introduced and a resource requirement analysis approach was developed for emergency response workflows, in which support of on-the-fly workflow change is critical [14]. In this paper, we propose a Petri net based approach for recourse requirements analysis, which can be used for more general purposes. The concept of resource-oriented workflow nets (ROWN) is introduced and the transition firing rules of ROWN are presented. Resource requirements for general workflows can be done through reachability analysis. An efficient resource analysis algorithm is developed for a class of well-structured workflows, in which when a task execution is started it is guaranteed to finish successfully. For a task that may fail in the middle of execution, an equivalent non-failing task model in terms of resource consumption is developed.

Proceedings ArticleDOI
23 Jun 2013
TL;DR: This paper proposes an approach to automatically obtain abstractions from low-level provenance data by finding common workflow fragments on workflow execution provenance and relating them to templates and shows that by using these kinds of abstractions the authors can highlight the most common abstract methods used in the executions of a repository.
Abstract: Provenance plays a major role when understanding and reusing the methods applied in a scientific experiment, as it provides a record of inputs, the processes carried out and the use and generation of intermediate and final results. In the specific case of in-silico scientific experiments, a large variety of scientific workflow systems (e.g., Wings, Taverna, Galaxy, Vistrails) have been created to support scientists. All of these systems produce some sort of provenance about the executions of the workflows that encode scientific experiments. However, provenance is normally recorded at a very low level of detail, which complicates the understanding of what happened during execution. In this paper we propose an approach to automatically obtain abstractions from low-level provenance data by finding common workflow fragments on workflow execution provenance and relating them to templates. We have tested our approach with a dataset of workflows published by the Wings workflow system. Our results show that by using these kinds of abstractions we can highlight the most common abstract methods used in the executions of a repository, relating different runs and workflow templates with each other.

Proceedings Article
01 Aug 2013
TL;DR: This work introduces the proposed five step workflow for creating information extractors, the graph query based rule language, as well as the core features of the PROPMINER tool.
Abstract: The use of deep syntactic information such as typed dependencies has been shown to be very effective in Information Extraction. Despite this potential, the process of manually creating rule-based information extractors that operate on dependency trees is not intuitive for persons without an extensive NLP background. In this system demonstration, we present a tool and a workflow designed to enable initiate users to interactively explore the effect and expressivity of creating Information Extraction rules over dependency trees. We introduce the proposed five step workflow for creating information extractors, the graph query based rule language, as well as the core features of the PROPMINER tool.

Journal ArticleDOI
TL;DR: A comprehensive framework tailored for flexible human-centric healthcare processes that improves the reliability of activity recognition data and presents a set of mechanisms that exploit the application knowledge encoded in workflows in order to reduce the uncertainty of this data, thus enabling unobtrusive robust healthcare workflows.
Abstract: Processes in the healthcare domain are characterized by coarsely predefined recurring procedures that are flexibly adapted by the personnel to suite-specific situations. In this setting, a workflow management system that gives guidance and documents the personnel's actions can lead to a higher quality of care, fewer mistakes, and higher efficiency. However, most existing workflow management systems enforce rigid inflexible workflows and rely on direct manual input. Both are inadequate for healthcare processes. In particular, direct manual input is not possible in most cases since (1) it would distract the personnel even in critical situations and (2) it would violate fundamental hygiene principles by requiring disinfected doctors and nurses to touch input devices. The solution could be activity recognition systems that use sensor data (e.g., audio and acceleration data) to infer the current activities by the personnel and provide input to a workflow (e.g., informing it that a certain activity is finished now). However, state-of-the-art activity recognition technologies have difficulties in providing reliable information. We describe a comprehensive framework tailored for flexible human-centric healthcare processes that improves the reliability of activity recognition data. We present a set of mechanisms that exploit the application knowledge encoded in workflows in order to reduce the uncertainty of this data, thus enabling unobtrusive robust healthcare workflows. We evaluate our work based on a real-world case study and show that the robustness of unobtrusive healthcare workflows can be increased to an absolute value of up to 91p (compared to only 12p with a classical workflow system). This is a major breakthrough that paves the way towards future IT-enabled healthcare systems.

Proceedings ArticleDOI
27 Jun 2013
TL;DR: This paper proposes the generation of workflow description summaries in order to tackle workflow complexity, elaborate reduction primitives for summarizing workflows, and shows how primitives, as building blocks, can be used in conjunction with semantic workflow annotations to encode different summarization strategies.
Abstract: Scientific workflows have become the workhorse of Big Data analytics for scientists. As well as being repeatable and optimizable pipelines that bring together datasets and analysis tools, workflows make-up an important part of the provenance of data generated from their execution. By faithfully capturing all stages in the analysis, workflows play a critical part in building up the audit-trail (a.k.a. provenance) meta-data for derived datasets and contributes to the veracity of results. Provenance is essential for reporting results, reporting the method followed, and adapting to changes in the datasets or tools. These functions, however, are hampered by the complexity of workflows and consequently the complexity of data-trails generated from their instrumented execution. In this paper we propose the generation of workflow description summaries in order to tackle workflow complexity. We elaborate reduction primitives for summarizing workflows, and show how primitives, as building blocks, can be used in conjunction with semantic workflow annotations to encode different summarization strategies. We report on the effectiveness of the method through experimental evaluation using real-world workflows from the Tavern a system.

Journal ArticleDOI
TL;DR: This paper bridges the gap between provenance and geo-processing workflow through extending both workflow language and service interface, making it possible for the automatic capture of provenance information in the geospatial web service environment.
Abstract: Data provenance, also called data lineage, records the derivation history of a data product. In the earth science domain, geospatial data provenance is important because it plays a significant role in data quality and usability evaluation, data trail audition, workflow replication, and product reproducibility. The generation of the geospatial provenance metadata is usually coupled with the execution of geo-processing workflow. Their symbiotic relationship makes them complementary to each other and promises great benefit once they are integrated. However, the heterogeneity of data and computing resources in the distributed environment constructed under the service-oriented architecture (SOA) brings a great challenge to resource integration. Specifically, the issues, such as the lack of interoperability and compatibility among provenance metadata models and between provenance and workflow, create obstacles for the integration of provenance, and geo-processing workflow. In order to tackle these issues, on one hand, this paper breaks the provenance heterogeneity through recording provenance information in a standard lineage model defined in ISO 19115:2003 and ISO 19115-2:2009 standards. On the other hand, this paper bridges the gap between provenance and geo-processing workflow through extending both workflow language and service interface, making it possible for the automatic capture of provenance information in the geospatial web service environment. The proposed method is implemented in the GeoBrain, a SOA-based geospatial web service system. The testing result from implementation shows that the geospatial provenance information is successfully captured throughout the life cycle of geo-processing workflows and properly recorded in the ISO standard lineage model.