Showing papers on "Workflow technology published in 2013"

PDF

Open Access

Journal Article•DOI•

Characterizing and profiling scientific workflows

[...]

Gideon Juve¹, Ann L. Chervenak¹, Ewa Deelman¹, Shishir Bharathi¹, Gaurang Mehta¹, Karan Vahi¹ - Show less +2 more•Institutions (1)

University of Southern California¹

01 Mar 2013-Future Generation Computer Systems

TL;DR: A characterization of workflows from six diverse scientific applications, including astronomy, bioinformatics, earthquake science, and gravitational-wave physics is provided, based on novel workflow profiling tools that provide detailed information about the various computational tasks that are present in the workflow.

...read moreread less

648 citations

Journal Article•DOI•

Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds

[...]

Saeid Abrishami¹, Mahmoud Naghibzadeh¹, Dick Epema²•Institutions (2)

Ferdowsi University of Mashhad¹, Delft University of Technology²

01 Jan 2013-Future Generation Computer Systems

TL;DR: Two workflow scheduling algorithms are proposed which aim to minimize the workflow execution cost while meeting a deadline and have a polynomial time complexity which make them suitable options for scheduling large workflows in IaaS Clouds.

...read moreread less

580 citations

Time Constraints in Workflow Systems

[...]

Johann Eder¹, Johann Eder², Euthimios Panagos¹, Michael Rabinovich¹•Institutions (2)

AT&T Labs¹, Alpen-Adria-Universität Klagenfurt²

01 Jan 2013

TL;DR: This paper presents a framework for computing activity deadlines so that the overall process deadline is met and all external time constraints are satisfied.

...read moreread less

Abstract: Time management is a critical component of workflow-based process management. Important aspects of time management include planning of workflow process execution in time, estimating workflow execution duration, avoiding deadline violations, and satisfying all external time constraints such as fixed-date constraints and upper and lower bounds for time intervals between activities. In this paper, we present a framework for computing activity deadlines so that the overall process deadline is met and all external time constraints are satisfied.

...read moreread less

198 citations

Journal Article•DOI•

Models as web services using the Open Geospatial Consortium (OGC) Web Processing Service (WPS) standard

[...]

Anthony M. Castronova¹, Jonathan L. Goodall¹, M. Elag¹•Institutions (1)

University of South Carolina¹

01 Mar 2013-Environmental Modelling and Software

TL;DR: This work advances the idea of service-oriented modeling by presenting a design for a modeling service that builds from the Open Geospatial Consortium Web Processing Service (WPS) protocol, and demonstrates how the WPS protocol can be used to create modeling services, and how these modeling services can be brought into workflow environments using generic client-side code.

...read moreread less

Abstract: Environmental modeling often requires the use of multiple data sources, models, and analysis routines coupled into a workflow to answer a research question. Coupling these computational resources can be accomplished using various tools, each requiring the developer to follow a specific protocol to ensure that components are linkable. Despite these coupling tools, it is not always straight forward to create a modeling workflow due to platform dependencies, computer architecture requirements, and programming language incompatibilities. A service-oriented approach that enables individual models to operate and interact with others using web services is one method for overcoming these challenges. This work advances the idea of service-oriented modeling by presenting a design for a modeling service that builds from the Open Geospatial Consortium (OGC) Web Processing Service (WPS) protocol. We demonstrate how the WPS protocol can be used to create modeling services, and then demonstrate how these modeling services can be brought into workflow environments using generic client-side code. We implemented this approach within the HydroModeler environment, a model coupling tool built on the Open Modeling Interface standard (version 1.4), and show how a hydrology model can be hosted as a WPS web service and used within a client-side workflow. The primary advantage of this approach is that the server-side software follows an established standard that can be leveraged and reused within multiple workflow environments and decision support systems.

...read moreread less

137 citations

Journal Article•DOI•

Multi-Objective Approach for Energy-Aware Workflow Scheduling in Cloud Computing Environments

[...]

Sonia Yassa, Rachid Chelouah¹, Hubert Kadima², Bertrand Granado•Institutions (2)

École nationale supérieure de l'électronique et de ses applications¹, École Normale Supérieure²

04 Nov 2013-The Scientific World Journal

TL;DR: This study proposes a new approach for multi-objective workflow scheduling in clouds, and presents the hybrid PSO algorithm to optimize the scheduling performance, based on the Dynamic Voltage and Frequency Scaling (DVFS) technique to minimize energy consumption.

...read moreread less

Abstract: We address the problem of scheduling workflow applications on heterogeneous computing systems like cloud computing infrastructures. In general, the cloud workflow scheduling is a complex optimization problem which requires considering different criteria so as to meet a large number of QoS (Quality of Service) requirements. Traditional research in workflow scheduling mainly focuses on the optimization constrained by time or cost without paying attention to energy consumption. The main contribution of this study is to propose a new approach for multi-objective workflow scheduling in clouds, and present the hybrid PSO algorithm to optimize the scheduling performance. Our method is based on the Dynamic Voltage and Frequency Scaling (DVFS) technique to minimize energy consumption. This technique allows processors to operate in different voltage supply levels by sacrificing clock frequencies. This multiple voltage involves a compromise between the quality of schedules and energy. Simulation results on synthetic and real-world scientific applications highlight the robust performance of the proposed approach.

...read moreread less

133 citations

Journal Article•DOI•

Adaptive workflow scheduling for dynamic grid and cloud computing environment

[...]

Mustafizur Rahman¹, Rafiul Hassan², Rajiv Ranjan³, Rajkumar Buyya¹•Institutions (3)

University of Melbourne¹, King Fahd University of Petroleum and Minerals², Commonwealth Scientific and Industrial Research Organisation³

10 Sep 2013-Concurrency and Computation: Practice and Experience

TL;DR: A dynamic critical‐path‐based adaptive workflow scheduling algorithm for grids, which determines efficient mapping of workflow tasks to grid resources dynamically by calculating the critical path in the workflow task graph at every step is proposed.

...read moreread less

Abstract: SUMMARY Effective scheduling is a key concern for the execution of performance-driven grid applications such as workflows. In this paper, we first define the workflow scheduling problem and describe the existing heuristic-based and metaheuristic-based workflow scheduling strategies in grids. Then, we propose a dynamic critical-path-based adaptive workflow scheduling algorithm for grids, which determines efficient mapping of workflow tasks to grid resources dynamically by calculating the critical path in the workflow task graph at every step. Using simulation, we compared the performance of the proposed approach with the existing approaches, discussed in this paper for different types and sizes of workflows. The results demonstrate that the heuristic-based scheduling techniques can adapt to the dynamic nature of resource and avoid performance degradation in dynamically changing grid environments. Finally, we outline a hybrid heuristic combining the features of the proposed adaptive scheduling technique with metaheuristics for optimizing execution cost and time as well as meeting the users requirements to efficiently manage the dynamism and heterogeneity of the hybrid cloud environment. Copyright © 2013 John Wiley & Sons, Ltd.

...read moreread less

124 citations

Journal Article•DOI•

Cross-organizational collaborative workflow mining from a multi-source log

[...]

Qingtian Zeng¹, Sherry X. Sun², Hua Duan¹, Cong Liu¹, Huaiqing Wang³ - Show less +1 more•Institutions (3)

Shandong University of Science and Technology¹, City University of Hong Kong², South University of Science and Technology of China³

01 Feb 2013

TL;DR: This paper investigating the application of process mining for workflow integration based on the concept of RM_WF_Net, a type of Petri net extended with resource and message factors, finds the coordination patterns between different organizations and the workflow models in different organizations from the running logs containing the information about resource allocation.

...read moreread less

Abstract: Today's enterprise business processes become increasingly complex given that they are often executed by geographically dispersed partners or different organizations. Designing and modeling such a cross-organizational workflow is a complicated, time-consuming process and requires that a designer has extensive experience. Workflow logs captured by different cross-organizational systems provide a very valuable source of information on how business processes are executed in reality and thus can be used to derive workflow models through process mining. In this paper, we investigate the application of process mining for workflow integration based on the concept of RM_WF_Net, a type of Petri net extended with resource and message factors. Four coordination patterns are defined for workflow integration. A process mining approach is presented to discover the coordination patterns between different organizations and the workflow models in different organizations from the running logs containing the information about resource allocation. A process integration approach is then presented to obtain the model for a cross-organizational workflow based on the model mined for each organization and the coordination patterns between different organizations.

...read moreread less

74 citations

Journal Article•DOI•

Workflow Systems for Science: Concepts and Tools

[...]

Domenico Talia¹•Institutions (1)

University of Calabria¹

08 Jan 2013-International Scholarly Research Notices

TL;DR: This paper discusses basic concepts of scientific workflows and presents workflow system tools and frameworks used today for the implementation of application in science and engineering on high-performance computers and distributed systems.

...read moreread less

Abstract: The wide availability of high-performance computing systems, Grids and Clouds, allowed scientists and engineers to implement more and more complex applications to access and process large data repositories and run scientific experiments in silico on distributed computing platforms. Most of these applications are designed as workflows that include data analysis, scientific computation methods, and complex simulation techniques. Scientific applications require tools and high-level mechanisms for designing and executing complex workflows. For this reason, in the past years, many efforts have been devoted towards the development of distributed workflow management systems for scientific applications. This paper discusses basic concepts of scientific workflows and presents workflow system tools and frameworks used today for the implementation of application in science and engineering on high-performance computers and distributed systems. In particular, the paper reports on a selection of workflow systems largely used for solving scientific problems and discusses some open issues and research challenges in the area.

...read moreread less

68 citations

Journal Article•DOI•

Chiron: a parallel engine for algebraic scientific workflows

[...]

Eduardo Ogasawara¹, Eduardo Ogasawara², Jonas Dias¹, Vítor Silva¹, Fernando Chirigati¹, Daniel de Oliveira¹, Fabio Porto, Patrick Valduriez³, Marta Mattoso¹ - Show less +5 more•Institutions (3)

Federal University of Rio de Janeiro¹, Centro Federal de Educação Tecnológica de Minas Gerais², French Institute for Research in Computer Science and Automation³

01 Nov 2013-Concurrency and Computation: Practice and Experience

TL;DR: This paper shows how the workflow algebra is efficiently implemented in Chiron, an algebraic based parallel scientific workflow engine that has a unique native distributed provenance mechanism that enables runtime queries in a relational database.

...read moreread less

Abstract: Large-scale scientific experiments based on computer simulations are typically modeled as scientific workflows, which eases the chaining of different programs. These scientific workflows are defined, executed and monitored by Scientific Workflow Management Systems (SWfMS). As these experiments manage large amounts of data, it becomes critical to execute them in High Performance Computing (HPC) environments, such as clusters, grids and clouds. However, few SWfMS provide parallel support. The ones that do so are usually labor-intensive for workflow developers and have limited primitives to optimize workflow execution. To address these issues, we developed a workflow algebra to specify and enable the optimization of parallel execution of scientific workflows. In this paper, we show how the workflow algebra is efficiently implemented in Chiron, an algebraic based parallel scientific workflow engine. Chiron has a unique native distributed provenance mechanism that enables run-time queries in a relational database. We developed two studies to evaluate the performance of our algebraic approach implemented in Chiron; the first study compares Chiron with different approaches while the second one evaluates the scalability of Chiron. By analyzing the results, we conclude that Chiron is efficient in executing scientific workflows, with the benefits of declarative specification and runtime provenance support.

...read moreread less

64 citations

Proceedings Article•DOI•

Capturing and querying workflow runtime provenance with PROV: a practical approach

[...]

Flavio Costa¹, Vítor Silva¹, Daniel de Oliveira¹, Kary A. C. S. Ocaña¹, Eduardo Ogasawara¹, Jonas Dias¹, Marta Mattoso¹ - Show less +3 more•Institutions (1)

Federal University of Rio de Janeiro¹

18 Mar 2013

TL;DR: The benefits of representing and sharing runtime provenance data for improving the experiment management as well as the analysis of the scientific data are shown.

...read moreread less

Abstract: Scientific workflows are commonly used to model and execute large-scale scientific experiments. They represent key resources for scientists and are enacted and managed by Scientific Workflow Management Systems (SWfMS). Each SWfMS has its particular approach to execute workflows and to capture and manage their provenance data. Due to the large scale of experiments, it may be unviable to analyze provenance data only after the end of the execution. A single experiment may demand weeks to run, even in high performance computing environments. Thus scientists need to monitor the experiment during its execution, and this can be done through provenance data. Runtime provenance analysis allows for scientists to monitor workflow execution and to take actions before the end of it (i.e. workflow steering). This provenance data can also be used to fine-tune the parallel execution of the workflow dynamically. We use the PROV data model as a basic framework for modeling and providing runtime provenance as a database that can be queried even during the execution. This database is agnostic of SWfMS and workflow engine. We show the benefits of representing and sharing runtime provenance data for improving the experiment management as well as the analysis of the scientific data.

...read moreread less

64 citations

Patent•

Software release workflow management

[...]

Marwan E. Jubran¹, Aleksandr Gershaft¹, Maksim Libenson¹•Institutions (1)

Microsoft¹

14 Mar 2013

TL;DR: In this article, a computer-implemented method for managing a release of a software product includes obtaining a request for the release, the request including workflow action parameter data to define a release pipeline involving a plurality of software engineering systems configured to process data indicative of the software product, and executing, with a processor, a workflow to implement the release pipeline in accordance with the workflow action parameters.

...read moreread less

Abstract: A computer-implemented method for managing a release of a software product includes obtaining a request for the release, the request including workflow action parameter data to define a release pipeline involving a plurality of software engineering systems configured to process data indicative of the software product, and executing, with a processor, a workflow to implement the release pipeline in accordance with the workflow action parameter data. Executing the workflow includes sending a series of instructions to the plurality of software engineering systems. A successive instruction in the series of instructions is sent based on whether a gating rule for the release is met.

...read moreread less

Patent•

Automation of data storage activities

[...]

Anand Vibhor, Amey Vijaykumar Karandikar

06 Mar 2013

TL;DR: In this article, a system receives data storage workflow activities that include computer-executable instructions for carrying out data storage workflows in a network data storage system, and the system deploys the workflow to one or more workflow engines that can execute the various data storage activities related to the workflow.

...read moreread less

Abstract: A system receives data storage workflow activities that include computer-executable instructions for carrying out data storage workflow in a network data storage system. Once the workflow is received, the system deploys the workflow to one or more workflow engines that can execute the various data storage activities related to the workflow. Prior to executing a data storage activity, the system can determine which workflow engine to use based on an allocation scheme.

...read moreread less

Journal Article•DOI•

Patient-centered care requires a patient-oriented workflow model

[...]

Mustafa Ozkaynak¹, Patricia Flatley Brennan², David A. Hanauer³, Sharon A. Johnson¹, Jos Aarts⁴, Kai Zheng³, Saira Haque⁵ - Show less +3 more•Institutions (5)

Worcester Polytechnic Institute¹, University of Wisconsin-Madison², University of Michigan³, Erasmus University Rotterdam⁴, Research Triangle Park⁵

01 Jun 2013-Journal of the American Medical Informatics Association

TL;DR: This work presents two cases demonstrating the potential value of patient-oriented workflow models, and defines meaningful system boundaries and can lead to HIT implementations that are more consistent with cooperative work and its emergent features.

...read moreread less

Journal Article•DOI•

A service-oriented approach for flexible process support within enterprises: application on PLM systems

[...]

Safa Hachani, Lilia Gzara¹, Hervé Verjus²•Institutions (2)

Grenoble Institute of Technology¹, Polytech'Savoie²

01 Feb 2013-Enterprise Information Systems

TL;DR: The aim of this work is to propose an alternative approach for flexible process support within PLM systems that deal with a service-oriented perspectives rather than an activity-oriented one.

...read moreread less

Abstract: Manufacturing industries collaborating to develop new products need to implement an effective management of their design processes DPs and product information. Unfortunately, product lifecycle management PLM systems which are dedicated to support design activities are not efficient as it might be expected. Indeed, DPs are changing, emergent and non deterministic, due to the business environment under which they are carried out. PLM systems are currently based on workflow technology which does not support process agility. So, needs in terms of process support flexibility are necessary to facilitate the coupling with the environment reality. Furthermore, service-oriented approaches SOA enhances flexibility and adaptability of composed solutions. Systems based on SOA have the ability to inherently being evolvable. So, we can say that SOA can promote a support of flexible DPs. The aim of this work is to propose an alternative approach for flexible process support within PLM systems. The objective is to specify, design and implement business processes BPs in a very flexible way so that business changes can rapidly be considered in PLM solutions. Unlike existing approaches, the proposed one deal with a service-oriented perspectives rather than an activity-oriented one.

...read moreread less

Journal Article•DOI•

A survey of pipelined workflow scheduling: Models and algorithms

[...]

Anne Benoit¹, Ümit V. Çatalyürek², Yves Robert¹, Erik Saule²•Institutions (2)

École normale supérieure de Lyon¹, Ohio State University²

30 Aug 2013-ACM Computing Surveys

TL;DR: A large class of applications need to execute the same workflow on different datasets of identical size, and the scheduling problem that encompasses all of these techniques is called pipelined workflow scheduling.

...read moreread less

Abstract: A large class of applications need to execute the same workflow on different datasets of identical size. Efficient execution of such applications necessitates intelligent distribution of the application components and tasks on a parallel machine, and the execution can be orchestrated by utilizing task, data, pipelined, and/or replicated parallelism. The scheduling problem that encompasses all of these techniques is called pipelined workflow scheduling, and it has been widely studied in the last decade. Multiple models and algorithms have flourished to tackle various programming paradigms, constraints, machine behaviors, or optimization goals. This article surveys the field by summing up and structuring known results and approaches.

...read moreread less

Book•DOI•

User-Level Workflow Design

[...]

Anna-Lena Lamprecht

01 Jan 2013

TL;DR: This book follows the loose programming paradigm, a novel approach to user-level workflow design, which makes essential use of constraint-driven workflow synthesis : Constraints provide high-level, declarative descriptions of individual components and entire workflows, which are then used to automatically translate the high- level specifications into concrete workflows that conform to the constraints by design.

...read moreread less

Abstract: Just as driving a car needs no engineer, steering a computer should need no programmer and the development of user-specific software should be in the hands of the user. Service-oriented and model-based approaches have become the methods of choice for user-centric development of variant-rich workflows in many application domains. Formal methods can be integrated to further support the workflow development process at different levels. Particularly effective with regard to user-level workflow design are constraint-based methods, where the key role of constraints is to capture intents about the developed applications in the user’s specific domain language. This book follows the loose programming paradigm, a novel approach to user-level workflow design, which makes essential use of constraint-driven workflow synthesis : Constraints provide high-level, declarative descriptions of individual components and entire workflows. Process synthesis techniques are then used to automatically translate the high-level specifications into concrete workflows that conform to the constraints by design. Loose programming is moreover characterized by its unique holistic perspective on workflow development: being fully integrated into a mature process development framework, it profits seamlessly from the availability of various already established features and methods. In this book, the applicability of this framework is evaluated with a particular focus on the bioinformatics application domain. For this purpose, the first reference implementation of the loose programming paradigm is applied to a series of real-life bioinformatics workflow scenarios, whose different characteristics allow for a detailed evaluation of the features, capabilities, and limitations of the approach. The applications show that the proposed approach to constraint-driven design of variant-rich workflows enables the user to effectively create and manage software processes in his specific domain language and frees him from dealing with the technicalities of the individual services and their composition. Naturally, the quality of the synthesis solutions crucially depends on the provided domain model and on the applied synthesis strategy and constraints.

...read moreread less

Journal Article•DOI•

Fine-Grain Interoperability of Scientific Workflows in Distributed Computing Infrastructures

[...]

Kassian Plankensteiner¹, Radu Prodan¹, Matthias Janetschek¹, Thomas Fahringer¹, Johan Montagnat², David Rogers³, Ian Harvey³, Ian Taylor³, Ákos Balaskó, Péter Kacsuk - Show less +6 more•Institutions (3)

University of Innsbruck¹, Centre national de la recherche scientifique², Cardiff University³

01 Sep 2013

TL;DR: This work presents the fine-grained interoperability solution proposed in the SHIWA European project that brings together four representative European workflow systems: ASKALON, MOTEUR, WS-PGRADE, and Triana, and proposes a generic Interoperable Workflow Intermediate Representation (IWIR) that can be used as a common bridge for translating workflows between different languages independent of the underlying distributed computing infrastructure.

...read moreread less

Abstract: Today there exist a wide variety of scientific workflow management systems, each designed to fulfill the needs of a certain scientific community. Unfortunately, once a workflow application has been designed in one particular system it becomes very hard to share it with users working with different systems. Portability of workflows and interoperability between current systems barely exists. In this work, we present the fine-grained interoperability solution proposed in the SHIWA European project that brings together four representative European workflow systems: ASKALON, MOTEUR, WS-PGRADE, and Triana. The proposed interoperability is realised at two levels of abstraction: abstract and concrete. At the abstract level, we propose a generic Interoperable Workflow Intermediate Representation (IWIR) that can be used as a common bridge for translating workflows between different languages independent of the underlying distributed computing infrastructure. At the concrete level, we propose a bundling technique that aggregates the abstract IWIR representation and concrete task representations to enable workflow instantiation, execution and scheduling. We illustrate case studies using two real-workflow applications designed in a native environment and then translated and executed by a foreign workflow system in a foreign distributed computing infrastructure.

...read moreread less

Proceedings Article•DOI•

A broker-based framework for multi-cloud workflows

[...]

Foued Jrad¹, Jie Tao¹, Achim Streit¹•Institutions (1)

Karlsruhe Institute of Technology¹

22 Apr 2013

TL;DR: A broker-based framework for running workflows in a multi-Cloud environment that allows an automatic selection of the target Clouds, a uniform access to the Clouds, and workflow data management with respect to user Service Level Agreement (SLA) requirements is presented.

...read moreread less

Abstract: Computational science workflows have been successfully run on traditional HPC systems like clusters and Grids for many years. Today, users are interested to execute their workflow applications in the Cloud to exploit the economic and technical benefits of this new emerging technology. The deployment and management of workflows over the current existing heterogeneous and not yet interoperable Cloud providers, however, is still a challenging task for the workflow developers. In this paper, we present a broker-based framework for running workflows in a multi-Cloud environment. The framework allows an automatic selection of the target Clouds, a uniform access to the Clouds, and workflow data management with respect to user Service Level Agreement (SLA) requirements. Following a simulation approach, we evaluated the framework with a real scientific workflow application in different deployment scenarios. The results show that our framework offers benefits to users by executing workflows with the expected performance and service quality at lowest cost.

...read moreread less

Proceedings Article•DOI•

Automatically repairing broken workflows for evolving GUI applications

[...]

Sai Zhang¹, Hao Lu¹, Michael D. Ernst¹•Institutions (1)

University of Washington¹

15 Jul 2013

TL;DR: This work evaluated FlowFixer on 16 broken workflows from 5 realworld GUI applications written in Java and found that it produced significantly better results than two alternative approaches.

...read moreread less

Abstract: A workflow is a sequence of UI actions to complete a specific task. In the course of a GUI application's evolution, changes ranging from a simple GUI refactoring to a complete rearchitecture can break an end-user's well-established workflow. It can be challenging to find a replacement workflow. To address this problem, we present a technique (and its tool implementation, called FlowFixer) that repairs a broken workflow. FlowFixer uses dynamic profiling, static analysis, and random testing to suggest a replacement UI action that fixes a broken workflow. We evaluated FlowFixer on 16 broken workflows from 5 realworld GUI applications written in Java. In 13 workflows, the correct replacement action was FlowFixer's first suggestion. In 2 workflows, the correct replacement action was FlowFixer's second suggestion. The remaining workflow was un-repairable. Overall, FlowFixer produced significantly better results than two alternative approaches.

...read moreread less

Patent•

Methods and systems for use with an evaluation workflow for an evidence-based evaluation

[...]

Inna Fedoseyeva, Jonathan W. Stowe

15 Mar 2013

TL;DR: In this paper, the authors present systems and methods for use in creating an evaluation workflow defining a multiple step evaluation process for use by one or more users variously involved in an evidence-based evaluation.

...read moreread less

Abstract: Several embodiments provide systems and methods for use in creating an evaluation workflow defining a multiple step evaluation process for use by one or more users variously involved in an evidence-based evaluation. The systems and methods allow the user to define the evaluation workflow and store the evaluation workflow in a database, allow the user to add a plurality of assessments to the evaluation workflow and store the plurality of assessments in association with the evaluation workflow in the database, each assessment defining an evaluation event at a given point in time to be assessed as part of the evaluation process spanning an evaluation period of time, and allow the user to add one or more parts to each of the plurality of assessments and store the one or more parts in association with the plurality of assessments in the database.

...read moreread less

Book Chapter•DOI•

Workflow Time Management Revisited

[...]

Johann Eder¹, Euthimios Panagos², Michael Rabinovich³•Institutions (3)

Alpen-Adria-Universität Klagenfurt¹, Applied Communication Sciences², Case Western Reserve University³

01 Jan 2013

TL;DR: Some of the most important research efforts and results in modeling temporal aspects of workflows, analysis of temporal properties of workflow models, computation of workflow execution schedules, and minimization of exceptions due to violation of temporal constraints are summarized.

...read moreread less

Abstract: Time is an important aspect of business process management. Here we revisit the following contributions of early workflow time management approaches: representation of temporal information and temporal constraints, analysis of temporal constraint satisfiability, and computation of workflow execution plans that satisfy temporal constraints. In particular, we summarize some of the most important research efforts and results in: (a) modeling temporal aspects of workflows, (b) analysis of temporal properties of workflow models, (c) computation of workflow execution schedules, (d) minimization of exceptions due to violation of temporal constraints, (e) monitoring of temporal workflow aspects, and (f) modeling and calculation of temporal properties for distributed workflows and for guaranteeing Quality of Service in Web-service composition.

...read moreread less

Journal Article•DOI•

Performance evaluation of parallel strategies in public clouds: A study with phylogenomic workflows

[...]

Daniel de Oliveira¹, Kary A. C. S. Ocaña¹, Eduardo Ogasawara¹, Jonas Dias¹, João Carlos de A. R. Gonçalves¹, Fernanda Araujo Baião², Marta Mattoso¹ - Show less +3 more•Institutions (2)

Federal University of Rio de Janeiro¹, Universidade Federal do Estado do Rio de Janeiro²

01 Sep 2013-Future Generation Computer Systems

TL;DR: A performance evaluation for SciPhylomics executions in a real cloud environment using two parallel execution approaches (SciCumulus and Hadoop) at the Amazon EC2 cloud reinforces the benefits of parallelizing data for the phylogenomic inference workflow using MapReduce-like parallel approaches in the cloud.

...read moreread less

Proceedings Article•DOI•

Rethinking data management for big data scientific workflows

[...]

Karan Vahi¹, Mats Rynge¹, Gideon Juve¹, Rajiv Mayani¹, Ewa Deelman¹ - Show less +1 more•Institutions (1)

University of Southern California¹

23 Dec 2013

TL;DR: This paper presents two general approaches, one that exclusively uses object stores to store all the files accessed and generated by a workflow, while the other relies on the shared filesystem for caching intermediate data sets.

...read moreread less

Abstract: Scientific workflows consist of tasks that operate on input data to generate new data products that are used by subsequent tasks. Workflow management systems typically stage data to computational sites before invoking the necessary computations. In some cases data may be accessed using remote I/O. There are limitations with these approaches, however. First, the storage at a computational site may be limited and not able to accommodate the necessary input and intermediate data. Second, even if there is enough storage, it is sometimes managed by a filesystem with limited scalability. In recent years, object stores have been shown to provide a scalable way to store and access large datasets, however, they provide a limited set of operations (retrieve, store and delete) that do not always match the requirements of the workflow tasks. In this paper, we show how scientific workflows can take advantage of the capabilities of object stores without requiring users to modify their workflow-based applications or scientific codes. We present two general approaches, one that exclusively uses object stores to store all the files accessed and generated by a workflow, while the other relies on the shared filesystem for caching intermediate data sets. We have implemented both of these approaches in the Pegasus Workflow Management System and have used them to execute workflows in variety of execution environments ranging from traditional supercomputing environments that have a shared filesystem to dynamic environments like Amazon AWS and the Open Science Grid that only offer remote object stores. As a result, Pegasus users can easily migrate their applications from a shared filesystem deployment to one using object stores without changing their application codes.

...read moreread less

Journal Article•DOI•

Modeling, run-time optimization and execution of distributed workflow applications in the JEE-based BeesyCluster environment

[...]

Pawel Czarnul¹•Institutions (1)

Gdańsk University of Technology¹

01 Jan 2013-The Journal of Supercomputing

TL;DR: The paper presents implementation details of the multithreaded workflow execution engine implemented in JEE, and performs tests for three different optimization goals for two business and scientific workflow applications.

...read moreread less

Abstract: The paper presents a complete solution for modeling scientific and busi- ness workflow applications, static and just-in-time QoS selection of services and workflow execution in a real environment. The workflow application is modeled as an acyclic directed graph where nodes denote tasks and edges denote dependencies between the tasks. The BeesyCluster middleware is used to allow providers to pub- lish services from sequential or parallel applications, from their servers or clusters. Optimization algorithms are proposed to select a capable service for each task so that a global criterion is optimized such as a product of workflow execution time and cost, a linear combination of those or minimization of the time with a cost constraint. The paper presents implementation details of the multithreaded workflow execution en- gine implemented in JEE. Several tests were performed for three different optimiza- tion goals for two business and scientific workflow applications. Finally, the overhead of the solution is presented.

...read moreread less

Journal Article•DOI•

Resource oriented workflow nets and workflow resource requirement analysis

[...]

Jiacun Wang¹, Demin Li•Institutions (1)

Monmouth University¹

08 Sep 2013-International Journal of Software Engineering and Knowledge Engineering

TL;DR: A Petri net based approach for recourse requirements analysis, which can be used for more general purposes, and the concept of resource-oriented workflow nets (ROWN) is introduced and the transition firing rules of ROWN are presented.

...read moreread less

Abstract: Petri nets are a powerful formalism in modeling workflows. A workflow determines the flow of work according to pre-defined business process. In many situations, business processes are constrained by scarce resources. The lack of resources can cause contention, the need for some tasks to wait for others to complete, which slows down the accomplishment of larger goals. In our previous work, a resource-constrained workflow model was introduced and a resource requirement analysis approach was developed for emergency response workflows, in which support of on-the-fly workflow change is critical [14]. In this paper, we propose a Petri net based approach for recourse requirements analysis, which can be used for more general purposes. The concept of resource-oriented workflow nets (ROWN) is introduced and the transition firing rules of ROWN are presented. Resource requirements for general workflows can be done through reachability analysis. An efficient resource analysis algorithm is developed for a class of well-structured workflows, in which when a task execution is started it is guaranteed to finish successfully. For a task that may fail in the middle of execution, an equivalent non-failing task model in terms of resource consumption is developed.

...read moreread less

Proceedings Article•DOI•

Detecting common scientific workflow fragments using templates and execution provenance

[...]

Daniel Garijo¹, Oscar Corcho¹, Yolanda Gil²•Institutions (2)

Technical University of Madrid¹, University of Southern California²

23 Jun 2013

TL;DR: This paper proposes an approach to automatically obtain abstractions from low-level provenance data by finding common workflow fragments on workflow execution provenance and relating them to templates and shows that by using these kinds of abstractions the authors can highlight the most common abstract methods used in the executions of a repository.

...read moreread less

Abstract: Provenance plays a major role when understanding and reusing the methods applied in a scientific experiment, as it provides a record of inputs, the processes carried out and the use and generation of intermediate and final results. In the specific case of in-silico scientific experiments, a large variety of scientific workflow systems (e.g., Wings, Taverna, Galaxy, Vistrails) have been created to support scientists. All of these systems produce some sort of provenance about the executions of the workflows that encode scientific experiments. However, provenance is normally recorded at a very low level of detail, which complicates the understanding of what happened during execution. In this paper we propose an approach to automatically obtain abstractions from low-level provenance data by finding common workflow fragments on workflow execution provenance and relating them to templates. We have tested our approach with a dataset of workflows published by the Wings workflow system. Our results show that by using these kinds of abstractions we can highlight the most common abstract methods used in the executions of a repository, relating different runs and workflow templates with each other.

...read moreread less

Proceedings Article•

Propminer: A Workflow for Interactive Information Extraction and Exploration using Dependency Trees

[...]

Alan Akbik¹, Oresti Konomi, Michail Melnikov•Institutions (1)

Technical University of Berlin¹

01 Aug 2013

TL;DR: This work introduces the proposed five step workflow for creating information extractors, the graph query based rule language, as well as the core features of the PROPMINER tool.

...read moreread less

Abstract: The use of deep syntactic information such as typed dependencies has been shown to be very effective in Information Extraction. Despite this potential, the process of manually creating rule-based information extractors that operate on dependency trees is not intuitive for persons without an extensive NLP background. In this system demonstration, we present a tool and a workflow designed to enable initiate users to interactively explore the effect and expressivity of creating Information Extraction rules over dependency trees. We introduce the proposed five step workflow for creating information extractors, the graph query based rule language, as well as the core features of the PROPMINER tool.

...read moreread less

Journal Article•DOI•

Dealing with uncertainty: Robust workflow navigation in the healthcare domain

[...]

Hannes Wolf, Klaus Herrmann, Kurt Rothermel

08 Oct 2013-ACM Transactions on Intelligent Systems and Technology

TL;DR: A comprehensive framework tailored for flexible human-centric healthcare processes that improves the reliability of activity recognition data and presents a set of mechanisms that exploit the application knowledge encoded in workflows in order to reduce the uncertainty of this data, thus enabling unobtrusive robust healthcare workflows.

...read moreread less

Abstract: Processes in the healthcare domain are characterized by coarsely predefined recurring procedures that are flexibly adapted by the personnel to suite-specific situations. In this setting, a workflow management system that gives guidance and documents the personnel's actions can lead to a higher quality of care, fewer mistakes, and higher efficiency. However, most existing workflow management systems enforce rigid inflexible workflows and rely on direct manual input. Both are inadequate for healthcare processes. In particular, direct manual input is not possible in most cases since (1) it would distract the personnel even in critical situations and (2) it would violate fundamental hygiene principles by requiring disinfected doctors and nurses to touch input devices. The solution could be activity recognition systems that use sensor data (e.g., audio and acceleration data) to infer the current activities by the personnel and provide input to a workflow (e.g., informing it that a certain activity is finished now). However, state-of-the-art activity recognition technologies have difficulties in providing reliable information. We describe a comprehensive framework tailored for flexible human-centric healthcare processes that improves the reliability of activity recognition data. We present a set of mechanisms that exploit the application knowledge encoded in workflows in order to reduce the uncertainty of this data, thus enabling unobtrusive robust healthcare workflows. We evaluate our work based on a real-world case study and show that the robustness of unobtrusive healthcare workflows can be increased to an absolute value of up to 91p (compared to only 12p with a classical workflow system). This is a major breakthrough that paves the way towards future IT-enabled healthcare systems.

...read moreread less

Proceedings Article•DOI•

Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotations

[...]

Pinar Alper¹, Khalid Belhajjame¹, Carole Goble¹, Pinar Karagoz²•Institutions (2)

University of Manchester¹, Middle East Technical University²

27 Jun 2013

TL;DR: This paper proposes the generation of workflow description summaries in order to tackle workflow complexity, elaborate reduction primitives for summarizing workflows, and shows how primitives, as building blocks, can be used in conjunction with semantic workflow annotations to encode different summarization strategies.

...read moreread less

Abstract: Scientific workflows have become the workhorse of Big Data analytics for scientists. As well as being repeatable and optimizable pipelines that bring together datasets and analysis tools, workflows make-up an important part of the provenance of data generated from their execution. By faithfully capturing all stages in the analysis, workflows play a critical part in building up the audit-trail (a.k.a. provenance) meta-data for derived datasets and contributes to the veracity of results. Provenance is essential for reporting results, reporting the method followed, and adapting to changes in the datasets or tools. These functions, however, are hampered by the complexity of workflows and consequently the complexity of data-trails generated from their instrumented execution. In this paper we propose the generation of workflow description summaries in order to tackle workflow complexity. We elaborate reduction primitives for summarizing workflows, and show how primitives, as building blocks, can be used in conjunction with semantic workflow annotations to encode different summarization strategies. We report on the effectiveness of the method through experimental evaluation using real-world workflows from the Tavern a system.

...read moreread less

Journal Article•DOI•

Implementation of Geospatial Data Provenance in a Web Service Workflow Environment With ISO 19115 and ISO 19115-2 Lineage Model

[...]

Liping Di¹, Yuanzheng Shao¹, Lingjun Kang¹•Institutions (1)

George Mason University¹

21 May 2013-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: This paper bridges the gap between provenance and geo-processing workflow through extending both workflow language and service interface, making it possible for the automatic capture of provenance information in the geospatial web service environment.

...read moreread less

Abstract: Data provenance, also called data lineage, records the derivation history of a data product. In the earth science domain, geospatial data provenance is important because it plays a significant role in data quality and usability evaluation, data trail audition, workflow replication, and product reproducibility. The generation of the geospatial provenance metadata is usually coupled with the execution of geo-processing workflow. Their symbiotic relationship makes them complementary to each other and promises great benefit once they are integrated. However, the heterogeneity of data and computing resources in the distributed environment constructed under the service-oriented architecture (SOA) brings a great challenge to resource integration. Specifically, the issues, such as the lack of interoperability and compatibility among provenance metadata models and between provenance and workflow, create obstacles for the integration of provenance, and geo-processing workflow. In order to tackle these issues, on one hand, this paper breaks the provenance heterogeneity through recording provenance information in a standard lineage model defined in ISO 19115:2003 and ISO 19115-2:2009 standards. On the other hand, this paper bridges the gap between provenance and geo-processing workflow through extending both workflow language and service interface, making it possible for the automatic capture of provenance information in the geospatial web service environment. The proposed method is implemented in the GeoBrain, a SOA-based geospatial web service system. The testing result from implementation shows that the geospatial provenance information is successfully captured throughout the life cycle of geo-processing workflows and properly recorded in the ISO standard lineage model.

...read moreread less

Collapse