scispace - formally typeset
Search or ask a question
Topic

Workflow

About: Workflow is a research topic. Over the lifetime, 31996 publications have been published within this topic receiving 498339 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This study develops an unceRtainty-aware Online Scheduling Algorithm (ROSA) to schedule dynamic and multiple workflows with deadlines that performs better than the five compared algorithms with respect to costs, deviations, deviation, resource utilization, and fairness.
Abstract: Scheduling workflows in cloud service environment has attracted great enthusiasm, and various approaches have been reported up to now. However, these approaches often ignored the uncertainties in the scheduling environment, such as the uncertain task start/execution/finish time, the uncertain data transfer time among tasks, the sudden arrival of new workflows. Ignoring these uncertain factors often leads to the violation of workflow deadlines and increases service renting costs of executing workflows. This study devotes to improving the performance for cloud service platforms by minimizing uncertainty propagation in scheduling workflow applications that have both uncertain task execution time and data transfer time. To be specific, a novel scheduling architecture is designed to control the count of workflow tasks directly waiting on each service instance (e.g., virtual machine and container). Once a task is completed, its start/execution/finish time are available, which means its uncertainties disappearing, and will not affect the subsequent waiting tasks on the same service instance. Thus, controlling the count of waiting tasks on service instances can prohibit the propagation of uncertainties. Based on this architecture, we develop an unce R tainty-aware O nline S cheduling A lgorithm ( ROSA ) to schedule dynamic and multiple workflows with deadlines. The proposed ROSA skillfully integrates both the proactive and reactive strategies. During the execution of the generated baseline schedules, the reactive strategy in ROSA will be dynamically called to produce new proactive baseline schedules for dealing with uncertainties. Then, on the basis of real-world workflow traces, five groups of simulation experiments are carried out to compare ROSA with five typical algorithms. The comparison results reveal that ROSA performs better than the five compared algorithms with respect to costs (up to 56 percent), deviation (up to 70 percent), resource utilization (up to 37 percent), and fairness (up to 37 percent).

116 citations

Journal ArticleDOI
TL;DR: This paper derives into the logical optimization of ETL processes, modeling it as a state-space search problem, and provides an exhaustive and two heuristic algorithms toward the minimization of the execution cost of an ETL workflow.
Abstract: Extraction-transformation-loading (ETL) tools are pieces of software responsible for the extraction of data from several sources, their cleansing, customization, and insertion into a data warehouse. In this paper, we derive into the logical optimization of ETL processes, modeling it as a state-space search problem. We consider each ETL workflow as a state and fabricate the state space through a set of correct state transitions. Moreover, we provide an exhaustive and two heuristic algorithms toward the minimization of the execution cost of an ETL workflow. The heuristic algorithm with greedy characteristics significantly outperforms the other two algorithms for a large set of experimental cases.

115 citations

Patent
11 Sep 2006
TL;DR: In this article, the authors propose a data elevation architecture for automatically and dynamically surfacing to a user interface (UI) context-specific data based on specific workflow or content currently being worked on by a user.
Abstract: Data elevation architecture for automatically and dynamically surfacing to a user interface (UI) context-specific data based on specific workflow or content currently being worked on by a user. Data is broken down into data elements and stored at a data element level in a data catalog using metadata, attributes, and relationships. Data elements are automatically selected from a comprehensive collection of the data catalogs based on relevancy and correlation to the current user task. The data catalog stores and relates the data elements and metadata based on criteria specified by content matching based on business terms or specified in a business process in predefined relationships between forms or specified by the user as correlated. The UI displays the data automatically in forms dynamically selected, populated, and presented at the point of focus or user activity so that the user can interact or take action immediately.

115 citations

Journal ArticleDOI
TL;DR: This work applies knowledge management to guarantee SLAs and low resource wastage in Clouds and designs and implements two methods, Case-Based Reasoning and rule-based approach, which prove feasibility as KM techniques and shows major improvements towards CBR.

115 citations

Journal IssueDOI
TL;DR: The Taverna Workbench as mentioned in this paper is a Grid environment for the composition and execution of workflows for the life sciences community, which is based on the myGrid project's workbench.
Abstract: Life sciences research is based on individuals, often with diverse skills, assembled into research groups. These groups use their specialist expertise to address scientific problems. The in silico experiments undertaken by these research groups can be represented as workflows involving the co-ordinated use of analysis programs and information repositories that may be globally distributed. With regards to Grid computing, the requirements relate to the sharing of analysis and information resources rather than sharing computational power. The myGrid project has developed the Taverna Workbench for the composition and execution of workflows for the life sciences community. This experience paper describes lessons learnt during the development of Taverna. A common theme is the importance of understanding how workflows fit into the scientists' experimental context. The lessons reflect an evolving understanding of life scientists' requirements on a workflow environment, which is relevant to other areas of data intensive and exploratory science. Copyright © 2005 John Wiley & Sons, Ltd.

115 citations


Network Information
Related Topics (5)
Software
130.5K papers, 2M citations
89% related
Information system
107.5K papers, 1.8M citations
84% related
The Internet
213.2K papers, 3.8M citations
82% related
Deep learning
79.8K papers, 2.1M citations
82% related
Cluster analysis
146.5K papers, 2.9M citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20234,414
20229,010
20211,461
20201,579
20191,702