scispace - formally typeset
Search or ask a question
Topic

Workflow

About: Workflow is a research topic. Over the lifetime, 31996 publications have been published within this topic receiving 498339 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper proposes to use workflow management as a mechanism to facilitate the teamwork in a collaborative product development environment where remote Web-based Decision Support Systems (TeleDSS) are extensively used by team members who are geographically distributed.
Abstract: Product development is collaborative, involving multi-disciplinary functions and heterogeneous tools. Teamwork is essential through seamless tool integration and better co-ordination of human activities. This paper proposes to use workflow management as a mechanism to facilitate the teamwork in a collaborative product development environment where remote Web-based Decision Support Systems (TeleDSS) are extensively used by team members who are geographically distributed. The workflow of a project is modelled as a network. Its nodes correspond to the work (packages), and its edges to flows of control and data. The concept of agents is introduced to define nodes and the concept of messages to define edges. As a sandwich layer, agents act as special-purpose application clients for the remote TeleDSS application servers. Agents are delegated to manipulate the corresponding TeleDSS on behalf of their human users. Details of the proceedings are recorded by agents as their properties for future references or shared uses by other team members. Through flow messages, agents are able to share input and output data and request for remote services. One of the major contributions is that agents, once defined, can be reused for different projects without any changes.

118 citations

Journal ArticleDOI
01 Aug 2011
TL;DR: This work proposes an algebraic approach (inspired by relational algebra) and a parallel execution model that enable automatic optimization of scientific workflows and demonstrates performance improvements of up to 226% compared to an ad-hoc workflow implementation.
Abstract: Scientific workflows have emerged as a basic abstraction for structuring and executing scientific experiments in computational environments. In many situations, these workflows are computationally and data intensive, thus requiring execution in large-scale parallel computers. However, parallelization of scientific workflows remains low-level, ad-hoc and labor-intensive, which makes it hard to exploit optimization opportunities. To address this problem, we propose an algebraic approach (inspired by relational algebra) and a parallel execution model that enable automatic optimization of scientific workflows. We conducted a thorough validation of our approach using both a real oil exploitation application and synthetic data scenarios. The experiments were run in Chiron, a data-centric scientific workflow engine implemented to support our algebraic approach. Our experiments demonstrate performance improvements of up to 226% compared to an ad-hoc workflow implementation.

118 citations

Journal ArticleDOI
TL;DR: A specification language with the ability to express resources and resource allocation constraints and a scheduler module that contains a constraint solver in order to find correct resource assignments are core and novel parts of this architecture.

118 citations

Posted Content
TL;DR: ReStore as discussed by the authors is an extension to the Pig dataflow system on top of Hadoop, and it can also create additional reuse opportunities by materializing and storing the output of query execution operators that are executed within a MapReduce job.
Abstract: Analyzing large scale data has emerged as an important activity for many organizations in the past few years. This large scale data analysis is facilitated by the MapReduce programming and execution model and its implementations, most notably Hadoop. Users of MapReduce often have analysis tasks that are too complex to express as individual MapReduce jobs. Instead, they use high-level query languages such as Pig, Hive, or Jaql to express their complex tasks. The compilers of these languages translate queries into workflows of MapReduce jobs. Each job in these workflows reads its input from the distributed file system used by the MapReduce system and produces output that is stored in this distributed file system and read as input by the next job in the workflow. The current practice is to delete these intermediate results from the distributed file system at the end of executing the workflow. One way to improve the performance of workflows of MapReduce jobs is to keep these intermediate results and reuse them for future workflows submitted to the system. In this paper, we present ReStore, a system that manages the storage and reuse of such intermediate results. ReStore can reuse the output of whole MapReduce jobs that are part of a workflow, and it can also create additional reuse opportunities by materializing and storing the output of query execution operators that are executed within a MapReduce job. We have implemented ReStore as an extension to the Pig dataflow system on top of Hadoop, and we experimentally demonstrate significant speedups on queries from the PigMix benchmark.

118 citations

Patent
11 Apr 2005
TL;DR: In this article, a system and method of modeling and evaluating workflows that provides workflow auto generation and Hierarchical Dependence Graphs for workflows is presented by accessing a knowledge database containing service descriptions, generating valid workflows models, simulating workflow, and obtaining customer requirements through a Graphical User Interface.
Abstract: A system and method of modeling and evaluating workflows that provides workflow auto generation and Hierarchical Dependence Graphs for workflows. Modeling and evaluation of workflows is accomplished by accessing a knowledge database (2) containing service descriptions, generating valid workflows models (4), simulating workflow (6) and obtaining customer requirements through a Graphical User Interface (8). This system and method generate and display workflows that satisfy a users requirements. In addition, Hierarchical Dependence Graphs provide abstract views that provide additional analysis and control of workflow.

118 citations


Network Information
Related Topics (5)
Software
130.5K papers, 2M citations
89% related
Information system
107.5K papers, 1.8M citations
84% related
The Internet
213.2K papers, 3.8M citations
82% related
Deep learning
79.8K papers, 2.1M citations
82% related
Cluster analysis
146.5K papers, 2.9M citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20234,414
20229,010
20211,461
20201,579
20191,702