scispace - formally typeset
Search or ask a question
Topic

Workflow

About: Workflow is a research topic. Over the lifetime, 31996 publications have been published within this topic receiving 498339 citations.


Papers
More filters
Book ChapterDOI
25 Mar 2004
TL;DR: In this article, a generic framework for transforming heterogeneous data within scientific workflows is defined, which relies on a formalized ontology, which serves as a simple, unstructured global schema.
Abstract: Ecologists spend considerable effort integrating heterogeneous data for statistical analyses and simulations, for example, to run and test predictive models. Our research is focused on reducing this effort by providing data integration and transformation tools, allowing researchers to focus on “real science,” that is, discovering new knowledge through analysis and modeling. This paper defines a generic framework for transforming heterogeneous data within scientific workflows. Our approach relies on a formalized ontology, which serves as a simple, unstructured global schema. In the framework, inputs and outputs of services within scientific workflows can have structural types and separate semantic types (expressions of the target ontology). In addition, a registration mapping can be defined to relate input and output structural types to their corresponding semantic types. Using registration mappings, appropriate data transformations can then be generated for each desired service composition. Here, we describe our proposed framework and an initial implementation for services that consume and produce XML data.

148 citations

Journal ArticleDOI
TL;DR: An overview of the algorithms that were implemented within the InWo LvE workflow mining system is given, the most important results of their experimental evalualion are summarized and the experiences that were made in the first industrial application of InWoLvE are presented.

148 citations

Patent
Steven D. Gadol1
08 Dec 1995
TL;DR: In this paper, the authors present a system and a method for automating workflow by distributing the tasks required for the execution of said workflow over servers and clients connected on a network, which allows the stages of the workflow to be performed asynchronously, meaning that, once a workflow initiated by a user has been initiated by database server, the stages can be executed on respective network clients without further interaction with the server.
Abstract: A system and method for automating workflow by distributing the tasks required for the execution of said workflow over servers and clients connected on a network. The disclosed system and method allow the stages of the workflow to be performed asynchronously, meaning that, once a workflow initiated by a user has been initiated by a database server, the stages of the workflow can be executed on respective network clients without further interaction with the server (i.e., without requiring a stateful connection between the clients and servers). This is accomplished through the use of a workflow courier that embodies all programs (encompassing rules governing the execution of the workflow) and forms needed by clients to complete stages of the workflow. The workflow courier also stores workflow state information that indicates which stages of the workflow have been completed. The executable programs are written in the platform-independent Java programming language and are therefore executable on any computer that has an installed Java browser. After each stage is executed, the client executing that stage updates the workflow courier and transmits the updated workflow courier to a client having an associated user who is authorized to perform the next step in the workflow. The updated state information indicates to the recipient of the workflow which stages remain to be completed.

148 citations

Journal ArticleDOI
TL;DR: Apollo as discussed by the authors is an open source software package that enables researchers to efficiently inspect and refine the precise structure and role of genomic features in a graphical browser-based platform, allowing distributed users to simultaneously edit the same encoded features while also instantly seeing the updates made by other researchers on the same region.
Abstract: Genome annotation is the process of identifying the location and function of a genome's encoded features. Improving the biological accuracy of annotation is a complex and iterative process requiring researchers to review and incorporate multiple sources of information such as transcriptome alignments, predictive models based on sequence profiles, and comparisons to features found in related organisms. Because rapidly decreasing costs are enabling an ever-growing number of scientists to incorporate sequencing as a routine laboratory technique, there is widespread demand for tools that can assist in the deliberative analytical review of genomic information. To this end, we present Apollo, an open source software package that enables researchers to efficiently inspect and refine the precise structure and role of genomic features in a graphical browser-based platform. Some of Apollo's newer user interface features include support for real-time collaboration, allowing distributed users to simultaneously edit the same encoded features while also instantly seeing the updates made by other researchers on the same region in a manner similar to Google Docs. Its technical architecture enables Apollo to be integrated into multiple existing genomic analysis pipelines and heterogeneous laboratory workflow platforms. Finally, we consider the implications that Apollo and related applications may have on how the results of genome research are published and made accessible.

148 citations

Journal ArticleDOI
TL;DR: In this article, a self-adaptive discrete particle swarm optimization algorithm with genetic algorithm operators (GA-DPSO) was proposed to optimize the data transmission time when placing data for a scientific workflow.
Abstract: Compared to traditional distributed computing environments such as grids, cloud computing provides a more cost-effective way to deploy scientific workflows. Each task of a scientific workflow requires several large datasets that are located in different datacenters, resulting in serious data transmission delays. Edge computing reduces the data transmission delays and supports the fixed storing manner for scientific workflow private datasets, but there is a bottleneck in its storage capacity. It is a challenge to combine the advantages of both edge computing and cloud computing to rationalize the data placement of scientific workflow, and optimize the data transmission time across different datacenters. In this study, a self-adaptive discrete particle swarm optimization algorithm with genetic algorithm operators (GA-DPSO) was proposed to optimize the data transmission time when placing data for a scientific workflow. This approach considered the characteristics of data placement combining edge computing and cloud computing. In addition, it considered the factors impacting transmission delay, such as the bandwidth between datacenters, the number of edge datacenters, and the storage capacity of edge datacenters. The crossover and mutation operators of the genetic algorithm were adopted to avoid the premature convergence of traditional particle swarm optimization algorithm, which enhanced the diversity of population evolution and effectively reduced the data transmission time. The experimental results show that the data placement strategy based on GA-DPSO can effectively reduce the data transmission time during workflow execution combining edge computing and cloud computing.

147 citations


Network Information
Related Topics (5)
Software
130.5K papers, 2M citations
89% related
Information system
107.5K papers, 1.8M citations
84% related
The Internet
213.2K papers, 3.8M citations
82% related
Deep learning
79.8K papers, 2.1M citations
82% related
Cluster analysis
146.5K papers, 2.9M citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20234,414
20229,010
20211,461
20201,579
20191,702