scispace - formally typeset
Search or ask a question

Showing papers on "Scientific workflow system published in 2016"


Journal ArticleDOI
TL;DR: TimeStudio facilitates the reproduction and replication of scientific studies, increases the transparency of analyses, and reduces individual researchers’ analysis workload, making TimeStudio a flexible workbench for organizing and performing a wide range of analyses.
Abstract: This article describes a new open source scientific workflow system, the TimeStudio Project, dedicated to the behavioral and brain sciences. The program is written in MATLAB and features a graphical user interface for the dynamic pipelining of computer algorithms developed as TimeStudio plugins. TimeStudio includes both a set of general plugins (for reading data files, modifying data structures, visualizing data structures, etc.) and a set of plugins specifically developed for the analysis of event-related eyetracking data as a proof of concept. It is possible to create custom plugins to integrate new or existing MATLAB code anywhere in a workflow, making TimeStudio a flexible workbench for organizing and performing a wide range of analyses. The system also features an integrated sharing and archiving tool for TimeStudio workflows, which can be used to share workflows both during the data analysis phase and after scientific publication. TimeStudio thus facilitates the reproduction and replication of scientific studies, increases the transparency of analyses, and reduces individual researchers’ analysis workload. The project website (http://timestudioproject.com) contains the latest releases of TimeStudio, together with documentation and user forums.

60 citations


Patent
30 Mar 2016
TL;DR: In this article, a cloud computing platform-oriented scientific workflow system and method is presented, where a customization module customizes a display layer, a workflow layer, an executive layer and a computing environment; an automatic deployment module automatically deploys the computing environment according to a cloud environment abstract description of the customization module and a corresponding scientific software automatic configuration script.
Abstract: The invention relates to a cloud computing platform-oriented scientific workflow system and method. A customization module customizes a display layer, a workflow layer, an executive layer and a computing environment; an automatic deployment module automatically deploys the computing environment according to a computing environment abstract description of the customization module and a corresponding scientific software automatic configuration script; and an executive module accurately dispatches calculation steps of a scientific workflow and runs in the cloud computing environment. According to the system and the method, more customizable scientific workflow services can be provided for scientific research personnel; scientific workflow processes can be customized according to scientific experiment demands; computing resources in a cloud platform are rented as needed; the limitation of computing resources in a machine room of a lab is avoided; the limitation in manually installing a software tool to deploy the computing environment is avoided; it is not required to perform manual tracking and execute the calculation steps; and the system and the method are suitable for large-scale scientific data analysis tasks.

13 citations


Book ChapterDOI
07 Jun 2016
TL;DR: SisGExp transparently captures provenance of R scripts and endows experiments reproducibility, a provenance-based approach that aid researchers to manage, share, and enact the computational scientific workflows that encapsulate legacy R scripts.
Abstract: Reproducibility is a major feature of Science. Even agronomic research of exemplary quality may have irreproducible empirical findings because of random or systematic error. This work presents SisGExp, a provenance-based approach that aid researchers to manage, share, and enact the computational scientific workflows that encapsulate legacy R scripts. SisGExp transparently captures provenance of R scripts and endows experiments reproducibility. SisGExp is non-intrusive, does not require users to change their working way, it wrap agronomic experiments as a scientific workflow system.

5 citations


Journal ArticleDOI
01 Jan 2016
TL;DR: The Security Analysis Package (SAP) leverages Kepler's Provenance Recorder (PR) to secure data flows from external input-based attacks, from access to unauthorized exter- nal sites, and from data integrity issues.
Abstract: We have developed a model for securing data-flow based application chains. We have imple- mented the model in the form of an add-on package for the scientific workflow system called Kepler. Our Security Analysis Package (SAP) leverages Kepler's Provenance Recorder (PR). SAP secures data flows from external input-based attacks, from access to unauthorized exter- nal sites, and from data integrity issues. It is not a surprise that cost of real-time security is a certain amount of run-time overhead. About half of the overhead appears to come from the use of the Kepler PR and the other half from security function added by SAP.

4 citations


DOI
13 Nov 2016
TL;DR: The design of fault tolerance mechanism implemented in Pwrake, a light-weight workflow system to execute data-intensive many-task workflows with the help of high-performance parallel I/O of Gfarm file system, is discussed.
Abstract: We have been developing a light-weight workflow system called Pwrake to execute data-intensive many-task workflows with the help of high-performance parallel I/O of Gfarm file system. This paper discusses the design of fault tolerance mechanism implemented in Pwrake. To avoid a workflow abort in the occurrence of a worker node failure, Pwrake detects a node failure based on the result of a task retry. To avoid loss of files when a worker node fails, we make use of automatic file replication of Gfarm file system. To resume an interrupted workflow correctly, we introduce a Pwrake option to rename or remove an output file of a failed task. In the experiment, we confirmed that the overhead of Gfarm automatic file replication in workflow execution time is less than 10%, and that workflow continues and returns right results even after the occurrence of an artificial failure in a worker node.

1 citations