Open AccessProceedings Article
Best practices for workflow design: how to prevent workflow decay
Kristina Hettne,Katy Wolstencroft,Khalid Belhajjame,Carole Goble,Eleni Mina,Harish Dharuri,Lourdes Verdes-Montenegro,Julián Garrido,David De Roure,Marco Roos +9 more
- pp 23
TLDR
It is argued that good workflow design is a prerequisite for repairing a workflow, or redesigning an equivalent workflow pattern with new components, and the semantic tooling that is being developed in the Workflow4Ever project to support these best practices are presented.Abstract:
In this position paper we present a set of best practices for workflow design to prevent workflow decay and increase reuse and re-purposing of scientific workflows. MyExperiment provides access to a large number of scientific workflows. However, scientists find it difficult to reuse or re-purpose these workflows for mainly two reasons: workflows suffer from decay over time and lack sufficient metadata to understand their purpose. We argue that good workflow design is a prerequisite for repairing a workflow, or redesigning an equivalent workflow pattern with new components. We present a set of best practices for workflow design and the semantic tooling that is being developed in the Workflow4Ever (Wf4Ever) project to support these best practices.read more
Citations
More filters
Journal ArticleDOI
Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities
Sarah Cohen-Boulakia,Khalid Belhajjame,Olivier Collin,Jérôme Chopard,Christine Froidevaux,Alban Gaignard,Konrad Hinsen,Pierre Larmande,Yvan Le Bras,Frédéric Lemoine,Fabien Mareuil,Hervé Ménager,Christophe Pradal,Christophe Blanchet +13 more
TL;DR: This study characterize and define the criteria that need to be catered for by reproducibility-friendly scientific workflow systems, and use such criteria to place several representative and widely used workflow systems and companion tools within such a framework.
Journal ArticleDOI
Structuring research methods and data with the research object model: genomics workflows as a case study
Kristina Hettne,Harish Dharuri,Jun Zhao,Katherine Wolstencroft,Katherine Wolstencroft,Khalid Belhajjame,Stian Soiland-Reyes,Eleni Mina,Mark Thompson,Don Cruickshank,Lourdes Verdes-Montenegro,Julián Garrido,David De Roure,Oscar Corcho,Graham Klyne,Reinout van Schouwen,Peter A C 't Hoen,Sean Bechhofer,Carole Goble,Marco Roos +19 more
TL;DR: In this paper, a workflow-centric Research Object (RO) model is proposed to aggregate and annotate the resources used in a bioinformatics experiment, allowing to retrieve the conclusions of the experiment in the context of the driving hypothesis, the executed workflows and their input data.
Journal ArticleDOI
Structuring research methods and data with the Research Object model: genomics workflows as a case study
Kristina Hettne,Harish Dharuri,Jun Zhao,Katherine Wolstencroft,Katherine Wolstencroft,Khalid Belhajjame,Stian Soiland-Reyes,Eleni Mina,Mark Thompson,Don Cruickshank,Lourdes Verdes-Montenegro,Julián Garrido,David De Roure,Oscar Corcho,Graham Klyne,Reinout van Schouwen,Peter A C 't Hoen,Sean Bechhofer,Carole Goble,Marco Roos +19 more
TL;DR: Applying a workflow-centric RO model to aggregate and annotate the resources used in a bioinformatics experiment allowed us to retrieve the conclusions of the experiment in the context of the driving hypothesis, the executed workflows and their input data.
Proceedings ArticleDOI
Four level provenance support to achieve portable reproducibility of scientific workflows
TL;DR: The necessary and satisfactory parameters of workflow reproducecibility are investigated and a mathematical formula to determine the rate of reproducibility is given and these measurements allow the scientist to make a decision about the next steps toward the creation of reproducible workflows.
Journal ArticleDOI
Repeat: a framework to assess empirical reproducibility in biomedical research
Leslie D. McIntosh,Anthony Juehne,Cynthia Hudson Vitale,Xiaoyan Liu,Rosalia Alcoser,J. Christian Lukas,Bradley A. Evanoff +6 more
TL;DR: The use of RepeAT may allow the biomedical community to have a better understanding of the current practices of research transparency and accessibility among principal investigators and common adoption ofrepeAT may improve reporting of research practices and the availability of research outputs.
References
More filters
Journal ArticleDOI
Best Practices for Scientific Computing
Greg Wilson,D. A. Aruliah,C. Titus Brown,Neil Chue Hong,Matt Davis,Richard T. Guy,Steven H. D. Haddock,Kathryn D. Huff,Ian M. Mitchell,Mark D. Plumbley,Ben Waugh,Ethan P. White,Paul P. H. Wilson +12 more
TL;DR: A set of best practices for scientific software development, based on research and experience, that will improve scientists' productivity and the reliability of their software are described.
Journal ArticleDOI
Why linked data is not enough for scientists
Sean Bechhofer,Iain Buchan,David De Roure,Paolo Missier,John Ainsworth,Jiten Bhagat,Philip Couch,Don Cruickshank,Mark Delderfield,Ian Dunlop,Matthew Gamble,Danius T. Michaelides,Stuart Owen,David Newman,Shoaib Sufi,Carole Goble +15 more
TL;DR: This paper makes the case for a scientific data publication model on top of linked data and introduces the notion of Research Objects as first class citizens for sharing and publishing.
Proceedings ArticleDOI
Why workflows break — Understanding and combating decay in Taverna workflows
Jun Zhao,Jose Manuel Gomez-Perez,Khalid Belhajjame,Graham Klyne,Esteban García-Cuesta,Aleix Garrido,Kristina Hettne,Marco Roos,David De Roure,Carole Goble +9 more
TL;DR: A minimal set of auxiliary resources to be preserved together with the workflows as an aggregation object and provide a software tool for end-users to create such aggregations and to assess their completeness.
Proceedings Article
Workflow-centric research objects: First class citizens in scholarly discourse.
Khalid Belhajjame,Oscar Corcho,Daniel Garijo,Jun Zhao,Paolo Missier,David Newman,Raul Palma,Sean Bechhofer,Esteban García Cuesta,Jose Manuel Gomez-Perez,Graham Klyne,Kevin R. Page,Marco Roos,José Enrique Ruiz,Stian Soiland-Reyes,Lourdes Verdes-Montenegro,David De Roure,Carole Goble +17 more
TL;DR: A model to specify workflow-centric research objects is proposed, and how the model can be grounded using semantic technologies and existing vocabularies, in particular the Object Reuse and Exchange model and the Annotation Ontology (AO).
Book ChapterDOI
Seven bottlenecks to workflow reuse and repurposing
TL;DR: Based on a comparison of e-Science middleware projects, this paper identifies seven bottlenecks to scalable reuse and repurposing, and includes some thoughts on the applicability of using OWL for two bott lenecks: workflow fragment discovery and the ranking of fragments.