Showing papers on "Workflow published in 2013"

PDF

Open Access

Proceedings Article•DOI•

[...]

Aniket Kittur¹, Jeffrey V. Nickerson², Michael S. Bernstein³, Elizabeth M. Gerber⁴, Aaron Shaw⁴, John Zimmerman¹, Matthew Lease⁵, John Horton⁶ - Show less +4 more•Institutions (6)

Carnegie Mellon University¹, Stevens Institute of Technology², Stanford University³, Northwestern University⁴, University of Texas at Austin⁵, Harvard University⁶

23 Feb 2013

TL;DR: This paper outlines a framework that will enable crowd work that is complex, collaborative, and sustainable, and lays out research challenges in twelve major areas: workflow, task assignment, hierarchy, real-time response, synchronous collaboration, quality control, crowds guiding AIs, AIs guiding crowds, platforms, job design, reputation, and motivation.

...read moreread less

Abstract: Paid crowd work offers remarkable opportunities for improving productivity, social mobility, and the global economy by engaging a geographically distributed workforce to complete complex tasks on demand and at scale. But it is also possible that crowd work will fail to achieve its potential, focusing on assembly-line piecework. Can we foresee a future crowd workplace in which we would want our children to participate? This paper frames the major challenges that stand in the way of this goal. Drawing on theory from organizational behavior and distributed computing, as well as direct feedback from workers, we outline a framework that will enable crowd work that is complex, collaborative, and sustainable. The framework lays out research challenges in twelve major areas: workflow, task assignment, hierarchy, real-time response, synchronous collaboration, quality control, crowds guiding AIs, AIs guiding crowds, platforms, job design, reputation, and motivation.

...read moreread less

836 citations

Journal Article•DOI•

The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud

[...]

Katherine Wolstencroft¹, Robert Haines¹, Donal Fellows¹, Alan Williams¹, David Withers¹, Stuart Owen¹, Stian Soiland-Reyes¹, Ian Dunlop¹, Aleksandra Nenadic¹, Paul R. Fisher¹, Jiten Bhagat¹, Khalid Belhajjame¹, Finn Bacall¹, Alex Hardisty¹, Abraham Nieva de la Hidalga¹, Maria Paula Balcázar Vargas¹, Shoaib Sufi¹, Carole Goble¹ - Show less +14 more•Institutions (1)

University of Manchester¹

01 Jul 2013-Nucleic Acids Research

TL;DR: An update to the taverna tool suite is provided, highlighting new features and developments in the workbench and the Taverna Server.

...read moreread less

Abstract: The Taverna workflow tool suite (http://www.taverna.org.uk) is designed to combine distributed Web Services and/or local tools into complex analysis pipelines. These pipelines can be executed on local desktop machines or through larger infrastructure (such as supercomputers, Grids or cloud environments), using the Taverna Server. In bioinformatics, Taverna workflows are typically used in the areas of high-throughput omics analyses (for example, proteomics or transcriptomics), or for evidence gathering methods involving text mining or data mining. Through Taverna, scientists have access to several thousand different tools and resources that are freely available from a large range of life science institutions. Once constructed, the workflows are reusable, executable bioinformatics protocols that can be shared, reused and repurposed. A repository of public workflows is available at http://www.myexperiment.org. This article provides an update to the Taverna tool suite, highlighting new features and developments in the workbench and the Taverna Server.

...read moreread less

724 citations

Journal Article•DOI•

Characterizing and profiling scientific workflows

[...]

Gideon Juve¹, Ann L. Chervenak¹, Ewa Deelman¹, Shishir Bharathi¹, Gaurang Mehta¹, Karan Vahi¹ - Show less +2 more•Institutions (1)

University of Southern California¹

01 Mar 2013-Future Generation Computer Systems

TL;DR: A characterization of workflows from six diverse scientific applications, including astronomy, bioinformatics, earthquake science, and gravitational-wave physics is provided, based on novel workflow profiling tools that provide detailed information about the various computational tasks that are present in the workflow.

...read moreread less

648 citations

Journal Article•DOI•

Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds

[...]

Saeid Abrishami¹, Mahmoud Naghibzadeh¹, Dick Epema²•Institutions (2)

Ferdowsi University of Mashhad¹, Delft University of Technology²

01 Jan 2013-Future Generation Computer Systems

TL;DR: Two workflow scheduling algorithms are proposed which aim to minimize the workflow execution cost while meeting a deadline and have a polynomial time complexity which make them suitable options for scheduling large workflows in IaaS Clouds.

...read moreread less

580 citations

Journal Article•DOI•

Automated data reduction workflows for astronomy

[...]

Wolfram Freudling, Martino Romaniello, D. M. Bramich, Pascal Ballester, Vincenzo Forchi, C. E. Garcia-Dablo, Sabine Moehler, Mark Neeser - Show less +4 more

21 Nov 2013-arXiv: Instrumentation and Methods for Astrophysics

TL;DR: Reflex is a specific implementation of astronomical scientific workflows within the Kepler workflow engine, and the overall design choices and methods can also be applied to other environments for running automated science workflows.

...read moreread less

Abstract: Data from complex modern astronomical instruments often consist of a large number of different science and calibration files, and their reduction requires a variety of software tools The execution chain of the tools represents a complex workflow that needs to be tuned and supervised, often by individual researchers that are not necessarily experts for any specific instrument The efficiency of data reduction can be improved by using automatic workflows to organise data and execute the sequence of data reduction steps To realize such efficiency gains, we designed a system that allows intuitive representation, execution and modification of the data reduction workflow, and has facilities for inspection and interaction with the data The European Southern Observatory (ESO) has developed Reflex, an environment to automate data reduction workflows Reflex is implemented as a package of customized components for the Kepler workflow engine Kepler provides the graphical user interface to create an executable flowchart-like representation of the data reduction process Key features of Reflex are a rule-based data organiser, infrastructure to re-use results, thorough book-keeping, data progeny tracking, interactive user interfaces, and a novel concept to exploit information created during data organisation for the workflow execution Reflex includes novel concepts to increase the efficiency of astronomical data processing While Reflex is a specific implementation of astronomical scientific workflows within the Kepler workflow engine, the overall design choices and methods can also be applied to other environments for running automated science workflows

...read moreread less

574 citations

Journal Article•DOI•

Automated data reduction workflows for astronomy - The ESO Reflex environment

[...]

Wolfram Freudling¹, Martino Romaniello¹, D. M. Bramich¹, Pascal Ballester¹, Vincenzo Forchi¹, C. E. Garcia-Dablo¹, Sabine Moehler¹, Mark Neeser¹ - Show less +4 more•Institutions (1)

European Southern Observatory¹

01 Nov 2013-Astronomy and Astrophysics

TL;DR: Reflex as discussed by the authors is an environment to automate data reduction workflows for astronomical data processing, which includes a rule-based data organiser, infrastructure to re-use results, thorough book-keeping, data progeny tracking, interactive user interfaces, and a novel concept to exploit information created during data organisation for the workflow execution.

...read moreread less

Abstract: Context. Data from complex modern astronomical instruments often consist of a large number of di erent science and calibration files, and their reduction requires a variety of software tools. The execution chain of the tools represents a complex workflow that needs to be tuned and supervised, often by individual researchers that are not necessarily experts for any specific instrument. Aims. The e ciency of data reduction can be improved by using automatic workflows to organise data and execute a sequence of data reduction steps. To realize such e ciency gains, we designed a system that allows intuitive representation, execution and modification of the data reduction workflow, and has facilities for inspection and interaction with the data. Methods. The European Southern Observatory (ESO) has developed Reflex, an environment to automate data reduction workflows. Reflex is implemented as a package of customized components for the Kepler workflow engine. Kepler provides the graphical user interface to create an executable flowchart-like representation of the data reduction process. Key features of Reflex are a rule-based data organiser, infrastructure to re-use results, thorough book-keeping, data progeny tracking, interactive user interfaces, and a novel concept to exploit information created during data organisation for the workflow execution. Results. Automated workflows can greatly increase the e ciency of astronomical data reduction. In Reflex, workflows can be run noninteractively as a first step. Subsequent optimization can then be carried out while transparently re-using all unchanged intermediate products. We found that such workflows enable the reduction of complex data by non-expert users and minimizes mistakes due to book-keeping errors. Conclusions. Reflex includes novel concepts to increase the e ciency of astronomical data processing. While Reflex is a specific implementation of astronomical scientific workflows within the Kepler workflow engine, the overall design choices and methods can also be applied to other environments for running automated science workflows.

...read moreread less

569 citations

Journal Article•DOI•

The Medical Imaging Interaction Toolkit: challenges and advances : 10 years of open-source development.

[...]

Marco Nolden¹, Sascha Zelzer¹, Alexander Seitel¹, Diana Wald¹, Michael W. Müller¹, Alfred M. Franz¹, Daniel Maleike², Markus Fangerau¹, Matthias Baumhauer¹, Matthias Baumhauer², Lena Maier-Hein¹, Klaus H. Maier-Hein¹, Hans-Peter Meinzer¹, Ivo Wolf³ - Show less +10 more•Institutions (3)

German Cancer Research Center¹, Mint.com², Mannheim University of Applied Sciences³

16 Apr 2013

TL;DR: The aim of this paper is to show how MITK evolved into a software system that is able to cover all steps of a clinical workflow including data retrieval, image analysis, diagnosis, treatment planning, intervention support, and treatment control.

...read moreread less

Abstract: The Medical Imaging Interaction Toolkit (MITK) has been available as open-source software for almost 10 years now. In this period the requirements of software systems in the medical image processing domain have become increasingly complex. The aim of this paper is to show how MITK evolved into a software system that is able to cover all steps of a clinical workflow including data retrieval, image analysis, diagnosis, treatment planning, intervention support, and treatment control. MITK provides modularization and extensibility on different levels. In addition to the original toolkit, a module system, micro services for small, system-wide features, a service-oriented architecture based on the Open Services Gateway initiative (OSGi) standard, and an extensible and configurable application framework allow MITK to be used, extended and deployed as needed. A refined software process was implemented to deliver high-quality software, ease the fulfillment of regulatory requirements, and enable teamwork in mixed-competence teams. MITK has been applied by a worldwide community and integrated into a variety of solutions, either at the toolkit level or as an application framework with custom extensions. The MITK Workbench has been released as a highly extensible and customizable end-user application. Optional support for tool tracking, image-guided therapy, diffusion imaging as well as various external packages (e.g. CTK, DCMTK, OpenCV, SOFA, Python) is available. MITK has also been used in several FDA/CE-certified applications, which demonstrates the high-quality software and rigorous development process. MITK provides a versatile platform with a high degree of modularization and interoperability and is well suited to meet the challenging tasks of today’s and tomorrow’s clinically motivated research.

...read moreread less

359 citations

Journal Article•DOI•

G-Hadoop: MapReduce across distributed data centers for data-intensive computing

[...]

Lizhe Wang¹, Jie Tao², Rajiv Ranjan³, Holger Marten², Achim Streit², Jingying Chen⁴, Dan Chen⁵ - Show less +3 more•Institutions (5)

Chinese Academy of Sciences¹, Karlsruhe Institute of Technology², Commonwealth Scientific and Industrial Research Organisation³, Central China Normal University⁴, China University of Geosciences (Wuhan)⁵

01 Mar 2013-Future Generation Computer Systems

TL;DR: The design and implementation of G-Hadoop, a MapReduce framework that aims to enable large-scale distributed computing across multiple clusters is presented.

...read moreread less

319 citations

Journal Article•DOI•

Machine learning in cell biology – teaching computers to recognize phenotypes

[...]

Christoph Sommer¹, Daniel W. Gerlich¹•Institutions (1)

Austrian Academy of Sciences¹

15 Dec 2013-Journal of Cell Science

TL;DR: It is outlined how microscopy images can be converted into a data representation suitable for machine learning, and various state-of-the-art machine-learning algorithms are introduced, highlighting recent applications in image-based screening.

...read moreread less

Abstract: Recent advances in microscope automation provide new opportunities for high-throughput cell biology, such as image-based screening. High-complex image analysis tasks often make the implementation of static and predefined processing rules a cumbersome effort. Machine-learning methods, instead, seek to use intrinsic data structure, as well as the expert annotations of biologists to infer models that can be used to solve versatile data analysis tasks. Here, we explain how machine-learning methods work and what needs to be considered for their successful application in cell biology. We outline how microscopy images can be converted into a data representation suitable for machine learning, and then introduce various state-of-the-art machine-learning algorithms, highlighting recent applications in image-based screening. Our Commentary aims to provide the biologist with a guide to the application of machine learning to microscopy assays and we therefore include extensive discussion on how to optimize experimental workflow as well as the data analysis pipeline.

...read moreread less

296 citations

Journal Article•DOI•

A market-oriented hierarchical scheduling strategy in cloud workflow systems

[...]

Zhangjun Wu¹, Zhangjun Wu², Xiao Liu¹, Zhiwei Ni², Dong Yuan¹, Yun Yang¹ - Show less +2 more•Institutions (2)

Swinburne University of Technology¹, Hefei University of Technology²

01 Jan 2013-The Journal of Supercomputing

TL;DR: The hierarchical scheduling strategy is being implemented in the SwinDeW-C cloud workflow system and demonstrating satisfactory performance, and the experimental results show that the overall performance of ACO based scheduling algorithm is better than others on three basic measurements: the optimisations rate on makespan, the optimisation rate on cost and the CPU time.

...read moreread less

Abstract: A cloud workflow system is a type of platform service which facilitates the automation of distributed applications based on the novel cloud infrastructure. One of the most important aspects which differentiate a cloud workflow system from its other counterparts is the market-oriented business model. This is a significant innovation which brings many challenges to conventional workflow scheduling strategies. To investigate such an issue, this paper proposes a market-oriented hierarchical scheduling strategy in cloud workflow systems. Specifically, the service-level scheduling deals with the Task-to-Service assignment where tasks of individual workflow instances are mapped to cloud services in the global cloud markets based on their functional and non-functional QoS requirements; the task-level scheduling deals with the optimisation of the Task-to-VM (virtual machine) assignment in local cloud data centres where the overall running cost of cloud workflow systems will be minimised given the satisfaction of QoS constraints for individual tasks. Based on our hierarchical scheduling strategy, a package based random scheduling algorithm is presented as the candidate service-level scheduling algorithm and three representative metaheuristic based scheduling algorithms including genetic algorithm (GA), ant colony optimisation (ACO), and particle swarm optimisation (PSO) are adapted, implemented and analysed as the candidate task-level scheduling algorithms. The hierarchical scheduling strategy is being implemented in our SwinDeW-C cloud workflow system and demonstrating satisfactory performance. Meanwhile, the experimental results show that the overall performance of ACO based scheduling algorithm is better than others on three basic measurements: the optimisation rate on makespan, the optimisation rate on cost and the CPU time.

...read moreread less

277 citations

Proceedings Article•DOI•

Cascade: crowdsourcing taxonomy creation

[...]

Lydia B. Chilton¹, Greg Little, Darren Edge², Daniel S. Weld¹, James A. Landay¹ - Show less +1 more•Institutions (2)

University of Washington¹, Microsoft²

27 Apr 2013

TL;DR: Cascade is an automated workflow that allows crowd workers to spend as little at 20 seconds each while collectively making a taxonomy, and it is shown that on three datasets its quality is 80-90% of that of experts.

...read moreread less

Abstract: Taxonomies are a useful and ubiquitous way of organizing information. However, creating organizational hierarchies is difficult because the process requires a global understanding of the objects to be categorized. Usually one is created by an individual or a small group of people working together for hours or even days. Unfortunately, this centralized approach does not work well for the large, quickly changing datasets found on the web. Cascade is an automated workflow that allows crowd workers to spend as little at 20 seconds each while collectively making a taxonomy. We evaluate Cascade and show that on three datasets its quality is 80-90% of that of experts. Cascade has a competitive cost to expert information architects, despite taking six times more human labor. Fortunately, this labor can be parallelized such that Cascade will run in as fast as four minutes instead of hours or days.

...read moreread less

Journal Article•DOI•

EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats

[...]

Jon Ison¹, Matúš Kalaš², Inge Jonassen², Dan Bolser², Mahmut Uludag², Hamish McWilliam², James Malone², Rodrigo Lopez², Steve Pettifer², Peter M. Rice² - Show less +6 more•Institutions (2)

European Bioinformatics Institute¹, University of Manchester²

15 May 2013-Bioinformatics

TL;DR: EDAM is an ontology of bioinformatics operations (tool or workflow functions), types of data and identifiers, application domains and data formats, which supports semantic annotation of diverse entities such as Web services, databases, programmatic libraries, standalone tools, interactive applications, data schemas, datasets and publications within bio informatics.

...read moreread less

Abstract: Motivation: Advancing the search, publication and integration of bioinformatics tools and resources demands consistent machine-understandable descriptions. A comprehensive ontology allowing such descriptions is therefore required. Results: EDAM is an ontology of bioinformatics operations (tool or workflow functions), types of data and identifiers, application domains and data formats. EDAM supports semantic annotation of diverse entities such as Web services, databases, programmatic libraries, standalone tools, interactive applications, data schemas, datasets and publications within bioinformatics. EDAM applies to organizing and finding suitable tools and data and to automating their integration into complex applications or workflows. It includes over 2200 defined concepts and has successfully been used for annotations and implementations. Availability: The latest stable version of EDAM is available in OWL format from http://edamontology.org/EDAM.owl and in OBO format from http://edamontology.org/EDAM.obo. It can be viewed online at the NCBO BioPortal and the EBI Ontology Lookup Service. For documentation and license please refer to http://edamontology.org. This article describes version 1.2 available at http://edamontology.org/ EDAM_1.2.owl.

...read moreread less

Time Constraints in Workflow Systems

[...]

Johann Eder¹, Johann Eder², Euthimios Panagos¹, Michael Rabinovich¹•Institutions (2)

AT&T Labs¹, Alpen-Adria-Universität Klagenfurt²

01 Jan 2013

TL;DR: This paper presents a framework for computing activity deadlines so that the overall process deadline is met and all external time constraints are satisfied.

...read moreread less

Abstract: Time management is a critical component of workflow-based process management. Important aspects of time management include planning of workflow process execution in time, estimating workflow execution duration, avoiding deadline violations, and satisfying all external time constraints such as fixed-date constraints and upper and lower bounds for time intervals between activities. In this paper, we present a framework for computing activity deadlines so that the overall process deadline is met and all external time constraints are satisfied.

...read moreread less

Journal Article•DOI•

Quantitative microscopy of the lung: a problem-based approach. Part 1: basic principles of lung stereology

[...]

Matthias Ochs¹, Christian Mühlfeld•Institutions (1)

Hannover Medical School¹

01 Jul 2013-American Journal of Physiology-lung Cellular and Molecular Physiology

TL;DR: The present companion articles were designed to allow a short practically oriented introduction into the concepts of design-based stereology and to provide recommendations for choosing the most appropriate methods to investigate a number of important disease models.

...read moreread less

Abstract: The growing awareness of the importance of accurate morphometry in lung research has recently motivated the publication of guidelines set forth by a combined task force of the American Thoracic Society and the European Respiratory Society (20). This official ATS/ERS Research Policy Statement provides general recommendations on which stereological methods are to be used in quantitative microscopy of the lung. However, to integrate stereology into a particular experimental study design, investigators are left with the problem of how to implement this in practice. Specifically, different animal models of human lung disease require the use of different stereological techniques and may determine the mode of lung fixation, tissue processing, preparation of sections, and other things. Therefore, the present companion articles were designed to allow a short practically oriented introduction into the concepts of design-based stereology (Part 1) and to provide recommendations for choosing the most appropriate methods to investigate a number of important disease models (Part 2). Worked examples with illustrative images will facilitate the practical performance of equivalent analyses. Study algorithms provide comprehensive surveys to ensure that no essential step gets lost during the multistage workflow. Thus, with this review, we hope to close the gap between theory and practice and enhance the use of stereological techniques in pulmonary research.

...read moreread less

Journal Article•DOI•

OpenMOLE, a workflow engine specifically tailored for the distributed exploration of simulation models

[...]

Romain Reuillon, Mathieu Leclaire, Sébastien Rey-Coyrehourcq

01 Oct 2013-Future Generation Computer Systems

TL;DR: The OpenMOLE DSL is presented through the example of a toy model exploration and through the automated calibration of a real-world complex-system model in the field of geography.

...read moreread less

Journal Article•DOI•

Models as web services using the Open Geospatial Consortium (OGC) Web Processing Service (WPS) standard

[...]

Anthony M. Castronova¹, Jonathan L. Goodall¹, M. Elag¹•Institutions (1)

University of South Carolina¹

01 Mar 2013-Environmental Modelling and Software

TL;DR: This work advances the idea of service-oriented modeling by presenting a design for a modeling service that builds from the Open Geospatial Consortium Web Processing Service (WPS) protocol, and demonstrates how the WPS protocol can be used to create modeling services, and how these modeling services can be brought into workflow environments using generic client-side code.

...read moreread less

Abstract: Environmental modeling often requires the use of multiple data sources, models, and analysis routines coupled into a workflow to answer a research question. Coupling these computational resources can be accomplished using various tools, each requiring the developer to follow a specific protocol to ensure that components are linkable. Despite these coupling tools, it is not always straight forward to create a modeling workflow due to platform dependencies, computer architecture requirements, and programming language incompatibilities. A service-oriented approach that enables individual models to operate and interact with others using web services is one method for overcoming these challenges. This work advances the idea of service-oriented modeling by presenting a design for a modeling service that builds from the Open Geospatial Consortium (OGC) Web Processing Service (WPS) protocol. We demonstrate how the WPS protocol can be used to create modeling services, and then demonstrate how these modeling services can be brought into workflow environments using generic client-side code. We implemented this approach within the HydroModeler environment, a model coupling tool built on the Open Modeling Interface standard (version 1.4), and show how a hydrology model can be hosted as a WPS web service and used within a client-side workflow. The primary advantage of this approach is that the server-side software follows an established standard that can be leveraged and reused within multiple workflow environments and decision support systems.

...read moreread less

Journal Article•DOI•

Multi-Objective Approach for Energy-Aware Workflow Scheduling in Cloud Computing Environments

[...]

Sonia Yassa, Rachid Chelouah¹, Hubert Kadima², Bertrand Granado•Institutions (2)

École nationale supérieure de l'électronique et de ses applications¹, École Normale Supérieure²

04 Nov 2013-The Scientific World Journal

TL;DR: This study proposes a new approach for multi-objective workflow scheduling in clouds, and presents the hybrid PSO algorithm to optimize the scheduling performance, based on the Dynamic Voltage and Frequency Scaling (DVFS) technique to minimize energy consumption.

...read moreread less

Abstract: We address the problem of scheduling workflow applications on heterogeneous computing systems like cloud computing infrastructures. In general, the cloud workflow scheduling is a complex optimization problem which requires considering different criteria so as to meet a large number of QoS (Quality of Service) requirements. Traditional research in workflow scheduling mainly focuses on the optimization constrained by time or cost without paying attention to energy consumption. The main contribution of this study is to propose a new approach for multi-objective workflow scheduling in clouds, and present the hybrid PSO algorithm to optimize the scheduling performance. Our method is based on the Dynamic Voltage and Frequency Scaling (DVFS) technique to minimize energy consumption. This technique allows processors to operate in different voltage supply levels by sacrificing clock frequencies. This multiple voltage involves a compromise between the quality of schedules and energy. Simulation results on synthetic and real-world scientific applications highlight the robust performance of the proposed approach.

...read moreread less

Journal Article•DOI•

Budget-Deadline Constrained Workflow Planning for Admission Control

[...]

Wei Zheng¹, Rizos Sakellariou²•Institutions (2)

Xiamen University¹, University of Manchester²

01 Dec 2013

TL;DR: A novel heuristic is proposed and evaluated using simulation with four different real-world workflow applications to find a feasible plan for the execution of the workflow which would allow providers to decide whether they can agree with the specific constraints set by the user.

...read moreread less

Abstract: In this paper, we assume an environment with multiple, heterogeneous resources, which provide services of different capabilities and of a different cost. Users want to make use of these services to execute a workflow application, within a certain deadline and budget. The problem considered in this paper is to find a feasible plan for the execution of the workflow which would allow providers to decide whether they can agree with the specific constraints set by the user. If they agree to admit the workflow, providers can allocate services for its execution in a way that both deadline and budget constraints are met while account is also taken of the existing load in the provider's environment (confirmed reservations from other users whose requests have been accepted). A novel heuristic is proposed and evaluated using simulation with four different real-world workflow applications.

...read moreread less

Journal Article•DOI•

Provider Barriers to Telemental Health: Obstacles Overcome, Obstacles Remaining

[...]

Elizabeth Brooks¹, Carolyn Turvey, Eugene F Augusterfer•Institutions (1)

University of Colorado Denver¹

22 May 2013-Telemedicine Journal and E-health

TL;DR: Significant improvements in TMH have rapidly reduced obstacles for its use, and it is important to grow and disseminate data underscoring the promise and effectiveness of TMH, integrate videoconferencing capabilities into electronic medical record platforms, expand TMH reimbursement, and modify licensure standards.

...read moreread less

Abstract: Many providers are hesitant to use telemental health technologies When providers are queried, various barriers are presented, such as the clinician's skepticism about the effectiveness of telemental health (TMH), viewing telehealth technologies as inconvenient, or reporting difficulties with medical reimbursement Provider support for TMH is critical to its diffusion because clinicians often serve as the initial gatekeepers to telehealth implementation and program success In this article, we address provider concerns in three broad domains: (1) personal barriers, (2) clinical workflow and technology barriers, and (3) licensure, credentialing, and reimbursement barriers We found evidence that, although many barriers have been discussed in the literature for years, advancements in TMH have rapidly reduced obstacles for its use Improvements include extensive opportunities for training, a growing evidence base supporting positive TMH outcomes, and transformations in technologies that improve prov

...read moreread less

Journal Article•DOI•

Adaptive workflow scheduling for dynamic grid and cloud computing environment

[...]

Mustafizur Rahman¹, Rafiul Hassan², Rajiv Ranjan³, Rajkumar Buyya¹•Institutions (3)

University of Melbourne¹, King Fahd University of Petroleum and Minerals², Commonwealth Scientific and Industrial Research Organisation³

10 Sep 2013-Concurrency and Computation: Practice and Experience

TL;DR: A dynamic critical‐path‐based adaptive workflow scheduling algorithm for grids, which determines efficient mapping of workflow tasks to grid resources dynamically by calculating the critical path in the workflow task graph at every step is proposed.

...read moreread less

Abstract: SUMMARY Effective scheduling is a key concern for the execution of performance-driven grid applications such as workflows. In this paper, we first define the workflow scheduling problem and describe the existing heuristic-based and metaheuristic-based workflow scheduling strategies in grids. Then, we propose a dynamic critical-path-based adaptive workflow scheduling algorithm for grids, which determines efficient mapping of workflow tasks to grid resources dynamically by calculating the critical path in the workflow task graph at every step. Using simulation, we compared the performance of the proposed approach with the existing approaches, discussed in this paper for different types and sizes of workflows. The results demonstrate that the heuristic-based scheduling techniques can adapt to the dynamic nature of resource and avoid performance degradation in dynamically changing grid environments. Finally, we outline a hybrid heuristic combining the features of the proposed adaptive scheduling technique with metaheuristics for optimizing execution cost and time as well as meeting the users requirements to efficiently manage the dynamism and heterogeneity of the hybrid cloud environment. Copyright © 2013 John Wiley & Sons, Ltd.

...read moreread less

Proceedings Article•DOI•

Seahawk: stack overflow in the IDE

[...]

Luca Ponzanelli¹, Alberto Bacchelli¹, Michele Lanza¹•Institutions (1)

University of Lugano¹

18 May 2013

TL;DR: Seahawk is an Eclipse plugin that supports an integrated and largely automated approach to assist programmers using Stack Overflow, and formulates queries automatically from the active context in the IDE, presents a ranked and interactive list of results, and lets users import code samples in discussions through drag & drop.

...read moreread less

Abstract: Services, such as Stack Overflow, offer a web platform to programmers for discussing technical issues, in form of Question and Answers (Q&A). Since Q&A services store the discussions, the generated “crowd knowledge” can be accessed and consumed by a large audience for a long time. Nevertheless, Q&A services are detached from the development environments used by programmers: Developers have to tap into this crowd knowledge through web browsers and cannot smoothly integrate it into their workflow. This situation hinders part of the benefits of Q&A services. To better leverage the crowd knowledge of Q&A services, we created Seahawk, an Eclipse plugin that supports an integrated and largely automated approach to assist programmers using Stack Overflow. Seahawk formulates queries automatically from the active context in the IDE, presents a ranked and interactive list of results, lets users import code samples in discussions through drag & drop and link Stack Overflow discussions and source code persistently as a support for team work. Video Demo URL: http://youtu.be/DkqhiU9FYPI.

...read moreread less

Journal Article•DOI•

KNIME-CDK: Workflow-driven cheminformatics.

[...]

Stephan Beisken¹, Thorsten Meinl², Bernd Wiswedel, Luis F. de Figueiredo¹, Michael R. Berthold², Christoph Steinbeck¹ - Show less +2 more•Institutions (2)

European Bioinformatics Institute¹, University of Konstanz²

22 Aug 2013-BMC Bioinformatics

TL;DR: KNIME-CDK is an open-source plug-in for the Konstanz Information Miner, a free workflow platform that allows for efficient cross-vendor structural cheminformatics and its ease-of-use and modularity enables researchers to automate routine tasks and data analysis.

...read moreread less

Abstract: Background: Cheminformaticians have to routinely process and analyse libraries of small molecules. Among other things, that includes the standardization of molecules, calculation of various descriptors, visualisation of molecular structures, and downstream analysis. For this purpose, scientific workflow platforms such as the Konstanz Information Miner can be used if provided with the right plug-in. A workflow-based cheminformatics tool provides the advantage of ease-of-use and interoperability between complementary cheminformatics packages within the same framework, hence facilitating the analysis process. Results: KNIME-CDK comprises functions for molecule conversion to/from common formats, generation of signatures, fingerprints, and molecular properties. It is based on the Chemistry Development Toolkit and uses the Chemical Markup Language for persistence. A comparison with the cheminformatics plug-in RDKit shows that KNIME-CDK supports a similar range of chemical classes and adds new functionality to the framework. We describe the design and integration of the plug-in, and demonstrate the usage of the nodes on ChEBI, a library of small molecules of biological interest. Conclusions: KNIME-CDK is an open-source plug-in for the Konstanz Information Miner, a free workflow platform. KNIME-CDK is build on top of the open-source Chemistry Development Toolkit and allows for efficient cross-vendor structural cheminformatics. Its ease-of-use and modularity enables researchers to automate routine tasks and data analysis, bringing complimentary cheminformatics functionality to the workflow environment.

...read moreread less

Journal Article•DOI•

Adaptive resource configuration for Cloud infrastructure management

[...]

Michael Maurer¹, Ivona Brandic¹, Rizos Sakellariou²•Institutions (2)

Vienna University of Technology¹, University of Manchester²

01 Feb 2013-Future Generation Computer Systems

TL;DR: This work applies knowledge management to guarantee SLAs and low resource wastage in Clouds and designs and implements two methods, Case-Based Reasoning and rule-based approach, which prove feasibility as KM techniques and shows major improvements towards CBR.

...read moreread less

Posted Content•

Constructing Gazetteers from Volunteered Big Geo-Data Based on Hadoop

[...]

Song Gao¹, Linna Li², Wenwen Li³, Krzysztof Janowicz¹, Yue Zhang³ - Show less +1 more•Institutions (3)

University of California, Santa Barbara¹, California State University, Long Beach², Arizona State University³

29 Nov 2013-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: This research builds a scalable distributed platform and a high-performance geoprocessing workflow based on the Hadoop ecosystem to harvest crowd-sourced gazetteer entries and introduces a provenance-based trust model for quality assurance.

...read moreread less

Abstract: Traditional gazetteers are built and maintained by authoritative mapping agencies. In the age of Big Data, it is possible to construct gazetteers in a data-driven approach by mining rich volunteered geographic information (VGI) from the Web. In this research, we build a scalable distributed platform and a high-performance geoprocessing workflow based on the Hadoop ecosystem to harvest crowd-sourced gazetteer entries. Using experiments based on geotagged datasets in Flickr, we find that the MapReduce-based workflow running on the spatially enabled Hadoop cluster can reduce the processing time compared with traditional desktop-based operations by an order of magnitude. We demonstrate how to use such a novel spatial-computing infrastructure to facilitate gazetteer research. In addition, we introduce a provenance-based trust model for quality assurance. This work offers new insights on enriching future gazetteers with the use of Hadoop clusters, and makes contributions in connecting GIS to the cloud computing environment for the next frontier of Big Geo-Data analytics.

...read moreread less

Introducing The Neuroscience Gateway

[...]

Subhashini Sivagnanam, Amit Majumdar, Kenneth Yoshimoto, Vadim Astakhov, Anita Bandrowski, Maryann E. Martone, Nicholas T. Carnevale - Show less +3 more

01 Jan 2013

TL;DR: The Neuroscience Gateway hides or eliminates, from the point of view of the users, all the administrative and technical barriers and makes parallel neuronal simulation tools easily available and accessible on complex high performance computing machines and handles the running of jobs and data management and retrieval.

...read moreread less

Abstract: Last few decades have seen the emergence of computational neuroscience as a mature field where researchers are interested in modeling complex and large neuronal systems and require access to high performance computing machines and associated cyberinfrastructure to manage computational workflow and data. The neuronal simulation tools, used in this research field, are also implemented for parallel computers and suitable for high performance computing machines. But using these tools on complex high performance computing machines remain a challenge due to issues with acquiring computer time on these machines located at national supercomputer centers, dealing with complex user interface of these machines, dealing with data management and retrieval. The Neuroscience Gateway is being developed to alleviate all of these barriers to entry for computational neuroscientist. It hides or eliminates, from the point of view of the users, all the administrative and technical barriers and makes parallel neuronal simulation tools easily available and accessible on complex high performance computing machines and handles the running of jobs and data management and retrieval. This paper describes the architecture it is based on, how it is implemented, and how users can use this for computational neuroscience research using high performance computing at the back end.

...read moreread less

Book Chapter•DOI•

Model Based Design with Systems Engineering Based on RFLP Using V6

[...]

Sven Kleiner, Christoph Kramer

01 Jan 2013

TL;DR: The RFLP approach will be presented as the baseline for model-based design with Systems Engineering that enable close interaction and collaboration between the different engineering disciplines render resources and processes more efficient, enhance quality, and ensure that the target system ultimately meets the requirements, while reducing design cycle time and engineering lead time.

...read moreread less

Abstract: Today, coping with the different workflows, methods and tools of this inter-disciplinary approach to product development throughout a product’s life-cycle is the key challenge for a company. There is evidently a need for requirements engineering and management, as well as model-based design and engineering. More specifically, however, what is required is a unique and integrated methodology for requirements engineering and management, functional and logical design, as well as physical design in different domains for the multi-disciplinary development process based on a Systems Engineering approach early in the design process. In this paper, the RFLP approach (Requirements – Functional – Logical – Physical) will be presented as the baseline for model-based design with Systems Engineering that enable close interaction and collaboration between the different engineering disciplines render resources and processes more efficient, enhance quality, and ensure that the target system ultimately meets the requirements, while reducing design cycle time and engineering lead time.

...read moreread less

Journal Article•DOI•

GATE Teamware: a web-based, collaborative text annotation framework

[...]

Kalina Bontcheva¹, Hamish Cunningham¹, Ian Roberts¹, Angus Roberts¹, Valentin Tablan¹, Niraj Aswani¹, Genevieve Gorrell¹ - Show less +3 more•Institutions (1)

University of Sheffield¹

01 Dec 2013

TL;DR: GATE Teamware enables users to carry out complex corpus annotation projects, involving distributed annotator teams, and has been evaluated through the creation of several gold standard corpora and internal projects, as well as through external evaluation in commercial and EU text annotation projects.

...read moreread less

Abstract: This paper presents GATE Teamware--an open-source, web-based, collaborative text annotation framework. It enables users to carry out complex corpus annotation projects, involving distributed annotator teams. Different user roles are provided (annotator, manager, administrator) with customisable user interface functionalities, in order to support the complex workflows and user interactions that occur in corpus annotation projects. Documents may be pre-processed automatically, so that human annotators can begin with text that has already been pre-annotated and thus making them more efficient. The user interface is simple to learn, aimed at non-experts, and runs in an ordinary web browser, without need of additional software installation. GATE Teamware has been evaluated through the creation of several gold standard corpora and internal projects, as well as through external evaluation in commercial and EU text annotation projects. It is available as on-demand service on GateCloud.net, as well as open-source for self-installation.

...read moreread less

Journal Article•DOI•

Parametric accuracy: building information modeling process applied to the cultural heritage preservation

[...]

Simone Garagnani¹, Anna Maria Manferdini¹•Institutions (1)

University of Bologna¹

13 Feb 2013-The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences

TL;DR: In order to introduce a methodology destined to process point cloud data in a BIM environment with high accuracy, this paper describes some experiences on monumental sites documentation, generated through a plug-in written for Autodesk Revit and codenamed GreenSpider after its capability to layout points in space as if they were nodes of an ideal cobweb.

...read moreread less

Abstract: . Since their introduction, modeling tools aimed to architectural design evolved in today’s "digital multi-purpose drawing boards" based on enhanced parametric elements able to originate whole buildings within virtual environments. Semantic splitting and elements topology are features that allow objects to be "intelligent" (i.e. self-aware of what kind of element they are and with whom they can interact), representing this way basics of Building Information Modeling (BIM), a coordinated, consistent and always up to date workflow improved in order to reach higher quality, reliability and cost reductions all over the design process. Even if BIM was originally intended for new architectures, its attitude to store semantic inter-related information can be successfully applied to existing buildings as well, especially if they deserve particular care such as Cultural Heritage sites. BIM engines can easily manage simple parametric geometries, collapsing them to standard primitives connected through hierarchical relationships: however, when components are generated by existing morphologies, for example acquiring point clouds by digital photogrammetry or laser scanning equipment, complex abstractions have to be introduced while remodeling elements by hand, since automatic feature extraction in available software is still not effective. In order to introduce a methodology destined to process point cloud data in a BIM environment with high accuracy, this paper describes some experiences on monumental sites documentation, generated through a plug-in written for Autodesk Revit and codenamed GreenSpider after its capability to layout points in space as if they were nodes of an ideal cobweb.

...read moreread less

Journal Article•DOI•

phyloGenerator: an automated phylogeny generation tool for ecologists

[...]

William D. Pearse¹, Andy Purvis¹•Institutions (1)

Imperial College London¹

01 Jul 2013-Methods in Ecology and Evolution

TL;DR: PhyloGenerator is an open‐source, stand‐alone Python program that makes use of pre‐existing sequence data and taxonomic information to largely automate the estimation of phylogenies, and is a step towards an open, reproducible phylogenetic workflow.

...read moreread less

Abstract: Summary 1. Ecologists increasingly wish to use phylogenies, but are hampered by the technical challenge of phylogeny estimation. 2. We present phyloGenerator, an open-source, stand-alone Python program, that makes use of pre-existing sequence data and taxonomic information to largely automate the estimation of phylogenies. 3. phyloGenerator allows nonspecialists to quickly and easily produce robust, repeatable, and defensible phylogenies without requiring an extensive knowledge of phylogenetics. Experienced phylogeneticists may also find it useful as a tool to conduct exploratory analyses. 4. phyloGenerator performs a number of ‘sanity checks’ on users’ output, but users should still check their outputs carefully; we give some advice on how to do so. 5. By linking a number of tools in a common framework, phyloGenerator is a step towards an open, reproducible phylogenetic workflow. 6. Bundled downloads for Windows and Mac OSX, along with the source code and an install script for Linux, can be found at http://willpearse.github.io/phyloGenerator (note the capital ‘G’).

...read moreread less

Journal Article•DOI•

Next-generation sequencing meets genetic diagnostics: development of a comprehensive workflow for the analysis of BRCA1 and BRCA2 genes

[...]

Lídia Feliubadaló, Adriana Lopez-Doriga, Ester Castellsagué, Jesús del Valle, Mireia Menéndez, Eva Tornero, Eva Montes, Raquel Cuesta, Carolina Gómez, Olga Campos, Marta Pineda, Sara González, Victor Moreno, Joan Brunet, Ignacio Blanco, Eduard Serra, Gabriel Capellá, Conxi Lázaro - Show less +14 more

01 Aug 2013-European Journal of Human Genetics

TL;DR: The NGS-based workflow developed meets the sensitivity and specificity requirements for the genetic diagnosis of HBOCS and improves on the cost-effectiveness of current approaches.

...read moreread less

Abstract: Next-generation sequencing (NGS) is changing genetic diagnosis due to its huge sequencing capacity and cost-effectiveness. The aim of this study was to develop an NGS-based workflow for routine diagnostics for hereditary breast and ovarian cancer syndrome (HBOCS), to improve genetic testing for BRCA1 and BRCA2. A NGS-based workflow was designed using BRCA MASTR kit amplicon libraries followed by GS Junior pyrosequencing. Data analysis combined Variant Identification Pipeline freely available software and ad hoc R scripts, including a cascade of filters to generate coverage and variant calling reports. A BRCA homopolymer assay was performed in parallel. A research scheme was designed in two parts. A Training Set of 28 DNA samples containing 23 unique pathogenic mutations and 213 other variants (33 unique) was used. The workflow was validated in a set of 14 samples from HBOCS families in parallel with the current diagnostic workflow (Validation Set). The NGS-based workflow developed permitted the identification of all pathogenic mutations and genetic variants, including those located in or close to homopolymers. The use of NGS for detecting copy-number alterations was also investigated. The workflow meets the sensitivity and specificity requirements for the genetic diagnosis of HBOCS and improves on the cost-effectiveness of current approaches.

...read moreread less

Collapse