Showing papers on "Workflow published in 2016"

PDF

Open Access

Proceedings Article•DOI•

Jupyter Notebooks – a publishing format for reproducible computational workflows

[...]

Thomas Kluyver, Benjamin Ragan-Kelley¹, Fernando Perez², Brian E. Granger³, Matthias Bussonnier⁴, Jonathan Frederic, Kyle Kelley, Jessica B. Hamrick¹, Jason Grout⁵, Sylvain Corlay⁶, Paul Ivanov¹, Damián Avila, Safia Abdalla, Carol Willing - Show less +10 more•Institutions (6)

University of California, Berkeley¹, University of California², California Polytechnic State University³, Curie Institute⁴, Drake University⁵, Bloomberg L.P.⁶

01 Jan 2016

TL;DR: Jupyter notebooks, a document format for publishing code, results and explanations in a form that is both readable and executable, is presented.

...read moreread less

Abstract: It is increasingly necessary for researchers in all fields to write computer code, and in order to reproduce research results, it is important that this code is published. We present Jupyter notebooks, a document format for publishing code, results and explanations in a form that is both readable and executable. We discuss various tools and use cases for notebook documents.

...read moreread less

2,145 citations

Journal Article•DOI•

Bioconductor workflow for microbiome data analysis: From raw reads to community analyses [version 1; referees: 3 approved]

[...]

Ben J. Callahan, Kris Sankaran, Julia Fukuyama, Paul J. McMurdie, Susan Holmes - Show less +1 more

01 Jan 2016-F1000Research

343 citations

Journal Article•DOI•

Evolutionary Multi-Objective Workflow Scheduling in Cloud

[...]

Zhaomeng Zhu¹, Gongxuan Zhang¹, Miqing Li², Xiaohui Liu²•Institutions (2)

Nanjing University of Science and Technology¹, Brunel University London²

01 May 2016-IEEE Transactions on Parallel and Distributed Systems

TL;DR: An evolutionary multi-objective optimization (EMO)-based algorithm is proposed to solve this workflow scheduling problem on an infrastructure as a service (IaaS) platform and can achieve significantly better solutions than existing state-of-the-art QoS optimization scheduling algorithms in most cases.

...read moreread less

Abstract: Cloud computing provides promising platforms for executing large applications with enormous computational resources to offer on demand. In a Cloud model, users are charged based on their usage of resources and the required quality of service (QoS) specifications. Although there are many existing workflow scheduling algorithms in traditional distributed or heterogeneous computing environments, they have difficulties in being directly applied to the Cloud environments since Cloud differs from traditional heterogeneous environments by its service-based resource managing method and pay-per-use pricing strategies. In this paper, we highlight such difficulties, and model the workflow scheduling problem which optimizes both makespan and cost as a Multi-objective Optimization Problem (MOP) for the Cloud environments. We propose an evolutionary multi-objective optimization (EMO)-based algorithm to solve this workflow scheduling problem on an infrastructure as a service (IaaS) platform. Novel schemes for problem-specific encoding and population initialization, fitness evaluation and genetic operators are proposed in this algorithm. Extensive experiments on real world workflows and randomly generated workflows show that the schedules produced by our evolutionary algorithm present more stability on most of the workflows with the instance-based IaaS computing and pricing models. The results also show that our algorithm can achieve significantly better solutions than existing state-of-the-art QoS optimization scheduling algorithms in most cases. The conducted experiments are based on the on-demand instance types of Amazon EC2; however, the proposed algorithm are easy to be extended to the resources and pricing models of other IaaS services.

...read moreread less

321 citations

Journal Article•DOI•

EAGER: efficient ancient genome reconstruction

[...]

Alexander Peltzer¹, Günter Jäger², Alexander Herbig¹, Alexander Seitz², Christian Kniep, Johannes Krause¹, Kay Nieselt² - Show less +3 more•Institutions (2)

Max Planck Society¹, University of Tübingen²

31 Mar 2016-Genome Biology

TL;DR: This work introduces EAGER, a time-efficient pipeline, which greatly simplifies the analysis of large-scale genomic data sets and provides features to preprocess, map, authenticate, and assess the quality of ancient DNA samples.

...read moreread less

Abstract: The automated reconstruction of genome sequences in ancient genome analysis is a multifaceted process. Here we introduce EAGER, a time-efficient pipeline, which greatly simplifies the analysis of large-scale genomic data sets. EAGER provides features to preprocess, map, authenticate, and assess the quality of ancient DNA samples. Additionally, EAGER comprises tools to genotype samples to discover, filter, and analyze variants. EAGER encompasses both state-of-the-art tools for each step as well as new complementary tools tailored for ancient DNA data within a single integrated solution in an easily accessible format.

...read moreread less

298 citations

Journal Article•DOI•

Towards workflow scheduling in cloud computing

[...]

Mohammad Masdari¹, Sima ValiKardan¹, Zahra Shahi¹, Sonay Imani Azar¹•Institutions (1)

Islamic Azad University¹

01 May 2016-Journal of Network and Computer Applications

TL;DR: A comprehensive survey and analysis of state of the art workflow scheduling schemes for scheduling simple and scientific workflows in the cloud computing and provides a classification of the proposed schemes based on the type of scheduling algorithm applied in each scheme.

...read moreread less

203 citations

Journal Article•DOI•

CcpNmr AnalysisAssign: a flexible platform for integrated NMR analysis

[...]

Simon P. Skinner¹, Rasmus H. Fogh¹, Wayne Boucher², Timothy J. Ragan¹, Luca G. Mureddu¹, Geerten W. Vuister¹ - Show less +2 more•Institutions (2)

University of Leicester¹, University of Cambridge²

23 Sep 2016-Journal of Biomolecular NMR

TL;DR: CcpNmr version-3, the latest software release from the Collaborative Computational Project for NMR, is presented, designed to be simple, functional and flexible, and aims to ensure that routine tasks can be performed in a straightforward manner.

...read moreread less

Abstract: NMR spectroscopy is an indispensably powerful technique for the analysis of biomolecules under ambient conditions, both for structural- and functional studies. However, in practice the complexity of the technique has often frustrated its application by non-specialists. In this paper, we present CcpNmr version-3, the latest software release from the Collaborative Computational Project for NMR, for all aspects of NMR data analysis, including liquid- and solid-state NMR data. This software has been designed to be simple, functional and flexible, and aims to ensure that routine tasks can be performed in a straightforward manner. We have designed the software according to modern software engineering principles and leveraged the capabilities of modern graphics libraries to simplify a variety of data analysis tasks. We describe the process of backbone assignment as an example of the flexibility and simplicity of implementing workflows, as well as the toolkit used to create the necessary graphics for this workflow. The package can be downloaded from www.ccpn.ac.uk/v3-software/downloads and is freely available to all non-profit organisations.

...read moreread less

184 citations

DOI•

Common Workflow Language, v1.0

[...]

Peter Amstutz, Michael R. Crusoe, Nebojsa Tijanic, Brad Chapman¹, John Chilton², Michael Heuer³, Andrey V. Kartashov⁴, Dan Leehr⁵, Hervé Ménager⁶, Maya Nedeljkovich, Matt Scales⁷, Stian Soiland-Reyes, Luka Stojanovic - Show less +9 more•Institutions (7)

Harvard University¹, Pennsylvania State University², University of California, Berkeley³, Cincinnati Children's Hospital Medical Center⁴, Duke University⁵, Pasteur Institute⁶, Institute of Cancer Research⁷

08 Jul 2016

TL;DR: The Common Workflow Language (CWL) is designed to express workflows for data-intensive science, such as Bioinformatics, Medical Imaging, Chemistry, Physics, and Astronomy.

...read moreread less

Abstract: The Common Workflow Language (CWL) is an informal, multi-vendor working group consisting of various organizations and individuals that have an interest in portability of data analysis workflows. Our goal is to create specifications that enable data scientists to describe analysis tools and workflows that are powerful, easy to use, portable, and support reproducibility.CWL builds on technologies such as JSON-LD and Avro for data modeling and Docker for portable runtime environments. CWL is designed to express workflows for data-intensive science, such as Bioinformatics, Medical Imaging, Chemistry, Physics, and Astronomy.This is v1.0 of the CWL tool and workflow specification, released on 2016-07-08.

...read moreread less

169 citations

Design And Control Of Workflow Processes Business Process Management For The Service Industry

[...]

Nadine Gottschalk

01 Jan 2016

TL;DR: The design and control of workflow processes business process management for the service industry is universally compatible with any devices to read, allowing you to get the most less latency time to download any of the authors' books like this one.

...read moreread less

Abstract: Thank you for downloading design and control of workflow processes business process management for the service industry. Maybe you have knowledge that, people have search hundreds times for their chosen readings like this design and control of workflow processes business process management for the service industry, but end up in infectious downloads. Rather than enjoying a good book with a cup of tea in the afternoon, instead they are facing with some malicious bugs inside their computer. design and control of workflow processes business process management for the service industry is available in our book collection an online access to it is set as public so you can download it instantly. Our books collection spans in multiple locations, allowing you to get the most less latency time to download any of our books like this one. Merely said, the design and control of workflow processes business process management for the service industry is universally compatible with any devices to read.

...read moreread less

165 citations

Journal Article•DOI•

systemPipeR: NGS workflow and report generation environment

[...]

Tyler W. H. Backman¹, Thomas Girke¹•Institutions (1)

University of California, Berkeley¹

20 Sep 2016-BMC Bioinformatics

TL;DR: The R/Bioconductor package systemPipeR accelerates the extraction of reproducible analysis results from NGS experiments by making efficient use of existing software resources without limiting the user to a set of predefined methods or environments.

...read moreread less

Abstract: Next-generation sequencing (NGS) has revolutionized how research is carried out in many areas of biology and medicine. However, the analysis of NGS data remains a major obstacle to the efficient utilization of the technology, as it requires complex multi-step processing of big data demanding considerable computational expertise from users. While substantial effort has been invested on the development of software dedicated to the individual analysis steps of NGS experiments, insufficient resources are currently available for integrating the individual software components within the widely used R/Bioconductor environment into automated workflows capable of running the analysis of most types of NGS applications from start-to-finish in a time-efficient and reproducible manner. To address this need, we have developed the R/Bioconductor package systemPipeR. It is an extensible environment for both building and running end-to-end analysis workflows with automated report generation for a wide range of NGS applications. Its unique features include a uniform workflow interface across different NGS applications, automated report generation, and support for running both R and command-line software on local computers and computer clusters. A flexible sample annotation infrastructure efficiently handles complex sample sets and experimental designs. To simplify the analysis of widely used NGS applications, the package provides pre-configured workflows and reporting templates for RNA-Seq, ChIP-Seq, VAR-Seq and Ribo-Seq. Additional workflow templates will be provided in the future. systemPipeR accelerates the extraction of reproducible analysis results from NGS experiments. By combining the capabilities of many R/Bioconductor and command-line tools, it makes efficient use of existing software resources without limiting the user to a set of predefined methods or environments. systemPipeR is freely available for all common operating systems from Bioconductor ( http://bioconductor.org/packages/devel/systemPipeR ).

...read moreread less

159 citations

Journal Article•DOI•

Passive performance and building form: An optimization framework for early-stage design support

[...]

Kyle Konis¹, Alejandro Gamas¹, Karen Kensek¹•Institutions (1)

University of Southern California¹

01 Feb 2016-Solar Energy

TL;DR: The PPOF and simulation-based workflow help to make generative modeling, informed by powerful energy and lighting simulation engines, more accessible to designers working on regular projects and schedules to create high-performance buildings.

...read moreread less

158 citations

Journal Article•DOI•

Standardization and quality management in next-generation sequencing.

[...]

Christoph Endrullat¹, Jörn Glökler¹, Philipp Franke¹, Marcus Frohme¹•Institutions (1)

Technical University of Applied Sciences Wildau¹

01 Jul 2016-Applied and Translational Genomics

TL;DR: Current standardization efforts and quality management initiatives from companies, organizations and societies in form of published studies and ongoing projects will exert a decisive influence on traceability and reproducibility of sequence data.

...read moreread less

Abstract: DNA sequencing continues to evolve quickly even after > 30 years. Many new platforms suddenly appeared and former established systems have vanished in almost the same manner. Since establishment of next-generation sequencing devices, this progress gains momentum due to the continually growing demand for higher throughput, lower costs and better quality of data. In consequence of this rapid development, standardized procedures and data formats as well as comprehensive quality management considerations are still scarce. Here, we listed and summarized current standardization efforts and quality management initiatives from companies, organizations and societies in form of published studies and ongoing projects. These comprise on the one hand quality documentation issues like technical notes, accreditation checklists and guidelines for validation of sequencing workflows. On the other hand, general standard proposals and quality metrics are developed and applied to the sequencing workflow steps with the main focus on upstream processes. Finally, certain standard developments for downstream pipeline data handling, processing and storage are discussed in brief. These standardization approaches represent a first basis for continuing work in order to prospectively implement next-generation sequencing in important areas such as clinical diagnostics, where reliable results and fast processing is crucial. Additionally, these efforts will exert a decisive influence on traceability and reproducibility of sequence data.

...read moreread less

Journal Article•DOI•

EnReal : An Energy-Aware Resource Allocation Method for Scientific Workflow Executions in Cloud Environment

[...]

Xiaolong Xu¹, Wanchun Dou¹, Xuyun Zhang², Jinjun Chen³•Institutions (3)

Nanjing University¹, NICTA², University of Technology, Sydney³

01 Apr 2016

TL;DR: An energy consumption model is presented for applications deployed across cloud computing platforms, and a corresponding energy-aware resource allocation algorithm is proposed for virtual machine scheduling to accomplish scientific workflow executions.

...read moreread less

Abstract: Scientific workflows are often deployed across multiple cloud computing platforms due to their large-scale characteristic. This can be technically achieved by expanding a cloud platform. However, it is still a challenge to conduct scientific workflow executions in an energy-aware fashion across cloud platforms or even inside a cloud platform, since the cloud platform expansion will make the energy consumption a big concern. In this paper, we propose an En ergy-aware Re source Al location method, named EnReal , to address the above challenge. Basically, we leverage the dynamic deployment of virtual machines for scientific workflow executions. Specifically, an energy consumption model is presented for applications deployed across cloud computing platforms, and a corresponding energy-aware resource allocation algorithm is proposed for virtual machine scheduling to accomplish scientific workflow executions. Experimental evaluation demonstrates that the proposed method is both effective and efficient.

...read moreread less

Journal Article•DOI•

Toward the Geoscience Paper of the Future: Best practices for documenting and sharing research from data to software to provenance

[...]

Yolanda Gil¹, Cédric H. David², Ibrahim Demir³, Bakinam T. Essawy⁴, Robinson W. Fulweiler⁵, Jonathan L. Goodall⁴, Leif Karlstrom⁶, Huikyo Lee², Heath J. Mills⁷, Ji-Hyun Oh², Ji-Hyun Oh¹, Suzanne A. Pierce⁸, Allen Pope⁹, Allen Pope¹⁰, Mimi W. Tzeng, Sandra R. Villamizar¹¹, Xuan Yu¹² - Show less +13 more•Institutions (12)

University of Southern California¹, California Institute of Technology², University of Iowa³, University of Virginia⁴, Boston University⁵, University of Oregon⁶, University of Houston–Clear Lake⁷, University of Texas at Austin⁸, University of Washington⁹, University of Colorado Boulder¹⁰, Pontifical Bolivarian University¹¹, University of Delaware¹²

01 Oct 2016-Earth and Space Science

TL;DR: The Geoscience Paper of the Future (GPF) as discussed by the authors is an approach to fully document, share, and cite all their research products including data, software, and computational provenance.

...read moreread less

Abstract: Geoscientists now live in a world rich with digital data and methods, and their computational research cannot be fully captured in traditional publications. The Geoscience Paper of the Future (GPF) presents an approach to fully document, share, and cite all their research products including data, software, and computational provenance. This article proposes best practices for GPF authors to make data, software, and methods openly accessible, citable, and well documented. The publication of digital objects empowers scientists to manage their research products as valuable scientific assets in an open and transparent way that enables broader access by other scientists, students, decision makers, and the public. Improving documentation and dissemination of research will accelerate the pace of scientific discovery by improving the ability of others to build upon published work.

...read moreread less

Journal Article•DOI•

Cost optimization approaches for scientific workflow scheduling in cloud and grid computing: A review, classifications, and open issues

[...]

Ehab Nabiel Alkhanak¹, Sai Peck Lee¹, Reza Rezaei¹, Reza M. Parizi²•Institutions (2)

Information Technology University¹, New York Institute of Technology²

01 Mar 2016-Journal of Systems and Software

TL;DR: This paper extensively surveying existing SWFS approaches in cloud and grid computing and provides a classification of cost optimization aspects and parameters of SWFS that are categorized into monetary and temporal cost parameters based on various scheduling stages to help researchers and practitioners choose the most appropriate cost optimization approach.

...read moreread less

Journal Article•DOI•

Methodological challenges and analytic opportunities for modeling and interpreting Big Healthcare Data

[...]

Ivo D. Dinov¹•Institutions (1)

University of Michigan¹

25 Feb 2016-GigaScience

TL;DR: Using imaging, genetic and healthcare data, examples of processing heterogeneous datasets using distributed cloud services, automated and semi-automated classification techniques, and open-science protocols are provided.

...read moreread less

Abstract: Managing, processing and understanding big healthcare data is challenging, costly and demanding. Without a robust fundamental theory for representation, analysis and inference, a roadmap for uniform handling and analyzing of such complex data remains elusive. In this article, we outline various big data challenges, opportunities, modeling methods and software techniques for blending complex healthcare data, advanced analytic tools, and distributed scientific computing. Using imaging, genetic and healthcare data we provide examples of processing heterogeneous datasets using distributed cloud services, automated and semi-automated classification techniques, and open-science protocols. Despite substantial advances, new innovative technologies need to be developed that enhance, scale and optimize the management and processing of large, complex and heterogeneous data. Stakeholder investments in data acquisition, research and development, computational infrastructure and education will be critical to realize the huge potential of big data, to reap the expected information benefits and to build lasting knowledge assets. Multi-faceted proprietary, open-source, and community developments will be essential to enable broad, reliable, sustainable and efficient data-driven discovery and analytics. Big data will affect every sector of the economy and their hallmark will be ‘team science’.

...read moreread less

Proceedings Article•DOI•

CloudSeer: Workflow Monitoring of Cloud Infrastructures via Interleaved Logs

[...]

Xiao Yu¹, Pallavi Joshi², Jian-Wu Xu², Guoliang Jin¹, Hui Zhang², Guofei Jiang² - Show less +2 more•Institutions (2)

North Carolina State University¹, Princeton University²

25 Mar 2016

TL;DR: The experiments on OpenStack, a popular open-source cloud infrastructure, show that CloudSeer's efficiency and problem-detection capability are suitable for online monitoring.

...read moreread less

Abstract: Cloud infrastructures provide a rich set of management tasks that operate computing, storage, and networking resources in the cloud. Monitoring the executions of these tasks is crucial for cloud providers to promptly find and understand problems that compromise cloud availability. However, such monitoring is challenging because there are multiple distributed service components involved in the executions. CloudSeer enables effective workflow monitoring. It takes a lightweight non-intrusive approach that purely works on interleaved logs widely existing in cloud infrastructures. CloudSeer first builds an automaton for the workflow of each management task based on normal executions, and then it checks log messages against a set of automata for workflow divergences in a streaming manner. Divergences found during the checking process indicate potential execution problems, which may or may not be accompanied by error log messages. For each potential problem, CloudSeer outputs necessary context information including the affected task automaton and related log messages hinting where the problem occurs to help further diagnosis. Our experiments on OpenStack, a popular open-source cloud infrastructure, show that CloudSeer's efficiency and problem-detection capability are suitable for online monitoring.

...read moreread less

Journal Article•DOI•

Fault-Tolerant Scheduling for Real-Time Scientific Workflows with Elastic Resource Provisioning in Virtualized Clouds

[...]

Xiaomin Zhu¹, Ji Wang¹, Hui Guo², Dakai Zhu³, Laurence T. Yang⁴, Ling Liu⁵ - Show less +2 more•Institutions (5)

National University of Defense Technology¹, University of New South Wales², University of Texas at San Antonio³, St. Francis Xavier University⁴, Georgia Institute of Technology⁵

01 Dec 2016-IEEE Transactions on Parallel and Distributed Systems

TL;DR: A real-time workflow fault-tolerant model that extends the traditional PB model by incorporating the cloud characteristics is established and a dynamic fault-Tolerant scheduling algorithm, FASTER, is proposed for realtime workflows in the virtualized cloud.

...read moreread less

Abstract: Clouds are becoming an important platform for scientific workflow applications. However, with many nodes being deployed in clouds, managing reliability of resources becomes a critical issue, especially for the real-time scientific workflow execution where deadlines should be satisfied. Therefore, fault tolerance in clouds is extremely essential. The PB (primary backup) based scheduling is a popular technique for fault tolerance and has effectively been used in the cluster and grid computing. However, applying this technique for real-time workflows in a virtualized cloud is much more complicated and has rarely been studied. In this paper, we address this problem. We first establish a real-time workflow fault-tolerant model that extends the traditional PB model by incorporating the cloud characteristics. Based on this model, we develop approaches for task allocation and message transmission to ensure faults can be tolerated during the workflow execution. Finally, we propose a dynamic fault-tolerant scheduling algorithm, FASTER, for real-time workflows in the virtualized cloud. FASTER has three key features: 1) it employs a backward shifting method to make full use of the idle resources and incorporates task overlapping and VM migration for high resource utilization, 2) it applies the vertical/horizontal scaling-up technique to quickly provision resources for a burst of workflows, and 3) it uses the vertical scaling-down scheme to avoid unnecessary and ineffective resource changes due to fluctuated workflow requests. We evaluate our FASTER algorithm with synthetic workflows and workflows collected from the real scientific and business applications and compare it with six baseline algorithms. The experimental results demonstrate that FASTER can effectively improve the resource utilization and schedulability even in the presence of node failures in virtualized clouds.

...read moreread less

Journal Article•DOI•

QSAR Toolbox - workflow and major functionalities.

[...]

S. Dimitrov, Robert Diderich¹, Tomasz Sobanski, Todor Pavlov, Georgi Chankov, Atanas Chapkanov, Yordan H. Karakolev, Stanislav Temelkov, R Vasilev, K. Gerova, Chanita Kuseva, N. D. Todorova, Aycel Mehmed, Mike Rasenberg, Ovanes G. Mekenyan - Show less +11 more•Institutions (1)

Organisation for Economic Co-operation and Development¹

19 Feb 2016-Sar and Qsar in Environmental Research

TL;DR: The OECD QSAR Toolbox is a software application intended to be used by governments, the chemical industry and other stakeholders in filling gaps in (eco)toxicity data needed for assessing the hazards of chemicals.

...read moreread less

Abstract: The OECD QSAR Toolbox is a software application intended to be used by governments, the chemical industry and other stakeholders in filling gaps in (eco)toxicity data needed for assessing the hazards of chemicals. The development and release of the Toolbox is a cornerstone in the computerization of hazard assessment, providing an 'all inclusive' tool for the application of category approaches, such as read-across and trend analysis, in a single software application, free of charge. The Toolbox incorporates theoretical knowledge, experimental data and computational tools from various sources into a logical workflow. The main steps of this workflow are substance identification, identification of relevant structural characteristics and potential toxic mechanisms of interaction (i.e. profiling), identification of other chemicals that have the same structural characteristics and/or mechanism (i.e. building a category), data collection for the chemicals in the category and use of the existing experimental data to fill the data gap(s). The description of the Toolbox workflow and its main functionalities is the scope of the present article.

...read moreread less

Proceedings Article•

CFA: a practical prediction system for video QoE optimization

[...]

Junchen Jiang¹, Vyas Sekar¹, Henry Milner², Davis Shepherd, Ion Stoica², Hui Zhang¹ - Show less +2 more•Institutions (2)

Carnegie Mellon University¹, University of California, Berkeley²

16 Mar 2016

TL;DR: The design and implementation of Critical Feature Analytics is presented and it is demonstrated that CFA leads to significant improvements in video quality; e.g., 32% less buffering time and 12% higher bitrate than a random decision maker.

...read moreread less

Abstract: Many prior efforts have suggested that Internet video Quality of Experience (QoE) could be dramatically improved by using data-driven prediction of video quality for different choices (e.g., CDN or bitrate) to make optimal decisions. However, building such a prediction system is challenging on two fronts. First, the relationships between video quality and observed session features can be quite complex. Second, video quality changes dynamically. Thus, we need a prediction model that is (a) expressive enough to capture these complex relationships and (b) capable of updating quality predictions in near real-time. Unfortunately, several seemingly natural solutions (e.g., simple machine learning approaches and simple network models) fail on one or more fronts. Thus, the potential benefits promised by these prior efforts remain unrealized. We address these challenges and present the design and implementation of Critical Feature Analytics (CFA). The design of CFA is driven by domain-specific insights that video quality is typically determined by a small subset of critical features whose criticality persists over several tens of minutes. This enables a scalable and accurate workflow where we automatically learn critical features for different sessions on coarse-grained timescales, while updating quality predictions in near real-time. Using a combination of a real-world pilot deployment and trace-driven analysis, we demonstrate that CFA leads to significant improvements in video quality; e.g., 32% less buffering time and 12% higher bitrate than a random decision maker.

...read moreread less

Journal Article•DOI•

Low-time complexity budget-deadline constrained workflow scheduling on heterogeneous resources

[...]

Hamid Arabnejad¹, Jorge G. Barbosa¹, Radu Prodan²•Institutions (2)

University of Porto¹, University of Innsbruck²

01 Feb 2016-Future Generation Computer Systems

TL;DR: This paper presents a heuristic scheduling algorithm with quadratic time complexity that considers two important constraints for QoS-based workflow scheduling, time and cost, named Deadline-Budget Constrained Scheduling (DBCS).

...read moreread less

Journal Article•DOI•

A security and cost aware scheduling algorithm for heterogeneous tasks of scientific workflow in clouds

[...]

Zhongjin Li¹, Jidong Ge², Hongji Yang³, LiGuo Huang⁴, Haiyang Hu⁵, Hao Hu¹, Bin Luo¹ - Show less +3 more•Institutions (5)

Nanjing University¹, Beijing University of Posts and Telecommunications², Bath Spa University³, Southern Methodist University⁴, Hangzhou Dianzi University⁵

01 Dec 2016-Future Generation Computer Systems

TL;DR: This paper proposes a security and cost aware scheduling (SCAS) algorithm based on the meta-heuristic optimization technique, particle swarm optimization (PSO), the coding strategy of which is devised to minimize the total workflow execution cost while meeting the deadline and risk rate constraints.

...read moreread less

Journal Article•DOI•

Automatic data-driven real-time segmentation and recognition of surgical workflow.

[...]

Olga Dergachyova¹, David Bouget², David Bouget¹, Arnaud Huaulmé², Arnaud Huaulmé¹, Arnaud Huaulmé³, Xavier Morandi¹, Xavier Morandi², Pierre Jannin¹, Pierre Jannin² - Show less +6 more•Institutions (3)

French Institute of Health and Medical Research¹, University of Rennes², Joseph Fourier University³

19 Mar 2016

TL;DR: A fully data-driven and real-time method for segmentation and recognition of surgical phases using a combination of video data and instrument usage signals, exploiting no prior knowledge is proposed.

...read moreread less

Abstract: With the intention of extending the perception and action of surgical staff inside the operating room, the medical community has expressed a growing interest towards context-aware systems. Requiring an accurate identification of the surgical workflow, such systems make use of data from a diverse set of available sensors. In this paper, we propose a fully data-driven and real-time method for segmentation and recognition of surgical phases using a combination of video data and instrument usage signals, exploiting no prior knowledge. We also introduce new validation metrics for assessment of workflow detection. The segmentation and recognition are based on a four-stage process. Firstly, during the learning time, a Surgical Process Model is automatically constructed from data annotations to guide the following process. Secondly, data samples are described using a combination of low-level visual cues and instrument information. Then, in the third stage, these descriptions are employed to train a set of AdaBoost classifiers capable of distinguishing one surgical phase from others. Finally, AdaBoost responses are used as input to a Hidden semi-Markov Model in order to obtain a final decision. On the MICCAI EndoVis challenge laparoscopic dataset we achieved a precision and a recall of 91 % in classification of 7 phases. Compared to the analysis based on one data type only, a combination of visual features and instrument signals allows better segmentation, reduction of the detection delay and discovery of the correct phase order.

...read moreread less

Journal Article•DOI•

Quality Control of Structural MRI Images Applied Using FreeSurfer—A Hands-On Workflow to Rate Motion Artifacts

[...]

Lea L. Backhausen¹, Megan M. Herting², Judith Buse¹, Veit Roessner¹, Michael N. Smolka¹, Nora C. Vetter¹ - Show less +2 more•Institutions (2)

Dresden University of Technology¹, University of Southern California²

06 Dec 2016-Frontiers in Neuroscience

TL;DR: A stringent workflow of quality control steps during and after acquisition of T1-weighted images is proposed, which enables researchers dealing with populations that are typically affected by motion artifacts to enhance data quality and maximize sample sizes.

...read moreread less

Abstract: In structural magnetic resonance imaging motion artifacts are common, especially when not scanning healthy young adults. It has been shown that motion affects the analysis with automated image-processing techniques (e.g. FreeSurfer). This can bias results. Several developmental and adult studies have found reduced volume and thickness of gray matter due to motion artifacts. Thus, quality control is necessary in order to ensure an acceptable level of quality and to define exclusion criteria of images (i.e. determine participants with most severe artifacts). However, information about the quality control workflow and image exclusion procedure is largely lacking in the current literature and the existing rating systems differ. Here we propose a stringent workflow of quality control steps during and after acquisition of T1-weighted images, which enables researchers dealing with populations that are typically affected by motion artifacts to enhance data quality and maximize sample sizes. As an underlying aim we established a thorough quality control rating system for T1-weighted images and applied it to the analysis of developmental clinical data using the automated processing pipeline FreeSurfer. This hands-on workflow and quality control rating system will aid researchers in minimizing motion artifacts in the final data set, and therefore enhance the quality of structural magnetic resonance imaging studies.

...read moreread less

Patent•

Restocking workflow prioritization

[...]

Mark Mellott¹, Mark David Murawski¹, Vanessa Cassandra Sanchez¹, John Pecorari¹, Heather Viszlay¹ - Show less +1 more•Institutions (1)

Honeywell¹

05 Jan 2016

TL;DR: In this article, a system includes a plurality of data collection devices to receive and provide retail store inventory data, rate of sale data, and incoming product inventory data in real time; and a programmed computer coupled to receive the data from the plurality of devices and execute code to generate and prioritize restocking workflow activities as a function of the received data.

...read moreread less

Abstract: A system includes a plurality of data collection devices to receive and provide retail store product inventory data, rate of sale data, and incoming product inventory data in real time; and a programmed computer coupled to receive the data from the plurality of data collection devices and execute code to generate and prioritize restocking workflow activities as a function of the received data. The programmed computer further provides the restocking workflow activities to at least one of the data collection devices to direct a worker to restock a product.

...read moreread less

Patent•

System and method for assessing worker performance

[...]

Kwong Wing Au, Christopher L. Lofty, Steven Thomas, John Pecorari, James Geisler - Show less +1 more

11 Jan 2016

TL;DR: In this article, a system and a method for accurately and fairly assessing a worker's travel performance by analyzing the worker's voice dialog is presented, which can be used to evaluate the worker travel performance.

...read moreread less

Abstract: Logistical operations (e.g., warehouses) may use a voice-enabled workflow to facilitate the work tasks of a staff (i.e., population) of workers. Typically, it is necessary for a worker to travel from location-to-location to complete assigned work tasks. As such, a worker's time spent travelling often correlates with the worker's overall work performance. Understanding the worker's travel performance is highly desirable, but computing a fair and accurate travel-performance metric is difficult. One reason for this is that the distance a worker travels is often unknown. The present invention embraces a system and method for accurately and fairly assessing a worker's travel performance by analyzing the worker's voice dialog.

...read moreread less

Journal Article•DOI•

Cost Effective Genetic Algorithm for Workflow Scheduling in Cloud Under Deadline Constraint

[...]

Jasraj Meena¹, Malay Kumar¹, Manu Vardhan¹•Institutions (1)

National Institute of Technology, Raipur¹

11 Aug 2016-IEEE Access

TL;DR: This paper proposes a meta-heuristic cost effective genetic algorithm that minimizes the execution cost of the workflow while meeting the deadline in cloud computing environment, and develops novel schemes for encoding, population initialization, crossover, and mutations operators of genetic algorithm.

...read moreread less

Abstract: Cloud computing is becoming an increasingly admired paradigm that delivers high-performance computing resources over the Internet to solve the large-scale scientific problems, but still it has various challenges that need to be addressed to execute scientific workflows. The existing research mainly focused on minimizing finishing time (makespan) or minimization of cost while meeting the quality of service requirements. However, most of them do not consider essential characteristic of cloud and major issues, such as virtual machines (VMs) performance variation and acquisition delay. In this paper, we propose a meta-heuristic cost effective genetic algorithm that minimizes the execution cost of the workflow while meeting the deadline in cloud computing environment. We develop novel schemes for encoding, population initialization, crossover, and mutations operators of genetic algorithm. Our proposal considers all the essential characteristics of the cloud as well as VM performance variation and acquisition delay. Performance evaluation on some well-known scientific workflows, such as Montage, LIGO, CyberShake, and Epigenomics of different size exhibits that our proposed algorithm performs better than the current state-of-the-art algorithms.

...read moreread less

Patent•

Facilitating workflow application development

[...]

Dennis Doubleday, Jeffrey Pike, Shawn Zabel, Brian Bender, Mark Murawski - Show less +1 more

06 Jan 2016

TL;DR: In this article, a system has a domain expert component library stored on a computer readable storage device, the component library containing connectable components that create a mobile workflow based application; an intermediate representation of a workflow application based on the workflow sequence, and a software programming language environment to perform data manipulation changes to the intermediate representation.

...read moreread less

Abstract: A system has a domain expert component library stored on a computer readable storage device, the component library containing connectable components that create a mobile workflow based application; a domain expert user interface coupled to the domain expert component library to facilitate assembly of components in a workflow sequence; and a developer user interface coupled to: receive an intermediate representation of a workflow application based on the workflow sequence, and provide a software programming language environment to perform data manipulation changes to the intermediate representation to create the mobile workflow based application.

...read moreread less

Patent•

Application development using mutliple primary user interfaces

[...]

Shawn Zabel, Jeffrey Pike, Brian Bender, Dennis Doubleday, Mark Murawski - Show less +1 more

05 Jan 2016

TL;DR: In this article, the authors present techniques, software, apparatuses, and systems configured for application development for an application using multiple primary user interfaces (PUIs) in one or more embodiments.

...read moreread less

Abstract: Generally discussed herein are techniques, software, apparatuses, and systems configured for application development for an application using multiple primary user interfaces. In one or more embodiments, a method can include receiving data indicating a plurality of workflow activities to be used in an application, each of the workflow activities including data corresponding to a configuration of a view model module and a list of views to be associated with the configuration, receiving data indicating a plurality of primary user interface views to associate with each of the workflow activities, receiving data indicating a connection between two of the workflow activities of the plurality of workflow activities, and producing an application model based on the received data indicating the plurality of workflow activities, the data indicates the connection between two of the workflow activities and the data indicating the plurality of primary user interface views.

...read moreread less

Patent•

Voice mode asset retrieval

[...]

Shawn Zabel, Jeffrey Pike, Brian Bender, Dennis Doubleday, Mark Murawski - Show less +1 more

05 Jan 2016

TL;DR: In this paper, the authors detect an event published to a workflow activity by a voice-based dialog view, wherein the event indicates a state of asset retrieval, navigating to a built-in asset retrieval work activity, retrieving an asset, and dismissing the workflow activity to revert to the activity associated with the voice based dialog view.

...read moreread less

Abstract: A method includes detecting an event published to a workflow activity by a voice based dialog view, wherein the event indicates a state of asset retrieval, navigating to a built-in asset retrieval work activity, retrieving an asset, and dismissing the workflow activity to revert to a workflow activity associated with the voice based dialog view.

...read moreread less

Proceedings Article•DOI•

Data-driven Personas: Constructing Archetypal Users with Clickstreams and User Telemetry

[...]

Xiang Zhang¹, Hans-Frederick Brown, Anil Shankar•Institutions (1)

North Carolina State University¹

07 May 2016

TL;DR: This work directly incorporates user behavior via clicks gathered automatically from telemetry data related to the actual product use in the field and uses mixed models, a statistical approach that incorporates these clustered workflows to create five representative personas.

...read moreread less

Abstract: User Experience (UX) research teams following a user centered design approach harness personas to better understand a user's workflow by examining that user's behavior, goals, needs, wants, and frustrations To create target personas these researchers rely on workflow data from surveys, self-reports, interviews, and user observation However, this data not directly related to user behavior, weakly reflects a user's actual workflow in the product, is costly to collect, is limited to a few hundred responses, and is outdated as soon as a persona's workflows evolve To address these limitations we present a quantitative bottom-up data-driven approach to create personas First, we directly incorporate user behavior via clicks gathered automatically from telemetry data related to the actual product use in the field; since the data collection is automatic it is also cost effective Next, we aggregate 35 million clicks from 2400 users into 39,000 clickstreams and then structure them into 10 workflows via hierarchical clustering; we thus base our personas on a large data sample Finally, we use mixed models, a statistical approach that incorporates these clustered workflows to create five representative personas; updating our mixed model ensures that these personas remain current We also validated these personas with our product's user behavior experts to ensure that workflows and the persona goals represent actual product use

...read moreread less

Collapse