Showing papers on "Software portability published in 2015"

PDF

Open Access

Journal Article•DOI•

An introduction to Docker for reproducible research

[...]

20 Jan 2015-Operating Systems Review

TL;DR: How the popular emerging technology Docker combines several areas from systems research - such as operating system virtualization, cross-platform portability, modular re-usable elements, versioning, and a 'DevOps' philosophy, to address these challenges is examined.

...read moreread less

Abstract: As computational work becomes more and more integral to many aspects of scientific research, computational reproducibility has become an issue of increasing importance to computer systems researchers and domain scientists alike Though computational reproducibility seems more straight forward than replicating physical experiments, the complex and rapidly changing nature of computer environments makes being able to reproduce and extend such work a serious challenge In this paper, I explore common reasons that code developed for one research project cannot be successfully executed or extended by subsequent researchers I review current approaches to these issues, including virtual machines and workflow systems, and their limitations I then examine how the popular emerging technology Docker combines several areas from systems research - such as operating system virtualization, cross-platform portability, modular re-usable elements, versioning, and a 'DevOps' philosophy, to address these challenges I illustrate this with several examples of Docker use with a focus on the R statistical environment

...read moreread less

729 citations

Journal Article•DOI•

IBEX: an open infrastructure software platform to facilitate collaborative work in radiomics.

[...]

Lifei Zhang¹, David V. Fried¹, X Fave¹, L. Hunter¹, Jinzhong Yang¹, Laurence E. Court¹ - Show less +2 more•Institutions (1)

University of Texas MD Anderson Cancer Center¹

01 Mar 2015-Medical Physics

TL;DR: Two key elements in collaborative workflows, the consistency of data sharing and the reproducibility of calculation result, are embedded in the IBEX workflow: image data, feature algorithms, and model validation including newly developed ones from different users can be easily and consistently shared so that results can be more easily reproduced between institutions.

...read moreread less

Abstract: Purpose: Radiomics, which is the high-throughput extraction and analysis of quantitative image features, has been shown to have considerable potential to quantify the tumor phenotype. However, at present, a lack of software infrastructure has impeded the development of radiomics and its applications. Therefore, the authors developed the imaging biomarker explorer (ibex), an open infrastructure software platform that flexibly supports common radiomics workflow tasks such as multimodality image data import and review, development of feature extraction algorithms, model validation, and consistent data sharing among multiple institutions. Methods: The ibex software package was developed using the matlab and c/c++ programming languages. The software architecture deploys the modern model-view-controller, unit testing, and function handle programming concepts to isolate each quantitative imaging analysis task, to validate if their relevant data and algorithms are fit for use, and to plug in new modules. On one hand, ibex is self-contained and ready to use: it has implemented common data importers, common image filters, and common feature extraction algorithms. On the other hand, ibex provides an integrated development environment on top of matlab and c/c++, so users are not limited to its built-in functions. In the ibex developer studio, users can plug in, debug, and test new algorithms, extending ibex’s functionality. ibex also supports quality assurance for data and feature algorithms: image data, regions of interest, and feature algorithm-related data can be reviewed, validated, and/or modified. More importantly, two key elements in collaborative workflows, the consistency of data sharing and the reproducibility of calculation result, are embedded in the ibex workflow: image data, feature algorithms, and model validation including newly developed ones from different users can be easily and consistently shared so that results can be more easily reproduced between institutions. Results: Researchers with a variety of technical skill levels, including radiation oncologists, physicists, and computer scientists, have found the ibex software to be intuitive, powerful, and easy to use. ibex can be run at any computer with the windows operating system and 1GB RAM. The authors fully validated the implementation of all importers, preprocessing algorithms, and feature extraction algorithms. Windows version 1.0 beta of stand-alone ibex and ibex’s source code can be downloaded. Conclusions: The authors successfully implemented ibex, an open infrastructure software platform that streamlines common radiomics workflow tasks. Its transparency, flexibility, and portability can greatly accelerate the pace of radiomics research and pave the way toward successful clinical translation.

...read moreread less

264 citations

Proceedings Article•DOI•

Generating performance portable code using rewrite rules: from high-level functional expressions to high-performance OpenCL code

[...]

Michel Steuwer¹, Christian Fensch², Sam Lindley¹, Christophe Dubach¹•Institutions (2)

University of Edinburgh¹, Heriot-Watt University²

29 Aug 2015

TL;DR: This work proposes a novel approach aiming to combine high-level programming, code portability, and high-performance by applying a simple set of rewrite rules to transform it into a low-level functional representation close to the OpenCL programming model, from which OpenCL code is generated.

...read moreread less

Abstract: Computers have become increasingly complex with the emergence of heterogeneous hardware combining multicore CPUs and GPUs. These parallel systems exhibit tremendous computational power at the cost of increased programming effort resulting in a tension between performance and code portability. Typically, code is either tuned in a low-level imperative language using hardware-specific optimizations to achieve maximum performance or is written in a high-level, possibly functional, language to achieve portability at the expense of performance. We propose a novel approach aiming to combine high-level programming, code portability, and high-performance. Starting from a high-level functional expression we apply a simple set of rewrite rules to transform it into a low-level functional representation, close to the OpenCL programming model, from which OpenCL code is generated. Our rewrite rules define a space of possible implementations which we automatically explore to generate hardware-specific OpenCL implementations. We formalize our system with a core dependently-typed lambda-calculus along with a denotational semantics which we use to prove the correctness of the rewrite rules. We test our design in practice by implementing a compiler which generates high performance imperative OpenCL code. Our experiments show that we can automatically derive hardware-specific implementations from simple functional high-level algorithmic expressions offering performance on a par with highly tuned code for multicore CPUs and GPUs written by experts.

...read moreread less

123 citations

Journal Article•DOI•

What Counts as Scientific Data? A Relational Framework.

[...]

Sabina Leonelli¹•Institutions (1)

University of Exeter¹

01 Dec 2015-Philosophy of Science

TL;DR: This paper proposes an account of scientific data that makes sense of recent debates on data-driven and ‘big data’ research, while also building on the history of data production and use particularly within biology.

...read moreread less

Abstract: This paper proposes an account of scientific data that makes sense of recent debates on data-driven research, while also building on the history of data production and use particularly within biology. In this view, 'data' is a relational category applied to research outputs that are taken, at specific moments of inquiry, to provide evidence for knowledge claims of interest to the researchers involved. They do not have truth-value in and of themselves, nor can they be seen as straightforward representations of given phenomena. Rather, they are fungible objects defined by their portability and their prospective usefulness as evidence.

...read moreread less

122 citations

Posted Content•

The Stan Math Library: Reverse-Mode Automatic Differentiation in C++

[...]

Bob Carpenter, Matthew D. Hoffman, Marcus A. Brubaker, Daniel D. Lee, Peter Li, Michael Betancourt - Show less +2 more

23 Sep 2015-arXiv: Mathematical Software

TL;DR: The Stan Math Library is a C++, reverse-mode automatic differentiation library designed to be usable, extensive and extensible, efficient, scalable, stable, portable, and redistributable in order to facilitate the construction and utilization of such algorithms.

...read moreread less

Abstract: As computational challenges in optimization and statistical inference grow ever harder, algorithms that utilize derivatives are becoming increasingly more important. The implementation of the derivatives that make these algorithms so powerful, however, is a substantial user burden and the practicality of these algorithms depends critically on tools like automatic differentiation that remove the implementation burden entirely. The Stan Math Library is a C++, reverse-mode automatic differentiation library designed to be usable, extensive and extensible, efficient, scalable, stable, portable, and redistributable in order to facilitate the construction and utilization of such algorithms. Usability is achieved through a simple direct interface and a cleanly abstracted functional interface. The extensive built-in library includes functions for matrix operations, linear algebra, differential equation solving, and most common probability functions. Extensibility derives from a straightforward object-oriented framework for expressions, allowing users to easily create custom functions. Efficiency is achieved through a combination of custom memory management, subexpression caching, traits-based metaprogramming, and expression templates. Partial derivatives for compound functions are evaluated lazily for improved scalability. Stability is achieved by taking care with arithmetic precision in algebraic expressions and providing stable, compound functions where possible. For portability, the library is standards-compliant C++ (03) and has been tested for all major compilers for Windows, Mac OS X, and Linux.

...read moreread less

111 citations

Journal Article•DOI•

Root System Markup Language: Toward a Unified Root Architecture Description Language

[...]

Guillaume Lobet¹, Michael P. Pound², Julien Diener³, Christophe Pradal³, Xavier Draye, Christophe Godin³, Mathieu Javaux, Daniel Leitner, Félicien Meunier⁴, Philippe Nacry⁵, Tony P. Pridmore², Andrea Schnepf⁶ - Show less +8 more•Institutions (6)

University of Liège¹, University of Nottingham², French Institute for Research in Computer Science and Automation³, Université catholique de Louvain⁴, Institut national de la recherche agronomique⁵, Forschungszentrum Jülich⁶

01 Mar 2015-Plant Physiology

TL;DR: The Root System Markup Language (RSML), which has been designed to enable portability of root architecture data between different software tools in an easy and interoperable manner, is described, to provide a standard format upon which to base central repositories that will soon arise following the expanding worldwide root phenotyping effort.

...read moreread less

Abstract: The number of image analysis tools supporting the extraction of architectural features of root systems has increased over the last years. These tools offer a handy set of complementary facilities, yet it is widely accepted that none of these software tool is able to extract in an efficient way growing array of static and dynamic features for different types of images and species. . We describe the Root System Markup Language (RSML) that has been designed to overcome two major challenges: (i) to enable portability of root architecture data between different software tools in an easy and interoperable manner allowing seamless collaborative work, and (ii) to provide a standard format upon which to base central repositories which will soon arise following the expanding worldwide root phenotyping effort. RSML follows the XML standard to store 2D or 3D image metadata, plant and root properties and geometries, continuous functions along individual root paths and a suite of annotations at the image, plant or root scales, at one or several time points. Plant ontologies are used to describe botanical entities that are relevant at the scale of root system architecture. An xml-schema describes the features and constraints of RSML and open-source packages have been developed in several languages (R, Excel, Java, Python, C#) to enable researchers to integrate RSML files into popular research workflow.

...read moreread less

108 citations

Journal Article•DOI•

pocl: A Performance-Portable OpenCL Implementation

[...]

Pekka Jääskeläinen¹, Carlos S. de La Lama, Erik Schnetter², Kalle Raiskila³, Jarmo Takala¹, Heikki Berg³ - Show less +2 more•Institutions (3)

Tampere University of Technology¹, Perimeter Institute for Theoretical Physics², Nokia³

01 Oct 2015-International Journal of Parallel Programming

TL;DR: The proposed open source implementation of OpenCL is also platform portable, enabling OpenCL on a wide range of architectures, both already commercialized and on those that are still under research.

...read moreread less

Abstract: OpenCL is a standard for parallel programming of heterogeneous systems The benefits of a common programming standard are clear; multiple vendors can provide support for application descriptions written according to the standard, thus reducing the program porting effort While the standard brings the obvious benefits of platform portability, the performance portability aspects are largely left to the programmer The situation is made worse due to multiple proprietary vendor implementations with different characteristics, and, thus, required optimization strategies In this paper, we propose an OpenCL implementation that is both portable and performance portable At its core is a kernel compiler that can be used to exploit the data parallelism of OpenCL programs on multiple platforms with different parallel hardware styles The kernel compiler is modularized to perform target-independent parallel region formation separately from the target-specific parallel mapping of the regions to enable support for various styles of fine-grained parallel resources such as subword SIMD extensions, SIMD datapaths and static multi-issue Unlike previous similar techniques that work on the source level, the parallel region formation retains the information of the data parallelism using the LLVM IR and its metadata infrastructure This data can be exploited by the later generic compiler passes for efficient parallelization The proposed open source implementation of OpenCL is also platform portable, enabling OpenCL on a wide range of architectures, both already commercialized and on those that are still under research The paper describes how the portability of the implementation is achieved We test the two aspects to portability by utilizing the kernel compiler and the OpenCL implementation to run OpenCL applications in various platforms with different style of parallel resources The results show that most of the benchmarked applications when compiled using pocl were faster or close to as fast as the best proprietary OpenCL implementation for the platform at hand

...read moreread less

100 citations

Proceedings Article•DOI•

PENCIL: A Platform-Neutral Compute Intermediate Language for Accelerator Programming

[...]

Riyadh Baghdadi¹, Ulysse Beaugnon¹, Albert Cohen¹, Tobias Grosser¹, Michael Kruse¹, Chandan Reddy¹, Sven Verdoolaege¹, Adam Betts², Alastair F. Donaldson², Jeroen Ketema², Javed Absar, Sven van Haastregt, Alexey Kravets, Anton Lokhmotov, Robert David, Elnar Hajiyev - Show less +12 more•Institutions (2)

French Institute for Research in Computer Science and Automation¹, Imperial College London²

18 Oct 2015

TL;DR: PENCIL, a rigorously-defined subset of GNU C99-enriched with additional language constructs-that enables compilers to exploit parallelism and produce highly optimized code when targeting accelerators, is presented.

...read moreread less

Abstract: Programming accelerators such as GPUs withlow-level APIs and languages such as OpenCL and CUDAis difficult, error-prone, and not performance-portable. Au-tomatic parallelization and domain specific languages (DSLs)have been proposed to hide complexity and regain performanceportability. We present P ENCIL, a rigorously-defined subset ofGNU C99 -- enriched with additional language constructs -- that enables compilers to exploit parallelism and produce highlyoptimized code when targeting accelerators. P ENCIL aims toserve both as a portable implementation language for libraries, and as a target language for DSL compilers. We implemented a P ENCIL-to-OpenCL backend using astate-of-the-art polyhedral compiler. The polyhedral compiler, extended to handle data-dependent control flow and non-affinearray accesses, generates optimized OpenCL code. To demon-strate the potential and performance portability of P ENCILand the P ENCIL-to-OpenCL compiler, we consider a numberof image processing kernels, a set of benchmarks from theRodinia and SHOC suites, and DSL embedding scenarios forlinear algebra (BLAS) and signal processing radar applications(SpearDE), and present experimental results for four GPUplatforms: AMD Radeon HD 5670 and R9 285, NVIDIAGTX 470, and ARM Mali-T604.

...read moreread less

98 citations

Proceedings Article•DOI•

Robot Web Tools: Efficient messaging for cloud robotics

[...]

Russell Toris¹, Julius Kammerl², David V. Lu³, Jihoon Lee, Odest Chadwicke Jenkins⁴, Sarah Osentoski⁵, Mitchell Wills¹, Sonia Chernova¹ - Show less +4 more•Institutions (5)

Worcester Polytechnic Institute¹, Willow Garage², Washington University in St. Louis³, Brown University⁴, Bosch⁵

17 Dec 2015

TL;DR: These efforts with Robot Web Tools are described to advance: 1) human-robot interaction through usable client and visualization libraries for more efficient development of front-end human- robot interfaces, and 2) cloud robotics through more efficient methods of transporting high-bandwidth topics.

...read moreread less

Abstract: Since its official introduction in 2012, the Robot Web Tools project has grown tremendously as an open-source community, enabling new levels of interoperability and portability across heterogeneous robot systems, devices, and front-end user interfaces. At the heart of Robot Web Tools is the rosbridge protocol as a general means for messaging ROS topics in a client-server paradigm suitable for wide area networks, and human-robot interaction at a global scale through modern web browsers. Building from rosbridge, this paper describes our efforts with Robot Web Tools to advance: 1) human-robot interaction through usable client and visualization libraries for more efficient development of front-end human-robot interfaces, and 2) cloud robotics through more efficient methods of transporting high-bandwidth topics (e.g., kinematic transforms, image streams, and point clouds). We further discuss the significant impact of Robot Web Tools through a diverse set of use cases that showcase the importance of a generic messaging protocol and front-end development systems for human-robot interaction.

...read moreread less

85 citations

Proceedings Article•DOI•

Enabling IOT services using WIFI — ZigBee gateway for a home automation system

[...]

G. V Vivek¹, M. P. Sunil¹•Institutions (1)

Jain University¹

01 Nov 2015

TL;DR: This experiment demonstrates that the proposed gateway work efficiently by sending and receiving instructions from different protocols, a Graphical User Interface (GUI) allows user to interact with the ambient environment settings.

...read moreread less

Abstract: Home automation system is a process of automating or adapting to basic household activities like control of Lighting, Heating, Ventilation and Air condition (HVAC) appliances on user command. A dedicated hand held device is ideal in providing a user interface in a home automation system, due to their portability and their wide range of capabilities. The hand held device runs on Linux OS which is an open source platform and can work on limited memory communication between the appliances a home automation network through low power communication protocols such as ZigBee. With the recent increase in the use of internet and cheaper components, large number of Home automation system with IOT capabilities are in demand. Home automation system prototype includes a gateway with user interactions capabilities, this experiment demonstrates that the proposed gateway work efficiently by sending and receiving instructions from different protocols, a Graphical User Interface (GUI) allows user to interact with the ambient environment settings.

...read moreread less

72 citations

Journal Article•DOI•

Implementation of a BIM Domain-specific Language for the Building Environment Rule and Analysis

[...]

Jin-Kook Lee¹, Charles M. Eastman², Yong-Cheol Lee²•Institutions (2)

Hanyang University¹, Georgia Institute of Technology²

01 Aug 2015-Journal of Intelligent and Robotic Systems

TL;DR: The implementation of the BERA Language is based on the use of Industry Foundation Classes as given building information models, Solibri Model Checker as an IFC engine, and the Java Virtual Machine as a compilation and execution environment.

...read moreread less

Abstract: This paper describes an implementation process for a domain-specific computer programming language: the Building Environment Rule and Analysis (BERA) Language. As the growing area of Building Information Modeling (BIM), there has been a need to develop highly customized domain-specific languages for handling issues in building models in the architecture, engineering and construction (AEC) industry sector. The BERA Language, one of the domain-specific languages, deals with building information models in an intuitive way in order to ensure the quality of design and assess the design programming requirements using user-defined rules in the early design phases. To accomplish these goals, the BERA Language provides the capabilities for an effectiveness and ease of use without precise knowledge of general-purpose languages that are conventionally used in BIM software development. Furthermore, the design and implementation of the BERA Language focuses on building objects and their associated information-rich properties and relationships. This paper represents the implementation issues of the BERA Language associated with the building information models, their mapping into the building data structure, and their instantiation and execution. In addition, Portability of the language, extensibility and platform-dependent issues are involved in the BERA Language implementation. The implementation described in this paper is based on the use of Industry Foundation Classes (IFC) as given building information models, Solibri Model Checker?${\circledR }$ (SMC) as an IFC engine, and the Java Virtual Machine (JVM) as a compilation and execution environment.

...read moreread less

Journal Article•DOI•

BatchJobs and BatchExperiments: Abstraction Mechanisms for Using R in Batch Environments

[...]

Bernd Bischl, Michel Lang, Olaf Mersmann, Jörg Rahnenführer, Claus Weihs - Show less +1 more

20 Mar 2015-Journal of Statistical Software

TL;DR: Two R packages which greatly simplify working in batch computing environments and use a clear and well-defined interface to the batch system which makes them applicable in most high-performance computing environments are presented.

...read moreread less

Abstract: Empirical analysis of statistical algorithms often demands time-consuming experiments. We present two R packages which greatly simplify working in batch computing environments. The package BatchJobs implements the basic objects and procedures to control any batch cluster from within R. It is structured around cluster versions of the well-known higher order functions Map, Reduce and Filter from functional programming. Computations are performed asynchronously and all job states are persistently stored in a database, which can be queried at any point in time. The second package, BatchExperiments, is tailored for the still very general scenario of analyzing arbitrary algorithms on problem instances. It extends package BatchJobs by letting the user define an array of jobs of the kind “apply algorithm A to problem instance P and store results”. It is possible to associate statistical designs with parameters of problems and algorithms and therefore to systematically study their influence on the results.The packages’ main features are: (a) Convenient usage: All relevant batch system operations are either handled internally or mapped to simple R functions. (b) Portability: Both packages use a clear and well-defined interface to the batch system which makes them applicable in most high-performance computing environments. (c) Reproducibility: Every computational part has an associated seed to ensure reproducibility even when the underlying batch system changes. (d) Abstraction and good software design: The code layers for algorithms, experiment definitions and execution are cleanly separated and enable the writing of readable and maintainable code.

...read moreread less

Journal Article•DOI•

Heterogeneous computing on mixed unstructured grids with PyFR

[...]

Freddie D. Witherden¹, Brian C. Vermeire¹, Peter E. Vincent¹•Institutions (1)

Imperial College London¹

05 Oct 2015-Computers & Fluids

TL;DR: PyFR as mentioned in this paper is an open-source high-order accurate computational fluid dynamics solver for unstructured grids, which has been extended to run on mixed element meshes, and a range of hardware platforms, including heterogeneous multi-node systems.

...read moreread less

Journal Article•DOI•

Design and Development of nEMoS, an All-in-One, Low-Cost, Web-Connected and 3D-Printed Device for Environmental Analysis

[...]

Francesco Salamone¹, Lorenzo Belussi¹, Ludovico Danza¹, Matteo Ghellere¹, Italo Meroni¹ - Show less +1 more•Institutions (1)

National Research Council¹

04 Jun 2015-Sensors

TL;DR: A low-cost and open-source hardware architecture able to detect the indoor variables necessary for the IEQ calculation as an alternative to the traditional hardware used for this purpose is described.

...read moreread less

Abstract: The Indoor Environmental Quality (IEQ) refers to the quality of the environment in relation to the health and well-being of the occupants. It is a holistic concept, which considers several categories, each related to a specific environmental parameter. This article describes a low-cost and open-source hardware architecture able to detect the indoor variables necessary for the IEQ calculation as an alternative to the traditional hardware used for this purpose. The system consists of some sensors and an Arduino board. One of the key strengths of Arduino is the possibility it affords of loading the script into the board’s memory and letting it run without interfacing with computers, thus granting complete independence, portability and accuracy. Recent works have demonstrated that the cost of scientific equipment can be reduced by applying open-source principles to their design using a combination of the Arduino platform and a 3D printer. The evolution of the 3D printer has provided a new means of open design capable of accelerating self-directed development. The proposed nano Environmental Monitoring System (nEMoS) instrument is shown to have good reliability and it provides the foundation for a more critical approach to the use of professional sensors as well as for conceiving new scenarios and potential applications.

...read moreread less

Proceedings Article•DOI•

Higher-level parallelization for local and distributed asynchronous task-based programming

[...]

Hartmut Kaiser¹, Thomas Heller, Daniel Bourgeois¹, Dietmar Fey•Institutions (1)

Louisiana State University¹

15 Nov 2015

TL;DR: Higher level facilities which are fully aligned with modern C++ programming concepts, are easily extensible, fully generic, and enable highly efficient parallelization on par with or better than existing equivalent applications based on OpenMP and/or MPI are presented.

...read moreread less

Abstract: One of the biggest challenges on the way to exascale computing is programmability in the context of performance portability. The efficient utilization of the prospective architectures of exascale supercomputers will be challenging in many ways, very much because of a massive increase of on-node parallelism, and an increase of complexity of memory hierarchies. Parallel programming models need to be able to formulate algorithms that allow exploiting these architectural peculiarities. The recent revival of interest in the industry and wider community for the C++ language has spurred a remarkable amount of standardization proposals and technical specifications. Among those efforts is the development of seamlessly integrating various types of parallelism, such as iterative parallel execution, task-based parallelism, asynchronous execution flows, continuation style computation, and explicit fork-join control flow of independent and non-homogeneous code paths. Those proposals are the foundation of a powerful high-level abstraction that allows C++ codes to deal with an ever increasing architectural complexity in recent hardware developments.In this paper, we present the results of developing those higher level parallelization facilities in HPX, a general purpose C++ runtime system for applications of any scale. The developed higher-level parallelization APIs have been designed to overcome the limitations of today's prevalently used programming models in C++ codes. HPX exposes a uniform higher-level API which gives the application programmer syntactic and semantic equivalence of various types of on-node and off-node parallelism, all of which are well integrated into the C++ type system. We show that these higher level facilities which are fully aligned with modern C++ programming concepts, are easily extensible, fully generic, and enable highly efficient parallelization on par with or better than existing equivalent applications based on OpenMP and/or MPI.

...read moreread less

Proceedings Article•DOI•

SEANet: A Software-Defined Acoustic Networking Framework for Reconfigurable Underwater Networking

[...]

Emrecan Demirors¹, Bharatwaj G. Shankar¹, G. Enrico Santagati¹, Tommaso Melodia¹•Institutions (1)

Northeastern University¹

22 Oct 2015

TL;DR: SEANet (Software-dEfined Acoustic Networking), a modular, evolving software-defined framework for UAN devices that offers the necessary flexibility to adapt and satisfy different application and system requirements through a well-defined set of functionalities at the physical, data-link, network, and application layers of the networking protocol stack is presented.

...read moreread less

Abstract: As of today, Underwater Acoustic Networks (UANs) are heavily dependent on commercially available acoustic modems. While commercial modems are often able to support specific applications, they are typically not flexible enough to satisfy the requirements of next-generation UANs, which need to be able to adapt their communication and networking protocols in real-time based on the environmental and application conditions. To address these needs, we present SEANet (Software-dEfined Acoustic Networking), a modular, evolving software-defined framework for UAN devices that offers the necessary flexibility to adapt and satisfy different application and system requirements through a well-defined set of functionalities at the physical, data-link, network, and application layers of the networking protocol stack. SEANet is based on a structured modular architecture that enables real-time reconfiguration at different layers, provides a flexible platform for the deployment of new protocol designs and enhancements, and ensures software portability for platform independence. Moreover, we present a prototype of a low-cost, fully reconfigurable underwater sensing platform that implements the SEANet framework, and discuss performance evaluation results from water tank tests.

...read moreread less

Proceedings Article•DOI•

DeSC: decoupled supply-compute communication management for heterogeneous architectures

[...]

Tae Jun Ham¹, Juan L. Aragón², Margaret Martonosi¹•Institutions (2)

Princeton University¹, University of Murcia²

05 Dec 2015

TL;DR: This work proposes Decoupled Supply-Compute (DeSC) as a way to attack memory bottlenecks automatically, while maintaining good portability and low complexity, and updates and expands onDecoupled Access Execute approaches with increased specialization and automatic compiler support.

...read moreread less

Abstract: Today's computers employ significant heterogeneity to meet performance targets at manageable power. In adopting increased compute specialization, however, the relative amount of time spent on memory or communication latency has increased. System and software optimizations for memory and communication often come at the costs of increased complexity and reduced portability. We propose Decoupled Supply-Compute (DeSC) as a way to attack memory bottlenecks automatically, while maintaining good portability and low complexity. Drawing from Decoupled Access Execute (DAE) approaches, our work updates and expands on these techniques with increased specialization and automatic compiler support. Across the evaluated workloads, DeSC offers an average of 2.04× speedup over baseline (on homogeneous CMPs) and 1.56× speedup when a DeSC data supplier feeds data to a hardware accelerator. Achieving performance very close to what a perfect cache hierarchy would offer, DeSC offers the performance gains of specialized communication acceleration while maintaining useful generality across platforms.

...read moreread less

Proceedings Article•DOI•

Khronos SYCL for OpenCL: a tutorial

[...]

Ronan Keryell¹, Ruyman Reyes, Lee Howes²•Institutions (2)

Advanced Micro Devices¹, Qualcomm²

12 May 2015

TL;DR: This tutorial will introduce the concepts behind OpenCL SYCL, present an implementation of SYCL targeting OpenCL devices with SPIR based on Clang/LLVM and an open source CPU-only implementation based on C++1z, Boost and OpenMP.

...read moreread less

Abstract: SYCL ([sikə l] as in sickle) is a royalty-free, cross-platform C++ abstraction layer that builds on the underlying concepts, portability and efficiency of OpenCL, while adding the ease-of-use and flexibility of modern C++11. For example, SYCL enables single source development where C++ template functions can contain both host and device code to construct complex algorithms that use OpenCL acceleration, and then re-use them throughout their source code on different types of data.In this tutorial we will introduce the concepts behind OpenCL SYCL, present an implementation of SYCL targeting OpenCL devices with SPIR based on Clang/LLVM and an open source CPU-only implementation based on C++1z, Boost and OpenMP.Attendees of the last session are encouraged to install the open-source CPU-only implementation of SYCL and code along on laptop/tablet.

...read moreread less

Proceedings Article•DOI•

Machine Learning Based Auto-Tuning for Enhanced OpenCL Performance Portability

[...]

Thomas L. Falch¹, Anne C. Elster¹•Institutions (1)

Norwegian University of Science and Technology¹

25 May 2015

TL;DR: This paper uses machine learning-based auto-tuning to address poor performance portability in heterogeneous computing, and builds an artificial neural network based model that achieves a mean relative error as low as 6.1%, and is able to find configurations as little as 1.3% worse than the global minimum.

...read moreread less

Abstract: Heterogeneous computing, which combines devices with different architectures, is rising in popularity, and promises increased performance combined with reduced energy consumption. OpenCL has been proposed as a standard for programing such systems, and offers functional portability. It does, however, suffer from poor performance portability, code tuned for one device must be re-tuned to achieve good performance on another device. In this paper, we use machine learning-based auto-tuning to address this problem. Benchmarks are run on a random subset of the entire tuning parameter configuration space, and the results are used to build an artificial neural network based model. The model can then be used to find interesting parts of the parameter space for further search. We evaluate our method with different benchmarks, on several devices, including an Intel i7 3770 CPU, an Nvidia K40 GPU and an AMD Radeon HD 7970 GPU. Our model achieves a mean relative error as low as 6.1%, and is able to find configurations as little as 1.3% worse than the global minimum.

...read moreread less

Journal Article•DOI•

Design criteria for visualization of energy consumption: a systematic literature review

[...]

Latha Karthigaa Murugesan¹, Rashina Hoda¹, Zoran Salcic¹•Institutions (1)

University of Auckland¹

01 Nov 2015-Sustainable Cities and Society

TL;DR: Criteria for visualization, which include information displayed in the visualization, modes of visualization, and visualization techniques, and non-functional criteria, provide clear guidelines based on research evidence for software engineers and researchers designing visualizations of energy consumption for end-users.

...read moreread less

Proceedings Article•DOI•

ml.lib: Robust, Cross-platform, Open-source Machine Learning for Max and Pure Data

[...]

Jamie Bullock, Ali Momeni¹•Institutions (1)

Carnegie Mellon University¹

30 May 2015

TL;DR: The ml.lib project as discussed by the authors is a set of open-source tools designed for employing a wide range of machine learning techniques within two popular real-time programming environments, namely Max and Pure Data.

...read moreread less

Abstract: This paper documents the development of ml.lib: a set of open-source tools designed for employing a wide range of machine learning techniques within two popular real-time programming environments, namely Max and Pure Data. ml.lib is a cross-platform, lightweight wrapper around Nick Gillian's Gesture Recognition Toolkit, a C++ library that includes a wide range of data processing and machine learning techniques. ml.lib adapts these techniques for real-time use within popular data-flow IDEs, allowing instrument designers and performers to integrate robust learning, classification and mapping approaches within their existing workflows. ml.lib has been carefully de-signed to allow users to experiment with and incorporate ma-chine learning techniques within an interactive arts context with minimal prior knowledge. A simple, logical and consistent, scalable interface has been provided across over sixteen exter-nals in order to maximize learnability and discoverability. A focus on portability and maintainability has enabled ml.lib to support a range of computing architectures - including ARM - and operating systems such as Mac OS, GNU/Linux and Win-dows, making it the most comprehensive machine learning implementation available for Max and Pure Data.

...read moreread less

Journal Article•DOI•

BigDataScript: a scripting language for data pipelines

[...]

Pablo Cingolani¹, Robert Sladek¹, Mathieu Blanchette¹•Institutions (1)

McGill University¹

01 Jan 2015-Bioinformatics

TL;DR: By abstracting pipeline concepts at programming language level, BDS simplifies implementation, execution and management of complex bioinformatics pipelines, resulting in reduced development and debugging cycles as well as cleaner code.

...read moreread less

Abstract: Motivation: The analysis of large biological datasets often requires complex processing pipelines that run for a long time on large computational infrastructures We designed and implemented a simple script-like programming language with a clean and minimalist syntax to develop and manage pipeline execution and provide robustness to various types of software and hardware failures as well as portability Results: We introduce the BigDataScript (BDS) programming language for data processing pipelines, which improves abstraction from hardware resources and assists with robustness Hardware abstraction allows BDS pipelines to run without modification on a wide range of computer architectures, from a small laptop to multi-core servers, server farms, clusters and clouds BDS achieves robustness by incorporating the concepts of absolute serialization and lazy processing, thus allowing pipelines to recover from errors By abstracting pipeline concepts at programming language level, BDS simplifies implementation, execution and management of complex bioinformatics pipelines, resulting in reduced development and debugging cycles as well as cleaner code Availability and implementation: BigDataScript is available under open-source license at http://pcingolagithubio/BigDataScript Contact: mocliamg@inalogniceolbap

...read moreread less

Book•

Mobile Cloud Computing: Architectures, Algorithms and Applications

[...]

Debashis De

24 Dec 2015

TL;DR: The book explains how to integrate MCC with vehicular networks, compares economic models, and explores the application of MCC to mobile learning, vehicle monitoring, digital forensic analysis, health monitoring, and other areas.

...read moreread less

Abstract: Minimize Power Consumption and Enhance User Experience Essential for high-speed fifth-generation mobile networks, mobile cloud computing (MCC) integrates the power of cloud data centers with the portability of mobile computing devices. Mobile Cloud Computing: Architectures, Algorithms and Applications covers the latest technological and architectural advances in MCC. It also shows how MCC is used in health monitoring, gaming, learning, and commerce. The book examines computation within a mobile device; the evolution, architecture, and applications of cloud computing; the integration of mobile computing and cloud computing; offloading strategies that address constraints such as poor battery life; and green technologies to optimize mobile power consumption. It also presents various resource allocation schemes of MCC, the architecture and applications of sensor MCC, the new concept of mobile social cloud, security and privacy issues in MCC, and different types of trust in MCC. In addition, the book explains how to integrate MCC with vehicular networks, compares economic models, and explores the application of MCC to mobile learning, vehicle monitoring, digital forensic analysis, health monitoring, and other areas. The book concludes with a discussion of possible solutions to challenges such as energy efficiency, latency minimization, efficient resource management, billing, and security.

...read moreread less

Journal Article•DOI•

IT Service Platforms: Their Value Creation Model and the Impact of their Level of Openness on their Adoption☆

[...]

Selam Abrham Gebregiorgis, Jörn Altmann

01 Jan 2015-Procedia Computer Science

TL;DR: The results show that the level of openness plays a major role in the adoption of IT service platforms of emerging service providers, and this predicts that an IT service platform gets more attractive, if it opens up.

...read moreread less

Proceedings Article•DOI•

Automatic data placement into GPU on-chip memory resources

[...]

Chao Li¹, Yi Yang, Zhen Lin¹, Huiyang Zhou¹•Institutions (1)

North Carolina State University¹

07 Feb 2015

TL;DR: This paper focuses on programs that have already been reasonably optimized either manually by programmers or automatically by compiler tools and proposed compiler algorithms refine these programs by revising data placement across different types of GPU on-chip resources to achieve both performance enhancement and performance portability.

...read moreread less

Abstract: Although graphics processing units (GPUs) rely on thread-level parallelism to hide long off-chip memory access latency, judicious utilization of on-chip memory resources, including register files, shared memory, and data caches, is critical to application performance. However, explicitly managing GPU on-chip memory resources is a non-trivial task for application developers. More importantly, as on-chip memory resources vary among different GPU generations, performance portability has become a daunting challenge. In this paper, we tackle this problem with compiler-driven automatic data placement. We focus on programs that have already been reasonably optimized either manually by programmers or automatically by compiler tools. Our proposed compiler algorithms refine these programs by revising data placement across different types of GPU on-chip resources to achieve both performance enhancement and performance portability. Among 12 benchmarks in our study, our proposed compiler algorithm improves the performance by 1.76× on average on Nvidia GTX480, and by 1.61× on average on GTX680.

...read moreread less

Proceedings Article•DOI•

Design of semantic information broker for localized computing environments in the internet of things

[...]

Ivan V. Galov¹, Aleksandr A. Lomov¹, Dmitry Korzun¹•Institutions (1)

Petrozavodsk State University¹

20 Apr 2015

TL;DR: This paper proposes a renewed SIB design with increased extensibility, dependability, and portability, a step towards an efficient open interoperability platform for the smart space application development.

...read moreread less

Abstract: Emerging communication technologies of the Internet of Things (IoT) make all the devices of a spatial-limited physical computing environment locally interconnected as well as connected to the Internet. Software agents running on devices make the latter “smart objects” that are visible in our daily lives as real participating entities. Based on the M3 architecture for smart spaces, we consider the problem of creating a smart space deploying a Semantic Information Broker (SIB) in a localized IoTenvironment. SIB supports agent interaction in the smart space via sharing and self-generating information and its semantics. This paper proposes a renewed SIB design with increased extensibility, dependability, and portability. The research done is a step towards an efficient open interoperability platform for the smart space application development.

...read moreread less

Journal Article•DOI•

Accelerated application development

[...]

Wayne Joubert¹, Rick Archibald¹, Mark Berrill¹, W. Michael Brown¹, Markus Eisenbach¹, Ray Grout², Jeff Larkin³, John Levesque⁴, Bronson Messer¹, Matt Norman¹, Bobby Philip¹, Ramanan Sankaran¹, Arnold N. Tharrington¹, John A. Turner¹ - Show less +10 more•Institutions (4)

Oak Ridge National Laboratory¹, National Renewable Energy Laboratory², Nvidia³, Cray⁴

01 Aug 2015-Computers & Electrical Engineering

TL;DR: Experiences porting applications to the Titan system, the first multi-petaflop system based on accelerator hardware, and how users are currently making use of computational accelerators on Titan are discussed.

...read moreread less

Journal Article•DOI•

Facial emotion recognition system for autistic children: a feasible study based on FPGA implementation.

[...]

K. G. Smitha¹, A. P. Vinod¹•Institutions (1)

Nanyang Technological University¹

04 Aug 2015-Medical & Biological Engineering & Computing

TL;DR: A detailed study of the implementation of serial and parallel implementation of PCA is presented in order to identify the most feasible method for realization of a portable emotion detector for autistic children.

...read moreread less

Abstract: Children with autism spectrum disorder have difficulty in understanding the emotional and mental states from the facial expressions of the people they interact. The inability to understand other people's emotions will hinder their interpersonal communication. Though many facial emotion recognition algorithms have been proposed in the literature, they are mainly intended for processing by a personal computer, which limits their usability in on-the-move applications where portability is desired. The portability of the system will ensure ease of use and real-time emotion recognition and that will aid for immediate feedback while communicating with caretakers. Principal component analysis (PCA) has been identified as the least complex feature extraction algorithm to be implemented in hardware. In this paper, we present a detailed study of the implementation of serial and parallel implementation of PCA in order to identify the most feasible method for realization of a portable emotion detector for autistic children. The proposed emotion recognizer architectures are implemented on Virtex 7 XC7VX330T FFG1761-3 FPGA. We achieved 82.3% detection accuracy for a word length of 8 bits.

...read moreread less

Journal Article•DOI•

OpenCL performance evaluation on modern multicore CPUs

[...]

Joo Hwan Lee¹, Nimit Nigania¹, Hyesoon Kim¹, Kaushik Patel¹, Hyojong Kim¹ - Show less +1 more•Institutions (1)

Georgia Institute of Technology¹

01 Jan 2015-Scientific Programming

TL;DR: This paper evaluates the performance of OpenCL programs on out-of-order multicore CPUs from the architectural perspective, comparing OpenCL to conventional parallel programming models for CPUs.

...read moreread less

Abstract: Utilizing heterogeneous platforms for computation has become a general trend, making the portability issue important. OpenCL (Open Computing Language) serves this purpose by enabling portable execution on heterogeneous architectures. However, unpredictable performance variation on different platforms has become a burden for programmers who write OpenCL applications. This is especially true for conventional multicore CPUs, since the performance of general OpenCL applications on CPUs lags behind the performance of their counterparts written in the conventional parallel programming model for CPUs. In this paper, we evaluate the performance of OpenCL applications on out-of-order multicore CPUs from the architectural perspective. We evaluate OpenCL applications on various aspects, including API overhead, scheduling overhead, instruction-level parallelism, address space, data location, data locality, and vectorization, comparing OpenCL to conventional parallel programming models for CPUs. Our evaluation indicates unique performance characteristics of OpenCL applications and also provides insight into the optimization metrics for better performance on CPUs.

...read moreread less

Book•

Wireless Networked Music Performance

[...]

Leonardo Gabrielli¹, Stefano Squartini•Institutions (1)

Marche Polytechnic University¹

29 Dec 2015

TL;DR: A specific approach and framework has been developed by the authors which is reported in this chapter and compared to other meaningful approaches and technical achievements from other authors in wireless and wired NMP.

...read moreread less

Abstract: Wireless NMP has very few examples in the literature, if none (depending on the definition the reader adopts for NMP). This chapter reports advancements and developments in wireless NMP. The challenges posed by wireless NMP and the opportunities it offers are different from those seen in wired remote NMP. For this reason, a specific approach and framework has been developed by the authors which is reported in this chapter and compared to other meaningful approaches and technical achievements from other authors in wireless and wired NMP. Most of the contributions are by the authors and colleagues. The rationale and goals of the authors’ project, named WeMUST, are described and its technical achievements later reported. The project also targets portability and ease of use in wireless NMP. Embedded platforms are, thus, employed which are power-autonomous and provide some DSP capabilities. They adopt connection automation tools based on custom service discovery mechanisms based on existing networking technologies. The software used and related parameters are described and motivated. Finally, issues related to outdoor use are reported and technical choices to overcome these are described.

...read moreread less

Collapse