scispace - formally typeset
Search or ask a question

Showing papers on "Software portability published in 2011"


Patent
26 Jul 2011
TL;DR: In this paper, the authors present a program directory database, compiled automatically from information reported by network nodes that watch and identify content traffic passing into (and/or out of) networked computers.
Abstract: The present technology concerns cell phones and other portable devices, and more particularly concerns use of such devices in connection with media content (electronic and physical) and with other systems (e.g., televisions, digital video recorders, and electronic program directories). Some aspects of the technology allow users to easily transfer displayed content from cell phone screens onto a television screens for easier viewing, or vice versa for content portability. Others enable users to participate interactively in entertainment content, such as by submitting plot directions, audio input, character names, etc., yielding more engaging, immersive, user experiences. Still other aspects of the technology involve a program directory database, compiled automatically from information reported by network nodes that watch and identify content traffic passing into (and/or out of) networked computers. By identifying content resident at a number of different repositories (e.g., web sites, TV networks, P2P systems, etc.), such a directory allows cell phone users to identify the diversity of sources from which desired content can be obtained—some available on a scheduled basis, others available on demand. Depending on the application, the directory information may be transparent to the user—serving to identify sources for desired content, from which application software can pick for content downloading, based, e.g., on context and stored profile data. A great number of other features and arrangements are also detailed.

557 citations


Proceedings ArticleDOI
13 Sep 2011
TL;DR: This paper presents a comprehensive performance comparison between CUDA and OpenCL, and concludes that OpenCL's portability does not fundamentally affect its performance, and Opencl can be a good alternative to CUDA.
Abstract: This paper presents a comprehensive performance comparison between CUDA and OpenCL. We have selected 16 benchmarks ranging from synthetic applications to real-world ones. We make an extensive analysis of the performance gaps taking into account programming models, ptimization strategies, architectural details, and underlying compilers. Our results show that, for most applications, CUDA performs at most 30\% better than OpenCL. We also show that this difference is due to unfair comparisons: in fact, OpenCL can achieve similar performance to CUDA under a fair comparison. Therefore, we define a fair comparison of the two types of applications, providing guidelines for more potential analyses. We also investigate OpenCL's portability by running the benchmarks on other prevailing platforms with minor modifications. Overall, we conclude that OpenCL's portability does not fundamentally affect its performance, and OpenCL can be a good alternative to CUDA.

312 citations


Proceedings ArticleDOI
12 Nov 2011
TL;DR: Liszt is presented, a domain- specific language for constructing mesh-based PDE solvers, and language statements for interacting with an unstructured mesh, and storing data at its elements enable the compiler to expose the parallelism, locality, and synchronization of Liszt programs.
Abstract: Heterogeneous computers with processors and accelerators are becoming widespread in scientific computing. However, it is difficult to program hybrid architectures and there is no commonly accepted programming model. Ideally, applications should be written in a way that is portable to many platforms, but providing this portability for general programs is a hard problem. By restricting the class of programs considered, we can make this portability feasible. We present Liszt, a domain-specific language for constructing mesh-based PDE solvers. We introduce language statements for interacting with an unstructured mesh, and storing data at its elements. Program analysis of these statements enables our compiler to expose the parallelism, locality, and synchronization of Liszt programs. Using this analysis, we generate applications for multiple platforms: a cluster, an SMP, and a GPU. This approach allows Liszt applications to perform within 12% of hand-written C++, scale to large clusters, and experience order-of-magnitude speedups on GPUs.

236 citations


Proceedings ArticleDOI
06 Nov 2011
TL;DR: Experimental results and analysis show that the OpenCL version has different characteristics from the OpenMP version on multicore CPUs and exhibits different performance characteristics depending on different OpenCL compute devices.
Abstract: Heterogeneous parallel computing platforms, which are composed of different processors (e.g., CPUs, GPUs, FPGAs, and DSPs), are widening their user base in all computing domains. With this trend, parallel programming models need to achieve portability across different processors as well as high performance with reasonable programming effort. OpenCL (Open Computing Language) is an open standard and emerging parallel programming model to write parallel applications for such heterogeneous platforms. In this paper, we characterize the performance of an OpenCL implementation of the NAS Parallel Benchmark suite (NPB) on a heterogeneous parallel platform that consists of general-purpose CPUs and a GPU. We believe that understanding the performance characteristics of conventional workloads, such as the NPB, with an emerging programming model (i.e., OpenCL) is important for developers and researchers to adopt the programming model. We also compare the performance of the NPB in OpenCL to that of the OpenMP version. We describe the process of implementing the NPB in OpenCL and optimizations applied in our implementation. Experimental results and analysis show that the OpenCL version has different characteristics from the OpenMP version on multicore CPUs and exhibits different performance characteristics depending on different OpenCL compute devices. The results also indicate that the application needs to be rewritten or re-optimized for better performance on a different compute device although OpenCL provides source-code portability.

193 citations


Proceedings ArticleDOI
27 Feb 2011
TL;DR: A new FPGA memory architecture called Connected RAM (CoRAM) is proposed to serve as a portable bridge between the distributed computation kernels and the external memory interfaces to improve performance and efficiency and to improve an application's portability and scalability.
Abstract: FPGAs have been used in many applications to achieve orders-of-magnitude improvement in absolute performance and energy efficiency relative to conventional microprocessors. Despite their promise in both processing performance and efficiency, FPGAs have not yet gained widespread acceptance as mainstream computing devices. A fundamental obstacle to FPGA-based computing today is the FPGA's lack of a common, scalable memory architecture. When developing applications for FPGAs, designers are often directly responsible for crafting the application-specific infrastructure logic that manages and transports data to and from the processing kernels. This infrastructure not only increases design time and effort but will frequently lock a design to a particular FPGA product line, hindering scalability and portability. We propose a new FPGA memory architecture called Connected RAM (CoRAM) to serve as a portable bridge between the distributed computation kernels and the external memory interfaces. In addition to improving performance and efficiency, the CoRAM architecture provides a virtualized memory environment as seen by the hardware kernels to simplify development and to improve an application's portability and scalability.

159 citations


01 Jan 2011
TL;DR: A snapshot of new concepts and approaches in interoperability between clouds is provided followed by a proposal of their classification and a new approach in providing cloud portability is revealed.
Abstract: The greatest challenge beyond trust and security for the long-term adoption of cloud computing is the interoperability between clouds. In the context of world-wide tremendous activities against the vendor lock-in and lack of integration of cloud computing services, keeping track of the new concepts and approaches is also a challenge. We considered useful to provide in this paper a snapshot of these concepts and approaches followed by a proposal of their classification. A new approach in providing cloud portability is also revealed.

132 citations


Journal ArticleDOI
TL;DR: The study shows how the performance of machine learning taggers is degraded when they are ported across clinical documents from different sources, and shows that BioTagger-GM can be easily extended to detect clinical concept mentions with good performance.

125 citations


Book ChapterDOI
26 Oct 2011
TL;DR: In this article, a snapshot of the concepts and approaches followed by a proposal of their classification is provided, and a new approach in providing cloud portability is also revealed, which is useful to provide in this paper.
Abstract: The greatest challenge beyond trust and security for the long-term adoption of cloud computing is the interoperability between clouds. In the context of world-wide tremendous activities against the vendor lock-in and lack of integration of cloud computing services, keeping track of the new concepts and approaches is also a challenge. We considered useful to provide in this paper a snapshot of these concepts and approaches followed by a proposal of their classification. A new approach in providing cloud portability is also revealed.

115 citations


Proceedings Article
01 Jan 2011
TL;DR: This paper will focus on defining a reference model for cloud computing and presents a meta-model that shows the main cloud vocabulary and design elements, the set of configuration rules, and the semantic interpretation.
Abstract: Cloud Computing is a paradigm shift that involves dynamic provisioning of shared computing resources on demand. It is a pay-as-you-go model that offers computing resources as a service in an attempt to reduce IT capital and operating expenditures. The problem is that current software architectures lack elements such as those related to address elasticity, virtualization and billing. These elements are needed in the design of cloud applications. Moreover, there is no generic cloud software architecture for designing and building cloud applications. To further complicate the problem, each platform provider has different standards that influence the way applications are written. This ties cloud users to a particular provider. This paper will focus on defining a reference model for cloud computing; more particularly, it presents a meta-model that shows the main cloud vocabulary and design elements, the set of configuration rules, and the semantic interpretation. It is always important to understand the abstract architecture of a system, and then tackle platform-specific issues. This separation of concerns allows for better maintainability, and facilitate applications portability.

97 citations


Proceedings ArticleDOI
Amir Hormati1, Mehrzad Samadi1, Mark Woh1, Trevor Mudge1, Scott Mahlke1 
05 Mar 2011
TL;DR: Sponge alleviates the problems associated with current GPU programming methods by providing portability across different generations of GPUs and CPUs, and a better abstraction of the hardware details, such as the memory hierarchy and threading model.
Abstract: Graphics processing units (GPUs) provide a low cost platform for accelerating high performance computations. The introduction of new programming languages, such as CUDA and OpenCL, makes GPU programming attractive to a wide variety of programmers. However, programming GPUs is still a cumbersome task for two primary reasons: tedious performance optimizations and lack of portability. First, optimizing an algorithm for a specific GPU is a time-consuming task that requires a thorough understanding of both the algorithm and the underlying hardware. Unoptimized CUDA programs typically only achieve a small fraction of the peak GPU performance. Second, GPU code lacks efficient portability as code written for one GPU can be inefficient when executed on another. Moving code from one GPU to another while maintaining the desired performance is a non-trivial task often requiring significant modifications to account for the hardware differences. In this work, we propose Sponge, a compilation framework for GPUs using synchronous data flow streaming languages. Sponge is capable of performing a wide variety of optimizations to generate efficient code for graphics engines. Sponge alleviates the problems associated with current GPU programming methods by providing portability across different generations of GPUs and CPUs, and a better abstraction of the hardware details, such as the memory hierarchy and threading model. Using streaming, we provide a write-once software paradigm and rely on the compiler to automatically create optimized CUDA code for a wide variety of GPU targets. Sponge's compiler optimizations improve the performance of the baseline CUDA implementations by an average of 3.2x.

96 citations


Proceedings ArticleDOI
25 May 2011
TL;DR: A comprehensive and detailed overview and a comparison between the most recent and popular commercial and open-source robotic software for simulation and interfacing with real robots are presented.
Abstract: Simulators play an important role in robotics research as tools for testing the efficiency, safety, and robustness of new algorithms. This is of particular importance in scenarios that require robots to closely interact with humans, e.g., in medical robotics, and in assistive environments. Despite the increasing number of commercial and open-source robotic simulation tools, to the best of our knowledge, no comprehensive up-to-date survey paper has reviewed and compared their features. This survey paper presents a comprehensive and detailed overview and a comparison between the most recent and popular commercial and open-source robotic software for simulation and interfacing with real robots. A case-study is presented, showing the versatility in porting the control code from a simulation to a real robot. Finally, a detailed step-by-step documentation on software installation and usage has been made available publicly on the Internet, together with downloadable code examples.

Journal ArticleDOI
TL;DR: In this paper, the authors examined the price response of wireless carriers to the introduction of number portability in the U.S. and found that wireless prices decreased in response to number-portability, but not uniformly across plans.
Abstract: This paper examines the price response of wireless carriers to the introduction of number portability in the U.S. I find that wireless prices decreased in response to number portability, but not uniformly across plans. Average prices for the plans with the fewest minutes decreased by only $0.19/month (0.97%), but average prices for medium and high-volume plans decreased by $3.64/month (4.84%) and $10.29/month (6.81%), respectively. The results suggest that higher-volume users in the wireless market benefited more from the policy-induced reduction in switching costs.

Journal ArticleDOI
Verdi March1, Yan Gu1, Erwin Leonardi1, George Goh1, Markus Kirchberg1, Bu-Sung Lee1 
TL;DR: This paper shows that rich mobile applications can be achieved through the convergence of mobile and cloud computing, and proposes μCloud framework which models a rich mobile application as a graph of components distributed onto mobile devices and the cloud.

Journal ArticleDOI
TL;DR: A free operating-system-independent graphical user interface (GUI) has been developed to drive the most common simulation packages for treating both molecules and solids, written in the portable Java language and encoded in HTML and JavaScript.
Abstract: The growth in complexity of quantum mechanical software packages for modelling the physicochemical properties of crystalline materials may hinder their usability by the vast majority of non-specialized users. Consequently, a free operating-system-independent graphical user interface (GUI) has been developed to drive the most common simulation packages for treating both molecules and solids. In order to maintain maximum portability and graphical efficiency, the popular molecular graphics engine Jmol, written in the portable Java language, has been combined with a specialized GUI encoded in HTML and JavaScript. This framework, called J-ICE, allows users to visualize, build and manipulate complex input or output results (derived from modelling) entirely via a web server, i.e. without the burden of installing complex packages. This solution also dramatically speeds up both the development procedure and bug fixing. Among the range of software appropriate for modelling condensed matter, the focus of J-ICE is currently only on CRYSTAL09 and VASP.

Proceedings Article
01 Jan 2011
TL;DR: This work describes a new approach for a cross platform API that encompass all cloud service levels and expects that the implementation of this approach will offer a higher degree of portability and vendor independence for Cloud based applications.
Abstract: Cross platform APIs for cloud computing are emerging due to the need of the application developer to combine the features exposed by different cloud providers and to port the codes from one provider environment to another. Such APIs are allowing nowadays the federation of clouds to an infrastructure level, requiring a certain knowledge of programming the infrastructure. We describe a new approach for a cross platform API that encompass all cloud service levels. We expect that the implementation of this approach will offer a higher degree of portability and vendor independence for Cloud based applications.

Journal ArticleDOI
TL;DR: This article outlines the PEPPHER performance-aware component model, performance prediction means, runtime system, and other aspects of the project that address efficient utilization of hybrid systems consisting of multicore CPUs with GPU-type accelerators.
Abstract: PEPPHER, a three-year European FP7 project, addresses efficient utilization of hybrid (heterogeneous) computer systems consisting of multicore CPUs with GPU-type accelerators. This article outlines the PEPPHER performance-aware component model, performance prediction means, runtime system, and other aspects of the project. A larger example demonstrates performance portability with the PEPPHER approach across hybrid systems with one to four GPUs.

Proceedings ArticleDOI
22 Dec 2011
TL;DR: To enhance learning objects portability and interoperability not only cloud computing API standards should be advocated by the key cloud providers but also learning resources Standards should be defined by the Open Cloud Computing Education Federation as proposed by this paper.
Abstract: Cloud Computing is evolving as a key technology for sharing resources. Grid Computing, distributed computing, parallel computing and virtualization technologies define the shape of a new era. Traditional distance learning systems lack reusability, portability and interoperability. This paper sees cloud computing ecosystem as a new opportunity in designing cloud computing educational platforms where learning actors can reuse learning resources handled by cloud educational operating systems. To enhance learning objects portability and interoperability not only cloud computing API standards should be advocated by the key cloud providers but also learning resources standards should be defined by the Open Cloud Computing Education Federation as proposed by this paper.

Proceedings ArticleDOI
02 Apr 2011
TL;DR: This work presents a synergistic auto-vectorizing compilation scheme that leverages the optimized intermediate results provided by the first stage across disparate SIMD architectures from different vendors, having distinct characteristics ranging from different vector sizes, memory alignment and access constraints, to special computational idioms.
Abstract: Just-in-Time (JIT) compiler technology offers portability while facilitating target- and context-specific specialization. Single-Instruction-Multiple-Data (SIMD) hardware is ubiquitous and markedly diverse, but can be difficult for JIT compilers to efficiently target due to resource and budget constraints. We present our design for a synergistic auto-vectorizing compilation scheme. The scheme is composed of an aggressive, generic offline stage coupled with a lightweight, target-specific online stage. Our method leverages the optimized intermediate results provided by the first stage across disparate SIMD architectures from different vendors, having distinct characteristics ranging from different vector sizes, memory alignment and access constraints, to special computational idioms. We demonstrate the effectiveness of our design using a set of kernels that exercise innermost loop, outer loop, as well as straight-line code vectorization, all automatically extracted by the common offline compilation stage. This results in performance comparable to that provided by specialized monolithic offline compilers. Our framework is implemented using open-source tools and standards, thereby promoting interoperability and extendibility.

Proceedings ArticleDOI
25 Jul 2011
TL;DR: This system uses embedded system, 3G, and ZIGBEE technologies to overcome the drawbacks of current smart home systems such as discrete functions, poor portability, weak updating capability, and personal computer dependence.
Abstract: With more and more applications of Internet of Things in many domains, it also steps into Smart Homes. In this paper, we propose an Internet of Things-based smart home system for home comfort, leisure and safety. This system uses embedded system, 3G, and ZIGBEE technologies to overcome the drawbacks of current smart home systems such as discrete functions, poor portability, weak updating capability, and personal computer dependence. Moreover, the system architecture is presented, and the design of its gateway is shown in detail from hardware to software.

Journal ArticleDOI
TL;DR: In this paper, the authors describe how refactoring tools can improve programmer productivity, program performance, and program portability, and describe a toolset that supports several refactorings for making programs thread-safe, threading sequential programs for throughput, and improving scalability of parallel programs.
Abstract: In the multicore era, a major programming task will be to make programs more parallel. This is tedious because it requires changing many lines of code; it's also error-prone and nontrivial because programmers need to ensure noninterference of parallel operations. Fortunately, interactive refactoring tools can help reduce the analysis and transformation burden. The author describes how refactoring tools can improve programmer productivity, program performance, and program portability. The article also describes a toolset that supports several refactorings for making programs thread-safe, threading sequential programs for throughput, and improving scalability of parallel programs.


Proceedings ArticleDOI
08 Jun 2011
TL;DR: A virtual file system specifically optimized for virtual machine image storage based on a lazy transfer scheme coupled with object versioning that handles snapshotting transparently in a hypervisor-independent fashion is proposed, ensuring high portability for different configurations.
Abstract: Infrastructure as a Service (IaaS) cloud computing has revolutionized the way we think of acquiring resources by introducing a simple change: allowing users to lease computational resources from the cloud provider's datacenter for a short time by deploying virtual machines (VMs) on these resources. This new model raises new challenges in the design and development of IaaS middleware. One of those challenges is the need to deploy a large number (hundreds or even thousands) of VM instances simultaneously. Once the VM instances are deployed, another challenge is to simultaneously take a snapshot of many images and transfer them to persistent storage to support management tasks, such as suspend-resume and migration. With datacenters growing rapidly and configurations becoming heterogeneous, it is important to enable efficient concurrent deployment and snapshotting that are at the same time hypervisor independent and ensure a maximum compatibility with different configurations. This paper addresses these challenges by proposing a virtual file system specifically optimized for virtual machine image storage. It is based on a lazy transfer scheme coupled with object versioning that handles snapshotting transparently in a hypervisor-independent fashion, ensuring high portability for different configurations. Large-scale experiments on hundreds of nodes demonstrate excellent performance results: speedup for concurrent VM deployments ranges from a factor of 2 up to 25, with a reduction in bandwidth utilization of as much as 90%.


Journal Article
TL;DR: In this paper, the authors present a position paper exposing the concepts behind a recent proposal for an open-source application programming interface and platform for dealing with multiple cloud computing offers, which can lead to a step forward in the adoption of cloud computing on a larger scale than the actual one.
Abstract: The current diversity of Cloud computing services, benefic for the fast development of a new IT market, hinders the easy development, portability and inter-operability of Cloud oriented applications. Developing an application oriented view of Cloud services instead the current provider ones can lead to a step forward in the adoption of Cloud computing on a larger scale than the actual one. In this context, we present a position paper exposing the concepts behind a recent proposal for an open-source application programming interface and platform for dealing with multiple Cloud computing offers.

Journal ArticleDOI
TL;DR: This paper presents the QALL-ME Framework, a reusable architecture for building multi- and cross-lingual Question Answering (QA) systems working on structured data modelled by an ontology, and presents a running example to clarify how the framework processes questions.

Book ChapterDOI
13 Jun 2011
TL;DR: This paper presents extensions to OpenMP that provide a high-level programming model that can provide accelerated performance comparable to that of hand-coded implementations in CUDA.
Abstract: OpenMP [14] is the dominant programming model for shared-memory parallelism in C, C++ and Fortran due to its easy-touse directive-based style, portability and broad support by compiler vendors. Compute-intensive application regions are increasingly being accelerated using devices such as GPUs and DSPs, and a programming model with similar characteristics is needed here. This paper presents extensions to OpenMP that provide such a programming model. Our results demonstrate that a high-level programming model can provide accelerated performance comparable to that of hand-coded implementations in CUDA.

Proceedings ArticleDOI
29 Nov 2011
TL;DR: In this article, the authors establish directives that serve as guidelines for the design and implementation or identification of a suitable cloud computing framework to build or convert a high performance application to run in the cloud.
Abstract: Over the past decade, high performance applications have embraced parallel programming and computing models. While parallel computing offers advantages such as good utilization of dedicated hardware resources, it also has several drawbacks such as poor fault-tolerance, scalability, and ability to harness available resources during run-time. The advent of cloud computing presents a viable and promising alternative to parallel computing because of its advantages in offering a distributed computing model. In this work, we establish directives that serve as guidelines for the design and implementation or identification of a suitable cloud computing framework to build or convert a high performance application to run in the cloud. We show that following these directives leads to an elastic implementation that has better scalability, run-time resource adaptability, fault tolerance, and portability across cloud computing platforms, while requiring minimal effort and intervention from the user. We illustrate this by converting an MPI implementation of replica exchange, a parallel tempering molecular dynamics application, to an elastic cloud application using the Work Queue framework that adheres to these directive. We observe better scalability and resource adaptability of this elastic application on multiple platforms, including a homogeneous cluster environment (SGE) and heterogeneous cloud computing environments such as Microsoft Azure and Amazon EC2.

Proceedings ArticleDOI
24 Jan 2011
TL;DR: This paper develops an approach for predicting the optimal number of threads for a given data-parallel application in the presence of external workload and develops an alternative cooperative model that minimizes the impact on external workload while still giving an improved average speedup.
Abstract: Much compiler-orientated work in the area of mapping parallel programs to parallel architectures has ignored the issue of external workload. Given that the majority of platforms will not be dedicated to just one task at a time, the impact of other jobs needs to be addressed. As mapping is highly dependent on the underlying machine, a technique that is easily portable across platforms is also desirable.In this paper we develop an approach for predicting the optimal number of threads for a given data-parallel application in the presence of external workload. We achieve 93.7% of the maximum speedup available which gives an average speedup of 1.66 on 4 cores, a factor 1.24 times better than the OpenMP compiler's default policy. We also develop an alternative cooperative model that minimizes the impact on external workload while still giving an improved average speedup. Finally, we evaluate our approach on a separate 8-core machine giving an average 1.33 times speedup over the default policy showing the portability of our approach.

Proceedings ArticleDOI
01 Dec 2011
TL;DR: The PERCEPT system is introduced that supports a number of unique features such as: a) Low deployment and maintenance cost; b) Scalability; and c) Portability and ease-of-use.
Abstract: In order to enhance the perception of indoor and unfamiliar environments for the blind and visually-impaired, we introduce the PERCEPT system that supports a number of unique features such as: a) Low deployment and maintenance cost; b) Scalability, i.e. we can deploy the system in very large buildings; c) An on-demand system that does not overwhelm the user, as it offers small amounts of information on demand; and d) Portability and ease-of-use, i.e., the custom handheld device carried by the user is compact and instructions are received audibly.

Journal ArticleDOI
TL;DR: This paper demonstrates financial enterprise portability, which involves moving entire application services from desktops to clouds and between different clouds, and is transparent to users who can work as if on their familiar systems.
Abstract: This paper demonstrates financial enterprise portability, which involves moving entire application services from desktops to clouds and between different clouds, and is transparent to users who can work as if on their familiar systems. To demonstrate portability, reviews for several financial models are studied, where Monte Carlo Methods MCM and Black Scholes Model BSM are chosen. A special technique in MCM, Least Square Methods, is used to reduce errors while performing accurate calculations. Simulations for MCM are performed on different types of Clouds. Benchmark and experimental results are presented for discussion. 3D Black Scholes are used to explain the impacts and added values for risk analysis. Implications for banking are also discussed, as well as ways to track risks in order to improve accuracy. A conceptual Cloud platform is used to explain the contributions in Financial Software as a Service FSaaS and the IBM Fined Grained Security Framework. This study demonstrates portability, speed, accuracy, and reliability of applications in the clouds, while demonstrating portability for FSaaS and the Cloud Computing Business Framework CCBF.