scispace - formally typeset
Proceedings ArticleDOI

Bohrium: A Virtual Machine Approach to Portable Parallelism

TLDR
Bohrium is a runtime-system for mapping vector operations onto a number of different hardware platforms, from simple multi-core systems to clusters and GPU enabled systems, and can be used for any programming language but for now, the supported languages are limited to Python, C++ and the Net framework.
Abstract
In this paper we introduce, Bohrium, a runtime-system for mapping vector operations onto a number of different hardware platforms, from simple multi-core systems to clusters and GPU enabled systems. In order to make efficient choices Bohrium is implemented as a virtual machine that makes runtime decisions, rather than a statically compiled library, which is the more common approach. In principle, Bohrium can be used for any programming language but for now, the supported languages are limited to Python, C++ and the. Net framework, e.g. C# and F#. The primary success criteria are to maintain a complete abstraction from low-level details and to provide efficient code execution across different, current and future, processors. We evaluate the presented design through a setup that targets a multi-core CPU, an eight-node Cluster, and a GPU, all preliminary prototypes. The evaluation includes three well-known benchmark applications, Black Sholes, Shallow Water, and N-body, implemented in C++, Python, and C# respectively.

read more

Citations
More filters
Journal ArticleDOI

A framework for general sparse matrix-matrix multiplication on GPUs and heterogeneous processors

TL;DR: This work proposes a framework for SpGEMM on GPUs and emerging CPU-GPU heterogeneous processors using the CSR format, and proposes an efficient parallel insert method for long rows of the resulting matrix and develops a heuristic-based load balancing strategy.
Proceedings ArticleDOI

Acyclic Partitioning of Large Directed Acyclic Graphs

TL;DR: This work adopts the multilevel approach with coarsening, initial partitioning, and refinement phases for acyclic partitioning of directed acYclic graphs and develops a direct k-way partitioning scheme.
Journal ArticleDOI

Multilevel algorithms for acyclic partitioning of directed acyclic graphs

TL;DR: This work investigates the problem of partitioning the vertices of a directed acyclic graph into a given number of parts by minimizing the number or the total weight of the edges of the graph.
Proceedings ArticleDOI

Optimizing data-intensive computations in existing libraries with split annotations

TL;DR: Mozart is implemented, a new technique called split annotations (SAs) that enables key data movement optimizations over unmodified library functions and provides performance gains competitive with solutions that require rewriting libraries, and can sometimes outperform these systems by up to 2x by leveraging existing hand-optimized code.
Proceedings ArticleDOI

Fusion of Parallel Array Operations

TL;DR: In this article, the problem of fusing array operations based on shape compatibility, data reuse, and minimizing for data reuse has been formulated as a static weighted graph partitioning problem, known as the Weighted Loop Fusion problem.
References
More filters
Journal ArticleDOI

The Pricing of Options and Corporate Liabilities

TL;DR: In this paper, a theoretical valuation formula for options is derived, based on the assumption that options are correctly priced in the market and it should not be possible to make sure profits by creating portfolios of long and short positions in options and their underlying stocks.
Journal ArticleDOI

The NumPy Array: A Structure for Efficient Numerical Computation

TL;DR: In this article, the authors show how to improve the performance of NumPy arrays through vectorizing calculations, avoiding copying data in memory, and minimizing operation counts, which is a technique similar to the one described in this paper.
Journal ArticleDOI

The NumPy array: a structure for efficient numerical computation

TL;DR: This effort shows, NumPy performance can be improved through three techniques: vectorizing calculations, avoiding copying data in memory, and minimizing operation counts.
Journal ArticleDOI

A bridging model for parallel computation

TL;DR: The bulk-synchronous parallel (BSP) model is introduced as a candidate for this role, and results quantifying its efficiency both in implementing high-level language features and algorithms, as well as in being implemented in hardware.
Journal ArticleDOI

Can programming be liberated from the von Neumann style?: a functional style and its algebra of programs

TL;DR: A new class of computing systems uses the functional programming style both in its programming language and in its state transition rules; these systems have semantics loosely coupled to states—only one state transition occurs per major computation.
Related Papers (5)