scispace - formally typeset
Proceedings ArticleDOI

Implicitly parallel programming models for thousand-core microprocessors

TLDR
It is argued that implicitly parallel programming models are critical for addressing the software development crises and software scalability challenges for many-core microprocessors.
Abstract
This paper argues for an implicitly parallel programming model for many-core microprocessors, and provides initial technical approaches towards this goal. In an implicitly parallel programming model, programmers maximize algorithm- level parallelism, express their parallel algorithms by asserting high-level properties on top of a traditional sequential programming language, and rely on parallelizing compilers and hardware support to perform parallel execution under the hood. In such a model, compilers and related tools require much more advanced program analysis capabilities and programmer assertions than what are currently available so that a comprehensive understanding of the input program's concurrency can be derived. Such an understanding is then used to drive automatic or interactive parallel code generation tools for a diverse set of parallel hardware organizations. The chip-level architecture and hardware should maintain parallel execution state in such a way that a strictly sequential execution state can always be derived for the purpose of verifying and debugging the program. We argue that implicitly parallel programming models are critical for addressing the software development crises and software scalability challenges for many-core microprocessors.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

A performance study of general-purpose applications on graphics processors using CUDA

TL;DR: This paper uses NVIDIA's C-like CUDA language and an engineering sample of their recently introduced GTX 260 GPU to explore the effectiveness of GPUs for a variety of application types, and describes some specific coding idioms that improve their performance on the GPU.
Proceedings ArticleDOI

DMP: deterministic shared memory multiprocessing

TL;DR: The case for fully deterministic shared memory multiprocessing is made and it is shown that determinism can be provided with little performance cost using the architecture proposals on future hardware, and that software-only approaches can be utilized on existing systems.
Journal ArticleDOI

Extending Amdahl's Law for Energy-Efficient Computing in the Many-Core Era

TL;DR: An updated take on Amdahl's analytical model uses modern design constraints to analyze many-core design alternatives, providing computer architects with a better understanding of many- core design types, enabling them to make more informed tradeoffs.
Proceedings ArticleDOI

CHIPPER: A low-complexity bufferless deflection router

TL;DR: CHIPPER (Cheap-Interconnect Partially Permuting Router), a simplified router microarchitecture that eliminates in-router buffers and the crossbar is proposed, introducing three key insights: that deflection routing port allocation maps naturally to a permutation network within the router; that livelock freedom requires only an implicit token-passing scheme, eliminating expensive age-based priorities.
Proceedings ArticleDOI

Auto-generation and auto-tuning of 3D stencil codes on GPU clusters

TL;DR: This proposed framework takes a most concise specification of stencil behavior from the user as a single formula, auto-generates tunable code from it, systematically searches for the best configuration and generates the code with optimal parameter configurations for different GPUs.
References
More filters
Journal ArticleDOI

Cramming More Components Onto Integrated Circuits

TL;DR: Integrated circuits will lead to such wonders as home computers or at least terminals connected to a central computer, automatic controls for automobiles, and personal portable communications equipment as mentioned in this paper. But the biggest potential lies in the production of large systems.
Journal Article

Cramming More Components onto Integrated Circuits

Gordon E. Moore
- 01 Jan 1965 - 
TL;DR: Integrated circuits will lead to such wonders as home computers or at least terminals connected to a central computer, automatic controls for automobiles, and personal portable communications equipment as discussed by the authors. But the biggest potential lies in the production of large systems.
Book

Parallel Computer Architecture: A Hardware/Software Approach

TL;DR: This book explains the forces behind this convergence of shared-memory, message-passing, data parallel, and data-driven computing architectures and provides comprehensive discussions of parallel programming for high performance and of workload-driven evaluation, based on understanding hardware-software interactions.
Book ChapterDOI

StreamIt: A Language for Streaming Applications

TL;DR: The StreamIt language provides novel high-level representations to improve programmer productivity and program robustness within the streaming domain and the StreamIt compiler aims to improve the performance of streaming applications via stream-specific analyses and optimizations.
Book ChapterDOI

The spec# programming system: an overview

TL;DR: The goals and architecture of thespec# programming system, consisting of the object-oriented Spec# programming language, the Spec# compiler, and the Boogie static program verifier, are described.
Related Papers (5)