scispace - formally typeset
Book ChapterDOI

Delivering Faster Results Through Parallelisation and GPU Acceleration

TLDR
This paper presents the methods for parallelising two pieces of scientific software, leveraging multiple GPUs to achieve up to thirty times speed up.
Abstract
The rate of scientific discovery depends on the speed at which accurate results and analysis can be obtained. The use of parallel co-processors such as Graphical Processing Units (GPUs) is becoming more and more important in meeting this demand as improvements in serial data processing speed become increasingly difficult to sustain. However, parallel data processing requires more complex programming compared to serial processing. Here we present our methods for parallelising two pieces of scientific software, leveraging multiple GPUs to achieve up to thirty times speed up.

read more

Citations
More filters
Proceedings ArticleDOI

IMTeract Tool for Monitoring and Profiling HPC Systems and Applications

TL;DR: IMTeract was used for energy usage profiling of HPC clusters running FLUENT and DL-POLY software and a GPU cluster running different implementations of an FFT algorithm, and the experimental results are encouraging and it is suggested that the IMTeract tool can be used to measure the CPU, Memory, Disk I/O and NetworkI/O for an application or a process and report on the energy used.
References
More filters
Journal ArticleDOI

NVIDIA Tesla: A Unified Graphics and Computing Architecture

TL;DR: To enable flexible, programmable graphics and high-performance computing, NVIDIA has developed the Tesla scalable unified graphics and parallel computing architecture, which is massively multithreaded and programmable in C or via graphics APIs.
Journal ArticleDOI

The GPU Computing Era

TL;DR: The rapid evolution of GPU architectures-from graphics processors to massively parallel many-core multiprocessors, recent developments in GPU computing architectures, and how the enthusiastic adoption of CPU+GPU coprocessing is accelerating parallel applications are described.
Book

Speedup versus efficiency in parallel systems

TL;DR: The tradeoff between speedup and efficiency that is inherent to a software system is investigated in this paper, and the extent to which this tradeoff is determined by the average parallelism of the software system, as contrasted with other, more detailed, characterizations, is shown.
Proceedings ArticleDOI

Accelerator: using data parallelism to program GPUs for general-purpose uses

TL;DR: This work describes Accelerator, a system that uses data parallelism to program GPUs for general-purpose uses instead of C, and compares the performance of Accelerator versions of the benchmarks against hand-written pixel shaders.
Journal ArticleDOI

Speedup versus efficiency in parallel systems

TL;DR: The tradeoff between speedup and efficiency that is inherent to a software system is investigated and it is shown that for any software system and any number of processors, the sum of the average processor utilization and the attained fraction of the maximum possible speedup must exceed one.