scispace - formally typeset
Book ChapterDOI

Manual Parallelization Versus State-of-the-Art Parallelization Techniques: The SPEC CPU2006 as a Case Study

TLDR
In this article, the potentials of automatic parallelization and vectorization of the sequential C++ applications from the SPEC CPU2006 suite are discussed, and the effects of paralllelization are evaluated by profiling and executing on two representative parallel machines.
Abstract
Being multiprocessors (both on-chip and/or off-chip), modern computer systems can automatically exploit the benefits of parallel programs, but their resources remain underutilized in executing still-Prevailing sequential applications. An obvious solution is in the parallelization of such applications. The first part overviews he broad issues in parallelization. Various parallelization approaches and contemporary software and hardware tools for extracting parallelism from sequential applications are studied. It also attempts to identify typical code patterns amenable for parallelization. The second part represents a case study where the SPEC CPU2006 suite-is considered as a representative collection of typical sequential applications. Following that, it discusses the possibilities and potentials of automatic parallelization and vectorization of the sequential C++ applications from the CPU2006 suite. Since these potentials are generally limited, it explores the issues in manual parallelization of these applications. After previously identified patterns are applied by source-to-source code modifications, the effects of paralllelization are evaluated by profiling and executing on two representative parallel machines. Finally, the presented results are carefully discussed.

read more

Citations
More filters
Journal ArticleDOI

Modelling flood events with a cumulant CO lattice Boltzmann shallow water model

TL;DR: In this article, a semiautomatic procedure based on the coupled use of a GIS subroutine and a two-dimensional hydraulic lattice Boltzmann model solving the shallow water equations is presented.
Journal ArticleDOI

Comparative Analysis between Selection Sort and Merge Sort Algorithms

TL;DR: It was concluded that implementing algorithms using a machine with multiple numbers of cores in their Central Processing Unit (CPU) will result in a significant improvement in the performance of both algorithms.
Book ChapterDOI

Distributing and Parallelizing Non-canonical Loops

TL;DR: In this article , the authors leverage an original dependency analysis to parallelize loops regardless of their form in imperative programs, resulting in gains in execution time comparable to state-of-the-art automatic source-to-source code transformers.
Journal ArticleDOI

A Novel Loop Fission Technique Inspired by Implicit Computational Complexity

TL;DR: This work explores an unexpected application of Implicit Computational Complexity to parallelize loops in imperative programs by splitting a loop into multiple loops that can be run in parallel, resulting in gains in terms of execution time similar to state-of-the-art automatic parallelization tools when both are applicable.
Proceedings ArticleDOI

Influence of loop transformations on performance and energy consumption of the multithreded WZ factorization

TL;DR: It has been shown that for WZ factorization, which is an example of an application in which the authors can use the loop transformation, optimization towards high-performance can also be an effective strategy for improving energy efficiency.
References
More filters
Book

Introduction to Algorithms

TL;DR: The updated new edition of the classic Introduction to Algorithms is intended primarily for use in undergraduate or graduate courses in algorithms or data structures and presents a rich variety of algorithms and covers them in considerable depth while making their design and analysis accessible to all levels of readers.
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
Journal ArticleDOI

Scalable molecular dynamics with NAMD

TL;DR: NAMD as discussed by the authors is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems that scales to hundreds of processors on high-end parallel platforms, as well as tens of processors in low-cost commodity clusters, and also runs on individual desktop and laptop computers.
Related Papers (5)