Comparative study on Boruvka's implementation on hetrogenous platform with cache analysis

doi:10.1109/ICACCCT.2016.7831725

Citations

PDF

Open Access

More filters

Proceedings Article•

Hands-on Practical Hybrid Parallel Application Performance Engineering

[...]

Markus Geimer¹, Michael Gerndt², Ronny Tschüter³, Allen D. Malony⁴•Institutions (4)

Forschungszentrum Jülich¹, Technische Universität München², Dresden University of Technology³, University of Oregon⁴

01 Jan 2017

TL;DR: In this paper, the authors present state-of-the-art performance tools for leading-edge HPC systems founded on the Score-P community instrumentation and measurement infrastructure, demonstrating how they can be used for performance engineering of effective scientific applications based on standard MPI or OpenMP.

...read moreread less

Abstract: This tutorial presents state-of-the-art performance tools for leading-edge HPC systems founded on the Score-P community instrumentation and measurement infrastructure, demonstrating how they can be used for performance engineering of effective scientific applications based on standard MPI or OpenMP and now common mixed-mode hybrid parallelizations. Parallel performance evaluation tools from the Virtual Institute --- High Productivity Supercomputing (VI-HPS) are introduced and featured in hands-on exercises with Periscope, Scalasca, Vampir and TAU.We cover all aspects of performance engineering practice, including instrumentation, measurement (profiling and tracing, timing and hardware counters), data storage, analysis and visualization. Emphasis is placed on how tools are used in combination for identifying performance problems and investigating optimization alternatives, illustrated with a case study using a major application code.

...read moreread less

References

PDF

Open Access

More filters

Journal Article•DOI•

On the History of the Minimum Spanning Tree Problem

[...]

Ron Graham, Pavol Hell

01 Jan 1985-IEEE Annals of the History of Computing

TL;DR: There are several apparently independent sources and algorithmic solutions of the minimum spanning tree problem and their motivations, and they have appeared in Czechoslovakia, France, and Poland, going back to the beginning of this century.

...read moreread less

Abstract: It is standard practice among authors discussing the minimum spanning tree problem to refer to the work of Kruskal(1956) and Prim (1957) as the sources of the problem and its first efficient solutions, despite the citation by both of Boruvka (1926) as a predecessor. In fact, there are several apparently independent sources and algorithmic solutions of the problem. They have appeared in Czechoslovakia, France, and Poland, going back to the beginning of this century. We shall explore and compare these works and their motivations, and relate them to the most recent advances on the minimum spanning tree problem.

...read moreread less

788 citations

"Comparative study on Boruvka's impl..." refers background in this paper

...These system contain multiple processors with a powerful CPU and GPU core to accelerate the computation[1]....
[...]

Journal Article•DOI•

Otakar Boruvka on minimum spanning tree problem translation of both the 1926 papers, comments, history

[...]

Jaroslav Nešetřil, Eva Milková, Helena Nešetřilová

28 Apr 2001-Discrete Mathematics

TL;DR: The first English translation of both of Borůvka's pioneering works, which are generally regarded as a cornerstone of Combinatorial Optimization, are presented.

...read moreread less

322 citations

"Comparative study on Boruvka's impl..." refers background or methods in this paper

...Traditionally GPU operates in co-ordination with CPU where CPU offloads parallel parts of computation to GPU[4]....
[...]
...In GPU, cache is use to localize data during volume rendering [17], while in CPU, it is used to localized data during memory references....
[...]

GPU Programming Strategies and Trends in GPU Computing

[...]

André R. Brodtkorb¹, Trond Runar Hagen¹, Martin L. Sætra²•Institutions (2)

SINTEF¹, University of Oslo²

01 Jan 2012

TL;DR: The aim of this article is to simplify the process of getting started with GPU programming, by giving an overview of current GPU programming strategies, profile-driven development, and an outlook to future trends.

...read moreread less

Abstract: Over the last decade, there has been a growing interest in the use of graphics processing units (GPUs) for nongraphics applications. From early academic proof-of-concept papers around the year 2000, the use of GPUs has now matured to a point where there are countless industrial applications. Together with the expanding use of GPUs, we have also seen a tremendous development in the programming languages and tools, and getting started programming GPUs has never been easier. However, whilst getting started with GPU programming can be simple, being able to fully utilize GPU hardware is an art that can take months and years to master. The aim of this article is to simplify this process, by giving an overview of current GPU programming strategies, profile driven development, and an outlook to future trends.

...read moreread less

240 citations

"Comparative study on Boruvka's impl..." refers methods in this paper

...In the second method[12], we use CSR(Compressed Sparse Row) format to represent the graph and then find the MST. MST-solver algorithm is mostly implemented through CPU....
[...]

Proceedings Article•DOI•

Fast minimum spanning tree for large graphs on the GPU

[...]

Vibhav Vineet¹, Pawan Harish¹, Suryakant Patidar¹, P. J. Narayanan¹•Institutions (1)

International Institute of Information Technology, Hyderabad¹

01 Aug 2009

TL;DR: This paper presents a minimum spanning tree algorithm on Nvidia GPUs under CUDA, as a recursive formulation of Borůvka's approach for undirected graphs, implemented using scalable primitives such as scan, segmented scan and split.

...read moreread less

Abstract: Graphics Processor Units are used for many general purpose processing due to high compute power available on them. Regular, data-parallel algorithms map well to the SIMD architecture of current GPU. Irregular algorithms on discrete structures like graphs are harder to map to them. Efficient data-mapping primitives can play crucial role in mapping such algorithms onto the GPU. In this paper, we present a minimum spanning tree algorithm on Nvidia GPUs under CUDA, as a recursive formulation of Boruvka's approach for undirected graphs. We implement it using scalable primitives such as scan, segmented scan and split. The irregular steps of supervertex formation and recursive graph construction are mapped to primitives like split to categories involving vertex ids and edge weights. We obtain 30 to 50 times speedup over the CPU implementation on most graphs and 3 to 10 times speedup over our previous GPU implementation. We construct the minimum spanning tree on a 5 million node and 30 million edge graph in under 1 second on one quarter of the Tesla S1070 GPU.

...read moreread less

126 citations

"Comparative study on Boruvka's impl..." refers background in this paper

...Traditionally GPU operates in co-ordination with CPU where CPU offloads parallel parts of computation to GPU[4]....
[...]

Journal Article•DOI•

A new era in scientific computing: Domain decomposition methods in hybrid CPU–GPU architectures

[...]

Manolis Papadrakakis¹, George Stavroulakis¹, Alexander Karatarakis¹•Institutions (1)

National Technical University of Athens¹

01 Mar 2011-Computer Methods in Applied Mechanics and Engineering

TL;DR: This work demonstrates the implementation of the FETI method to a hybrid CPU–GPU computing environment and reveals the tremendous potential of this type of hybrid computing environment as a result of the full exploitation of multi-core CPU hardware resources and the intrinsic software and hardware features of the GPUs.

...read moreread less

111 citations

"Comparative study on Boruvka's impl..." refers background in this paper

...Keywords—performance;borukva’s;cache; vtune; MST; oldest; time; architectures; gpu; csr;outperforms;benchmarks I. INTRODUCTION In the new era of modern computing, new types of heterogeneous computers have begun to emerge[15]....
[...]

Comparative study on Boruvka's implementation on hetrogenous platform with cache analysis

Citations

References

"Comparative study on Boruvka's impl..." refers background in this paper

"Comparative study on Boruvka's impl..." refers background or methods in this paper

"Comparative study on Boruvka's impl..." refers methods in this paper

"Comparative study on Boruvka's impl..." refers background in this paper

"Comparative study on Boruvka's impl..." refers background in this paper

Related Papers (5)