scispace - formally typeset
Proceedings ArticleDOI

A Top-Down Parallel Semisort

TLDR
This work implements the parallel integer sorting algorithm of Rajasekaran and Reif, but instead of processing bits of a integers in a reduced range in a bottom-up fashion, it process the hashed values of keys directly top-down.
Abstract
Semisorting is the problem of reordering an input array of keys such that equal keys are contiguous but different keys are not necessarily in sorted order. Semisorting is important for collecting equal values and is widely used in practice. For example, it is the core of the MapReduce paradigm, is a key component of the database join operation, and has many other applications. We describe a (randomized) parallel algorithm for the problem that is theoretically efficient (linear work and logarithmic depth), but is designed to be more practically efficient than previous algorithms. We use ideas from the parallel integer sorting algorithm of Rajasekaran and Reif, but instead of processing bits of a integers in a reduced range in a bottom-up fashion, we process the hashed values of keys directly top-down. We implement the algorithm and experimentally show on a variety of input distributions that it outperforms a similarly-optimized radix sort on a modern 40-core machine with hyper-threading by about a factor of 1.7--1.9, and achieves a parallel speedup of up to 38x. We discuss the various optimizations used in our implementation and present an extensive experimental analysis of its performance.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Julienne: A Framework for Parallel Graph Algorithms using Work-efficient Bucketing

TL;DR: The Julienne framework is developed, which extends a recent shared-memory graph processing framework called Ligra with an interface for maintaining a collection of buckets under vertex insertions and bucket deletions, and develops the first work-efficient parallel algorithm for k-core in the literature with nontrivial parallelism.
Proceedings ArticleDOI

Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

TL;DR: It is shown that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes.
Proceedings Article

The Cilk++ concurrency platform

Leiserson
TL;DR: This paper overviews the Cilk++ programming environment, which incorporates a compiler, a runtime system, and a race-detection tool, and provides a “hyperobject” library which allows races on nonlocal variables to be mitigated without lock contention or substantial code restructuring.
Proceedings ArticleDOI

ParlayLib - A Toolkit for Parallel Algorithms on Shared-Memory Multicore Machines

TL;DR: ParlayLib is a C++ library for developing efficient parallel algorithms and software on shared-memory multicore machines that consists of a sequence data type, many parallel routines and algorithms, a work-stealing scheduler to support nested parallelism, and a scalable memory allocator.
Proceedings ArticleDOI

Parallelism in Randomized Incremental Algorithms

TL;DR: In this article, it was shown that most sequential randomized incremental algorithms are in fact parallel, and the dependence structure is shallow for all of the algorithms, implying high parallelism, and three types of dependences found in the algorithms studied and presented a framework for analyzing each type of algorithm.
References
More filters
Book

Introduction to Algorithms

TL;DR: The updated new edition of the classic Introduction to Algorithms is intended primarily for use in undergraduate or graduate courses in algorithms or data structures and presents a rich variety of algorithms and covers them in considerable depth while making their design and analysis accessible to all levels of readers.
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
Journal ArticleDOI

Introduction to algorithms: 4. Turtle graphics

TL;DR: In this article, a language similar to logo is used to draw geometric pictures using this language and programs are developed to draw geometrical pictures using it, which is similar to the one we use in this paper.