scispace - formally typeset
Open AccessPosted Content

Massively Parallel Dynamic Programming on Trees

TLDR
This paper attempts to address the issue of dynamic programming in the Massively Parallel Computations (MPC) model which is a popular abstraction of MapReduce-like paradigms, and introduces two classes of graph problems that admit dynamic programming solutions on trees.
Abstract
Dynamic programming is a powerful technique that is, unfortunately, often inherently sequential. That is, there exists no unified method to parallelize algorithms that use dynamic programming. In this paper, we attempt to address this issue in the Massively Parallel Computations (MPC) model which is a popular abstraction of MapReduce-like paradigms. Our main result is an algorithmic framework to adapt a large family of dynamic programs defined over trees. We introduce two classes of graph problems that admit dynamic programming solutions on trees. We refer to them as "(polylog)-expressible" and "linear-expressible" problems. We show that both classes can be parallelized in $O(\log n)$ rounds using a sublinear number of machines and a sublinear memory per machine. To achieve this result, we introduce a series of techniques that can be plugged together. To illustrate the generality of our framework, we implement in $O(\log n)$ rounds of MPC, the dynamic programming solution of graph problems such as minimum bisection, $k$-spanning tree, maximum independent set, longest path, etc., when the input graph is a tree.

read more

Citations
More filters
Proceedings Article

Parallel batch-dynamic graphs: algorithms and lower bounds

TL;DR: This paper gives an algorithm for dynamic graph connectivity in this setting with constant communication rounds and communication cost almost linear in terms of the batch size, and illustrates the power of dynamic algorithms in the MPC model by showing that the batched version of the adaptive connectivity problem is $\mathsf{P}$-complete in the centralized setting, but sub-linear sized batches can be handled in a constant number of rounds.
Posted Content

Improved MPC Algorithms for MIS, Matching, and Coloring on Trees and Beyond

TL;DR: This work presents round scalable Massively Parallel Computation algorithms for maximal independent set and maximal matching, in trees and more generally graphs of bounded arboricity, as well as for constant coloring trees.
Posted Content

Breaking the Linear-Memory Barrier in MPC: Fast MIS on Trees with n ε Memory per Machine.

TL;DR: The paper demonstrates how to make use of the all-to-all communication in the MPC model to exponentially improve on the corresponding bound in the LOCAL and PRAM models by Lenzen and Wattenhofer [PODC'11].
Proceedings Article

Massively Parallel k-Means Clustering for Perturbation Resilient Instances

TL;DR: A fully scalable (1 + ε ) -approximate k -means clustering algorithm for O ( α ) -perturbation resilient instance in the MPC model using O (1) rounds and O ε,d ( n 1+1 /α 2 + o ( 1) ) total space.
Posted Content

On the Hardness of Massively Parallel Computation

TL;DR: In this article, it was shown that hard functions that are essentially not parallelizable in the MPC model require at least Ω(tilde{\Omega(T)$ rounds to compute the function, even in the average case.
References
More filters
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Book

Hadoop: The Definitive Guide

Tom White
TL;DR: This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoops clusters.
Journal ArticleDOI

New hash functions and their use in authentication and set equality

TL;DR: Several new classes of hash functions with certain desirable properties are exhibited, and two novel applications for hashing which make use of these functions are introduced, including a provably secure authentication technique for sending messages over insecure lines and the application of testing sets for equality.
Journal ArticleDOI

Clustering data streams: Theory and practice

TL;DR: This work describes a streaming algorithm that effectively clusters large data streams and provides empirical evidence of the algorithm's performance on synthetic and real data streams.
Proceedings ArticleDOI

A model of computation for MapReduce

TL;DR: A simulation lemma is proved showing that a large class of PRAM algorithms can be efficiently simulated via MapReduce, and it is demonstrated how algorithms can take advantage of this fact to compute an MST of a dense graph in only two rounds.
Related Papers (5)