scispace - formally typeset
Open AccessJournal ArticleDOI

A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

George Karypis, +1 more
- 11 Dec 1998 - 
- Vol. 20, Iss: 1, pp 359-392
TLDR
This work presents a new coarsening heuristic (called heavy-edge heuristic) for which the size of the partition of the coarse graph is within a small factor of theSize of the final partition obtained after multilevel refinement, and presents a much faster variation of the Kernighan--Lin (KL) algorithm for refining during uncoarsening.
Abstract
Recently, a number of researchers have investigated a class of graph partitioning algorithms that reduce the size of the graph by collapsing vertices and edges, partition the smaller graph, and then uncoarsen it to construct a partition for the original graph [Bui and Jones, Proc. of the 6th SIAM Conference on Parallel Processing for Scientific Computing, 1993, 445--452; Hendrickson and Leland, A Multilevel Algorithm for Partitioning Graphs, Tech. report SAND 93-1301, Sandia National Laboratories, Albuquerque, NM, 1993]. From the early work it was clear that multilevel techniques held great promise; however, it was not known if they can be made to consistently produce high quality partitions for graphs arising in a wide range of application domains. We investigate the effectiveness of many different choices for all three phases: coarsening, partition of the coarsest graph, and refinement. In particular, we present a new coarsening heuristic (called heavy-edge heuristic) for which the size of the partition of the coarse graph is within a small factor of the size of the final partition obtained after multilevel refinement. We also present a much faster variation of the Kernighan--Lin (KL) algorithm for refining during uncoarsening. We test our scheme on a large number of graphs arising in various domains including finite element methods, linear programming, VLSI, and transportation. Our experiments show that our scheme produces partitions that are consistently better than those produced by spectral partitioning schemes in substantially smaller time. Also, when our scheme is used to compute fill-reducing orderings for sparse matrices, it produces orderings that have substantially smaller fill than the widely used multiple minimum degree algorithm.

read more

Content maybe subject to copyright    Report

A FAST AND HIGH QUALITY MULTILEVEL SCHEME FOR
PARTITIONING IRREGULAR GRAPHS
GEORGE KARYPIS
AND VIPIN KUMAR
SIAM J. S
CI. COMPUT.
c
°
1998 Society for Industrial and Applied Mathematics
Vol. 20, No. 1, pp. 359–392
Abstract. Recently, a number of researchers have investigated a class of graph partitioning
algorithms that reduce the size of the graph by collapsing vertices and edges, partition the smaller
graph, and then uncoarsen it to construct a partition for the original graph [Bui and Jones, Proc.
of the 6th SIAM Conference on Parallel Processing for Scientific Computing, 1993, 445–452; Hen-
drickson and Leland, A Multilevel Algorithm for Partitioning Graphs, Tech. report SAND 93-1301,
Sandia National Laboratories, Albuquerque, NM, 1993]. From the early work it was clear that
multilevel techniques held great promise; however, it was not known if they can be made to con-
sistently produce high quality partitions for graphs arising in a wide range of application domains.
We investigate the effectiveness of many different choices for all three phases: coarsening, partition
of the coarsest graph, and refinement. In particular, we present a new coarsening heuristic (called
heavy-edge heuristic) for which the size of the partition of the coarse graph is within a small factor
of the size of the final partition obtained after multilevel refinement. We also present a much faster
variation of the Kernighan–Lin (KL) algorithm for refining during uncoarsening. We test our scheme
on a large number of graphs arising in various domains including finite element methods, linear pro-
gramming, VLSI, and transportation. Our experiments show that our scheme produces partitions
that are consistently better than those produced by spectral partitioning schemes in substantially
smaller time. Also, when our scheme is used to compute fill-reducing orderings for sparse matrices,
it produces orderings that have substantially smaller fill than the widely used multiple minimum
degree algorithm.
Key words. graph partitioning, parallel computations, fill-reducing orderings, finite element
computations
AMS subject classifications. 68B10, 05C85
PII. S1064827595287997
1. Introduction. Graph partitioning is an important problem that has exten-
sive applications in many areas, including scientific computing, VLSI design, and task
scheduling. The problem is to partition the vertices of a graph in p roughly equal
parts, such that the number of edges connecting vertices in different parts is mini-
mized. For example, the solution of a sparse system of linear equations Ax = b via
iterative methods on a parallel computer gives rise to a graph partitioning problem.
A key step in each iteration of these methods is the multiplication of a sparse matrix
and a (dense) vector. A good partition of the graph corresponding to matrix A can
significantly reduce the amount of communication in parallel sparse matrix-vector
multiplication [32]. If parallel direct methods are used to solve a sparse system of
equations, then a graph partitioning algorithm can be used to compute a fill-reducing
ordering that leads to a high degree of concurrency in the factorization phase [32, 12].
The multiple minimum degree ordering used almost exclusively in serial direct meth-
Received by the editors June 19, 1995; accepted for publication (in revised form) January 28,
1997; published electronically August 4, 1998. This work was supported by Army Research Of-
fice contract DA/DAAH04-95-1-0538, NSF grant CCR-9423082, IBM Partnership Award, and by
Army High Performance Computing Research Center under the auspices of the Department of the
Army, Army Research Laboratory cooperative agreement DAAH04-95-2-0003/contract DAAH04-
95-C-0008. Access to computing facilities was provided by AHPCRC, Minnesota Supercomputer
Institute, Cray Research Inc., and by the Pittsburgh Supercomputing Center.
http://www.siam.org/journals/sisc/20-1/28799.html
Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN
55455 (karypis@cs.umn.edu, kumar@cs.umn.edu).
359

360 GEORGE KARYPIS AND VIPIN KUMAR
ods is not suitable for parallel direct methods, as it provides very little concurrency
in the parallel factorization phase.
The graph partitioning problem is NP-complete. However, many algorithms have
been developed that find a reasonably good partition. Spectral partitioning meth-
ods are known to produce good partitions for a wide class of problems, and they are
used quite extensively [45, 47, 24]. However, these methods are very expensive since
they require the computation of the eigenvector corresponding to the second smallest
eigenvalue (Fiedler vector). Execution time of the spectral methods can be reduced
if computation of the Fiedler vector is done by using a multilevel algorithm [2]. This
multilevel spectral bisection (MSB) algorithm usually manages to speed up the spec-
tral partitioning methods by an order of magnitude without any loss in the quality of
the edge-cut. However, even MSB can take a large amount of time. In particular, in
parallel direct solvers, the time for computing ordering using MSB can be several or-
ders of magnitude higher than the time taken by the parallel factorization algorithm,
and thus ordering time can dominate the overall time to solve the problem [18].
Another class of graph partitioning techniques uses the geometric information of
the graph to find a good partition. Geometric partitioning algorithms [23, 48, 37,
36, 38] tend to be fast but often yield partitions that are worse than those obtained
by spectral methods. Among the most prominent of these schemes is the algorithm
described in [37, 36]. This algorithm produces partitions that are provably within the
bounds that exist for some special classes of graphs (that includes graphs arising
in finite element applications). However, due to the randomized nature of these
algorithms, multiple trials are often required (5 to 50) to obtain solutions that are
comparable in quality with spectral methods. Multiple trials do increase the time
[15], but the overall runtime is still substantially lower than the time required by
the spectral methods. Geometric graph partitioning algorithms are applicable only
if coordinates are available for the vertices of the graph. In many problem areas
(e.g., linear programming, VLSI), there is no geometry associated with the graph.
Recently, an algorithm has been proposed to compute coordinates for graph vertices
[6] by using spectral methods. But these methods are much more expensive and
dominate the overall time taken by the graph partitioning algorithm.
Another class of graph partitioning algorithms reduces the size of the graph (i.e.,
coarsen the graph) by collapsing vertices and edges, partitions the smaller graph, and
then uncoarsens it to construct a partition for the original graph. These are called
multilevel graph partitioning schemes [4, 7, 19, 20, 26, 10, 43]. Some researchers
investigated multilevel schemes primarily to decrease the partitioning time, at the cost
of somewhat worse partition quality [43]. Recently, a number of multilevel algorithms
have been proposed [4, 26, 7, 20, 10] that further refine the partition during the
uncoarsening phase. These schemes tend to give good partitions at a reasonable
cost. Bui and Jones [4] use random maximal matching to successively coarsen the
graph down to a few hundred vertices; they partition the smallest graph and then
uncoarsen the graph level by level, applying the KL algorithm to refine the partition.
Hendrickson and Leland [26] enhance this approach by using edge and vertex weights
to capture the collapsing of the vertex and edges. In particular, this latter work
showed that multilevel schemes can provide better partitions than spectral methods
at lower cost for a variety of finite element problems.
In this paper we build on the work of Hendrickson and Leland. We experiment
with various parameters of multilevel algorithms and their effect on the quality of
partition and ordering. We investigate the effectiveness of many different choices

MULTILEVEL GRAPH PARTITIONING 361
for all three phases: coarsening, partition of the coarsest graph, and refinement. In
particular, we present a new coarsening heuristic (called heavy-edge heuristic) for
which the size of the partition of the coarse graph is within a small factor of the
size of the final partition obtained after multilevel refinement. We also present a new
variation of the KL algorithm for refining the partition during the uncoarsening phase
that is much faster than the KL refinement used in [26].
We test our scheme on a large number of graphs arising in various domains includ-
ing finite element methods, linear programming, VLSI, and transportation. Our ex-
periments show that our scheme consistently produces partitions that are better than
those produced by spectral partitioning schemes in substantially smaller times (10 to
35 times faster than multilevel spectral bisection).
1
Compared with the multilevel
scheme of [26], our scheme is about two to seven times faster, and it is consistently
better in terms of cut size. Much of the improvement in runtime comes from our
faster refinement heuristic, and the improvement in quality is due to the heavy-edge
heuristic used during coarsening.
We also used our graph partitioning scheme to compute fill-reducing orderings for
sparse matrices. Surprisingly, our scheme substantially outperforms the multiple min-
imum degree algorithm [35], which is the most commonly used method for computing
fill-reducing orderings of a sparse matrix.
Even though multilevel algorithms are quite fast compared with spectral methods,
they can still be the bottleneck if the sparse system of equations is being solved in
parallel [32, 18]. The coarsening phase of these methods is relatively easy to parallelize
[30], but the KL heuristic used in the refinement phase is very difficult to parallelize
[16]. Since both the coarsening phase and the refinement phase with the KL heuristic
take roughly the same amount of time, the overall runtime of the multilevel scheme
of [26] cannot be reduced significantly. Our new faster methods for refinement reduce
this bottleneck substantially. In fact our parallel implementation [30] of this multilevel
partitioning is able to get a speedup of as much as 56 on a 128-processor Cray T3D
for moderate size problems.
The remainder of the paper is organized as follows. Section 2 defines the graph
partitioning problem and describes the basic ideas of multilevel graph partitioning.
Sections 3, 4, and 5 describe different algorithms for the coarsening, initial partition-
ing, and the uncoarsening phase, respectively. Section 6 presents an experimental
evaluation of the various parameters of multilevel graph partitioning algorithms and
compares their performance with that of multilevel spectral bisection algorithm. Sec-
tion 7 compares the quality of the orderings produced by multilevel nested dissection
to those produced by multiple minimum degree and spectral nested dissection. Sec-
tion 9 provides a summary of the various results. A short version of this paper appears
in [29].
2. Graph partitioning. The k-way graph partitioning problem is defined as fol-
lows: given a graph G =(V,E) with |V | = n, partition V into k subsets, V
1
,V
2
,...,V
k
such that V
i
V
j
= for i 6= j, |V
i
| = n/k, and
S
i
V
i
= V , and the number of edges of
E whose incident vertices belong to different subsets is minimized. The k-way graph
partitioning problem can be naturally extended to graphs that have weights associ-
ated with the vertices and the edges of the graph. In this case, the goal is to partition
the vertices into k disjoint subsets such that the sum of the vertex-weights in each
1
We used the MSB algorithm in the Chaco [25] graph partitioning package to obtain the timings
for MSB.

362 GEORGE KARYPIS AND VIPIN KUMAR
subset is the same, and the sum of the edge-weights whose incident vertices belong to
different subsets is minimized. A k-way partition of V is commonly represented by a
partition vector P of length n, such that for every vertex v V , P [v] is an integer
between 1 and k, indicating the partition at which vertex v belongs. Given a partition
P , the number of edges whose incident vertices belong to different subsets is called
the edge-cut of the partition.
The efficient implementation of many parallel algorithms usually requires the so-
lution to a graph partitioning problem, where vertices represent computational tasks,
and edges represent data exchanges. Depending on the amount of the computation
performed by each task, the vertices are assigned a proportional weight. Similarly,
the edges are assigned weights that reflect the amount of data that need to be ex-
changed. A k-way partitioning of this computation graph can be used to assign tasks
to k processors. Since the partitioning assigns to each processor tasks whose total
weight is the same, the work is balanced among k processors. Furthermore, since the
algorithm minimizes the edge-cut (subject to the balanced load requirements), the
communication overhead is also minimized.
One such example is the sparse matrix-vector multiplication y = Ax. Matrix
A
n×n
and vector x are usually partitioned along rows, with each of the p processors
receiving n/p rows of A and the corresponding n/p elements of x [32]. For matrix A an
n-vertex graph G
A
can be constructed such that each row of the matrix corresponds
to a vertex, and if row i has a nonzero entry in column j (i 6= j), then there is
an edge between vertex i and vertex j. As discussed in [32], any edges connecting
vertices from two different partitions lead to communication for retrieving the value
of vector x that is not local but is needed to perform the dot-product. Thus, in order
to minimize the communication overhead, we need to obtain a p-way partition of G
A
and then to distribute the rows of A according to this partition.
Another important application of recursive bisection is to find a fill-reducing or-
dering for sparse matrix factorization [12, 32, 22]. These algorithms are generally
referred to as nested dissection ordering algorithms. Nested dissection recursively
splits a graph into almost equal halves by selecting a vertex separator until the de-
sired number of partitions is obtained. One way of obtaining a vertex separator is
to first obtain a bisection of the graph and then compute a vertex separator from
the edge separator. The vertices of the graph are numbered such that at each level
of recursion the separator vertices are numbered after the vertices in the partitions.
The effectiveness and the complexity of a nested dissection scheme depend on the
separator computing algorithm. In general, small separators result in low fill-in.
The k-way partition problem is frequently solved by recursive bisection. That is,
we first obtain a 2-way partition of V , and then we further subdivide each part using
2-way partitions. After log k phases, graph G is partitioned into k parts. Thus, the
problem of performing a k-way partition can be solved by performing a sequence of
2-way partitions or bisections. Even though this scheme does not necessarily lead to
optimal partition, it is used extensively due to its simplicity [12, 22].
2.1. Multilevel graph bisection. The graph G can be bisected using a mul-
tilevel algorithm. The basic structure of a multilevel algorithm is very simple. The
graph G is first coarsened down to a few hundred vertices, a bisection of this much
smaller graph is computed, and then this partition is projected back toward the orig-
inal graph (finer graph). At each step of the graph uncoarsening, the partition is
further refined. Since the finer graph has more degrees of freedom, such refinements
usually decrease the edge-cut. This process is graphically illustrated in Figure 1.

MULTILEVEL GRAPH PARTITIONING 363
G
G
1
projected partition
refined partition
Coarsening Phase
Uncoarsening Phase
Initial Partitioning Phase
Multilevel Graph Bisection
G
G
3
G
2
G
1
O
G
2
G
O
4
G
3
Fig. 1. The various phases of the multilevel graph bisection. During the coarsening phase, the
size of the graph is successively decreased; during the initial partitioning phase, a bisection of the
smaller graph is computed; and during the uncoarsening phase, the bisection is successively refined as
it is projected to the larger graphs. During the uncoarsening phase the light lines indicate projected
partitions, and dark lines indicate partitions that were produced after refinement.
Formally, a multilevel graph bisection algorithm works as follows: consider a
weighted graph G
0
=(V
0
,E
0
), with weights both on vertices and edges. A multilevel
graph bisection algorithm consists of the following three phases.
Coarsening phase. The graph G
0
is transformed into a sequence of smaller
graphs G
1
,G
2
,...,G
m
such that |V
0
| > |V
1
| > |V
2
| > ···> |V
m
|.
Partitioning phase. A 2-way partition P
m
of the graph G
m
=(V
m
,E
m
)is
computed that partitions V
m
into two parts, each containing half the vertices
of G
0
.
Uncoarsening phase. The partition P
m
of G
m
is projected back to G
0
by going
through intermediate partitions P
m1
,P
m2
,...,P
1
,P
0
.
3. Coarsening phase. During the coarsening phase, a sequence of smaller
graphs, each with fewer vertices, is constructed. Graph coarsening can be achieved in
various ways. Some possibilities are shown in Figure 2.
In most coarsening schemes, a set of vertices of G
i
is combined to form a single
vertex of the next level coarser graph G
i+1
. Let V
v
i
be the set of vertices of G
i
combined to form vertex v of G
i+1
. We will refer to vertex v as a multinode. In order
for a bisection of a coarser graph to be good with respect to the original graph, the

Citations
More filters
Journal ArticleDOI

Data clustering: 50 years beyond K-means

TL;DR: A brief overview of clustering is provided, well known clustering methods are summarized, the major challenges and key issues in designing clustering algorithms are discussed, and some of the emerging and useful research directions are pointed out.
Posted Content

Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

TL;DR: In this article, a spectral graph theory formulation of convolutional neural networks (CNNs) was proposed to learn local, stationary, and compositional features on graphs, and the proposed technique offers the same linear computational complexity and constant learning complexity as classical CNNs while being universal to any graph structure.
Journal ArticleDOI

Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

TL;DR: This paper introduces the problem of combining multiple partitionings of a set of objects into a single consolidated clustering without accessing the features or algorithms that determined these partitionings and proposes three effective and efficient techniques for obtaining high-quality combiners (consensus functions).
Book ChapterDOI

Data Clustering: 50 Years Beyond K-means

TL;DR: Cluster analysis as mentioned in this paper is the formal study of algorithms and methods for grouping objects according to measured or perceived intrinsic characteristics, which is one of the most fundamental modes of understanding and learning.
References
More filters
Journal ArticleDOI

Combinatorial optimization: algorithms and complexity

TL;DR: This clearly written, mathematically rigorous text includes a novel algorithmic exposition of the simplex method and also discusses the Soviet ellipsoid algorithm for linear programming; efficient algorithms for network flow, matching, spanning trees, and matroids; the theory of NP-complete problems; approximation algorithms, local search heuristics for NPcomplete problems, more.
Journal ArticleDOI

An efficient heuristic procedure for partitioning graphs

TL;DR: A heuristic method for partitioning arbitrary graphs which is both effective in finding optimal partitions, and fast enough to be practical in solving large problems is presented.
Journal ArticleDOI

The Symmetric Eigenvalue Problem.

TL;DR: Parlett as discussed by the authors presents mathematical knowledge that is needed in order to understand the art of computing eigenvalues of real symmetric matrices, either all of them or only a few.
Book

The Symmetric Eigenvalue Problem

TL;DR: Parlett as discussed by the authors presents mathematical knowledge that is needed in order to understand the art of computing eigenvalues of real symmetric matrices, either all of them or only a few.
Frequently Asked Questions (1)
Q1. What are the contributions mentioned in the paper "A fast and high quality multilevel scheme for partitioning irregular graphs∗" ?

The authors investigate the effectiveness of many different choices for all three phases: coarsening, partition of the coarsest graph, and refinement. In particular, the authors present a new coarsening heuristic ( called heavy-edge heuristic ) for which the size of the partition of the coarse graph is within a small factor of the size of the final partition obtained after multilevel refinement. The authors also present a much faster variation of the Kernighan–Lin ( KL ) algorithm for refining during uncoarsening.