scispace - formally typeset
Open AccessJournal ArticleDOI

Efficient computation of distance labeling for decremental updates in large dynamic graphs

Reads0
Chats0
TLDR
This paper proposes maintenance algorithms based on distance labeling, which can handle decremental updates efficiently and can speed up index re-computation by up to an order of magnitude compared with the state-of-the-art method, Pruned Landmark Labeling (PLL).
Abstract
Since today's real-world graphs, such as social network graphs, are evolving all the time, it is of great importance to perform graph computations and analysis in these dynamic graphs. Due to the fact that many applications such as social network link analysis with the existence of inactive users need to handle failed links or nodes, decremental computation and maintenance for graphs is considered a challenging problem. Shortest path computation is one of the most fundamental operations for managing and analyzing large graphs. A number of indexing methods have been proposed to answer distance queries in static graphs. Unfortunately, there is little work on answering such queries for dynamic graphs. In this paper, we focus on the problem of computing the shortest path distance in dynamic graphs, particularly on decremental updates (i.e., edge deletions). We propose maintenance algorithms based on distance labeling, which can handle decremental updates efficiently. By exploiting properties of distance labeling in original graphs, we are able to efficiently maintain distance labeling for new graphs. We experimentally evaluate our algorithms using eleven real-world large graphs and confirm the effectiveness and efficiency of our approach. More specifically, our method can speed up index re-computation by up to an order of magnitude compared with the state-of-the-art method, Pruned Landmark Labeling (PLL).

read more

Content maybe subject to copyright    Report

University of Huddersfield Repository
Qin, Yongrui, Sheng, Quan Z., Falkner, Nickolas J. G., Yao, Lina and Parkinson, Simon
Efficient Computation of Distance Labeling for Decremental Updates in Large Dynamic Graphs
Original Citation
Qin, Yongrui, Sheng, Quan Z., Falkner, Nickolas J. G., Yao, Lina and Parkinson, Simon (2016)
Efficient Computation of Distance Labeling for Decremental Updates in Large Dynamic Graphs.
World Wide Web Journal. ISSN 1386-145X
This version is available at http://eprints.hud.ac.uk/id/eprint/29768/
The University Repository is a digital collection of the research output of the
University, available on Open Access. Copyright and Moral Rights for the items
on this site are retained by the individual author and/or other copyright owners.
Users may access full items free of charge; copies of full text items generally
can be reproduced, displayed or performed and given to third parties in any
format or medium for personal research or study, educational or not-for-profit
purposes without prior permission or charge, provided:
The authors, title and full bibliographic details is credited in any copy;
A hyperlink and/or URL is included for the original metadata page; and
The content is not changed in any way.
For more information, including our policy and submission procedure, please
contact the Repository Team at: E.mailbox@hud.ac.uk.
http://eprints.hud.ac.uk/

World Wide Web manuscript No.
(will be inserted by the editor)
Efficient Computation of Distance Labeling for
Decremental Updates in Large Dynamic Graphs
Yongrui Qin · Quan Z. Sheng · Nickolas
J.G. Falkner · Lina Yao · Simon
Parkinson
Received: date / Accepted: date
Abstract Since today’s real-world graphs, such as social network graphs, are
evolving all the time, it is of great importance to perform graph computations
and analysis in these dynamic graphs. Due to the fact that many applications
such as social network link analysis with the existence of inactive users need
to handle failed links or nodes, decremental computation and maintenance for
graphs is considered a challenging problem. Shortest path computation is one
of the most fundamental operations for managing and analyzing large graphs.
A number of indexing methods have been proposed to answer distance queries
in static graphs. Unfortunately, there is little work on answering such queries
for dynamic graphs. In this paper, we focus on the problem of computing the
shortest path distance in dynamic graphs, particularly on decremental updates
(i.e., edge deletions). We propose maintenance algorithms based on distance
labeling, which can handle decremental updates efficiently. By exploiting prop-
erties of distance labeling in original graphs, we are able to efficiently maintain
distance labeling for new graphs. We experimentally evaluate our algorithms
using eleven real-world large graphs and confirm the effectiveness and effi-
ciency of our approach. More specifically, our method can speed up index
re-computation by up to an order of magnitude compared with the state-of-
the-art method, Pruned Landmark Labeling (PLL).
Keywords Shortest Path · Graph Computation · Distance Labeling ·
Dynamic Graph
Yongrui Qin and Simon Parkinson are with School of Computing and Engineering, Univer-
sity of Huddersfield, UK
Quan Z. Sheng and Nickolas J.G. Falkner are with School of Computer Science, The Uni-
versity of Adelaide, Australia
Lina Yao is with School of Computer Science and Engineering, The University of New South
Wales, Australia
Corresponding Author: Yongrui Qin, E-mail: y.qin2@hud.ac.uk

2 Yongrui Qin et al.
1 Introduction
Recent years have witnessed the fast emergence of massive graph data in many
application domains, such as the World Wide Web, linked data technology,
online social networks, and Web of Things. In a graph, one of the most funda-
mental problems is the computation of the shortest path or distance between
any given pair of vertices. For instance, distances or the numbers of links be-
tween web pages in a large web graph can be considered a robust measure
of web page relevancy, especially in relevance feedback analysis in web search
[21]. In RDF graphs of linked data, the shortest path distance from one entity
to another is important for ranking entity relationships and keyword querying
[18,15]. For online social networks, the shortest path distance can be used to
measure the closeness centrality between users [22,23].
A large body of indexing techniques have been recently proposed to pro-
cess exact shortest path distance queries in graphs [9, 24, 8, 7,2,26,16]. Among
them, a significant portion of indexes are based on 2-hop distance labeling,
which is originally proposed by Cohen et al. [11]. The 2-hop distance label-
ing pre-computes a label for each vertex so that the shortest path distance
between any two vertices can be computed by giving only their labels. These
labeling indexes, such as [9,7,2,16], prove to be efficient when processing large
graphs with edge numbers up to hundreds of millions.
Motivation. The above mentioned approaches generally make the assump-
tion that graphs are static. However, in reality, many graphs are subject to
constant changes. For example, it is reported that in the fourth quarter of 2012,
Facebook reached 1.056 billion users amounted to a 24.97% increase from the
same period in 2011 [14]. Around April 2013, DBpedia, one of the most popu-
lar RDF graphs, released its version 3.9. In this new release, an overall increase
in the number of concepts in the English edition changed from 3.7 million to
4.0 million things compared with its last release in June 2012
1
. Similarly, the
emerging social Web of Things also supports the need for dynamic graph data
management because smart things are normally moving and their connectivity
could be intermittent, leading to frequent and unpredictable changes in the
corresponding graph models [10,25].
We believe that it is imperative to design novel algorithms that can update
shortest path indexes efficiently for large dynamic graphs. Existing shortest
path indexing techniques based on 2-hop labeling may take up to hundreds
of seconds to pre-compute the whole shortest path index for a graph with
millions of edges. For larger graphs, it can take up to thousands of seconds
[2,16]. Applying indexing techniques designed for static graphs directly to
dynamic graphs may lead to inefficiency. This is because that if only a small
part of the graph is changed, i.e., only a deletion of an existing edge occurs, a
significant proportion of the shortest paths are likely to remain unchanged and
the index for the original graph may contain a large amount of correct distance
1
http://wiki.dbpedia.org/

Efficient Computation of Distance Labeling for Decremental Updates 3
information. In such case, simply recomputing the 2-hop distance index from
scratch would unnecessarily waste computing resources.
An alternative is to maintain dynamic all-pairs shortest paths (APSP).
Many approaches have been proposed to maintain dynamic APSP data struc-
tures. For example, in [12,13], a dynamic algorithm for general directed graphs
with non-negative edge weights was proposed with a computational complexity
of O(n
2
log
3
n), where n is the number of vertices. However, this time bound is
comparable to recomputing all-pairs shortest paths from scratch, which makes
the algorithm inefficient for handling changes in graphs. Recently, an algo-
rithm for maintaining dynamic all-pairs (1 + ) approximate shortest paths for
directed graphs with polynomial weights is proposed in [5]. The total update
complexity is
˜
O(mn/), where n is the number of vertices and m is the number
of edges. Unfortunately it only applies to dynamic approximate shortest path
problems.
Incremental updates (i.e., edge insertions) of 2-hop labeling in large dy-
namic graphs have been recently investigated in [3]. However, the problem of
supporting decremental updates (i.e., edge deletions) of 2-hop labeling still
remains unsolved and is considered a challenging problem [3]. Decremental
updates are very useful in the presence of many real-world problems such as
outdated web links in a web graph or obsolete user profiles in a social network.
Clearly, decremental maintenance is a fundamental and important operation
on graph data to support efficient web link analysis and social network anal-
ysis.
Contributions. To address the deficiency of existing shortest path indexing
techniques, this paper proposes a generic framework to update shortest path
indexes efficiently for dynamic graphs where edge deletions are allowed. As an
initial attempt on this challenging issue, we focus on unweighted, undirected
graphs. Similar to other distance labeling based indexing methods [2,16], our
method can be extended to weighted and/or directed graphs. We highlight our
main contributions in the following:
We present the concept of well-ordering 2-hop distance labeling and identify
its important properties that can be utilized to design update algorithms
for shortest path indexes in dynamic graphs.
We analyze cases of shortest path index maintenance in dynamic graphs
with decremental updates. We develop the corresponding theorems as well
as novel algorithms to enable efficient updates without reconstruction of
distance labeling for the entire graph.
We conduct extensive experiments on eleven real-world large graphs to ver-
ify the efficiency and effectiveness of our method. Compared with the state-
of-the-art technique [2] which is designed for static graphs, our method is
on average an order of magnitude faster.
The rest of this paper is organized as follows. In Section 2, we review the
related work. In Section 3, we present some preliminaries on 2-hop distance
labeling. We then present the framework and the details of our approach in

4 Yongrui Qin et al.
Section 4. In Section 5, we report the results of an extensive experimental study
using eleven large graphs from real-world. Finally, we present some concluding
remarks in Section 6.
2 Related Work
In this section, we review the major techniques that are most closely related
to our work.
Distance labeling has been an active research area in recent years. In [9],
Cheng and Yu exploit the strongly connected components property and graph
partitioning to pre-compute 2-hop distance cover. However, the graph parti-
tioning process introduces high cost because it has to find vertex separators
recursively. Hierarchical hub labeling (HHL) proposed by Abraham et al. [1] is
based on the partial order of vertices. Smaller labeling results can be obtained
by computing labeling for different partial order of vertices. In [17], Jin et al.
propose a highway-centric labeling (HCL) that uses a spanning tree as a high-
way and based on the highway, a 2-hop labeling is generated for fast distance
computation.
Very recently, the Pruned Landmark Labeling (PLL) [2] is proposed by
Akiba et al. to pre-compute 2-hop distance labels for vertices by perform-
ing a breadth-first search from every vertex. The key is to prune vertices
that have obtained correct distance information during breadth-first searches,
which helps reduce the search space and sizes of labels. Further, query perfor-
mance is also improved as the number of label entries per vertex is reduced.
IS-Label (or ISL) is developed by Fu et al. in [16] to pre-compute 2-hop distance
label for large graphs in memory constrained environments. ISL is based on
the idea of independent set of vertices in a large graph. By recursively remov-
ing an independent set of vertices from the original graph, and by augmenting
edges that preserve distance information after the removal of vertices in the
independent set, the remaining graph keeps the distance information for all
remaining vertices in the graph. Apart from the 2-hop distance labeling tech-
nique, a multi-hop distance labeling approach [7] is also studied, which can
reduce the overall size of labels at the cost of increased distance querying time.
Tree decomposition approaches have been recently investigated [24,4] for
answering distance queries in graphs. Wei proposes TEDI [24], which first
decomposes a graph into a tree and forms a tree decomposition. A tree de-
composition of a graph is a tree with each vertex associated with a set of
vertices in the graph, which is also called a bag. The shortest paths among
vertices in the same bag are pre-computed and stored in bags. For any given
source and target vertices, a bottom-up operation along the tree can be exe-
cuted to find the shortest path. An improved TEDI index is further proposed
by Akiba et al. in [4] that exploits a core-fringe structure to improve index
performance. However, due to the large size of some bags in the decomposed
tree, the construction time for a large graph is costly and thus such indexing
approaches cannot scale well.

Citations
More filters
Proceedings ArticleDOI

Scaling Distance Labeling on Small-World Networks

TL;DR: Scale distance labeling on small-world networks by proposing a Parallel Shortest-distance Labeling (PSL) scheme and further reducing the index size by exploiting graph and label properties and near-linear speedup in a multi-core environment.
Proceedings ArticleDOI

Dynamic Hub Labeling for Road Networks

TL;DR: In this paper, the authors adopt the state-of-the-art tree decomposition-based hub labeling as the underlying index, and design efficient algorithms to incrementally maintain the index.
Proceedings ArticleDOI

Efficient 2-Hop Labeling Maintenance in Dynamic Small-World Networks

TL;DR: Wang et al. as discussed by the authors adopt the state-of-the-art Parallel Shortest Distance Labeling (PSL) as the underlying 2-hop labeling construction method, and design algorithms to support efficient update of the index given edge weight change (increase and decrease) in the network.
Proceedings ArticleDOI

Hub Labeling for Shortest Path Counting

TL;DR: This work proposes a hub labeling scheme based on hub pushing and discusses several graph reduction techniques to reduce the index size and proves several theoretical results on the performance of the scheme for some special graph classes.
Proceedings Article

A Highly Scalable Labelling Approach for Exact Distance Queries in Complex Networks.

TL;DR: Li et al. as discussed by the authors proposed a scalable algorithm for constructing minimal distance labelling and a querying framework that supports fast distance-bounded search on a sparsified graph, which can handle networks with billions of vertices and billions of edges.
References
More filters
Proceedings ArticleDOI

Reachability and distance queries via 2-hop labels

TL;DR: In this paper, the authors propose a new data structure for representing all distances in a graph, which is distributed in the sense that it may be viewed as assigning labels to the vertices, such that a query involving vertices u and v may be answered using only the labels of u and V.
Journal ArticleDOI

A new approach to dynamic all pairs shortest paths

TL;DR: A fully dynamic algorithm for general directed graphs with non-negative real-valued edge weights that supports any sequence of operations in O(n2log3n) amortized time per update and unit worst-case time per distance query, where n is the number of vertices.
Posted Content

Fast Exact Shortest-Path Distance Queries on Large Networks by Pruned Landmark Labeling

TL;DR: This work proposes a new exact method for shortest-path distance queries on large-scale networks that can handle social networks and web graphs with hundreds of millions of edges, which are two orders of magnitude larger than the limits of previous exact methods.
Proceedings ArticleDOI

Fast exact shortest-path distance queries on large networks by pruned landmark labeling

TL;DR: In this article, a new exact method for shortest-path distance queries on large-scale networks is proposed, where the key ingredient introduced here is pruning during breadth-first searches.
Book ChapterDOI

Hierarchical hub labelings for shortest paths

TL;DR: This work studies hierarchical hub labelings for computing shortest paths to lead to faster preprocessing algorithms, making the labeling approach practical for a wider class of graphs.
Related Papers (5)
Frequently Asked Questions (21)
Q1. What are the contributions mentioned in the paper "Efficient computation of distance labeling for decremental updates in large dynamic graphs" ?

In this paper, the authors focus on the problem of computing the shortest path distance in dynamic graphs, particularly on decremental updates ( i. e., edge deletions ). The authors propose maintenance algorithms based on distance labeling, which can handle decremental updates efficiently. The authors experimentally evaluate their algorithms using eleven real-world large graphs and confirm the effectiveness and efficiency of their approach. 

Their future work will further investigate several aspects of maintaining distance labeling indexes for large dynamic graphs. The authors also plan to extend their work to efficiently update distance labeling in memory and computing resource constrained environments. The first one centers on how to further speed up the decremental maintenance. The authors will investigate possible ways to maintain auxiliary information and redundant label entries that could be useful to reduce the relabeling efforts when an update occurs. 

By recursively removing an independent set of vertices from the original graph, and by augmenting edges that preserve distance information after the removal of vertices in the independent set, the remaining graph keeps the distance information for all remaining vertices in the graph. 

In a graph, one of the most fundamental problems is the computation of the shortest path or distance between any given pair of vertices. 

After the deletion of edge (u, v) from graph G, for any vertex s, t in G′, if dG′(s, t) > dist(s, t, L), and suppose a shortest path between s and t in G is πG(s, t), then the authors must have uv ∈ πG(s, t) or vu ∈ πG(s, t). 

Existing shortest path indexing techniques based on 2-hop labeling may take up to hundreds of seconds to pre-compute the whole shortest path index for a graph with millions of edges. 

The key is to prune vertices that have obtained correct distance information during breadth-first searches, which helps reduce the search space and sizes of labels. 

To exploit parallel computing during the labeling for these first t roots of BFSs, the bit-parallel technique will be able to label up to a fixed number of neighbors (e.g., up to 32 or 64 neighbors) in a batch mode when processing one vertex. 

decremental maintenance is a fundamental and important operation on graph data to support efficient web link analysis and social network analysis. 

For instance, distances or the numbers of links between web pages in a large web graph can be considered a robust measure of web page relevancy, especially in relevance feedback analysis in web search [21]. 

Due to lack of alternative shortest paths information, the authors have to perform a large number of BFSs to discover alternative shortest paths in order to maintain the index. 

Proof: Since w ∈ PA(u), the authors must have that dG(r, v) = dG(r, u) + 1, which means that any shortest path between w and u, denoted as pwu, plus edge (u, v) in the original graph must also be a shortest path between w and v. 

This is because that if only a small part of the graph is changed, i.e., only a deletion of an existing edge occurs, a significant proportion of the shortest paths are likely to remain unchanged and the index for the original graph may contain a large amount of correct distance1 

A possible way to further improve performance on decremental maintenance would be to introduce auxiliary information on the labeling or even redundant label entries in the labeling index. 

due to the large size of some bags in the decomposed tree, the construction time for a large graph is costly and thus such indexing approaches cannot scale well. 

At a later stage, the authors run BFS rooted at u (Note that at the beginning of this BFS, r has been pruned since a BFS rooted at r has been completed) and if d3 >= d1 + d2, the authors prune v from the rest of the current BFS process. 

To support fast incremental updates, outdated distance labels are kept, which will not affect the distance computation in the updated graphs in the incremental case. 

when bit-parallel is applied, the average update times (AUT-bp) are even smaller, though the average speedup ratio is not as large as the instances without bit-parallel, which could be due to the faster indexing processes with bit-parallel and the fact that less room is available for speeding up the maintenance processes. 

A large body of indexing techniques have been recently proposed to process exact shortest path distance queries in graphs [9,24,8,7,2,26,16]. 

An improved TEDI index is further proposed by Akiba et al. in [4] that exploits a core-fringe structure to improve index performance. 

Figure 1 shows an example graph with 11 vertices and Table 1 shows a wellordering 2-hop distance labeling result L for the graph (L can be constructed by PLL [2] using the same vertex ordering as that specified in the table).