(Open Access) Distance Oracles for Unweighted Graphs: Breaking the Quadratic Barrier with Constant Additive Error (2008) | Surender Baswana

Q: What are the contributions mentioned in the paper "Distance oracles for unweighted graphs : breaking the quadratic barrier with constant additive error" ?

In this paper, the authors are able to break this quadratic barrier at the expense of introducing a ( small ) constant additive error for unweighted graphs.

Distance oracles for unweighted graphs :

breaking the quadratic barrier with constant

additive error

Surender Baswana

⋆

, Akshay Gaur

∗∗

, Sandeep Sen

∗∗

, and Jayant Upadhyay

⋆⋆

Abstract. Thorup and Z wick, in the seminal paper [Journal of ACM,

52(1), 2005, pp 1-24], showed that a weighted undirected graph on n

vertices can be preprocessed in subcubic time to design a data struc-

ture which occupies only subquadratic space, and yet, for any pair of

vertices, can answer distance query approximately in constant time. The

data stru cture is termed as approximate distance oracle. Subsequently,

there has been improvement in their preprocessing time, and presently

the best known algorithms [4, 3] achieve expected O ( n

) preprocessing

time for these oracles. For a class of graphs, these algorithms indeed run

in Θ(n

) time. In this paper, we are able to break this quadratic barrier

at the expense of introducing a (small) constant additive error for un-

weighted graphs. In achieving this goal, we have been able to preserve

the optimal size-stretch trade oﬀs of the oracles. One of our algorithms

can be extended to weighted graphs, where the additive error becomes

2 · w

max

(u, v) - here w

max

(u, v) is the heaviest edge in the sh ortest path

between vertices u, v.

1 Introduction

Let G = (V, E) be a graph on |V | = n vertices and |E| = m edges, and δ(u, v) de-

note the distance between any pair of vertices u, v ∈ V in graph G. The all-pairs

shortest paths (APSP) pr oblem requires preprocessing the given graph G so a s to

build a data structure using which we can retrieve distance or the s hortest path

between any pa ir of vertices eﬃciently. APSP is undoubtedly one o f the most

fundamental a lgorithmic graph pro blems of computer science. Despite being a

classical problem with widespread applications, there exists a huge gap between

the lower bound Ω(n

) and the worst case upper bound O(n

/ log

n) (due to

Chan [6]) of the time complexity of APSP problem. Furthermore, Θ(n

) space

requirement is a major bottleneck for graphs in many large scale applications.

These two factors have motivated researchers to desig n e ﬃcient algorithms (or

data structures) for reporting approximate distances. In the last ﬁfteen years,

⋆

Department of Comp. Sc. & Engg. Indian Institute of Technology Kanpur, Kan-

pur - 208016, I ndia. Email : sbaswana@iitk.ac.in. The work was supported by a

fellowship from Research I Found ation, CSE, IIT Kanpur.

⋆⋆

Department of Comp. Sc. & Engg., Indian Institute of Technology Delhi, New Delhi-

110016, India. Email : {manu,ssen,jayant}@cse.iitd.ernet.in

many novel algorithms [1, 8, 7, 2, 11] have been designed which work for undi-

rected g raphs. However, among all these algorithms the approximate distance

oracles designed by Thorup and Zwick [12] dese rve spec ial mention. They showed

that any given weighted undirected graph on n vertices can be preprocessed in

sub-cubic time for any integer t ≥ 3 to build a data structur e of sub-quadratic

size which for any pair of vertices u, v reports t-approximate distance - at least

δ(u, v) and at most tδ(u, v). There are two very impressive features of their data

structure. First, the trade-oﬀ between stretch t and the size of data structur e is

essentially optimal a ssuming a 196 3 girth lower bound conjecture of Erd˝os [9]

and second, in spite of its s ub- quadratic size their data structure c an answer

any dista nce query in constant time, hence the name “oracle”. More precisely,

Thorup and Zwick achieved the following result.

Theorem 1. [12] For any integer k ≥ 1, an undirected weighted graph on n

vertices and m edges can be preprocessed in expected O(kmn

1/k

) time to build

a data structure of size O(kn

1+1/k

) that can answer any (2k − 1)-approximate

distance query in O(k) time.

Having achieved optimal size-stretch trade oﬀs, and es sentially constant query

time, it is only the preprocessing time of these oracles which may be improved.

The preprocessing time has been improved to O(min(n

, kmn

1/k

)) for unweighted

graphs [4], and recently for weighted graphs as well [3]. Therefore, a natural

question is whether it is possible to achieve O (m + n

2−ǫ

) - a subquadratic upp e r

bound for constructing approximate distance oracles. Note that any approximate

all pairs shortest path algorithm takes Ω(n

) steps beca us e of the output size.

Therefore, a sub-quadr atic time oracle construction provides a cle ar advantage

over such algorithms when we are not interested in all the pair-wise distances.

The main objective here is to achieve sub-quadratic preprocessing time for ap-

proximate distance oracles without violating the size-stre tch trade oﬀ. It may be

noted that the quadratic upper bound of the existing preprocess ing a lgorithms

[12][4] for these oracles is indeed tight - there exists a family of graphs on which

these algorithms would execute in Θ(n

) time.

In this paper, we design approximate distance oracles which, at the expense

of constant additive e rror, are constructable in sub-quadratic time and preserve

size stretch trade-oﬀ optimally. More precisely, we show the following. For any

k > 1, there is a data-structure which occupies O(kn

1+1/k

) space such that for

any pair of vertices u, v ∈ V , it takes O(k) time to return

δ(u, v) satisfying

δ(u, v) ≤

δ(u, v) ≤ (2k − 1)δ(u, v) + c

where c

= 2 for k ≥ 3 and c

= 8

As a natural ex tension of (2k−1)-approximate distance oracle of [12], we denote

the above oracle by (2k−1, c

)-approximate distance oracle, where the ﬁrst term

(2k−1) is the stretch (multiplicative error) and c

is the surplus (a dditive error).

The expected preproces sing time for (2k − 1, c

) oracle is O(m + kn

2−α

), where

takes value in the interval [

) - takes value

for k = 2 and approaches

steadily as k increases (see Table 1).

In short, the small additive error has allowed us to break the quadratic barrier

of preproc e ssing o f approximate distance oracles . It would be very important to

Stretch Space Preprocessing Time Reference

(2k − 1, 0) O(kn

1+1/k

) O(min(n

, kmn

1/k

)) [12, 3, 4]

(3, 8) O(n

3/2

) O(min(m + n

, m

√

n)) this paper

(2k −1, 2), k ≥ 3 O(k n

1+1/k

) O(min(m + kn

2k−2

, kmn

))

this paper

Table 1. Comparing the new algorithms with the existing algorithms for approximate

distance oracles.

explore the limits to which the preprocess ing time can be further improved. The

result of this paper can be viewed as the ﬁrst signiﬁcant step in this direction.

1.1 Overview of the new algorithms

The observation which forms the basis of our algorithms is the simple fact that

the O(kmn

1/k

) time complexity of the algorithm of Thorup and Zwick [12] is

already sub-quadratic provided the graph is sparse enough. In order to utilize

this observation, we use the idea of pa rtitioning the graph into sparse and dense

subgraphs. Previously this idea was used by the algorithms which compute all-

pairs approximate distance with purely additive error only [1, 8]. Using a random

sample S ⊆ V of vertices, we deﬁne a sparse subgraph with o(n

2−1/k

) edges,

and execute Thorup and Zwick algorithm on this sub graph. This algorithm will

execute in o(n

) time and will easily take care of the case when the shortest

paths between a pair o f vertices is fully preserved in the sparse graph. Novelty

of our algorithms is to handle the other case. Our algorithms make use of a

combination of old and new ideas which enables achieving sub-quadratic space

without compromising the optimal size-stretch trade-oﬀ. In o rder to make these

ideas work, our a lgorithms eﬀectively use suitable emulators and spanners which

are suﬃciently sparse.

Deﬁnition 1 An (α, β)-spanner of a graph G = (V, E) is a subgraph (V, E

′

), E

′

⊆

E with the property that distance bet ween any two vertices u, v ∈ V in the span-

ner is at least δ(u, v) and at most α(δ(u, v)) + β.

Deﬁnition 2 An (α, β)-emulator of a graph G = (V, E) is a weighted graph

(V, E

∗

) such that the distance δ

∗

(u, v) between any two vertices u, v ∈ V in the

emulator is at least δ(u, v) and at most αδ(u, v) + β.

In the following section we describe the nota tions and lemmas which will be used

throughout this paper. In section 3, we describe our (3, 8)-approximate distance

oracle. In sec tion 4, we des c ribe (2k−1, 2)-approximate distance oracle for k > 2.

2 Preliminaries

For a given graph G = (V, E), and any subs e t S ⊆ V , we shall use the following

notations :

– N (v) : the set consisting of v and every neighbor of v in the graph G.

– N (S) : ∪

v∈S

N (v).

– p

(v) : the vertex from set S which is nearest to v (break the tie arbitrarily

in case there are multiple nearest vertices).

– δ(v, S) : distance between v and p

(v).

– E(v) : the set of edges in G which are incident on v.

– E

(v) : the se t E(v ) if v is not adjacent to any vertex of set S, and ∅ otherwise.

– E

: ∪

v∈V

(v).

– G

: the subgraph (V, E

– O

: the t-approximate distance oracle of Thorup and Zwick [12] created on

a subgraph G of G.

Our data structure will s tore information about p

(v) and δ(v, S) for each vertex

in the given graph G. To compute this information, the gra ph G can be processed

in just O(m) time as follows : insert a dummy vertex o in to the graph, connect

it to all the vertices of set S, and perform a BFS traversal on the graph starting

from o. We shall use T

to denote the set of edges of this BFS tree excluding

the edges incident on the dummy vertex.

Lemma 1. The edge set T

preserves the shortest path between v and p

(v) for

all v ∈ V . The size of T

is O(n).

Now we redeﬁne an important concept (due to Thorup and Zwick [12]) of ball

around a vertex.

Deﬁnition 3 [12] For a vertex u ∈ V and a set S ⊆ V in a graph G = (V, E),

we deﬁne ball(u, V, S) as the sub graph induced by all t hose vertices v ∈ V which

satisfy δ(u, v) < δ(u, S) (i.e., for u, it is v which is n earer than p

(u)).

We now state the following Lemma about the number of vertices and edges in

ball(u, V, S) when S is formed by random sampling.

Lemma 2. [12, 4] For a given graph G = (V, E), let S ⊆ V be a set formed

by selecting each vertex from V independently with probability q > 0. Then the

expected nu m ber of vertices and expected number of edges in ball(u, V, S) are

O(1/q) and O(1/q

) respectively.

We shall now state a few imp ortant Lemmas about the sparse subgraph (V, E

Lemma 3. If set S ⊆ V is formed by selecting each vertex independently with

probability q > 0, the expected size of the set E

would be O(n/q).

Lemma 4. If on the shortest path between any two vertices u, v ∈ V in the

graph G = (V, E) there are no two consecutive vertices in set N (S), then the

shortest path between u and v is preserved exactly in the subgraph (V, E

The above property of the edge set E

will prove to be very useful in our con-

struction. For the other case we observe the following.

Lemma 5. If the shortest path between u and v in the graph G contains at least

2 consecutive vertices from N (S), then δ(u, S) + δ(v , S) ≤ δ(u, v) + 1.

Proof. Suppose on the shortest path betwe e n u and v, u

′

be the vertex from

the set N (S) nearest to u, and v

′

be the vertex from the s e t N (S) nearest to

v. Since u

′

∈ N (S) either u

′

belongs to S or some neighbor of u

′

belongs to S.

This implies that δ(u, u

′

) ≥ δ(u, S) − 1. Similarly δ(v, v

′

) ≥ δ(v, S) − 1. Also

note that δ(u , u

′

) + δ(u

′

, v

′

) + δ(v

′

, v) = δ(u, v). Furthermore, δ(u

′

, v

′

) ≥ 1 since

there are at least two vertices fr om N (S) on the shortest path between u and v.

Therefore, δ(u, u

′

) + δ(v

′

, v) + 1 ≤ δ(u , v). This along with the lower bounds o n

δ(u, u

′

) and δ(v, v

′

) derived above imply that δ(u, S) + δ(v, S) ≤ δ(u, v) + 1.

For constr uction of our (2k −1, 2)-oracle for k > 2, we shall employ the following

result on spanners.

Theorem 2. [10] For a given u nweighted graph G = (V, E) and any integer

k > 1, t here exists an O(m) time algorithm for computing a (2k − 1)-spanner of

size O(n

1+1/k

3 A (3, c)-approximate distance oracle in expected

O(n

2−

) time

Let G be the given undirected unweighted graph. Let S be a set for med by

selecting each vertex independently with probability n

−

. Our preprocessing

algorithm for (3, c)-approximate distance oracle, where c = 8, will employ the

sparse subgraph (V, E

) and an emulator of the given graph G. We shall need

a (3, 2)-emulator which also s atisﬁes some additional properties which are very

crucial (see Lemma 7). We describe the construction and properties of this em-

ulator in the following subsection ﬁrst.

3.1 The e mulator (V, E

∗

) : its construction and properties

In the construction of the emulator, we shall e mploy the (3, 2)-spanner designed

by Baswana et al. [5 ].

Theorem 3. [5] For a given graph G = (V, E), let S

′

be a set formed by select-

ing each vertex from V independently with probability p = n

−

. It takes expected

O(m) time t o construct a (3, 2)-spanner of size O(n

4/3

) t hat satisﬁes the follow-

ing additional properties for each u ∈ V .

1. If u ∈ V has no neighbor from set S

′

in G, then every edge incident onto u

will be in the spanner.

2. If u has one or more neighbors from set S

′

in G, then for some unique

neighbor among them, denoted by c(u), the following assertions hold true.

(a) t he edge (u, c(u)) is present in the spanner also.

(b) for each edge (u, v) ∈ E not present in the spanner, there is a path

between c(u) and c(v) in the spann er with length at most 3.

Distance Oracles for Unweighted Graphs: Breaking the Quadratic Barrier with Constant Additive Error

Figures

Citations

Shortest-path queries in static networks

Additive spanners and (α, β)-spanners

Distance Oracles beyond the Thorup-Zwick Bound

Distance Oracles for Sparse Graphs

Approximate distance oracles with constant query time

References

On Observing Nondeterminism and Concurrency

Approximate distance oracles

Approximate distance oracles

Fast Estimation of Diameter and Shortest Paths (Without Matrix Multiplication)

Extremal problems in graph theory

Related Papers (5)

Approximate distance oracles

Deterministic constructions of approximate distance oracles and spanners

Extremal problems in graph theory

$(1 + \epsilon,\beta)$-Spanner Constructions for General Graphs

Spanners and emulators with sublinear distance errors

Frequently Asked Questions (1)

Q1. What are the contributions mentioned in the paper "Distance oracles for unweighted graphs : breaking the quadratic barrier with constant additive error" ?