The degree sequence of a scale-free random graph process

doi:10.1002/RSA.1009

The Degree Sequence of a Scale-Free

Random Graph Process

B´ela Bollob´as,

1, 2

Oliver Riordan,

2

Joel Spencer,

3

G´abor Tusn´ady

4

1

Department of Mathematical Sciences, University of Memphis, Memphis,

Tennessee 38152

2

Trinity College, Cambridge CB2 1TQ, United Kingdom

3

Courant Institute of Mathematical Sciences, New York University, New York,

New York, 10003

4

R´enyi Institute, Budapest, Hungary

Received 29 August 2000; accepted 23 January 2001

ABSTRACT: Recently, Barab

´

asi and Albert [2] suggested modeling complex real-world net-

works such as the worldwide web as follows: consider a random graph process in which

vertices are added to the graph one at a time and joined to a ﬁxed number of earlier ver-

tices, selected with probabilities proportional to their degrees. In [2] and, with Jeong, in [3],

Barab

´

asi and Albert suggested that after many steps the proportion Pd of vertices with de-

gree d should obey a power law Pd

α

d

−γ

. They obtained γ = 29 ± 01 by experiment and

gave a simple heuristic argument suggesting that γ = 3. Here we obtain Pd asymptotically

for all d ≤ n

1/15

, where n is the number of vertices, proving as a consequence that γ = 3.

1. INTRODUCTION

Recently there has been considerable interest in using random graphs to model

complex real-world networks to gain an insight into their properties. One of the

Correspondence to: Oliver Riordan; e-mail: omr10@dpmms.cam.ac.uk

Contract grant sponsor: NSF.

Contract grant number: DSM9971788.

279

280 BOLLOB

´

AS ET AL.

most basic properties of a graph or network is its degree sequence. For the stan-

dard random graph model n m of all graphs with m edges on a ﬁxed set of n

vertices, introduced by Erd

˝

os and R

´

enyi in [8] and studied in detail in [9], there is

a “characteristic” degree 2m/n: the vertex degrees have approximately a Poisson

or normal distribution with mean 2m/n. The same applies to the closely related

model n p introduced by Gilbert [10], where vertices are joined independently

with probability p. In contrast, Barab

´

asi and Albert [2], as well as several other

groups (see [4, 14] and the references therein), noticed that in many real-world ex-

amples the degree sequence has a “scale-free” power law distribution: the fraction

Pd of vertices with degree d is proportional over a large range to d

−γ

, where γ

is a constant independent of the size of the network. To explain this phenomenon,

Barab

´

asi and Albert [2] suggested the following random graph process as a

model.

 starting with a small number (m

0

) of vertices, at every time step

we add a new vertex with m≤m

0

 edges that link the new vertex to m

different vertices already present in the system. To incorporate preferen-

tial attachment, we assume that the probability  that a new vertex will

be connected to a vertex i depends on the connectivity k

i

of that ver-

tex, so that k

i

=k

i

/



j

k

j

. After t steps the model leads to a random

network with t + m

0

vertices and mt edges.

This process is intended as a highly simpliﬁed model of the growth of the world-

wide web, for example, the vertices representing sites or web pages, and the edges

links from sites to earlier sites. The preferential attachment assumption is based on

the idea that a new site is more likely to link to existing sites which are “popular” at

the time the site is added. For m = 1 this process is very similar to the nonuniform

random recursive tree process considered in [15, 17, 18]. An alternative model,

replacing the preferential attachment assumption by a notion of “link copying” is

given in [12, 14]. We shall discuss these models brieﬂy in the ﬁnal section.

In [2, 3] it is stated that computer experiments for the process above suggest that

Pd

α

d

−γ

with γ = 29 ± 01. In [3], the following heuristic argument is given to

suggest that γ = 3: consider the degree d

i

of the ith new vertex v

i

at time t, i.e.,

when there are t + m

0

vertices and mt edges. When a new vertex is added, the

probability that it is joined to v

i

is md

i

over the sum of the degrees, i.e., over 2mt.

This suggests the “mean-ﬁeld theory”

d d

i

dt

=

d

i

2t



With the initial condition that d

i

= m when t = i this gives d

i

= mt/i

1/2

, which

yields γ = 3.

Here we show how one can calculate the exact distribution of d

i

at time t and

obtain an asymptotic formula for Pd, d ≤ t

1/15

, which gives γ = 3 as a simple

consequence. The ﬁrst step is to give an exact deﬁnition of a random graph process

that ﬁts the rather vague description given above.

DEGREE SEQUENCE OF A RANDOM GRAPH 281

2. THE MODEL

The description of the random graph process quoted above is rather imprecise.

First, as the degrees are initially zero, it is not clear how the process is supposed to

get started. More seriously, the expected number of edges linking a new vertex v to

earlier vertices is



i

k

i

=1, rather than m. Also, when choosing in one go a set

S of m earlier vertices as the neighbors of v, the distribution of S is not speciﬁed

by giving the marginal probability that each vertex lies in S. For a trivial example,

suppose that m = 2 and that the ﬁrst four vertices form a four-cycle. Then for any

0 ≤ α ≤ 1/4 we could join the ﬁfth vertex to each adjacent pair with probability

α and to each nonadjacent pair with probability 1/2 − 2α. This suggests that for

m>1 we should choose the neighbors of v one at a time. Once doing so, it is very

natural to allow some of these neighbors to be the same, creating multiple edges in

the graph. Here we shall consider the precise model introduced in [6], which turns

out to be particularly pleasant to work with. This model ﬁts the description above

except that it allows multiple edges and also loops—in terms of the interpretation

there is no reason to exclude these. Once the process gets started there will in any

case not be many loops or multiple edges, so they should have little effect overall.

The following deﬁnition is essentially as given in [6]; we write d

G

v for the total

(in plus out) degree of the vertex v in the graph G.

We start with the case m = 1. Consider a ﬁxed sequence of vertices v

1

v

2



We shall inductively deﬁne a random graph process G

t

1



t≥0

so that G

t

1

is a directed

graph on v

i

 1 ≤ i ≤ t, as follows. Start with G

0

1

the “graph” with no vertices, or

with G

1

the graph with one vertex and one loop. Given G

t−1

1

, form G

t

1

by adding

the vertex v

t

together with a single edge directed from v

t

to v

i

, where i is chosen

randomly with

i = s=



d

G

t−1

1

v

s



/2t − 1 1 ≤ s ≤ t − 1

1/2t − 1 s = t

(1)

In other words, we send an edge e from v

t

to a random vertex v

i

, where the prob-

ability that a vertex is chosen as v

i

is proportional to its (total) degree at the time,

counting e as already contributing one to the degree of v

t

.Form>1weaddm

edges from v

t

one at a time, counting the previous edges as well as the “outward

half” of the edge being added as already contributing to the degrees. Equivalently,

we deﬁne the process G

t

m



t≥0

by running the process G

t

1

 on a sequence v



1

v



2

;

the graph G

t

m

is formed from G

mt

1

by identifying the vertices v



1

v



2

v



m

to form

v

1

, identifying v



m+1

v



m+2

v



2m

to form v

2

, and so on.

We shall write 

n

m

for the probability space of directed graphs on n vertices

v

1

v

2

v

n

 where a random G

n

m

∈ 

n

m

has the distribution derived from the

process above. As G

n

m

is deﬁned in terms of G

mn

1

, for most of the time we shall

consider the case m = 1. As noted in [6], there is an alternative description of the

distribution of G

n

1

in terms of pairings.

An n-pairing  is a partition of the set 1 22n into pairs, so there are

2n − 1!! =2n!/n!2

n

 n-pairings. Thinking of the elements 1 22n of the

ground set as points on the x axis, and the pairs as chords joining them, we shall

speak of the left and right endpoint of each pair.

282 BOLLOB

´

AS ET AL.

We form a directed graph φ  from an n-pairing  as follows: starting from

the left, merge all endpoints up to and including the ﬁrst right endpoint reached to

form the vertex v

1

. Then merge all further endpoints up to the next right endpoint

to form v

2

, and so on to v

n

. For the edges, replace each pair by a directed edge

from the vertex corresponding to its right endpoint to that corresponding to its left

endpoint. As noted in [6], if  is chosen uniformly at random from all 2n − 1!!

n-pairings, then φ has the same distribution as a random G

n

1

∈ 

n

1

. This state-

ment is easy to prove by induction on n: thinking in terms of pairings of distinct

points on the x axis, one can obtain a random n − 1-pairing from a random

n-pairing by deleting the pair containing the rightmost point. The reverse process,

starting from an n −1-pairing  , is to add a new pair with its right endpoint to the

right of everything in  and its left endpoint in one of the 2n − 1 possible places.

Now a vertex of degree d in φ  corresponds to d intervals between endpoints

in  . The effect of adding a new pair to  as described is thus to add a new vertex

to φ together with a new edge to a vertex chosen according to (1), with t = n.

The advantage of this description from pairings is that it gives us a simple non-

recursive deﬁnition of the distribution of G

n

m

, enabling us to calculate properties of

G

n

m

directly. We now use this to study the degrees of G

n

m

.

3. THE DEGREES OF G

n

m

In [2] it was suggested that the fraction of vertices of G

n

m

having degree d should fall

off as d

−3

as d →∞. We shall prove the following precise version of this statement,

writing #

n

m

d for the number of vertices of G

n

m

with indegree equal to d, i.e., with

(total) degree m + d.

Theorem 1. Let m ≥ 1 be ﬁxed, and let G

n

m



n≥0

be the random graph process deﬁned

in Section 2. Let

α

m d

=

2mm + 1

d + md + m + 1d + m + 2



and let >0 be ﬁxed. Then with probability tending to 1 as n →∞we have

1 − α

m d

≤

#

n

m

d

n

≤1 + α

m d

for every d in the range 0 ≤ d ≤ n

1/15

.

In turns out that we only need to calculate the expectation of #

n

m

d; the con-

centration result is then given by applying the following standard inequality due to

Azuma [1] and Hoeffding [11] (see also [5]).

Lemma 2 (Azuma–Hoeffding inequality). Let X

t



n

t=0

be a martingale with X

t+1

−

X

t

≤c for t = 0n− 1. Then





X

n

− X

0

≥x



≤ exp



−

x

2

2c

2

n





DEGREE SEQUENCE OF A RANDOM GRAPH 283

The strategy of the proof is as follows. First, as mentioned earlier, the results for

general m will follow from those for m = 1. We shall use the pairing model to ﬁnd

explicitly the distribution of D

k

, the sum of the ﬁrst k degrees, in this case, and

also the distribution of the next degree, d

G

n

1

v

k+1

, given D

k

. One could combine

these formulae to give a rather unilluminating expression for the distribution of

d

G

n

1

v

k+1

; instead we show that D

k

is concentrated about a certain value and hence

ﬁnd approximately the probability that d

G

n

1

v

k+1

=d. Summing over k gives us the

expectation of #

n

1

d, and concentration follows from Lemma 2.

Before turning to the distributions of the (total) degrees for m = 1, we note that

their expectations are easy to calculate exactly:

Ɛd

G

t

1

v

t

 = 1 +

1

2t − 1



Also, for s<t,

Ɛ



d

G

t

1

v

s

d

G

t−1

1

v

s





= d

G

t−1

1

v

s

+

d

G

t−1

1

v

s



2t − 1



which implies that

Ɛ



d

G

t

1

v

s





=

2t

2t − 1

Ɛd

G

t−1

1

v

s



Thus, for 1 ≤ s ≤ n,

Ɛ



d

G

n

1

v

s





=

n



i=s

2i

2i − 1

=

4

n−s+1

n!

2

2s − 2!

2n!s − 1!

2

=



n/s



1 + O1/s





using Stirling’s formula.

If every degree of G

n

1

were equal to its expectation this would give the proposed

distribution, but in fact the degrees can be far from their expectations. Indeed we

shall see that for almost all vertices the most likely degree is 1!

Let us write d

i

for d

G

n

1

v

i

, i.e., for the (total) degree of the vertex v

i

in the graph

G

n

1

. Our aim is to describe the distributions of the individual d

i

. To do this it turns

out to be useful to consider their sums D

k

=



k

i=1

d

i

.

Consider ﬁrst the event D

k

− 2k = s, where 0 ≤ s ≤ n − k. This is the event

that the last n − k vertices of G

n

1

send exactly s edges to the ﬁrst k vertices. This

event corresponds to pairings  in which the kth right endpoint is 2k +s. Consider

any pairing  with this property. We shall split  into two partial pairings, the left

partial pairing  and the right partial pairing , each consisting of some number

of pairs together with some unpaired elements. For  we take the partial pairing

on 12k + s induced by  , for  that on 2k + s + 12n. From the

restriction on ,in the element 2k + s must be paired with one of 12k +

s − 1, precisely s of the remaining 2k + s − 2 elements must be unpaired, and the

other 2k − 1 elements must be paired off somehow. Any of the

2k + s − 1



2k + s − 2

s



2k − 2!

2

k−1

k − 1!

The degree sequence of a scale-free random graph process

Citations

The Structure and Function of Complex Networks

Evolution of networks

A Brief History of Generative Models for Power Law and Lognormal Distributions

Random graph dynamics

Connected Components in Random Graphs with Given Expected Degree Sequences

References

Emergence of Scaling in Random Networks

Probability Inequalities for sums of Bounded Random Variables

Random Graphs

The evolution of random graphs

Mean-field theory for scale-free random networks

Related Papers (5)

Emergence of Scaling in Random Networks

Statistical mechanics of complex networks

Collective dynamics of small-world networks

Random Graphs

On power-law relationships of the Internet topology