scispace - formally typeset
Open AccessJournal ArticleDOI

The degree sequence of a scale-free random graph process

TLDR
Here the authors obtain P(d) asymptotically for all d≤n1/15, where n is the number of vertices, proving as a consequence that γ=3.9±0.1 is obtained.
Abstract
Recently, Barabasi and Albert [2] suggested modeling complex real-world networks such as the worldwide web as follows: consider a random graph process in which vertices are added to the graph one at a time and joined to a fixed number of earlier vertices, selected with probabilities proportional to their degrees. In [2] and, with Jeong, in [3], Barabasi and Albert suggested that after many steps the proportion P(d) of vertices with degree d should obey a power law P(d)αd−γ. They obtained γ=2.9±0.1 by experiment and gave a simple heuristic argument suggesting that γ=3. Here we obtain P(d) asymptotically for all d≤n1/15, where n is the number of vertices, proving as a consequence that γ=3. © 2001 John Wiley & Sons, Inc. Random Struct. Alg., 18, 279–290, 2001

read more

Content maybe subject to copyright    Report

The Degree Sequence of a Scale-Free
Random Graph Process
ela Bollob´as,
1, 2
Oliver Riordan,
2
Joel Spencer,
3
abor Tusn´ady
4
1
Department of Mathematical Sciences, University of Memphis, Memphis,
Tennessee 38152
2
Trinity College, Cambridge CB2 1TQ, United Kingdom
3
Courant Institute of Mathematical Sciences, New York University, New York,
New York, 10003
4
enyi Institute, Budapest, Hungary
Received 29 August 2000; accepted 23 January 2001
ABSTRACT: Recently, Barab
´
asi and Albert [2] suggested modeling complex real-world net-
works such as the worldwide web as follows: consider a random graph process in which
vertices are added to the graph one at a time and joined to a fixed number of earlier ver-
tices, selected with probabilities proportional to their degrees. In [2] and, with Jeong, in [3],
Barab
´
asi and Albert suggested that after many steps the proportion Pd of vertices with de-
gree d should obey a power law Pd
α
d
γ
. They obtained γ = 29 ± 01 by experiment and
gave a simple heuristic argument suggesting that γ = 3. Here we obtain Pd asymptotically
for all d n
1/15
, where n is the number of vertices, proving as a consequence that γ = 3.
© 2001 John Wiley & Sons, Inc. Random Struct. Alg., 18, 279–290, 2001
1. INTRODUCTION
Recently there has been considerable interest in using random graphs to model
complex real-world networks to gain an insight into their properties. One of the
Correspondence to: Oliver Riordan; e-mail: omr10@dpmms.cam.ac.uk
Contract grant sponsor: NSF.
Contract grant number: DSM9971788.
© 2001 John Wiley & Sons, Inc.
279

280 BOLLOB
´
AS ET AL.
most basic properties of a graph or network is its degree sequence. For the stan-
dard random graph model n m of all graphs with m edges on a fixed set of n
vertices, introduced by Erd
˝
os and R
´
enyi in [8] and studied in detail in [9], there is
a “characteristic” degree 2m/n: the vertex degrees have approximately a Poisson
or normal distribution with mean 2m/n. The same applies to the closely related
model n p introduced by Gilbert [10], where vertices are joined independently
with probability p. In contrast, Barab
´
asi and Albert [2], as well as several other
groups (see [4, 14] and the references therein), noticed that in many real-world ex-
amples the degree sequence has a “scale-free” power law distribution: the fraction
Pd of vertices with degree d is proportional over a large range to d
γ
, where γ
is a constant independent of the size of the network. To explain this phenomenon,
Barab
´
asi and Albert [2] suggested the following random graph process as a
model.
 starting with a small number (m
0
) of vertices, at every time step
we add a new vertex with m≤m
0
edges that link the new vertex to m
different vertices already present in the system. To incorporate preferen-
tial attachment, we assume that the probability that a new vertex will
be connected to a vertex i depends on the connectivity k
i
of that ver-
tex, so that k
i
=k
i
/
j
k
j
. After t steps the model leads to a random
network with t + m
0
vertices and mt edges.
This process is intended as a highly simplified model of the growth of the world-
wide web, for example, the vertices representing sites or web pages, and the edges
links from sites to earlier sites. The preferential attachment assumption is based on
the idea that a new site is more likely to link to existing sites which are “popular” at
the time the site is added. For m = 1 this process is very similar to the nonuniform
random recursive tree process considered in [15, 17, 18]. An alternative model,
replacing the preferential attachment assumption by a notion of “link copying” is
given in [12, 14]. We shall discuss these models briefly in the final section.
In [2, 3] it is stated that computer experiments for the process above suggest that
Pd
α
d
γ
with γ = 29 ± 01. In [3], the following heuristic argument is given to
suggest that γ = 3: consider the degree d
i
of the ith new vertex v
i
at time t, i.e.,
when there are t + m
0
vertices and mt edges. When a new vertex is added, the
probability that it is joined to v
i
is md
i
over the sum of the degrees, i.e., over 2mt.
This suggests the “mean-field theory”
d d
i
dt
=
d
i
2t
With the initial condition that d
i
= m when t = i this gives d
i
= mt/i
1/2
, which
yields γ = 3.
Here we show how one can calculate the exact distribution of d
i
at time t and
obtain an asymptotic formula for Pd, d t
1/15
, which gives γ = 3 as a simple
consequence. The first step is to give an exact definition of a random graph process
that fits the rather vague description given above.

DEGREE SEQUENCE OF A RANDOM GRAPH 281
2. THE MODEL
The description of the random graph process quoted above is rather imprecise.
First, as the degrees are initially zero, it is not clear how the process is supposed to
get started. More seriously, the expected number of edges linking a new vertex v to
earlier vertices is
i
k
i
=1, rather than m. Also, when choosing in one go a set
S of m earlier vertices as the neighbors of v, the distribution of S is not specified
by giving the marginal probability that each vertex lies in S. For a trivial example,
suppose that m = 2 and that the first four vertices form a four-cycle. Then for any
0 α 1/4 we could join the fifth vertex to each adjacent pair with probability
α and to each nonadjacent pair with probability 1/2 2α. This suggests that for
m>1 we should choose the neighbors of v one at a time. Once doing so, it is very
natural to allow some of these neighbors to be the same, creating multiple edges in
the graph. Here we shall consider the precise model introduced in [6], which turns
out to be particularly pleasant to work with. This model fits the description above
except that it allows multiple edges and also loops—in terms of the interpretation
there is no reason to exclude these. Once the process gets started there will in any
case not be many loops or multiple edges, so they should have little effect overall.
The following definition is essentially as given in [6]; we write d
G
v for the total
(in plus out) degree of the vertex v in the graph G.
We start with the case m = 1. Consider a fixed sequence of vertices v
1
v
2

We shall inductively define a random graph process G
t
1
t0
so that G
t
1
is a directed
graph on v
i
1 i t, as follows. Start with G
0
1
the “graph” with no vertices, or
with G
1
1
the graph with one vertex and one loop. Given G
t1
1
, form G
t
1
by adding
the vertex v
t
together with a single edge directed from v
t
to v
i
, where i is chosen
randomly with
i = s=
d
G
t1
1
v
s
/2t 1 1 s t 1
1/2t 1 s = t
(1)
In other words, we send an edge e from v
t
to a random vertex v
i
, where the prob-
ability that a vertex is chosen as v
i
is proportional to its (total) degree at the time,
counting e as already contributing one to the degree of v
t
.Form>1weaddm
edges from v
t
one at a time, counting the previous edges as well as the “outward
half of the edge being added as already contributing to the degrees. Equivalently,
we define the process G
t
m
t0
by running the process G
t
1
on a sequence v
1
v
2
;
the graph G
t
m
is formed from G
mt
1
by identifying the vertices v
1
v
2
v
m
to form
v
1
, identifying v
m+1
v
m+2
v
2m
to form v
2
, and so on.
We shall write
n
m
for the probability space of directed graphs on n vertices
v
1
v
2
v
n
where a random G
n
m
n
m
has the distribution derived from the
process above. As G
n
m
is defined in terms of G
mn
1
, for most of the time we shall
consider the case m = 1. As noted in [6], there is an alternative description of the
distribution of G
n
1
in terms of pairings.
An n-pairing is a partition of the set 1 22n into pairs, so there are
2n 1!! =2n!/n!2
n
n-pairings. Thinking of the elements 1 22n of the
ground set as points on the x axis, and the pairs as chords joining them, we shall
speak of the left and right endpoint of each pair.

282 BOLLOB
´
AS ET AL.
We form a directed graph φ from an n-pairing as follows: starting from
the left, merge all endpoints up to and including the first right endpoint reached to
form the vertex v
1
. Then merge all further endpoints up to the next right endpoint
to form v
2
, and so on to v
n
. For the edges, replace each pair by a directed edge
from the vertex corresponding to its right endpoint to that corresponding to its left
endpoint. As noted in [6], if is chosen uniformly at random from all 2n 1!!
n-pairings, then φ has the same distribution as a random G
n
1
n
1
. This state-
ment is easy to prove by induction on n: thinking in terms of pairings of distinct
points on the x axis, one can obtain a random n 1-pairing from a random
n-pairing by deleting the pair containing the rightmost point. The reverse process,
starting from an n 1-pairing , is to add a new pair with its right endpoint to the
right of everything in and its left endpoint in one of the 2n 1 possible places.
Now a vertex of degree d in φ corresponds to d intervals between endpoints
in . The effect of adding a new pair to as described is thus to add a new vertex
to φ together with a new edge to a vertex chosen according to (1), with t = n.
The advantage of this description from pairings is that it gives us a simple non-
recursive definition of the distribution of G
n
m
, enabling us to calculate properties of
G
n
m
directly. We now use this to study the degrees of G
n
m
.
3. THE DEGREES OF G
n
m
In [2] it was suggested that the fraction of vertices of G
n
m
having degree d should fall
off as d
3
as d →∞. We shall prove the following precise version of this statement,
writing #
n
m
d for the number of vertices of G
n
m
with indegree equal to d, i.e., with
(total) degree m + d.
Theorem 1. Let m 1 be fixed, and let G
n
m
n0
be the random graph process defined
in Section 2. Let
α
m d
=
2mm + 1
d + md + m + 1d + m + 2
and let >0 be fixed. Then with probability tending to 1 as n →∞we have
1 α
m d
#
n
m
d
n
≤1 + α
m d
for every d in the range 0 d n
1/15
.
In turns out that we only need to calculate the expectation of #
n
m
d; the con-
centration result is then given by applying the following standard inequality due to
Azuma [1] and Hoeffding [11] (see also [5]).
Lemma 2 (Azuma–Hoeffding inequality). Let X
t
n
t=0
be a martingale with X
t+1
X
t
≤c for t = 0n 1. Then
X
n
X
0
≥x
exp
x
2
2c
2
n

DEGREE SEQUENCE OF A RANDOM GRAPH 283
The strategy of the proof is as follows. First, as mentioned earlier, the results for
general m will follow from those for m = 1. We shall use the pairing model to find
explicitly the distribution of D
k
, the sum of the first k degrees, in this case, and
also the distribution of the next degree, d
G
n
1
v
k+1
, given D
k
. One could combine
these formulae to give a rather unilluminating expression for the distribution of
d
G
n
1
v
k+1
; instead we show that D
k
is concentrated about a certain value and hence
find approximately the probability that d
G
n
1
v
k+1
=d. Summing over k gives us the
expectation of #
n
1
d, and concentration follows from Lemma 2.
Before turning to the distributions of the (total) degrees for m = 1, we note that
their expectations are easy to calculate exactly:
Ɛd
G
t
1
v
t
 = 1 +
1
2t 1
Also, for s<t,
Ɛ
d
G
t
1
v
s
d
G
t1
1
v
s
= d
G
t1
1
v
s
+
d
G
t1
1
v
s
2t 1
which implies that
Ɛ
d
G
t
1
v
s
=
2t
2t 1
Ɛd
G
t1
1
v
s

Thus, for 1 s n,
Ɛ
d
G
n
1
v
s
=
n
i=s
2i
2i 1
=
4
ns+1
n!
2
2s 2!
2n!s 1!
2
=
n/s
1 + O1/s
using Stirling’s formula.
If every degree of G
n
1
were equal to its expectation this would give the proposed
distribution, but in fact the degrees can be far from their expectations. Indeed we
shall see that for almost all vertices the most likely degree is 1!
Let us write d
i
for d
G
n
1
v
i
, i.e., for the (total) degree of the vertex v
i
in the graph
G
n
1
. Our aim is to describe the distributions of the individual d
i
. To do this it turns
out to be useful to consider their sums D
k
=
k
i=1
d
i
.
Consider first the event D
k
2k = s, where 0 s n k. This is the event
that the last n k vertices of G
n
1
send exactly s edges to the first k vertices. This
event corresponds to pairings in which the kth right endpoint is 2k +s. Consider
any pairing with this property. We shall split into two partial pairings, the left
partial pairing and the right partial pairing , each consisting of some number
of pairs together with some unpaired elements. For we take the partial pairing
on 12k + s induced by , for that on 2k + s + 12n. From the
restriction on ,in the element 2k + s must be paired with one of 12k +
s 1, precisely s of the remaining 2k + s 2 elements must be unpaired, and the
other 2k 1 elements must be paired off somehow. Any of the
2k + s 1
2k + s 2
s
2k 2!
2
k1
k 1!

Citations
More filters
Journal ArticleDOI

The Structure and Function of Complex Networks

Mark Newman
- 01 Jan 2003 - 
TL;DR: Developments in this field are reviewed, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.
Journal ArticleDOI

Evolution of networks

TL;DR: The recent rapid progress in the statistical physics of evolving networks is reviewed, and how growing networks self-organize into scale-free structures is discussed, and the role of the mechanism of preferential linking is investigated.
Journal ArticleDOI

A Brief History of Generative Models for Power Law and Lognormal Distributions

TL;DR: A rich and long history is found of how lognormal distributions have arisen as a possible alternative to power law distributions across many fields, focusing on underlying generative models that lead to these distributions.
Book

Random graph dynamics

TL;DR: The Erdos-Renyi random graphs model, a version of the CHKNS model, helps clarify the role of randomness in the distribution of values in the discrete-time world.
Journal ArticleDOI

Connected Components in Random Graphs with Given Expected Degree Sequences

TL;DR: In this article, the authors consider a family of random graphs with a given expected degree sequence and examine the distribution of the sizes/volumes of the connected components which turns out depending primarily on the average degree d and the second-order average degree D~.
References
More filters
Journal ArticleDOI

Emergence of Scaling in Random Networks

TL;DR: A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.
Book ChapterDOI

Probability Inequalities for sums of Bounded Random Variables

TL;DR: In this article, upper bounds for the probability that the sum S of n independent random variables exceeds its mean ES by a positive number nt are derived for certain sums of dependent random variables such as U statistics.
Book

Random Graphs

Journal ArticleDOI

Mean-field theory for scale-free random networks

TL;DR: A mean-field method is developed to predict the growth dynamics of the individual vertices of the scale-free model, and this is used to calculate analytically the connectivity distribution and the scaling exponents.