What contributions have the authors mentioned in the paper "Maximizing the spread of influence through a social network" ?

Models for the processes by which ideas and influence propagate through a social network have been studied in a number of domains, including the diffusion of medical and technological innovations, the sudden and widespread adoption of various strategies in game-theoretic settings, and the effects of “ word of mouth ” in the promotion of new products. The authors consider this problem in several of the most widely studied models in social network analysis. The optimization problem of selecting the most influential nodes is NP-hard here, and the authors provide the first provable approximation guarantees for efficient algorithms. Using an analysis framework based on submodular functions, the authors show that a natural greedy strategy obtains a solution that is provably within 63 % of optimal for several classes of models ; their framework suggests a general approach for reasoning about the performance guarantees of algorithms for these types of influence problems in social networks. The authors also provide computational experiments on large collaboration networks, showing that in addition to their provable guarantees, their approximation algorithms significantly out-perform nodeselection heuristics based on the well-studied notions of degree centrality and distance centrality from the field of social networks. Recently, motivated by the design of viral marketing strategies, Domingos and Richardson posed a fundamental algorithmic problem for such social network processes: if the authors can try to convince a subset of individuals to adopt a new product or innovation, and the goal is to trigger a large cascade of further adoptions, which set of individuals should they target ?

What is the heuristic used in the sociology literature?

The degree and centrality-based heuristics are commonly used in the sociology literature as estimates of a node’s influence [30].

How can the authors obtain a close approximation to (A)?

by simulating the diffusion process and sampling the resulting active sets, the authors are able to obtain arbitrarily close approximations to σ(A), with high probability.

How many neighbors would be successful in a weighted cascade model?

The weighted cascade model resembles the linear threshold model in that the expected number of neighbors who would succeed in activating a node v is 1 in both models.

What is the effect of activating the nodes corresponding to the k sets?

If there are at most k sets that cover all elements, then activating the nodes corresponding to these k sets will activate all of the nodes ui, and thus also all of the xj .

What is the probability that v becomes active in iteration t+1?

If node v has not become active by the end of iteration t, then the probability that it becomes active in iteration t+1 is equal to the chance that the influence weights in At \\At−1 push it over its threshold, given that its threshold was not exceededalready; this probability is∑ u∈At\\At−1 bv,u1 − ∑u∈At−1 bv,u .

What is the effect of the weighted influence function on the quantity v?

If the authors let B denote the (random) set activated by the process with initial activation A, then the authors can define the weighted influence function σw(A) to be the expected value over outcomes B of the quantity ∑ v∈B wv .

What is the probability of a node being active at time t?

The authors are concerned with the sum over all time steps t ≤ τ of the expected number of active nodes at time t, for a given a time limit τ , while [10, 26] study the limit of this process: the expected number of nodes active at time t as t goes to infinity.

(Open Access) Maximizing the spread of influence through a social network (2003) | David Kempe

Q: What is the generality of the models the authors consider?

The generality of the models the authors consider lies between that of the polynomial-time solvable model of [26] and the very general model of [10], where the optimization problem cannot even be approximated to within a non-trivial factor.

Q: How can the authors extend the result of Nemhauser et al. to a?

one can extend the result of Nemhauser et al. to show that for any ε > 0, there is a γ > 0 such that by using (1 + γ)-approximate values for the function to be optimized, the authors obtain a (1−1/e−ε)-approximation.

Maximizing the Spread of Inﬂuence through a Social

Network

David Kempe

∗

Dept. of Computer Science

Cornell University, Ithaca NY

kempe@cs.cornell.edu

Jon Kleinberg

†

Dept. of Computer Science

Cornell University, Ithaca NY

kleinber@cs.cornell.edu

Eva Tardos

‡

Dept. of Computer Science

Cornell University, Ithaca NY

eva@cs.cornell.edu

ABSTRACT

Models for the processes by which ideas and inﬂuence propagate

through a social network have been studied in a number of do-

mains, including the diffusion of medical and technological innova-

tions, the sudden and widespread adoption of various strategies in

game-theoretic settings, and the effects of “word of mouth” in the

promotion of new products. Recently, motivated by the design of

viral marketing strategies, Domingos and Richardson posed a fun-

damental algorithmic problem for such social network processes:

if we can try to convince a subset of individuals to adopt a new

product or innovation, and the goal is to trigger a large cascade of

further adoptions, which set of individuals should we target?

We consider this problem in several of the most widely studied

models in social network analysis. The optimization problem of

selecting the most inﬂuential nodes is NP-hard here, and we pro-

vide the ﬁrst provable approximation guarantees for efﬁcient algo-

rithms. Using an analysis framework based on submodular func-

tions, we show that a natural greedy strategy obtains a solution that

is provably within 63% of optimal for several classes of models;

our framework suggests a general approach for reasoning about the

performance guarantees of algorithms for these types of inﬂuence

problems in social networks.

We also provide computational experiments on large collabora-

tion networks, showing that in addition to their provable guaran-

tees, our approximation algorithms signiﬁcantly out-perform node-

selection heuristics based on the well-studied notions of degree

centrality and distance centrality from the ﬁeld of social networks.

Categories and Subject Descriptors

F.2.2 [Analysis of Algorithms and Problem Complexity]: Non-

numerical Algorithms and Problems

∗

Supported by an Intel Graduate Fellowship and an NSF Graduate

Research Fellowship.

†

Supported in part by a David and Lucile Packard Foundation Fel-

lowship and NSF ITR/IM Grant IIS-0081334.

‡

Supported in part by NSF ITR grant CCR-011337, and ONR grant

N00014-98-1-0589.

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for proﬁt or commercial advantage and that copies

bear this notice and the full citation on the ﬁrst page. To copy otherwise, to

republish, to post on servers or to redistribute to lists, requires prior speciﬁc

permission and/or a fee.

SIGKDD ’03 Washington, DC, USA

$5.00.

Keywords

approximation algorithms, social networks, viral marketing,

diffusion of innovations

1. INTRODUCTION

A social network — the graph of relationships and interactions

within a group of individuals — plays a fundamental role as a

medium for the spread of information, ideas, and inﬂuence among

its members. An idea or innovation will appear — for example, the

use of cell phones among college students, the adoption of a new

drug within the medical profession, or the rise of a political move-

ment in an unstable society — and it can either die out quickly

or make signiﬁcant inroads into the population. If we want to un-

derstand the extent to which such ideas are adopted, it can be im-

portant to understand how the dynamics of adoption are likely to

unfold within the underlying social network: the extent to which

people are likely to be affected by decisions of their friends and

colleagues, or the extent to which “word-of-mouth” effects will

take hold. Such network diffusion processes have a long history

of study in the social sciences. Some of the earliest systematic

investigations focused on data pertaining to the adoption of medi-

cal and agricultural innovations in both developed and developing

parts of the world [8, 27, 29]; in other contexts, research has inves-

tigated diffusion processes for “word-of-mouth” and “viral market-

ing” effects in the success of new products [4, 7, 10, 13, 14, 20, 26],

the sudden and widespread adoption of various strategies in game-

theoretic settings [6, 12, 21, 32, 33], and the problem of cascading

failures in power systems [2, 3].

In recent work, motivated by applications to marketing, Domin-

gos and Richardson posed a fundamental algorithmic problem for

such systems [10, 26]. Suppose that we have data on a social

network, with estimates for the extent to which individuals inﬂu-

ence one another, and we would like to market a new product that

we hope will be adopted by a large fraction of the network. The

premise of viral marketing is that by initially targeting a few “inﬂu-

ential” members of the network — say, giving them free samples

of the product — we can trigger a cascade of inﬂuence by which

friends will recommend the product to other friends, and many in-

dividuals will ultimately try it. But how should we choose the few

key individuals to use for seeding this process? In [10, 26], this

question was considered in a probabilistic model of interaction;

heuristics were given for choosing customers with a large overall

effect on the network, and methods were also developed to infer the

inﬂuence data necessary for posing these types of problems.

In this paper, we consider the issue of choosing inﬂuential sets of

individuals as a problem in discrete optimization. The optimal so-

lution is NP-hard for most models that have been studied, including

the model of [10]. The framework proposed in [26], on the other

hand, is based on a simple linear model where the solution to the

optimization problem can be obtained by solving a system of linear

equations. Here we focus on a collection of related, NP-hard mod-

els that have been extensively studied in the social networks com-

munity, and obtain the ﬁrst provable approximation guarantees for

efﬁcient algorithms in a number of general cases. The generality

of the models we consider lies between that of the polynomial-time

solvable model of [26] and the very general model of [10], where

the optimization problem cannot even be approximated to within a

non-trivial factor.

We begin by departing somewhat from the Domingos-Richardson

framework in the following sense: where their models are essen-

tially descriptive, specifying a joint distribution over all nodes’ be-

havior in a global sense, we focus on more operational models

from mathematical sociology [15, 28] and interacting particle sys-

tems [11, 17] that explicitly represent the step-by-step dynamics

of adoption. We show that approximation algorithms for maximiz-

ing the spread of inﬂuence in these models can be developed in

a general framework based on submodular functions [9, 23]. We

also provide computational experiments on large collaboration net-

works, showing that in addition to their provable guarantees, our al-

gorithms signiﬁcantly out-perform node-selection heuristics based

on the well-studied notions of degree centrality and distance cen-

trality [30] from the ﬁeld of social network analysis.

Two Basic Diffusion Models

. In considering operational models

for the spread of an idea or innovation through a social network

G, represented by a directed graph, we will speak of each indi-

vidual node as being either active (an adopter of the innovation)

or inactive. We will focus on settings, guided by the motivation

discussed above, in which each node’s tendency to become active

increases monotonically as more of its neighbors become active.

Also, we will focus for now on the progressive case in which nodes

can switch from being inactive to being active, but do not switch in

the other direction; it turns out that this assumption can easily be

lifted later. Thus, the process will look roughly as follows from the

perspective of an initially inactive node v: as time unfolds, more

and more of v’s neighbors become active; at some point, this may

cause v to become active, and v’s decision may in turn trigger fur-

ther decisions by nodes to which v is connected.

Granovetter and Schelling were among the ﬁrst to propose mod-

els that capture such a process; their approach was based on the use

of node-speciﬁc thresholds [15, 28]. Many models of this ﬂavor

have since been investigated (see e.g. [5, 15, 18, 19, 21, 25, 28, 29,

31, 32, 33])but the following Linear Threshold Model lies at the

core of most subsequent generalizations. In this model, a node v is

inﬂuenced by each neighbor w according to a weight b

v,w

such that



w neighbor of v

v,w

≤ 1. The dynamics of the process then proceed

as follows. Each node v chooses a threshold θ

uniformly at ran-

dom from the interval [0, 1]; this represents the weighted fraction

of v’s neighbors that must become active in order for v to become

active. Given a random choice of thresholds, and an initial set of

active nodes A

(with all other nodes inactive), the diffusion pro-

cess unfolds deterministically in discrete steps: in step t, all nodes

that were active in step t − 1 remain active, and we activate any

node v for which the total weight of its active neighbors is at least



w active neighbor of v

v,w

≥ θ

Thus, the thresholds θ

intuitively represent the different latent ten-

dencies of nodes to adopt the innovation when their neighbors do;

the fact that these are randomly selected is intended to model our

lack of knowledge of their values — we are in effect averaging

over possible threshold values for all the nodes. (Another class of

approaches hard-wires all thresholds at a known value like 1/2; see

for example work by Berger [5], Morris [21], and Peleg [25].)

Based on work in interacting particle systems [11, 17] from prob-

ability theory, we can also consider dynamic cascade models for

diffusion processes. The conceptually simplest model of this type

is what one could call the Independent Cascade Model, investi-

gated recently in the context of marketing by Goldenberg, Libai,

and Muller [13, 14]. We again start with an initial set of active

nodes A

, and the process unfolds in discrete steps according to

the following randomized rule. When node v ﬁrst becomes active

in step t, it is given a single chance to activate each currently inac-

tive neighbor w; it succeeds with a probability p

v,w

— a parameter

of the system — independently of the history thus far. (If w has

multiple newly activated neighbors, their attempts are sequenced in

an arbitrary order.) If v succeeds, then w will become active in step

t +1; but whether or not v succeeds, it cannot make any further at-

tempts to activate w in subsequent rounds. Again, the process runs

until no more activations are possible.

The Linear Threshold and Independent Cascade Models are two

of the most basic and widely-studied diffusion models, but of course

many extensions can be considered. We will turn to this issue later

in the paper, proposing a general framework that simultaneously

includes both of these models as special cases. For the sake of con-

creteness in the introduction, we will discuss our results in terms of

these two models in particular.

Approximation Algorithms for Inﬂuence Maximization

. We are

now in a position to formally express the Domingos-Richardson

style of optimization problem — choosing a good initial set of

nodes to target — in the context of the above models. Both the

Linear Threshold and Independent Cascade Models (as well as the

generalizations to follow) involve an initial set of active nodes A

that start the diffusion process. We deﬁne the inﬂuence of a set

of nodes A, denoted σ(A), to be the expected number of active

nodes at the end of the process, given that A is this initial active

set A

. The inﬂuence maximization problem asks, for a parameter

k, to ﬁnd a k-node set of maximum inﬂuence. (When dealing with

algorithms for this problem, we will say that the chosen set A of

k initial active nodes has been targeted for activation by the algo-

rithm.) For the models we consider, it is NP-hard to determine the

optimum for inﬂuence maximization, as we will show later.

Our ﬁrst main result is that the optimal solution for inﬂuence

maximization can be efﬁciently approximated to within a factor

of (1 − 1/e − ε), in both the Linear Threshold and Independent

Cascade models; here e is the base of the natural logarithm and

ε is any positive real number. (Thus, this is a performance guar-

antee slightly better than 63%.) The algorithm that achieves this

performance guarantee is a natural greedy hill-climbing strategy

related to the approach considered in [10], and so the main con-

tent of this result is the analysis framework needed for obtaining a

provable performance guarantee, and the fairly surprising fact that

hill-climbing is always within a factor of at least 63% of optimal

for this problem. We prove this result in Section 2 using techniques

from the theory of submodular functions [9, 23], which we describe

in detail below, and which turn out to provide a natural context for

reasoning about both models and algorithms for inﬂuence maxi-

mization.

In fact, this analysis framework allows us to design and prove

guarantees for approximation algorithms in much richer and more

realistic models of the processes by which we market to nodes. The

deterministic activation of individual nodes is a highly simpliﬁed

model; an issue also considered in [10, 26] is that we may in reality

have a large number of different marketing actions available, each

of which may inﬂuence nodes in different ways. The available bud-

get can be divided arbitrarily between these actions. We show how

to extend the analysis to this substantially more general framework.

Our main result here is that a generalization of the hill-climbing al-

gorithm still provides approximation guarantees arbitrarily close to

(1 − 1/e).

It is worth brieﬂy considering the general issue of performance

guarantees for algorithms in these settings. For both the Linear

Threshold and the Independent Cascade models, the inﬂuence max-

imization problem is NP-complete, but it can be approximated well.

In the linear model of Richardson and Domingos [26], on the other

hand, both the propagation of inﬂuence as well as the effect of the

initial targeting are linear. Initial marketing decisions here are thus

limited in their effect on node activations; each node’s probability

of activation is obtained as a linear combination of the effect of tar-

geting and the effect of the neighbors. In this fully linear model,

the inﬂuence can be maximized by solving a system of linear equa-

tions. In contrast, we can show that general models like that of

Domingos and Richardson [10], and even simple models that build

in a ﬁxed threshold (like 1/2) at all nodes [5, 21, 25], lead to inﬂu-

ence maximization problems that cannot be approximated to within

any non-trivial factor, assuming P = NP. Our analysis of approx-

imability thus suggests a way of tracing out a more delicate bound-

ary of tractability through the set of possible models, by helping to

distinguish among those for which simple heuristics provide strong

performance guarantees and those for which they can be arbitrarily

far from optimal. This in turn can suggest the development of both

more powerful algorithms, and the design of accurate models that

simultaneously allow for tractable optimization.

Following the approximation and NP-hardness results, we de-

scribe in Section 3 the results of computational experiments with

both the Linear Threshold and Independent Cascade Models, show-

ing that the hill-climbing algorithm signiﬁcantly out-performs strate-

gies based on targeting high-degree or “central” nodes [30]. In Sec-

tion 4 we then develop a general model of diffusion processes in

social networks that simultaneously generalizes the Linear Thresh-

old and Independent Cascade Models, as well as a number of other

natural cases, and we show how to obtain approximation guaran-

tees for a large sub-class of these models. In Sections 5 and 6, we

also consider extensions of our approximation algorithms to mod-

els with more realistic scenarios in mind: more complex market-

ing actions as discussed above, and non-progressive processes, in

which active nodes may become inactive in subsequent steps.

2. APPROXIMATIONGUARANTEESINTHE

INDEPENDENTCASCADEANDLINEAR

THRESHOLD MODELS

The overall approach. We begin by describing our strategy for

proving approximation guarantees. Consider an arbitrary function

f(·) that maps subsets of a ﬁnite ground set U to non-negative real

numbers.

We say that f is submodular if it satisﬁes a natural “di-

minishing returns” property: the marginal gain from adding an ele-

ment to a set S is at least as high as the marginal gain from adding

Note that the inﬂuence function σ(·) deﬁned above has this form;

it maps each subset A of the nodes of the social network to a real

number denoting the expected size of the activated set if A is tar-

geted for initial activation.

the same element to a superset of S. Formally, a submodular func-

tion satisﬁes

f(S ∪{v}) − f (S) ≥ f(T ∪{v}) − f (T ),

for all elements v and all pairs of sets S ⊆ T .

Submodular functions have a number of very nice tractability

properties; the one that is relevant to us here is the following. Sup-

pose we have a function f that is submodular, takes only non-

negative values, and is monotone in the sense that adding an ele-

ment to a set cannot cause f to decrease: f(S ∪{v}) ≥ f(S)

for all elements v and sets S. We wish to ﬁnd a k-element set S

for which f(S) is maximized. This is an NP-hard optimization

problem (it can be shown to contain the Hitting Set problem as a

simple special case), but a result of Nemhauser, Wolsey, and Fisher

[9, 23] shows that the following greedy hill-climbing algorithm ap-

proximates the optimum to within a factor of (1 − 1/e) (where e

is the base of the natural logarithm): start with the empty set, and

repeatedly add an element that gives the maximum marginal gain.

HEOREM 2.1. [9, 23] For a non-negative, monotone submod-

ular function f , let S be a set of size k obtained by selecting ele-

ments one at a time, each time choosing an element that provides

the largest marginal increase in the function value. Let S

∗

be a

set that maximizes the value of f over all k-element sets. Then

f(S) ≥ (1 −1/e)· f (S

∗

); in other words, S provides a (1 − 1/e)-

approximation.

Due to its generality, this result has found applications in a num-

ber of areas of discrete optimization (see e.g. [22]); the only direct

use of it that we are aware of in the databases and data mining lit-

erature is in a context very different from ours, for the problem of

selecting database views to materialize [16].

Our strategy will be to show that for the models we are consid-

ering, the resulting inﬂuence function σ(·) is submodular. A subtle

difﬁculty lies in the fact that the result of Nemhauser et al. assumes

that the greedy algorithm can evaluate the underlying function ex-

actly, which may not be the case for the inﬂuence function σ(A).

However, by simulating the diffusion process and sampling the re-

sulting active sets, we are able to obtain arbitrarily close approxi-

mations to σ(A), with high probability. Furthermore, one can ex-

tend the result of Nemhauser et al. to show that for any ε>0, there

is a γ>0 such that by using (1 + γ)-approximate values for the

function to be optimized, we obtain a (1−1/e−ε)-approximation.

As mentioned in the introduction, we can extend this analysis

to a general model with more complex marketing actions that can

have a probabilistic effect on the initial activation of nodes. We

show in Section 6 how, with a more careful hill-climbing algorithm

and a generalization of Theorem 2.1, we can obtain comparable

approximation guarantees in this setting.

A further extension is to assume that each node v has an asso-

ciated non-negative weight w

, capturing how important it is that

v be activated in the ﬁnal outcome. (For instance, if we are mar-

keting textbooks to college teachers, then the weight could be the

number of students in the teacher’s class, resulting in a larger or

smaller number of sales.) If we let B denote the (random) set ac-

tivated by the process with initial activation A, then we can deﬁne

the weighted inﬂuence function σ

(A) to be the expected value

over outcomes B of the quantity



v∈B

. The inﬂuence func-

tion studied above is the special case obtained by setting w

for all nodes v. The objective function with weights is submodular

whenever the unweighted version is, so we can still use the greedy

algorithm for obtaining a (1 − 1/e−ε)-approximation. Note, how-

ever, that a sampling algorithm to approximately choose the next

element may need time that depends on the sizes of the weights.

Independent Cascade

In view of the above discussion, an approximation guarantee for

inﬂuence maximization in the Independent Cascade Model will be

a consequence of the following

HEOREM 2.2. For an arbitrary instance of the Independent

Cascade Model, the resulting inﬂuence function σ(·) is submodu-

lar.

In order to establish this result, we need to look, implicitly or

explicitly, at the expression σ(A ∪{v}) − σ(A), for arbitrary sets

A and elements v. In other words, what increase do we get in the

expected number of overall activations when we add v to the set

A? This increase is very difﬁcult to analyze directly, because it is

hard to work with quantities of the form σ(A). For example, the

Independent Cascade process is underspeciﬁed, since we have not

prescribed the order in which newly activated nodes in a given step

t will attempt to activate their neighbors. Thus, it is not initially

obvious that the process is even well-deﬁned, in the sense that it

yields the same distribution over outcomes regardless of how we

schedule the attempted activations.

Our proof deals with these difﬁculties by formulating an equiv-

alent view of the process, which makes it easier to see that there

is an order-independent outcome, and which provides an alternate

way to reason about the submodularity property.

Consider a point in the cascade process when node v has just be-

come active, and it attempts to activate its neighbor w, succeeding

with probability p

v,w

. We can view the outcome of this random

event as being determined by ﬂipping a coin of bias p

v,w

. From the

point of view of the process, it clearly does not matter whether the

coin was ﬂipped at the moment that v became active, or whether it

was ﬂipped at the very beginning of the whole process and is only

being revealed now. Continuing this reasoning, we can in fact as-

sume that for each pair of neighbors (v,w) in the graph, a coin of

bias p

v,w

is ﬂipped at the very beginning of the process (indepen-

dently of the coins for all other pairs of neighbors), and the result is

stored so that it can be later checked in the event that v is activated

while w is still inactive.

With all the coins ﬂipped in advance, the process can be viewed

as follows. The edges in G for which the coin ﬂip indicated an

activation will be successful are declared to be live; the remaining

edges are declared to be blocked. If we ﬁx the outcomes of the coin

ﬂips and then initially activate a set A, it is clear how to determine

the full set of active nodes at the end of the cascade process:

LAIM 2.3. A node x ends up active if and only if there is a

path from some node in A to x consisting entirely of live edges.

(We will call such a path a live-edge path.)

Consider the probability space in which each sample point spec-

iﬁes one possible set of outcomes for all the coin ﬂips on the edges.

Let X denote one sample point in this space, and deﬁne σ

(A) to

be the total number of nodes activated by the process when A is

the set initially targeted, and X is the set of outcomes of all coin

ﬂips on edges. Because we have ﬁxed a choice for X, σ

(A) is in

fact a deterministic quantity, and there is a natural way to express

its value, as follows. Let R(v, X) denote the set of all nodes that

can be reached from v on a path consisting entirely of live edges.

By Claim 2.3, σ

(A) is the number of nodes that can be reached

on live-edge paths from any node in A, and so it is equal to the

cardinality of the union ∪

v∈A

R(v, X).

Proof of Theorem 2.2. First, we claim that for each ﬁxed out-

come X, the function σ

(·) is submodular. To see this, let S and

T be two sets of nodes such that S ⊆ T , and consider the quantity

(S ∪{v})−σ

(S). This is the number of elements in R(v, X)

that are not already in the union ∪

u∈S

R(u, X); it is at least as large

as the number of elements in R(v, X) that are not in the (bigger)

union ∪

u∈T

R(u, X). It follows that σ

(S ∪{v}) − σ

(S) ≥

(T ∪{v}) − σ

(T ), which is the deﬁning inequality for sub-

modularity. Finally, we have

σ(A)=



outcomes X

Prob[X] · σ

(A),

since the expected number of nodes activated is just the weighted

average over all outcomes. But a non-negative linear combination

of submodular functions is also submodular, and hence σ(·) is sub-

modular, which concludes the proof.

Next we show the hardness of inﬂuence maximization.

HEOREM 2.4. The inﬂuence maximization problem is NP-hard

for the Independent Cascade model.

Proof. Consider an instance of the NP-complete Set Cover prob-

lem, deﬁned by a collection of subsets S

,... ,S

of a ground

set U = {u

,... ,u

}; we wish to know whether there exist

k of the subsets whose union is equal to U. (We can assume that

k<n<m.) We show that this can be viewed as a special case of

the inﬂuence maximization problem.

Given an arbitrary instance of the Set Cover problem, we deﬁne

a corresponding directed bipartite graph with n + m nodes: there

is a node i corresponding to each set S

, a node j corresponding

to each element u

, and a directed edge (i, j) with activation prob-

ability p

i,j

=1whenever u

∈ S

. The Set Cover problem is

equivalent to deciding if there is a set A of k nodes in this graph

with σ(A) ≥ n + k. Note that for the instance we have deﬁned,

activation is a deterministic process, as all probabilities are 0 or

1. Initially activating the k nodes corresponding to sets in a Set

Cover solution results in activating all n nodes corresponding to

the ground set U , and if any set A of k nodes has σ(A) ≥ n + k,

then the Set Cover problem must be solvable.

Linear Thresholds

We now prove an analogous result for the Linear Threshold Model.

HEOREM 2.5. For an arbitrary instance of the Linear Thresh-

old Model, the resulting inﬂuence function σ(·) is submodular.

Proof. The analysis is a bit more intricate than in the proof of The-

orem 2.2, but the overall argument has a similar structure. In the

proof of Theorem 2.2, we constructed an equivalent process by ini-

tially resolving the outcomes of some random choices, considering

each outcome in isolation, and then averaging over all outcomes.

For the Linear Threshold Model, the simplest analogue would be to

consider the behavior of the process after all node thresholds have

been chosen. Unfortunately, for a ﬁxed choice of thresholds, the

number of activated nodes is not in general a submodular function

of the targeted set; this fact necessitates a more subtle analysis.

Recall that each node v has an inﬂuence weight b

v,w

≥ 0 from

each of its neighbors w, subject to the constraint that



v,w

≤ 1.

(We can extend the notation by writing b

v,w

=0when w is not a

neighbor of v.) Suppose that v picks at most one of its incoming

edges at random, selecting the edge from w with probability b

v,w

and selecting no edge with probability 1 −



v,w

. The selected

edge is declared to be “live,” and all other edges are declared to

be “blocked.” (Note the contrast with the proof of Theorem 2.2:

there, we determined whether an edge was live independently of

the decision for each other edge; here, we negatively correlate the

decisions so that at most one live edge enters each node.)

The crux of the proof lies in establishing Claim 2.6 below, which

asserts that the Linear Threshold model is equivalent to reachabil-

ity via live-edge paths as deﬁned above. Once that equivalence is

established, submodularity follows exactly as in the proof of The-

orem 2.2. We can deﬁne R(v, X) as before to be the set of all

nodes reachable from v on live-edge paths, subject to a choice X

of live/blocked designations for all edges; it follows that σ

(A) is

the cardinality of the union ∪

v∈A

R(v, X), and hence a submodu-

lar function of A; ﬁnally, the function σ(·) is a non-negative linear

combination of the functions σ

(·) and hence also submodular.

CLAIM 2.6. For a given targeted set A, the following two dis-

tributions over sets of nodes are the same:

(i) The distribution over active sets obtained by running the Lin-

ear Threshold process to completion starting from A; and

(ii) The distribution over sets reachable from A via live-edge paths,

under the random selection of live edges deﬁned above.

Proof. We need to prove that reachability under our random choice

of live and blocked edges deﬁnes a process equivalent to that of

the Linear Threshold Model. To obtain intuition about this equiv-

alence, it is useful to ﬁrst analyze the special case in which the

underlying graph G is directed and acyclic. In this case, we can

ﬁx a topological ordering of the nodes v

,... ,v

(so that all

edges go from earlier nodes to later nodes in the order), and build

up the distribution of active sets by following this order. For each

node v

, suppose we already have determined the distribution over

active subsets of its neighbors. Then under the Linear Threshold

process, the probability that v

will become active, given that a sub-

set S

of its neighbors is active, is



w∈S

. This is precisely

the probability that the live incoming edge selected by v

lies in S

and so inductively we see that the two processes deﬁne the same

distribution over active sets.

To prove the claim generally, consider a graph G that is not

acyclic. It becomes trickier to show the equivalence, because there

is no natural ordering of the nodes over which to perform induc-

tion. Instead, we argue by induction over the iterations of the Lin-

ear Threshold process. We deﬁne A

to be the set of active nodes

at the end of iteration t, for t =0, 1, 2,... (note that A

is the set

initially targeted). If node v has not become active by the end of

iteration t, then the probability that it becomes active in iteration

t +1is equal to the chance that the inﬂuence weights in A

\ A

t−1

push it over its threshold, given that its threshold was not exceeded

already; this probability is



u∈A

t−1

v,u

1 −



u∈A

t−1

v,u

On the other hand, we can run the live-edge process by revealing

the identities of the live edges gradually as follows. We start with

the targeted set A. For each node v with at least one edge from the

set A, we determine whether v’s live edge comes from A. If so,

then v is reachable; but if not, we keep the source of v’s live edge

unknown, subject to the condition that it comes from outside A.

Having now exposed a new set of reachable nodes A



in the ﬁrst

stage, we proceed to identify further reachable nodes by perform-

ing the same process on edges from A



, and in this way produce

sets A



,.... If node v has not been determined to be reachable

by the end of stage t, then the probability that it is determined to

be reachable in stage t +1is equal to the chance that its live edge

comes from A

\ A

t−1

, given that its live edge has not come from

any of the earlier sets. But this is



u∈A

t−1

v,u

1 −



u∈A

t−1

v,u

, which is the

same as in the Linear Threshold process of the previous paragraph.

Thus, by induction over these stages, we see that the live-edge pro-

cess produces the same distribution over active sets as the Linear

Threshold process.

Inﬂuence maximization is hard in this model as well.

HEOREM 2.7. The inﬂuence maximization problem is NP-hard

for the Linear Threshold model.

Proof. Consider an instance of the NP-complete Vertex Cover prob-

lem deﬁned by an undirected n-node graph G =(V, E) and an in-

teger k; we want to know if there is a set S of k nodes in G so that

every edge has at least one endpoint in S. We show that this can be

viewed as a special case of the inﬂuence maximization problem.

Given an instance of the Vertex Cover problem involving a graph

G, we deﬁne a corresponding instance of the inﬂuence maximiza-

tion problem by directing all edges of G in both directions. If there

is a vertex cover S of size k in G, then one can deterministically

make σ(A)=n by targeting the nodes in the set A = S; con-

versely, this is the only way to get a set A with σ(A)=n.

In the proofs of both the approximation theorems in this section,

we established submodularity by considering an equivalent process

in which each node “hard-wired” certain of its incident edges as

transmitting inﬂuence from neighbors. This turns out to be a proof

technique that can be formulated in general terms, and directly ap-

plied to give approximability results for other models as well. We

discuss this further in the context of the general framework pre-

sented in Section 4.

3. EXPERIMENTS

In addition to obtaining worst-case guarantees on the perfor-

mance of our approximation algorithm, we are interested in under-

standing its behavior in practice, and comparing its performance

to other heuristics for identifying inﬂuential individuals. We ﬁnd

that our greedy algorithm achieves signiﬁcant performance gains

over several widely-used structural measures of inﬂuence in social

networks [30].

The Network Data

. For evaluation, it is desirable to use a network

dataset that exhibits many of the structural features of large-scale

social networks. At the same time, we do not address the issue

of inferring actual inﬂuence parameters from network observations

(see e.g. [10, 26]). Thus, for our testbed, we employ a collabo-

ration graph obtained from co-authorships in physics publications,

with simple settings of the inﬂuence parameters. It has been argued

extensively that co-authorship networks capture many of the key

features of social networks more generally [24]. The co-authorship

data was compiled from the complete list of papers in the high-

energy physics theory section of the e-print arXiv (www.arxiv.org).

The collaboration graph contains a node for each researcher who

has at least one paper with co-author(s) in the arXiv database. For

each paper with two or more authors, we inserted an edge for each

pair of authors (single-author papers were ignored). Notice that

this results in parallel edges when two researchers have co-authored

multiple papers — we kept these parallel edges as they can be in-

terpreted to indicate stronger social ties between the researchers

involved. The resulting graph has 10748 nodes, and edges between

about 53000 pairs of nodes.

We also ran experiments on the co-authorship graphs induced by

theoretical computer science papers. We do not report on the results

here, as they are very similar to the ones for high-energy physics.

Maximizing the spread of influence through a social network

Figures

Citations

What is Twitter, a social network or a news media?

Epidemic processes in complex networks

Cost-effective outbreak detection in networks

The dynamics of viral marketing

Efficient influence maximization in social networks

References

Diffusion of Innovations

Diffusion of innovations

Error and attack tolerance of complex networks

Social Network Analysis

Integer and Combinatorial Optimization

Related Papers (5)

Mining the network value of customers

Efficient influence maximization in social networks

Cost-effective outbreak detection in networks

Threshold models of collective behavior.

Talk of the Network: A Complex Systems Look at the Underlying Process of Word-of-Mouth

Frequently Asked Questions (10)

Q1. What contributions have the authors mentioned in the paper "Maximizing the spread of influence through a social network" ?

Q2. What is the heuristic used in the sociology literature?

Q3. How can the authors obtain a close approximation to (A)?

Q4. What is the generality of the models the authors consider?

Q5. How many neighbors would be successful in a weighted cascade model?

Q6. What is the effect of activating the nodes corresponding to the k sets?

Q7. What is the probability that v becomes active in iteration t+1?

Q8. How can the authors extend the result of Nemhauser et al. to a?

Q9. What is the effect of the weighted influence function on the quantity v?

Q10. What is the probability of a node being active at time t?