An analysis framework based on submodular functions shows that a natural greedy strategy obtains a solution that is provably within 63% of optimal for several classes of models, and suggests a general approach for reasoning about the performance guarantees of algorithms for these types of influence problems in social networks.
Abstract:
Models for the processes by which ideas and influence propagate through a social network have been studied in a number of domains, including the diffusion of medical and technological innovations, the sudden and widespread adoption of various strategies in game-theoretic settings, and the effects of "word of mouth" in the promotion of new products. Recently, motivated by the design of viral marketing strategies, Domingos and Richardson posed a fundamental algorithmic problem for such social network processes: if we can try to convince a subset of individuals to adopt a new product or innovation, and the goal is to trigger a large cascade of further adoptions, which set of individuals should we target?We consider this problem in several of the most widely studied models in social network analysis. The optimization problem of selecting the most influential nodes is NP-hard here, and we provide the first provable approximation guarantees for efficient algorithms. Using an analysis framework based on submodular functions, we show that a natural greedy strategy obtains a solution that is provably within 63% of optimal for several classes of models; our framework suggests a general approach for reasoning about the performance guarantees of algorithms for these types of influence problems in social networks.We also provide computational experiments on large collaboration networks, showing that in addition to their provable guarantees, our approximation algorithms significantly out-perform node-selection heuristics based on the well-studied notions of degree centrality and distance centrality from the field of social networks.
TL;DR: In this paper, the authors have crawled the entire Twittersphere and found a non-power-law follower distribution, a short effective diameter, and low reciprocity, which all mark a deviation from known characteristics of human social networks.
TL;DR: A coherent and comprehensive review of the vast research activity concerning epidemic processes is presented, detailing the successful theoretical approaches as well as making their limits and assumptions clear.
TL;DR: This work exploits submodularity to develop an efficient algorithm that scales to large problems, achieving near optimal placements, while being 700 times faster than a simple greedy algorithm and achieving speedups and savings in storage of several orders of magnitude.
TL;DR: While on average recommendations are not very effective at inducing purchases and do not spread very far, this work presents a model that successfully identifies communities, product, and pricing categories for which viral marketing seems to be very effective.
TL;DR: Based on the results, it is believed that fine-tuned heuristics may provide truly scalable solutions to the influence maximization problem with satisfying influence spread and blazingly fast running time.
TL;DR: A history of diffusion research can be found in this paper, where the authors present a glossary of developments in the field of Diffusion research and discuss the consequences of these developments.
TL;DR: Upon returning to the U.S., author Singhal’s Google search revealed the following: in January 2001, the impeachment trial against President Estrada was halted by senators who supported him and the government fell without a shot being fired.
TL;DR: It is found that scale-free networks, which include the World-Wide Web, the Internet, social networks and cells, display an unexpected degree of robustness, the ability of their nodes to communicate being unaffected even by unrealistically high failure rates.
TL;DR: In this article, the development of social network analysis, tracing its origins in classical sociology and its more recent formulation in social scientific and mathematical work, is described and discussed. But it is argued that the analysis of social networks is not a purely static process.
Q1. What contributions have the authors mentioned in the paper "Maximizing the spread of influence through a social network" ?
Models for the processes by which ideas and influence propagate through a social network have been studied in a number of domains, including the diffusion of medical and technological innovations, the sudden and widespread adoption of various strategies in game-theoretic settings, and the effects of “ word of mouth ” in the promotion of new products. The authors consider this problem in several of the most widely studied models in social network analysis. The optimization problem of selecting the most influential nodes is NP-hard here, and the authors provide the first provable approximation guarantees for efficient algorithms. Using an analysis framework based on submodular functions, the authors show that a natural greedy strategy obtains a solution that is provably within 63 % of optimal for several classes of models ; their framework suggests a general approach for reasoning about the performance guarantees of algorithms for these types of influence problems in social networks. The authors also provide computational experiments on large collaboration networks, showing that in addition to their provable guarantees, their approximation algorithms significantly out-perform nodeselection heuristics based on the well-studied notions of degree centrality and distance centrality from the field of social networks. Recently, motivated by the design of viral marketing strategies, Domingos and Richardson posed a fundamental algorithmic problem for such social network processes: if the authors can try to convince a subset of individuals to adopt a new product or innovation, and the goal is to trigger a large cascade of further adoptions, which set of individuals should they target ?
Q2. What is the heuristic used in the sociology literature?
The degree and centrality-based heuristics are commonly used in the sociology literature as estimates of a node’s influence [30].
Q3. How can the authors obtain a close approximation to (A)?
by simulating the diffusion process and sampling the resulting active sets, the authors are able to obtain arbitrarily close approximations to σ(A), with high probability.
Q4. What is the generality of the models the authors consider?
The generality of the models the authors consider lies between that of the polynomial-time solvable model of [26] and the very general model of [10], where the optimization problem cannot even be approximated to within a non-trivial factor.
Q5. How many neighbors would be successful in a weighted cascade model?
The weighted cascade model resembles the linear threshold model in that the expected number of neighbors who would succeed in activating a node v is 1 in both models.
Q6. What is the effect of activating the nodes corresponding to the k sets?
If there are at most k sets that cover all elements, then activating the nodes corresponding to these k sets will activate all of the nodes ui, and thus also all of the xj .
Q7. What is the probability that v becomes active in iteration t+1?
If node v has not become active by the end of iteration t, then the probability that it becomes active in iteration t+1 is equal to the chance that the influence weights in At \\At−1 push it over its threshold, given that its threshold was not exceededalready; this probability is∑ u∈At\\At−1 bv,u1 − ∑u∈At−1 bv,u .
Q8. How can the authors extend the result of Nemhauser et al. to a?
one can extend the result of Nemhauser et al. to show that for any ε > 0, there is a γ > 0 such that by using (1 + γ)-approximate values for the function to be optimized, the authors obtain a (1−1/e−ε)-approximation.
Q9. What is the effect of the weighted influence function on the quantity v?
If the authors let B denote the (random) set activated by the process with initial activation A, then the authors can define the weighted influence function σw(A) to be the expected value over outcomes B of the quantity ∑ v∈B wv .
Q10. What is the probability of a node being active at time t?
The authors are concerned with the sum over all time steps t ≤ τ of the expected number of active nodes at time t, for a given a time limit τ , while [10, 26] study the limit of this process: the expected number of nodes active at time t as t goes to infinity.