scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Centrality Measures, Upper Bound, and Influence Maximization in Large Scale Directed Social Networks

01 Jul 2014-Fundamenta Informaticae (IOS Press)-Vol. 130, Iss: 3, pp 317-342
TL;DR: Two new centrality measures, Diffusion Degree for independent cascade model of information diffusion and Maximum Influence Degree are proposed, which provide the maximum theoretically possible influence Upper Bound for a node.
Abstract: The paper addresses the problem of finding top k influential nodes in large scale directed social networks. We propose two new centrality measures, Diffusion Degree for independent cascade model of information diffusion and Maximum Influence Degree. Unlike other existing centrality measures, diffusion degree considers neighbors' contributions in addition to the degree of a node. The measure also works flawlessly with non uniform propagation probability distributions. On the other hand, Maximum Influence Degree provides the maximum theoretically possible influence Upper Bound for a node. Extensive experiments are performed with five different real life large scale directed social networks. With independent cascade model, we perform experiments for both uniform and non uniform propagation probabilities. We use Diffusion Degree Heuristic DiDH and Maximum Influence Degree Heuristic MIDH, to find the top k influential individuals. k seeds obtained through these for both the setups show superior influence compared to the seeds obtained by high degree heuristics, degree discount heuristics, different variants of set covering greedy algorithms and Prefix excluding Maximum Influence Arborescence PMIA algorithm. The superiority of the proposed method is also found to be statistically significant as per T-test.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: An organizing scheme for single‐topic user groups is proposed for facilitating user sharing and communicating under common interests and contains 3 features: topic impact evaluation, interest degree measurement, and trust chain‐based organizing.
Abstract: Summary Social network sites (SNS) presently face the task of grouping users into small subsets within themselves. In this study, an organizing scheme for single-topic user groups is proposed for facilitating user sharing and communicating under common interests. The main rationales of the proposed scheme are (1) only an influential single topic is selected through its impact evaluation to attract users; (2) only the users having high degree of interest, explicit or implicit, on the topic should be grouped; and (3) trustworthy relationships among users are taken into consideration to enlarge the scale of user group. The proposed organizing scheme comprises 3 features: topic impact evaluation, interest degree measurement, and trust chain-based organizing. The main structure of our proposed scheme is (1) an overview of the proposed scheme and its formal related definitions; (2) a topic impact evaluation method, ie, an importance evaluation and a popularity calculation; (3) a user interest degree measurement method, ie, explicit and implicit interest evaluation with dynamic factors included; (4) a trust chain calculation method based on the topology features of the trust chain; (5) an organizing algorithm for single topic user group, and finally, some experimental results and discussions to illustrate the effectiveness and feasibility of our scheme.

70 citations

Journal ArticleDOI
TL;DR: An effective discrete shuffled frog-leaping algorithm (DSFLA) is proposed to solve influence maximization problem in a more efficient way and is superior than several state-of-the-art alternatives.
Abstract: Influence maximization problem aims to select a subset of k most influential nodes from a given network such that the spread of influence triggered by the seed set will be maximum. Greedy based algorithms are time-consuming to approximate the expected influence spread of given node set accurately and not well scalable to large-scale networks especially when the propagation probability is large. Conventional heuristics based on network topology or confined diffusion paths tend to suffer from the problem of low solution accuracy or huge memory cost. In this paper an effective discrete shuffled frog-leaping algorithm (DSFLA) is proposed to solve influence maximization problem in a more efficient way. Novel encoding mechanism and discrete evolutionary rules are conceived based on network topology structure for virtual frog population. To facilitate the global exploratory solution, a novel local exploitation mechanism combining deterministic and random walk strategies is put forward to improve the suboptimal meme of each memeplex in the frog population. The experimental results of influence spread in six real-world networks and statistical tests show that DSFLA performs effectively in selecting targeted influential seed nodes for influence maximization and is superior than several state-of-the-art alternatives.

69 citations

Journal ArticleDOI
TL;DR: The possibilities of the linear threshold model for the definition of centrality measures to be used on weighted and labeled social networks are explored and a new centrality measure to rank the users of the network, the Linear Threshold Rank (LTR), and a centralization measure to determine to what extent the entire network has a centralized structure are explored.
Abstract: Centrality and influence spread are two of the most studied concepts in social network analysis. In recent years, centrality measures have attracted the attention of many researchers, generating a large and varied number of new studies about social network analysis and its applications. However, as far as we know, traditional models of influence spread have not yet been exhaustively used to define centrality measures according to the influence criteria. Most of the considered work in this topic is based on the independent cascade model. In this paper we explore the possibilities of the linear threshold model for the definition of centrality measures to be used on weighted and labeled social networks. We propose a new centrality measure to rank the users of the network, the Linear Threshold Rank (LTR), and a centralization measure to determine to what extent the entire network has a centralized structure, the Linear Threshold Centralization (LTC). We appraise the viability of the approach through several case studies. We consider four different social networks to compare our new measures with two centrality measures based on relevance criteria and another centrality measure based on the independent cascade model. Our results show that our measures are useful for ranking actors and networks in a distinguishable way.

50 citations

Journal ArticleDOI
TL;DR: Experimental results on benchmark data show the superiority of the proposed community detection algorithm compared to other well known methods, particularly when the network contains overlapping communities.

47 citations

Journal ArticleDOI
TL;DR: A two level approach, designed based on Suspected-Infected (SI) epidemic model for maximizing the influence spread, and multithreading approach for implementation of algorithm for the proposed SI model aids to further elevate the performance of proposed algorithm in terms of influence spread per second.

42 citations

References
More filters
Proceedings ArticleDOI
26 Aug 2001
TL;DR: It is proposed to model also the customer's network value: the expected profit from sales to other customers she may influence to buy, the customers those may influence, and so on recursively, taking advantage of the availability of large relevant databases.
Abstract: One of the major applications of data mining is in helping companies determine which potential customers to market to. If the expected profit from a customer is greater than the cost of marketing to her, the marketing action for that customer is executed. So far, work in this area has considered only the intrinsic value of the customer (i.e, the expected profit from sales to her). We propose to model also the customer's network value: the expected profit from sales to other customers she may influence to buy, the customers those may influence, and so on recursively. Instead of viewing a market as a set of independent entities, we view it as a social network and model it as a Markov random field. We show the advantages of this approach using a social network mined from a collaborative filtering database. Marketing that exploits the network value of customers---also known as viral marketing---can be extremely effective, but is still a black art. Our work can be viewed as a step towards providing a more solid foundation for it, taking advantage of the availability of large relevant databases.

2,886 citations


Additional excerpts

  • ...Received February 2013; revised April 2013....

    [...]

Proceedings ArticleDOI
12 Aug 2007
TL;DR: This work exploits submodularity to develop an efficient algorithm that scales to large problems, achieving near optimal placements, while being 700 times faster than a simple greedy algorithm and achieving speedups and savings in storage of several orders of magnitude.
Abstract: Given a water distribution network, where should we place sensors toquickly detect contaminants? Or, which blogs should we read to avoid missing important stories?.These seemingly different problems share common structure: Outbreak detection can be modeled as selecting nodes (sensor locations, blogs) in a network, in order to detect the spreading of a virus or information asquickly as possible. We present a general methodology for near optimal sensor placement in these and related problems. We demonstrate that many realistic outbreak detection objectives (e.g., detection likelihood, population affected) exhibit the property of "submodularity". We exploit submodularity to develop an efficient algorithm that scales to large problems, achieving near optimal placements, while being 700 times faster than a simple greedy algorithm. We also derive online bounds on the quality of the placements obtained by any algorithm. Our algorithms and bounds also handle cases where nodes (sensor locations, blogs) have different costs.We evaluate our approach on several large real-world problems,including a model of a water distribution network from the EPA, andreal blog data. The obtained sensor placements are provably near optimal, providing a constant fraction of the optimal solution. We show that the approach scales, achieving speedups and savings in storage of several orders of magnitude. We also show how the approach leads to deeper insights in both applications, answering multicriteria trade-off, cost-sensitivity and generalization questions.

2,413 citations


Additional excerpts

  • ...Received February 2013; revised April 2013....

    [...]

Journal ArticleDOI
TL;DR: While on average recommendations are not very effective at inducing purchases and do not spread very far, this work presents a model that successfully identifies communities, product, and pricing categories for which viral marketing seems to be very effective.
Abstract: We present an analysis of a person-to-person recommendation network, consisting of 4 million people who made 16 million recommendations on half a million products. We observe the propagation of recommendations and the cascade sizes, which we explain by a simple stochastic model. We analyze how user behavior varies within user communities defined by a recommendation network. Product purchases follow a ‘long tail’ where a significant share of purchases belongs to rarely sold items. We establish how the recommendation network grows over time and how effective it is from the viewpoint of the sender and receiver of the recommendations. While on average recommendations are not very effective at inducing purchases and do not spread very far, we present a model that successfully identifies communities, product, and pricing categories for which viral marketing seems to be very effective.

2,361 citations

Proceedings ArticleDOI
28 Jun 2009
TL;DR: Based on the results, it is believed that fine-tuned heuristics may provide truly scalable solutions to the influence maximization problem with satisfying influence spread and blazingly fast running time.
Abstract: Influence maximization is the problem of finding a small subset of nodes (seed nodes) in a social network that could maximize the spread of influence. In this paper, we study the efficient influence maximization from two complementary directions. One is to improve the original greedy algorithm of [5] and its improvement [7] to further reduce its running time, and the second is to propose new degree discount heuristics that improves influence spread. We evaluate our algorithms by experiments on two large academic collaboration graphs obtained from the online archival database arXiv.org. Our experimental results show that (a) our improved greedy algorithm achieves better running time comparing with the improvement of [7] with matching influence spread, (b) our degree discount heuristics achieve much better influence spread than classic degree and centrality-based heuristics, and when tuned for a specific influence cascade model, it achieves almost matching influence thread with the greedy algorithm, and more importantly (c) the degree discount heuristics run only in milliseconds while even the improved greedy algorithms run in hours in our experiment graphs with a few tens of thousands of nodes.Based on our results, we believe that fine-tuned heuristics may provide truly scalable solutions to the influence maximization problem with satisfying influence spread and blazingly fast running time. Therefore, contrary to what implied by the conclusion of [5] that traditional heuristics are outperformed by the greedy approximation algorithm, our results shed new lights on the research of heuristic algorithms.

2,073 citations


Additional excerpts

  • ...Received February 2013; revised April 2013....

    [...]

Journal ArticleDOI
TL;DR: The results clearly indicate that information dissemination is dominated by both weak and strong w-o-m, rather than by advertising, which means that strong and weak ties become the main forces propelling growth.
Abstract: Though word-of-mouth (w-o-m) communications is a pervasive and intriguing phenomenon, little is known on its underlying process of personal communications. Moreover as marketers are getting more interested in harnessing the power of w-o-m, for e-business and other net related activities, the effects of the different communications types on macro level marketing is becoming critical. In particular we are interested in the breakdown of the personal communication between closer and stronger communications that are within an individual's own personal group (strong ties) and weaker and less personal communications that an individual makes with a wide set of other acquaintances and colleagues (weak ties). We use a technique borrowed from Complex Systems Analysis called stochastic cellular automata in order to generate data and analyze the results so that answers to our main research issues could be ascertained. The following summarizes the impact of strong and weak ties on the speed of acceptance of a new product: ••The influence of weak ties is at least as strong as the influence of strong ties. Despite the relative inferiority of the weak tie parameter in the model's assumptions, their effect approximates or exceeds that of strong ties, in all stages of the product life cycle. ••External marketing efforts (e.g., advertising) are effective. However, beyond a relatively early stage of the growth cycle of the new product, their efficacy quickly diminishes and strong and weak ties become the main forces propelling growth. The results clearly indicate that information dissemination is dominated by both weak and strong w-o-m, rather than by advertising. ••The effect of strong ties diminishes as personal network size decreases. Market attributes were also found to mediate the effects of weak and strong ties. When personal networks are small, weak ties were found to have a stronger impact on information dissemination than strong ties.

2,044 citations


"Centrality Measures, Upper Bound, a..." refers background in this paper

  • ...Information diffusion in the social network is well studied in sociology....

    [...]

  • ...Received February 2013; revised April 2013....

    [...]