scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Centrality Measures, Upper Bound, and Influence Maximization in Large Scale Directed Social Networks

01 Jul 2014-Fundamenta Informaticae (IOS Press)-Vol. 130, Iss: 3, pp 317-342
TL;DR: Two new centrality measures, Diffusion Degree for independent cascade model of information diffusion and Maximum Influence Degree are proposed, which provide the maximum theoretically possible influence Upper Bound for a node.
Abstract: The paper addresses the problem of finding top k influential nodes in large scale directed social networks. We propose two new centrality measures, Diffusion Degree for independent cascade model of information diffusion and Maximum Influence Degree. Unlike other existing centrality measures, diffusion degree considers neighbors' contributions in addition to the degree of a node. The measure also works flawlessly with non uniform propagation probability distributions. On the other hand, Maximum Influence Degree provides the maximum theoretically possible influence Upper Bound for a node. Extensive experiments are performed with five different real life large scale directed social networks. With independent cascade model, we perform experiments for both uniform and non uniform propagation probabilities. We use Diffusion Degree Heuristic DiDH and Maximum Influence Degree Heuristic MIDH, to find the top k influential individuals. k seeds obtained through these for both the setups show superior influence compared to the seeds obtained by high degree heuristics, degree discount heuristics, different variants of set covering greedy algorithms and Prefix excluding Maximum Influence Arborescence PMIA algorithm. The superiority of the proposed method is also found to be statistically significant as per T-test.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: An organizing scheme for single‐topic user groups is proposed for facilitating user sharing and communicating under common interests and contains 3 features: topic impact evaluation, interest degree measurement, and trust chain‐based organizing.
Abstract: Summary Social network sites (SNS) presently face the task of grouping users into small subsets within themselves. In this study, an organizing scheme for single-topic user groups is proposed for facilitating user sharing and communicating under common interests. The main rationales of the proposed scheme are (1) only an influential single topic is selected through its impact evaluation to attract users; (2) only the users having high degree of interest, explicit or implicit, on the topic should be grouped; and (3) trustworthy relationships among users are taken into consideration to enlarge the scale of user group. The proposed organizing scheme comprises 3 features: topic impact evaluation, interest degree measurement, and trust chain-based organizing. The main structure of our proposed scheme is (1) an overview of the proposed scheme and its formal related definitions; (2) a topic impact evaluation method, ie, an importance evaluation and a popularity calculation; (3) a user interest degree measurement method, ie, explicit and implicit interest evaluation with dynamic factors included; (4) a trust chain calculation method based on the topology features of the trust chain; (5) an organizing algorithm for single topic user group, and finally, some experimental results and discussions to illustrate the effectiveness and feasibility of our scheme.

70 citations

Journal ArticleDOI
TL;DR: An effective discrete shuffled frog-leaping algorithm (DSFLA) is proposed to solve influence maximization problem in a more efficient way and is superior than several state-of-the-art alternatives.
Abstract: Influence maximization problem aims to select a subset of k most influential nodes from a given network such that the spread of influence triggered by the seed set will be maximum. Greedy based algorithms are time-consuming to approximate the expected influence spread of given node set accurately and not well scalable to large-scale networks especially when the propagation probability is large. Conventional heuristics based on network topology or confined diffusion paths tend to suffer from the problem of low solution accuracy or huge memory cost. In this paper an effective discrete shuffled frog-leaping algorithm (DSFLA) is proposed to solve influence maximization problem in a more efficient way. Novel encoding mechanism and discrete evolutionary rules are conceived based on network topology structure for virtual frog population. To facilitate the global exploratory solution, a novel local exploitation mechanism combining deterministic and random walk strategies is put forward to improve the suboptimal meme of each memeplex in the frog population. The experimental results of influence spread in six real-world networks and statistical tests show that DSFLA performs effectively in selecting targeted influential seed nodes for influence maximization and is superior than several state-of-the-art alternatives.

69 citations

Journal ArticleDOI
TL;DR: The possibilities of the linear threshold model for the definition of centrality measures to be used on weighted and labeled social networks are explored and a new centrality measure to rank the users of the network, the Linear Threshold Rank (LTR), and a centralization measure to determine to what extent the entire network has a centralized structure are explored.
Abstract: Centrality and influence spread are two of the most studied concepts in social network analysis. In recent years, centrality measures have attracted the attention of many researchers, generating a large and varied number of new studies about social network analysis and its applications. However, as far as we know, traditional models of influence spread have not yet been exhaustively used to define centrality measures according to the influence criteria. Most of the considered work in this topic is based on the independent cascade model. In this paper we explore the possibilities of the linear threshold model for the definition of centrality measures to be used on weighted and labeled social networks. We propose a new centrality measure to rank the users of the network, the Linear Threshold Rank (LTR), and a centralization measure to determine to what extent the entire network has a centralized structure, the Linear Threshold Centralization (LTC). We appraise the viability of the approach through several case studies. We consider four different social networks to compare our new measures with two centrality measures based on relevance criteria and another centrality measure based on the independent cascade model. Our results show that our measures are useful for ranking actors and networks in a distinguishable way.

50 citations

Journal ArticleDOI
TL;DR: Experimental results on benchmark data show the superiority of the proposed community detection algorithm compared to other well known methods, particularly when the network contains overlapping communities.

47 citations

Journal ArticleDOI
TL;DR: A two level approach, designed based on Suspected-Infected (SI) epidemic model for maximizing the influence spread, and multithreading approach for implementation of algorithm for the proposed SI model aids to further elevate the performance of proposed algorithm in terms of influence spread per second.

42 citations

References
More filters
Proceedings ArticleDOI
23 Jul 2002
TL;DR: This research optimize the amount of marketing funds spent on each customer, rather than just making a binary decision on whether to market to him, and takes into account the fact that knowledge of the network is partial, and that gathering that knowledge can itself have a cost.
Abstract: Viral marketing takes advantage of networks of influence among customers to inexpensively achieve large changes in behavior. Our research seeks to put it on a firmer footing by mining these networks from data, building probabilistic models of them, and using these models to choose the best viral marketing plan. Knowledge-sharing sites, where customers review products and advise each other, are a fertile source for this type of data mining. In this paper we extend our previous techniques, achieving a large reduction in computational cost, and apply them to data from a knowledge-sharing site. We optimize the amount of marketing funds spent on each customer, rather than just making a binary decision on whether to market to him. We take into account the fact that knowledge of the network is partial, and that gathering that knowledge can itself have a cost. Our results show the robustness and utility of our approach.

1,759 citations


Additional excerpts

  • ...Received February 2013; revised April 2013....

    [...]

Proceedings ArticleDOI
25 Jul 2010
TL;DR: The results from extensive simulations demonstrate that the proposed algorithm is currently the best scalable solution to the influence maximization problem and significantly outperforms all other scalable heuristics to as much as 100%--260% increase in influence spread.
Abstract: Influence maximization, defined by Kempe, Kleinberg, and Tardos (2003), is the problem of finding a small set of seed nodes in a social network that maximizes the spread of influence under certain influence cascade models. The scalability of influence maximization is a key factor for enabling prevalent viral marketing in large-scale online social networks. Prior solutions, such as the greedy algorithm of Kempe et al. (2003) and its improvements are slow and not scalable, while other heuristic algorithms do not provide consistently good performance on influence spreads. In this paper, we design a new heuristic algorithm that is easily scalable to millions of nodes and edges in our experiments. Our algorithm has a simple tunable parameter for users to control the balance between the running time and the influence spread of the algorithm. Our results from extensive simulations on several real-world and synthetic networks demonstrate that our algorithm is currently the best scalable solution to the influence maximization problem: (a) our algorithm scales beyond million-sized graphs where the greedy algorithm becomes infeasible, and (b) in all size ranges, our algorithm performs consistently well in influence spread --- it is always among the best algorithms, and in most cases it significantly outperforms all other scalable heuristics to as much as 100%--260% increase in influence spread.

1,709 citations


Additional excerpts

  • ...Received February 2013; revised April 2013....

    [...]

Journal ArticleDOI
TL;DR: This paper employs approximation algorithms for the graph-partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities, and defines the network community profile plot, which characterizes the "best" possible community—according to the conductance measure—over a wide range of size scales.
Abstract: A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins with the premise that a community or a cluster should be thought of as a set of nodes that has more and/or better connections between its members than to the remainder of the network. In this paper, we explore from a novel perspective several questions related to identifying meaningful communities in large social and information networks, and we come to several striking conclusions. Rather than defining a procedure to extract sets of nodes from a graph and then attempting to interpret these sets as "real" communities, we employ approximation algorithms for the graph-partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be i...

1,660 citations

Posted Content
TL;DR: In this article, the authors employ approximation algorithms for the graph partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities.
Abstract: A large body of work has been devoted to defining and identifying clusters or communities in social and information networks. We explore from a novel perspective several questions related to identifying meaningful communities in large social and information networks, and we come to several striking conclusions. We employ approximation algorithms for the graph partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities. In particular, we define the network community profile plot, which characterizes the "best" possible community--according to the conductance measure--over a wide range of size scales. We study over 100 large real-world social and information networks. Our results suggest a significantly more refined picture of community structure in large networks than has been appreciated previously. In particular, we observe tight communities that are barely connected to the rest of the network at very small size scales; and communities of larger size scales gradually "blend into" the expander-like core of the network and thus become less "community-like." This behavior is not explained, even at a qualitative level, by any of the commonly-used network generation models. Moreover, it is exactly the opposite of what one would expect based on intuition from expander graphs, low-dimensional or manifold-like graphs, and from small social networks that have served as testbeds of community detection algorithms. We have found that a generative graph model, in which new edges are added via an iterative "forest fire" burning process, is able to produce graphs exhibiting a network community profile plot similar to what we observe in our network datasets.

1,555 citations

Journal ArticleDOI
TL;DR: The requirements that context modelling and reasoning techniques should meet are discussed, including the modelling of a variety ofcontext information types and their relationships, of situations as abstractions of context information facts, of histories of contextInformation, and of uncertainty of context Information.

1,201 citations