Showing papers in "Social Network Analysis and Mining in 2016"

PDF

Open Access

Journal Article•DOI•

Dynamic community detection in evolving networks using locality modularity optimization

[...]

Mário Cordeiro, Rui Portocarrero Sarmento, João Gama

24 Mar 2016-Social Network Analysis and Mining

TL;DR: This work proposes a novel technique that maintains the community structure always up-to-date following the addition or removal of nodes and edges, and performs a local modularity optimization that maximizes the modularity gain function only for those communities where the editing of node and edges was performed, keeping the rest of the network unchanged.

...read moreread less

Abstract: The amount and the variety of data generated by today’s online social and telecommunication network services are changing the way researchers analyze social networks. Facing fast evolving networks with millions of nodes and edges are, among other factors, its main challenge. Community detection algorithms in these conditions have also to be updated or improved. Previous state-of-the-art algorithms based on the modularity optimization (i.e. Louvain algorithm), provide fast, efficient and robust community detection on large static networks. Nonetheless, due to the high computing complexity of these algorithms, the use of batch techniques in dynamic networks requires to perform network community detection for the whole network in each one of the evolution steps. This fact reveals to be computationally expensive and unstable in terms of tracking of communities. Our contribution is a novel technique that maintains the community structure always up-to-date following the addition or removal of nodes and edges. The proposed algorithm performs a local modularity optimization that maximizes the modularity gain function only for those communities where the editing of nodes and edges was performed, keeping the rest of the network unchanged. The effectiveness of our algorithm is demonstrated with the comparison to other state-of-the-art community detection algorithms with respect to Newman’s Modularity, Modularity with Split Penalty, Modularity Density, number of detected communities and running time.

...read moreread less

64 citations

Journal Article•DOI•

From classification to quantification in tweet sentiment analysis

[...]

Wei Gao¹, Fabrizio Sebastiani¹•Institutions (1)

Qatar Computing Research Institute¹

12 Apr 2016-Social Network Analysis and Mining

TL;DR: It is argued that researchers interested in tweet sentiment prevalence should switch to quantification-specific learning algorithms and evaluation measures, which produce substantially better class frequency estimates than a state-of-the-art classification-oriented algorithm routinely used in TSC.

...read moreread less

Abstract: Sentiment classification has become a ubiquitous enabling technology in the Twittersphere, since classifying tweets according to the sentiment they convey towards a given entity (be it a product, a person, a political party, or a policy) has many applications in political science, social science, market research, and many others. In this paper, we contend that most previous studies dealing with tweet sentiment classification (TSC) use a suboptimal approach. The reason is that the final goal of most such studies is not estimating the class label (e.g., Positive, Negative, or Neutral) of individual tweets, but estimating the relative frequency (a.k.a. “prevalence”) of the different classes in the dataset. The latter task is called quantification, and recent research has convincingly shown that it should be tackled as a task of its own, using learning algorithms and evaluation measures different from those used for classification. In this paper, we show (by carrying out experiments using two learners, seven quantification-specific algorithms, and 11 TSC datasets) that using quantification-specific algorithms produces substantially better class frequency estimates than a state-of-the-art classification-oriented algorithm routinely used in TSC. We thus argue that researchers interested in tweet sentiment prevalence should switch to quantification-specific (instead of classification-specific) learning algorithms and evaluation measures.

...read moreread less

63 citations

Journal Article•DOI•

A survey of event detection techniques in online social networks

[...]

Anuradha Goswami¹, Ajey Kumar¹•Institutions (1)

Symbiosis International University¹

17 Nov 2016-Social Network Analysis and Mining

TL;DR: A survey is done for event detection techniques in OSN based on social text streams—newswire, web forums, emails, blogs and microblogs, for natural disasters, trending or emerging topics and public opinion-based events.

...read moreread less

Abstract: The online social networks (OSNs) have become an important platform for detecting real-world event in recent years. These real-world events are detected by analyzing huge social-stream data available on different OSN platforms. Event detection has become significant because it contains substantial information which describes different scenarios during events or crisis. This information further helps to enable contextual decision making, regarding the event location, content and the temporal specifications. Several studies exist, which offers plethora of frameworks and tools for detecting and analyzing events used for applications like crisis management, monitoring and predicting events in different OSN platforms. In this paper, a survey is done for event detection techniques in OSN based on social text streams—newswire, web forums, emails, blogs and microblogs, for natural disasters, trending or emerging topics and public opinion-based events. The work done and the open problems are explicitly mentioned for each social stream. Further, this paper elucidates the list of event detection tools available for the researchers.

...read moreread less

53 citations

Journal Article•DOI•

Reply trees in Twitter: data analysis and branching process models

[...]

Ryosuke Nishi¹, Ryosuke Nishi², Taro Takaguchi¹, Taro Takaguchi², Keigo Oka¹, Keigo Oka³, Takanori Maehara¹, Takanori Maehara², Masashi Toyoda³, Ken-ichi Kawarabayashi², Ken-ichi Kawarabayashi¹, Naoki Masuda⁴ - Show less +8 more•Institutions (4)

Hitotsubashi University¹, National Institute of Informatics², University of Tokyo³, University of Bristol⁴

10 May 2016-Social Network Analysis and Mining

TL;DR: It is suggested that the in-degree of the tweet that initiates a reply tree may play an important role in forming the global shape of the reply tree.

...read moreread less

Abstract: Structure of networks constructed from mentioning relationships between posts in online media may be valuable for understanding how information and opinions spread in these media We crawled Twitter to collect tweets and replies to construct a large number of so-called reply trees, each of which was rooted at a tweet and joined by replies Consistent with the previous literature, we found that the empirical trees were characterized by some long path-like reply trees, large star-like trees, and long irregular trees, although their frequencies were not high We tested several branching process models to explain the empirical frequency of these types of reply trees as well as more basic quantities such as the distributions of the size and depth of the reply tree Based on our modeling results, we suggest that the in-degree of the tweet that initiates a reply tree (ie, the number of times that the tweet is directly mentioned by other reply posts) may play an important role in forming the global shape of the reply tree

...read moreread less

50 citations

Journal Article•DOI•

Detecting experts on Quora: by their activity, quality of answers, linguistic characteristics and temporal behaviors

[...]

Sumanth Patil¹, Kyumin Lee¹•Institutions (1)

Utah State University¹

01 Dec 2016-Social Network Analysis and Mining

TL;DR: This manuscript proposes user activity features, quality of answer features, linguistic features and temporal features to identify distinguishing patterns between experts and non-experts, and develops statistical models based on the features to automatically detect experts.

...read moreread less

Abstract: Quora is a fast growing social QA (2) propose user activity features, quality of answer features, linguistic features and temporal features to identify distinguishing patterns between experts and non-experts; and (3) develop statistical models based on the features to automatically detect experts. Our experimental results show that our classifiers effectively identify experts in general topics and a specific topic, achieving up to 97 % accuracy and 0.987 AUC.

...read moreread less

44 citations

Journal Article•DOI•

Recommendation information diffusion in social networks considering user influence and semantics

[...]

Dionisis Margaris¹, Costas Vassilakis², Panagiotis Georgiadis¹•Institutions (2)

National and Kapodistrian University of Athens¹, University of Peloponnese²

25 Nov 2016-Social Network Analysis and Mining

TL;DR: This paper enhances recommendation algorithms used in social networks by taking into account qualitative aspects of the recommended items, such as price and reliability, the influencing factors between social network users, the social network user behavior regarding their purchases in different item categories and the semantic categorization of the products to be recommended.

...read moreread less

Abstract: One of the major problems in the domain of social networks is the handling and diffusion of the vast, dynamic and disparate information created by its users. In this context, the information contributed by users can be exploited to generate recommendations for other users. Relevant recommender systems take into account static data from users’ profiles, such as location, age or gender, complemented with dynamic aspects stemming from the user behavior and/or social network state such as user preferences, items’ general acceptance and influence from social friends. In this paper, we enhance recommendation algorithms used in social networks by taking into account qualitative aspects of the recommended items, such as price and reliability, the influencing factors between social network users, the social network user behavior regarding their purchases in different item categories and the semantic categorization of the products to be recommended. The inclusion of these aspects leads to more accurate recommendations and diffusion of better user-targeted information. This allows for better exploitation of the limited recommendation space, and therefore, online advertisement efficiency is raised.

...read moreread less

42 citations

Journal Article•DOI•

Complex contagions and the diffusion of popular Twitter hashtags in Nigeria

[...]

Clay Fink¹, Aurora Schmidt¹, Vladimir Barash, Christopher J. Cameron², Michael W. Macy² - Show less +1 more•Institutions (2)

Johns Hopkins University Applied Physics Laboratory¹, Cornell University²

01 Dec 2016-Social Network Analysis and Mining

TL;DR: It is found that hashtags related to Nigerian sociopolitical issues, including the #bringbackourgirls hashtag, are more likely to be adopted among densely connected users with multiple network neighbors who have also adopted the hashtag, compared to mainstream news hashtags.

...read moreread less

Abstract: Social media sites such as Facebook and Twitter provide highly granular time-stamped data about the interactions and communications between people and provide us unprecedented opportunities for empirically testing theory about information flow in social networks. Using publicly available data from Twitter’s free API (Application Program Interface), we track the adoption of popular hashtags in Nigeria during 2014. These hashtags reference online marketing campaigns, major news stories, and events and issues specific to Nigeria, including reactions to the kidnapping of 276 schoolgirls in Northeastern Nigeria by the Islamic extremist group Boko Haram. We find that hashtags related to Nigerian sociopolitical issues, including the #bringbackourgirls hashtag, which was associated with protests against the Nigerian government’s response to the kidnapping, are more likely to be adopted among densely connected users with multiple network neighbors who have also adopted the hashtag, compared to mainstream news hashtags. This association between adoption threshold and local network structure is consistent with theory about the spread of complex contagions, a type of social contagion which requires social reinforcement from multiple adopting neighbors. Theory also predicts the need for a critical mass of adopters before the contagion can become viral. We illustrate this with the #bringbackourgirls hashtag by identifying the point at which the local social movement transforms into a more widespread phenomenon. We also show that these results are robust across both the follow and reply/mention/retweet networks on Twitter. Our analysis involves data mining records of hashtag adoption and of the social connections between adopters.

...read moreread less

41 citations

Journal Article•DOI•

Behavioral analysis and classification of spammers distributing pornographic content in social media

[...]

Monika Singh¹, Divya Bansal¹, Sanjeev Sofat¹•Institutions (1)

PEC University of Technology¹

24 Jun 2016-Social Network Analysis and Mining

TL;DR: It is found that spammers contributing to pornographic content follow legitimate Twitter users and send URLs that link users to pornographic sites, in what is the first attempt to analyze and categorize the behavior of pornographic users in Twitter as spammers.

...read moreread less

Abstract: Social spam is a huge and complicated problem plaguing social networking sites in several ways. This includes posts, reviews or blogs containing product promotions and contests, adult content and general spam. It has been found that social media websites such as Twitter is also acting as a distributor of pornographic content, although it is considered against their own stated policy. In this paper, we have reviewed the case of Twitter and found that spammers contributing to pornographic content follow legitimate Twitter users and send URLs that link users to pornographic sites. Behavioral analysis of such type of spammers has been conducted using graph-based as well as content-based information fetched using simple text operators to study their characteristics. In the present study, about 74,000 tweets containing pornographic adult content posted by around 18,000 users have been collected and analyzed. The analysis shows that the users posting pornographic content fulfill the characteristics of spammers as stated by the rules and guidelines of Twitter. It has been observed that the illegitimate use of social media for spreading social spam has been spreading at a fast pace, with the network companies turning a blind eye toward this growing problem. Clearly, there is an immense requirement to build an effective solution to remove objectionable and slanderous content as stated above from social networking websites to promote and protect public decency and the welfare of children and adults. It is also essential so as to enhance public experience of genuine users using social media and protect them from harm to their public identity on the World Wide Web. Further in this paper, classification of pornographic spammers and genuine users has also been performed using machine learning technique. Experimental results show that Random Forest classifier is able to predict pornographic spammers with a reasonably high accuracy of 91.96 %. To the best of our knowledge, this is the first attempt to analyze and categorize the behavior of pornographic users in Twitter as spammers. So far, the work has been done for identifying spammers but they are not specifically targeting pornographic spammers.

...read moreread less

40 citations

Journal Article•DOI•

A review of features for the discrimination of twitter users: application to the prediction of offline influence

[...]

Jean-Valère Cossu¹, Vincent Labatut¹, Nicolas Dugué²•Institutions (2)

University of Avignon¹, University of Orléans²

09 May 2016-Social Network Analysis and Mining

TL;DR: In this article, a wide range of content-based features for predicting online influence of Twitter users is presented. But the authors show that most of these features are not relevant to the offline influence detection problem.

...read moreread less

Abstract: Many works related to Twitter aim at characterizing its users in some way: role on the service (spammers, bots, organizations, etc.), nature of the user (socio-professional category, age, etc.), topics of interest, and others. However, for a given user classification problem, it is very difficult to select a set of appropriate features, because the many features described in the literature are very heterogeneous, with name overlaps and collisions, and numerous very close variants. In this article, we review a wide range of such features. In order to present a clear state-of-the-art description, we unify their names, definitions and relationships, and we propose a new, neutral, typology. We then illustrate the interest of our review by applying a selection of these features to the offline influence detection problem. This task consists in identifying users who are influential in real life, based on their Twitter account and related data. We show that most features deemed efficient to predict online influence, such as the numbers of retweets and followers, are not relevant to this problem. However, we propose several content-based approaches to label Twitter users as influencers or not. We also rank them according to a predicted influence level. Our proposals are evaluated over the CLEF RepLab 2014 dataset, and outmatch state-of-the-art methods.

...read moreread less

35 citations

Journal Article•DOI•

A survey on game theoretic models for community detection in social networks

[...]

Annapurna Jonnalagadda¹, Lakshmanan Kuppusamy¹•Institutions (1)

VIT University¹

20 Sep 2016-Social Network Analysis and Mining

TL;DR: The taxonomy of game models and their characteristics along with their performance are provided and the interesting applications of game theory for social networks are discussed and further research directions are provided as well as some open challenges.

...read moreread less

Abstract: Community detection in social networks has received much attention from the researchers of multiple disciplines due to its impactful applications such as recommendation systems, link prediction, and anomaly detection. The focus of community detection is to determine the more dense subgraphs of the network which are called communities. The nodes of the community are expected to have similar features and interests. Assuming the nodes as selfish agents, the evolution of communities can be effectively modelled as a community formation game. Game theory provides a systematic framework to model the competition and coordination among the players. In the past decade, there are several contributions from the domain of game theory to address the problem of community detection in social networks. In this paper, we make a comprehensive survey that studies and provides an insight into available game theory-based community detection algorithms. The current study provides the taxonomy of game models and their characteristics along with their performance. We discuss the interesting applications of game theory for social networks and also provide further research directions as well as some open challenges.

...read moreread less

34 citations

Journal Article•DOI•

A supervised learning approach to link prediction in Twitter

[...]

Cherry Ahmed¹, Abeer El-Korany¹, Reem Bahgat¹•Institutions (1)

Cairo University¹

02 May 2016-Social Network Analysis and Mining

TL;DR: This work has been extended by moving to a Machine Learning Approach which treats the prediction process as a classification problem, and shows that using both classical and ensemble classifiers outperforms baseline algorithms when applied individually.

...read moreread less

Abstract: The growth of social networks has lately attracted both academic and industrial researchers to study the ties between people, and how the social networks evolve with time. Social networks like Facebook, Twitter and Flickr require efficient and accurate methods to recommend friends to their users in the network. Several algorithms have been developed to recommend friends or predict likelihood of future links. Two main approaches are used to utilize those features; Score-based Approaches and Machine Learning Approaches. In a previous work, a score-based method was used based on topological, node and social features to calculate similarity between users and determine the likelihood of forming future links. This work has been extended by moving to a Machine Learning Approach which treats the prediction process as a classification problem. The classifier predicts the class of each edge whether it exists or doesn’t exist. Machine Learning Approaches have the benefit of adding all similarity indices needed as the feature set fed to the classifier. While in Score-based Approach when we used multiple features with associated weights, the performance was sensitive to the values of such weights. When machine learning is applied, the learning process is performed by the classifier which is fed by eight similarity indices representing connectivity, community, interaction and trust in social network. When indices are combined, a much higher accuracy than the previous Score-based Approach is obtained and hence enhancing the prediction accuracy. In order to evaluate the correctness of the proposed model, it has been applied on a real dataset of 2.974k users on the Twitter social network. Experiments show that using both classical and ensemble classifiers outperforms baseline algorithms when applied individually.

...read moreread less

Journal Article•DOI•

Focal structures analysis: identifying influential sets of individuals in a social network

[...]

Fatih Sen¹, Rolf T. Wigand², Nitin Agarwal², Serpil Tokdemir², Rafał Kasprzyk³ - Show less +1 more•Institutions (3)

Boston Children's Hospital¹, University of Arkansas at Little Rock², Military University of Technology in Warsaw³

08 Apr 2016-Social Network Analysis and Mining

TL;DR: The Focal Structures Analysis (FSA) methodology is developed to extract key sets of individuals, called focal structures, in a social network, and goes beyond the traditional unit of analysis, which is an individual or a set of influential individuals, and places focal structures between the individuals and communities/clusters as the unit ofAnalysis.

...read moreread less

Abstract: Identifying influential individuals is a well-known approach in extracting actionable knowledge in a network. Existing studies suggest measures to identify influential individuals, i.e., they focus on the question “which individuals are best connected to others or have the most influence?”. Such individuals, however, may not represent the context (relationships, interactions, etc.) entirely in a social network. For example, it is nearly an impossible task for a single individual to organize a mass protest of the scale of the Saudi Arabian women’s 2013 Oct26Driving campaign, the 2012 Occupy Wall Street and the 2011 Arab Spring. Similarly, other events such as mobilizing the 2013 Taksim square-Gezi Park protesters, coordinating crisis response for natural disasters (e.g., the 2010 Haiti earthquake), or even organizing flash mobs would require a key set of individuals rather than a single or the most influential individual in a social network. An alternate line of research dealing with community or cluster identification approaches extract subnetworks of individuals. However, these structures may not represent the key sets of individuals that could coordinate the social processes mentioned above. Therefore, we develop the Focal Structures Analysis (FSA) methodology to extract such key sets of individuals, called focal structures, in a social network. This research goes beyond the traditional unit of analysis, which is an individual or a set of influential individuals, and places focal structures between the individuals and communities/clusters as the unit of analysis. To the best of our knowledge, this type of work is the first effort in identifying influential sets of individuals and would open up new directions for researchers to develop new methods in social network analysis.

...read moreread less

Journal Article•DOI•

User characterization for online social networks

[...]

Tayfun Tuna¹, Esra Akbas², Ahmet Aksoy³, Muhammed Abdullah Canbaz³, Umit Karabiyik⁴, Bilal Gonen⁵, Ramazan S. Aygun⁶ - Show less +3 more•Institutions (6)

University of Houston¹, Florida State University², University of Nevada, Reno³, Sam Houston State University⁴, University of Cincinnati⁵, University of Alabama in Huntsville⁶

04 Nov 2016-Social Network Analysis and Mining

TL;DR: In this paper, the authors study the research studies that are helpful for user characterization as online users may not always reveal their true identity or attributes, focusing on user attribute determination such as gender and age, user behavior analysis such as motives for deception, mental models that are indicators of user behavior, user categorization such as bots versus humans, and entity matching on different social networks.

...read moreread less

Abstract: Online social network analysis has attracted great attention with a vast number of users sharing information and availability of APIs that help to crawl online social network data. In this paper, we study the research studies that are helpful for user characterization as online users may not always reveal their true identity or attributes. We especially focused on user attribute determination such as gender and age; user behavior analysis such as motives for deception; mental models that are indicators of user behavior; user categorization such as bots versus humans; and entity matching on different social networks. We believe our summary of analysis of user characterization will provide important insights into researchers and better services to online users.

...read moreread less

Journal Article•DOI•

Exploring characteristics of suspended users and network stability on Twitter

[...]

Wei Wei¹, Kenneth Joseph¹, Huan Liu², Kathleen M. Carley¹•Institutions (2)

Carnegie Mellon University¹, Arizona State University²

21 Jul 2016-Social Network Analysis and Mining

TL;DR: There is significant evidence that suspended users exist on the periphery of social networks on Twitter and consequently that removing them has little impact on network structure, and prior attempts to distinguish among different types of suspended users are improved by using a much larger dataset.

...read moreread less

Abstract: Social media is rapidly becoming a medium of choice for understanding the cultural pulse of a region; e.g. for identifying what the population is concerned with and what kind of help is needed in a crisis. To assess this cultural pulse, it is critical to have an accurate assessment of who is saying what. Unfortunately, social media is also the home of users who engage in disruptive, disingenuous, and potentially illegal activity. A range of users, both human and non-human, carry out such social cyber-attacks. We ask, to what extent does the presence or absence of such users influence our ability to assess the cultural pulse of a region? Our prior research on this topic showed that Twitter-based network structures and content are unstable and can be highly impacted by the removal of suspended users. Because of this, statistical techniques can be established to differentiate potential types of suspended and non-suspended users. In this extended paper, we develop additional experiments to explore the spatial patterns of suspended users, and we further consider how these users affect structural and content concentrations via the development of new metrics and new analyses. We find significant evidence that suspended users exist on the periphery of social networks on Twitter and consequently that removing them has little impact on network structure. We also improve prior attempts to distinguish among different types of suspended users by using a much larger dataset. Finally, we conduct a temporal sentiment analysis to illustrate differences between suspended users and non-suspended users on this dimension.

...read moreread less

Journal Article•DOI•

Structure-preserving sparsification methods for social networks

[...]

Michael Hamann¹, Gerd Lindner¹, Henning Meyerhenke¹, Christian L. Staudt¹, Dorothea Wagner¹ - Show less +1 more•Institutions (1)

Karlsruhe Institute of Technology¹

29 Apr 2016-Social Network Analysis and Mining

TL;DR: In this article, the first systematic conceptual and experimental comparison of edge sparsification methods on a diverse set of network properties is presented, which can be understood as methods for rating edges by importance and then filtering globally or locally by these scores.

...read moreread less

Abstract: Sparsification reduces the size of networks while preserving structural and statistical properties of interest. Various sparsifying algorithms have been proposed in different contexts. We contribute the first systematic conceptual and experimental comparison of edge sparsification methods on a diverse set of network properties. It is shown that they can be understood as methods for rating edges by importance and then filtering globally or locally by these scores. We show that applying a local filtering technique improves the preservation of all kinds of properties. In addition, we propose a new sparsification method (Local Degree) which preserves edges leading to local hub nodes. All methods are evaluated on a set of social networks from Facebook, Google+, Twitter and LiveJournal with respect to network properties including diameter, connected components, community structure, multiple node centrality measures and the behavior of epidemic simulations. To assess the preservation of the community structure, we also include experiments on synthetically generated networks with ground truth communities. Experiments with our implementations of the sparsification methods (included in the open-source network analysis tool suite NetworKit) show that many network properties can be preserved down to about 20 % of the original set of edges for sparse graphs with a reasonable density. The experimental results allow us to differentiate the behavior of different methods and show which method is suitable with respect to which property. While our Local Degree method is best for preserving connectivity and short distances, other newly introduced local variants are best for preserving the community structure.

...read moreread less

Journal Article•DOI•

A mathematical model of news propagation on online social network and a control strategy for rumor spreading

[...]

Joydip Dhar¹, Ankur Jain², Vijay Kumar Gupta²•Institutions (2)

Indian Institute of Information Technology and Management, Gwalior¹, Rajiv Gandhi Proudyogiki Vishwavidyalaya²

08 Aug 2016-Social Network Analysis and Mining

TL;DR: A mathematical model of news spreading from some posts displayed in an online social network using the epidemiological modeling technique is proposed and criteria of rumor detection and verification for the model are proposed.

...read moreread less

Abstract: People of the modern world are using social network Web sites to communicate with others either known or unknown, for getting opinions of others and giving their opinions to others. The post, weblogs, effects or affects human mind, at least for some time. These posts take a part in choosing their decisions and play an important role. But the information present in the post is either information or just misinformation, i.e., just a rumor. People are confused to distinguish these posts in either a correct information or misinformation. It is important to decide whether this is information or just a rumor because it may cause a support of the wrong decision of the whole majority. In this paper, a mathematical framework is presented related to these matters. Firstly, we proposed a mathematical model of news spreading from some posts displayed in an online social network. The development of mathematical models of news propagation uses the epidemiological modeling technique. Then, we proposed criteria of rumor detection and verification for the model. In the case of rumor, a revised model is proposed with media awareness as a control strategy for reducing the rumor spreading.

...read moreread less

Journal Article•DOI•

An algebraic approach to temporal network analysis based on temporal quantities

[...]

Vladimir Batagelj¹, Selena Praprotnik¹•Institutions (1)

University of Ljubljana¹

21 May 2016-Social Network Analysis and Mining

TL;DR: In this paper, the authors define the addition and multiplication of temporal quantities in a way that can be used for the definition of temporal networks and develop fast algorithms for the proposed operations.

...read moreread less

Abstract: In a temporal network, the presence and activity of nodes and links can change through time. To describe temporal networks we introduce the notion of temporal quantities. We define the addition and multiplication of temporal quantities in a way that can be used for the definition of addition and multiplication of temporal networks. The corresponding algebraic structures are semirings. The usual approach to (data) analysis of temporal networks is to transform the network into a sequence of time slices—static networks corresponding to selected time intervals and analyze each of them using standard methods to produce a sequence of results. The approach proposed in this paper enables us to compute these results directly. We developed fast algorithms for the proposed operations. They are available as an open source Python library TQ (Temporal Quantities) and a program Ianus. The proposed approach enables us to treat as temporal quantities also other network characteristics such as degrees, connectivity components, centrality measures, Pathfinder skeleton, etc. To illustrate the developed tools we present some results from the analysis of Franzosi’s violence network and Corman’s Reuters terror news network.

...read moreread less

Journal Article•DOI•

Identifying community structures in dynamic networks

[...]

Hamidreza Alvari¹, Alireza Hajibagheri¹, Gita Sukthankar¹, Kiran Lakkaraju²•Institutions (2)

University of Central Florida¹, Sandia National Laboratories²

12 Sep 2016-Social Network Analysis and Mining

TL;DR: In this article, a dynamic game-theoretic community detection method, D-GT (Dynamic Game-Theoretic Community Detection), is proposed. But it does not address the problem of detecting communities in dynamic networks.

...read moreread less

Abstract: Most real-world social networks are inherently dynamic, composed of communities that are constantly changing in membership. To track these evolving communities, we need dynamic community detection techniques. This article evaluates the performance of a set of game-theoretic approaches for identifying communities in dynamic networks. Our method, D-GT (Dynamic Game-Theoretic community detection), models each network node as a rational agent who periodically plays a community membership game with its neighbors. During game play, nodes seek to maximize their local utility by joining or leaving the communities of network neighbors. The community structure emerges after the game reaches a Nash equilibrium. Compared to the benchmark community detection methods, D-GT more accurately predicts the number of communities and finds community assignments with a higher normalized mutual information, while retaining a good modularity.

...read moreread less

Journal Article•DOI•

Discover millions of fake followers in Weibo

[...]

Yi Zhang¹, Jianguo Lu¹•Institutions (1)

University of Windsor¹

31 Mar 2016-Social Network Analysis and Mining

TL;DR: This paper investigates the top Weibo accounts whose follower lists duplicate or nearly duplicate each other (hereafter called near-duplicates), and proposes a novel fake account detection method based on the very purpose of the existence of these accounts.

...read moreread less

Abstract: Weibo is the Chinese counterpart of Twitter, which has attracted hundreds of millions of users. Just like other Online Social Networks (hereafter OSNs), Weibo has a large number of fake accounts. They are created to sell their following links to customers, who want to boost their follower counts. These bogus accounts are difficult to identify individually, especially when they are created by sophisticated programs or controlled by human beings directly. This paper proposes a novel fake account detection method that is based on the very purpose of the existence of these accounts: they are created to follow their targets en masse, resulting in high-overlapping between the follower lists of their customers. This paper investigates the top Weibo accounts whose follower lists duplicate or nearly duplicate each other (hereafter called near-duplicates). Discovering near-duplicates is a challenging task. The network is large; the data in its entirety are not available; the pair-wise comparison is very expensive. We developed a sampling-based approach to discover all the near-duplicates of the top accounts, who have at least 50,000 followers. In the experiment, we found 395 near-duplicates, which leads us to 11.90 million fake accounts (4.56 % of total users) who send 741.10 million links (9.50 % of the entire edges). Furthermore, we characterize four typical structures of the spammers, cluster these spammers into 34 groups, and analyze the properties of each group.

...read moreread less

Journal Article•DOI•

The social role of social media: the case of Chennai rains-2015

[...]

Mayank Yadav¹, Zillur Rahman¹•Institutions (1)

Indian Institutes of Technology¹

25 Oct 2016-Social Network Analysis and Mining

TL;DR: The role that social media can play during the time of natural disasters, with the help of the recent case of Chennai floods in India is focused on.

...read moreread less

Abstract: Social media has altered the way individuals communicate in present scenario. Individuals feel more connected on Facebook and Twitter with greater communication freedom to chat, share pictures, and videos. Hence, social media is widely employed by various companies to promote their product and services and establish better customer relationships. Owing to the increasing popularity of these social media platforms, their usage is also expanding significantly. Various studies have discussed the importance of social media in the corporate world for effective marketing communication, customer relationships, and firm performance, but no studies have focused on the social role of social media, i.e., in disaster resilience in India. Various academicians and practitioners have advocated the importance and use of social media in disaster resilience. This article focuses on the role that social media can play during the time of natural disasters, with the help of the recent case of Chennai floods in India. This study provides a better understanding about the role social media can play in natural disaster resilience in Indian context.

...read moreread less

Journal Article•DOI•

Centrality in the global network of corporate control

[...]

Frank W. Takes¹, Frank W. Takes², Eelke M. Heemskerk¹•Institutions (2)

University of Amsterdam¹, Leiden University²

15 Oct 2016-Social Network Analysis and Mining

TL;DR: In this paper, the authors investigate the global board interlock network, covering 400,000 firms linked through 1,700,000 edges representing shared directors between these firms, and investigate the concept of centrality, which is used to investigate the embeddedness of firms from a particular country within the global network.

...read moreread less

Abstract: Corporations across the world are highly interconnected in a large global network of corporate control. This paper investigates the global board interlock network, covering 400,000 firms linked through 1,700,000 edges representing shared directors between these firms. The main focus is on the concept of centrality, which is used to investigate the embeddedness of firms from a particular country within the global network. The study results in three contributions. First, to the best of our knowledge for the first time we can investigate the topology as well as the concept of centrality in corporate networks at a global scale, allowing for the largest cross-country comparison ever done in interlocking directorates literature. We demonstrate, among other things, extremely similar network topologies, yet large differences between countries when it comes to the relation between economic prominence indicators and firm centrality. Second, we introduce two new metrics that are specifically suitable for comparing the centrality ranking of a partition to that of the full network. Using the notion of centrality persistence we propose to measure the persistence of a partition’s centrality ranking in the full network. In the board interlock network, it allows us to assess the extent to which the footprint of a national network is still present within the global network. Next, the measure of centrality ranking dominance tells us whether a partition (country) is more dominant at the top or the bottom of the centrality ranking of the full (global) network. Finally, comparing these two new measures of persistence and dominance between different countries allows us to classify these countries based the their embeddedness, measured using the relation between the centrality of a country’s firms on the national and the global scale of the board interlock network.

...read moreread less

Journal Article•DOI•

Analysis and detection of labeled cyberbullying instances in Vine, a video-based social network

[...]

Rahat Ibn Rafiq¹, Homa Hosseinmardi¹, Sabrina Arredondo Mattson¹, Richard Han¹, Qin Lv¹, Shivakant Mishra¹ - Show less +2 more•Institutions (1)

University of Colorado Boulder¹

29 Sep 2016-Social Network Analysis and Mining

TL;DR: This research paper performs a thorough investigation of cyberbullying instances in Vine, a video-based online social network, and trains different classifiers based upon the labeled media sessions to detect instances of cyber Bullying.

...read moreread less

Abstract: The last decade has experienced an exponential growth of popularity in online social networks. This growth in popularity has also paved the way for the threat of cyberbullying to grow to an extent that was never seen before. Online social network users are now constantly under the threat of cyberbullying from predators and stalkers. In our research paper, we perform a thorough investigation of cyberbullying instances in Vine, a video-based online social network. We collect a set of media sessions (shared videos with their associated meta-data) and then label those using CrowdFlower, a crowd-sourced website for cyberaggression and cyberbullying. We also perform a second survey that labels the videos’ contents and emotions exhibited. After the labeling of the media sessions, we provide a detailed analysis of the media sessions to investigate the cyberbullying and cyberaggression behavior in Vine. After the analysis, we train different classifiers based upon the labeled media sessions. We then investigate, evaluate and compare the classifers’ performances to detect instances of cyberbullying.

...read moreread less

Journal Article•DOI•

Multi-source models for civil unrest forecasting

[...]

Gizem Korkmaz¹, Jose Cadena¹, Chris J. Kuhlman¹, Achla Marathe¹, Anil Vullikanti¹, Naren Ramakrishnan² - Show less +2 more•Institutions (2)

Virginia Bioinformatics Institute¹, Virginia Tech²

15 Jul 2016-Social Network Analysis and Mining

TL;DR: It is found that social media and news are more informative than other data sources, including the political event databases, and enhance the prediction performance, however, social media increases the variation in the performance metrics.

...read moreread less

Abstract: Civil unrest events (protests, strikes, and "occupy" events) range from small, nonviolent protests that address specific issues to events that turn into large-scale riots. Detecting and forecasting these events is of key interest to social scientists and policy makers because they can lead to significant societal and cultural changes. We forecast civil unrest events in six countries in Latin America on a daily basis, from November 2012 through August 2014, using multiple data sources that capture social, political and economic contexts within which civil unrest occurs. The models contain predictors extracted from social media sites (Twitter and blogs) and news sources, in addition to volume of requests to Tor, a widely used anonymity network. Two political event databases and country-specific exchange rates are also used. Our forecasting models are evaluated using a Gold Standard Report (GSR), which is compiled by an independent group of social scientists and subject matter experts. We use logistic regression models with Lasso to select a sparse feature set from our diverse datasets. The experimental results, measured by F1-scores, are in the range 0.68 to 0.95, and demonstrate the efficacy of using a multi-source approach for predicting civil unrest. Case studies illustrate the insights into unrest events that are obtained with our method. The ablation study demonstrates the relative value of data sources for prediction. We find that social media and news are more informative than other data sources, including the political event databases, and enhance the prediction performance. However, social media increases the variation in the performance metrics.

...read moreread less

Journal Article•DOI•

Where has this tweet come from? Fast and fine-grained geolocalization of non-geotagged tweets

[...]

Pavlos Paraskevopoulos¹, Themis Palpanas²•Institutions (2)

University of Trento¹, Paris Descartes University²

30 Sep 2016-Social Network Analysis and Mining

TL;DR: This work proposes a framework for geolocating tweets that are not geotagged and aims at providing accurate geolocation estimates at fine grain (i.e., within a city) by exploiting the similarities in the content between this post and a set of geot tagged tweets.

...read moreread less

Abstract: The rise in the use of social networks in the recent years has resulted in an abundance of information on different aspects of everyday social activities that is available online, with the most prominent and timely source of such information being Twitter. This has resulted in a proliferation of tools and applications that can help end users and large-scale event organizers to better plan and manage their activities. In this process of analysis of the information originating from social networks, an important aspect is that of the geographic coordinates, i.e., geolocalization, of the relevant information, which is necessary for several applications (e.g., on trending venues, traffic jams). Unfortunately, only a very small percentage of the twitter posts are geotagged, which significantly restricts the applicability and utility of such applications. In this work, we address this problem by proposing a framework for geolocating tweets that are not geotagged. Our solution is general and estimates the location from which a post was generated by exploiting the similarities in the content between this post and a set of geotagged tweets, as well as their time-evolution characteristics. Contrary to previous approaches, our framework aims at providing accurate geolocation estimates at fine grain (i.e., within a city). The experimental evaluation with real data demonstrates the efficiency and effectiveness of our approach.

...read moreread less

Journal Article•DOI•

Predicting charitable donations using social media

[...]

Rostyslav Korolov¹, Justin Peabody², Allen Lavoie², Sanmay Das², Malik Magdon-Ismail¹, William A. Wallace¹ - Show less +2 more•Institutions (2)

Rensselaer Polytechnic Institute¹, Washington University in St. Louis²

06 Jun 2016-Social Network Analysis and Mining

TL;DR: A contagion model is used to predict the near-quadratic scaling for the disaster response case and suggests that diffusion is present in emergency response case, while regular charity does not spread via social network.

...read moreread less

Abstract: We study the relationship between chatter on social media and observed actions concerning charitable donation. One hypothesis is that a fraction of those who act will also tweet about it, implying a linear relation. However, if the contagion is present, we expect a superlinear scaling. We consider two scenarios: donations in response to a natural disaster, and regular donations. We empirically validate the model using two location-paired sets of social media and donation data, corresponding to the two scenarios. Results show a quadratic relation between chatter and action in emergency response case. In case of regular donations, we observe a near-linear relation. Additionally, regular donations can be explained by demographic factors, while for a disaster response social media is a much better predictor of action. A contagion model is used to predict the near-quadratic scaling for the disaster response case. This suggests that diffusion is present in emergency response case, while regular charity does not spread via social network. Understanding the scaling behavior that relates social media chatter to physical actions is an important step in estimating the extent of a response and for determining social media strategies to affect the response.

...read moreread less

Journal Article•DOI•

Sentiment/subjectivity analysis survey for languages other than English

[...]

Mohammed Korayem, Khalifeh AlJadda, David J. Crandall¹•Institutions (1)

Indiana University¹

09 Sep 2016-Social Network Analysis and Mining

TL;DR: This paper surveys different ways used for building systems for subjective and sentiment analysis for languages other than English and presents a separate section devoted to Arabic sentiment analysis.

...read moreread less

Abstract: Subjective and sentiment analysis have gained considerable attention recently. Most of the resources and systems built so far are done for English. The need for designing systems for other languages is increasing. This paper surveys different ways used for building systems for subjective and sentiment analysis for languages other than English. There are three different types of systems used for building these systems. The first (and the best) one is the language-specific systems. The second type of systems involves reusing or transferring sentiment resources from English to the target language. The third type of methods is based on using language-independent methods. The paper presents a separate section devoted to Arabic sentiment analysis.

...read moreread less

Journal Article•DOI•

Design of reciprocal recommendation systems for online dating

[...]

Peng Xia¹, Shuangfei Zhai², Benyuan Liu¹, Yizhou Sun³, Cindy X. Chen¹ - Show less +1 more•Institutions (3)

University of Massachusetts Lowell¹, Binghamton University², Northeastern University³

10 Jun 2016-Social Network Analysis and Mining

TL;DR: This work introduces similarity measures that capture the unique features and characteristics of the online dating network, for example, the interest similarity between two users if they send messages to same users, and attractiveness similarity if they receive messages from same users.

...read moreread less

Abstract: Online dating sites have become popular platforms for people to look for potential romantic partners. Different from traditional user-item recommendations where the goal is to match items (e.g., books, videos) with a user’s interests, a recommendation system for online dating aims to match people who are mutually interested in and likely to communicate with each other. We introduce similarity measures that capture the unique features and characteristics of the online dating network, for example, the interest similarity between two users if they send messages to same users, and attractiveness similarity if they receive messages from same users. A reciprocal score that measures the compatibility between a user and each potential dating candidate is computed, and the recommendation list is generated to include users with top scores. The performance of our proposed recommendation system is evaluated on a real-world dataset from a major online dating site in China. The results show that our recommendation algorithms significantly outperform previously proposed approaches, and the collaborative filtering-based algorithms achieve much better performance than content-based algorithms in both precision and recall. Our results also reveal interesting behavioral difference between male and female users when it comes to looking for potential dates. In particular, males tend to be focused on their own interest and oblivious toward their attractiveness to potential dates, while females are more conscientious to their own attractiveness to the other side of the line.

...read moreread less

Journal Article•DOI•

A supervised approach for intra-/inter-community interaction prediction in dynamic social networks

[...]

Giulio Rossetti¹, Riccardo Guidotti², Ioanna Miliou², Dino Pedreschi², Fosca Giannotti¹ - Show less +1 more•Institutions (2)

Istituto di Scienza e Tecnologie dell'Informazione¹, University of Pisa²

27 Sep 2016-Social Network Analysis and Mining

TL;DR: This paper proposes a supervised learning approach which exploits features computed by time-aware forecasts of topological measures calculated between node pairs, and instantiate the interaction prediction problem in two disjoint applicative scenarios: intra-community and inter-community link prediction.

...read moreread less

Abstract: Due to the growing availability of Internet services in the last decade, the interactions between people became more and more easy to establish. For example, we can have an intercontinental job interview, or we can send real-time multimedia content to any friend of us just owning a smartphone. All this kind of human activities generates digital footprints, that describe a complex, rapidly evolving, network structures. In such dynamic scenario, one of the most challenging tasks involves the prediction of future interactions between couples of actors (i.e., users in online social networks, researchers in collaboration networks). In this paper, we approach such problem by leveraging networks dynamics: to this extent, we propose a supervised learning approach which exploits features computed by time-aware forecasts of topological measures calculated between node pairs. Moreover, since real social networks are generally composed by weakly connected modules, we instantiate the interaction prediction problem in two disjoint applicative scenarios: intra-community and inter-community link prediction. Experimental results on real time-stamped networks show how our approach is able to reach high accuracy. Furthermore, we analyze the performances of our methodology when varying the typologies of features, community discovery algorithms and forecast methods.

...read moreread less

Journal Article•DOI•

A computational approach for the experimental study of EU case law: analysis and implementation

[...]

Nicola Lettieri, Antonio Altamura¹, Armando Faggiano¹, Delfina Malandrino¹•Institutions (1)

University of Salerno¹

02 Aug 2016-Social Network Analysis and Mining

TL;DR: This paper presents an ongoing research project aiming to explore how approaches and techniques at the boundaries between Network analysis, Legal informatics and Visualization can help shedding new light into legal matters.

...read moreread less

Abstract: In recent years, the encounter between network analysis (NA) and Law has issued new challenges both on a scientific and application level. If, on the one hand, it is fostering new computational-inspired approaches to visualize, retrieve, manipulate and analyze legal information, on the other hand, it is inspiring the creation of innovative tools allowing legal scholars without technical skills to start dealing with NA and visual analytics on their own. This paper presents an ongoing research project aiming to explore how approaches and techniques at the boundaries between Network analysis, Legal informatics and Visualization can help shedding new light into legal matters. The attention is focused, on EuCaseNet, an online toolkit allowing legal scholars to apply NA and visual analytics techniques to the entire corpus of EU case law.

...read moreread less

Journal Article•DOI•

Fast rumor source identification via random walks

[...]

Alankar Jain¹, Vivek S. Borkar², Dinesh Garg³•Institutions (3)

IBM¹, Indian Institute of Technology Bombay², Indian Institute of Technology Gandhinagar³

22 Aug 2016-Social Network Analysis and Mining

TL;DR: This work proposes a heuristic based on the hitting time statistics of a surrogate random walk process that can be used to approximate the maximum likelihood estimator of the rumor source.

...read moreread less

Abstract: We consider the problem of inferring the source of a rumor in a given large network. We assume that the rumor propagates in the network through a discrete time susceptible-infected model. Input to our problem includes information regarding the entire network, an infected subgraph of the network observed at some known time instant, and the probability of one-hop rumor propagation. We propose a heuristic based on the hitting time statistics of a surrogate random walk process that can be used to approximate the maximum likelihood estimator of the rumor source. We test the performance of our heuristic on some standard synthetic and real-world network datasets and show that it outperforms many centrality-based heuristics that have traditionally been used in rumor source inference literature. Through time complexity analysis and extensive experimental evaluation, we demonstrate that our heuristic is computationally efficient for large, undirected and dense non-tree networks.

...read moreread less