scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Detection of fake followers using feature ratio in self-organizing maps

TL;DR: A measure based on a computed feature ratio value to effectively isolate fake follower accounts from genuine users is proposed and shown to be an efficient measure to cluster fake followers.
Abstract: Detection of Fake followers has been a challenging task for the social media research community. Fake followers adopt varied methods to accomplish the goal. Identifying and modeling the behavior of fake followers is an interesting ongoing active research field. In this paper we propose a measure based on a computed feature ratio value to effectively isolate fake follower accounts from genuine users. A twitter dataset comprising of fake follower information has been used for the analytics presented in this paper. The artificial neural network referred to as Self-Organizing Map has been used for training and analysis of the Twitter fake follower dataset. The analysis presented demonstrates that the proposed metric is an efficient measure to cluster fake followers.
Citations
More filters
Proceedings ArticleDOI
23 Jan 2023
TL;DR: In this paper , a novel algorithm called SVM-NN is proposed in order to effectively detect phone Instagram accounts, which can accurately classify about 89% of the users in the classification dataset.
Abstract: Online social networks are more prevalent than ever and have become deeply ingrained in people’s social lives. Through online social networks, they chat with one another, share data, organize events, and even run their own online companies. In order to steal personal information, spread destructive activities, and publish fake information, attackers and imposters have been drawn to OSNs because of their explosive growth and the vast amount of personal data they collect from their users. On the contrary, researchers have started to look into reliable strategies for identifying fake accounts and questionable activities using account attributes. However, several of the employed account variables have no impact at all or have a negative effect on the results. Furthermore, employing independent categorization algorithms does not necessarily yield positive outcomes. In order to effectively detect phone Instagram accounts, a novel algorithm called SVM-NN is proposed in this research. There were utilized four feature evaluation and data reduction procedures. The support vector machine, neural network, and the most current technique, SVMNN, were used to determine whether the chosen accounts were actual or spam. SVM-NN outperforms SVM and NN, using fewer characteristics while still being able to accurately classify about 89% of the users in our classification dataset.
Proceedings ArticleDOI
23 Jan 2023
TL;DR: In this paper , a novel algorithm called SVM-NN is proposed in order to effectively detect phone Instagram accounts, which can accurately classify about 89% of the users in the classification dataset.
Abstract: Online social networks are more prevalent than ever and have become deeply ingrained in people’s social lives. Through online social networks, they chat with one another, share data, organize events, and even run their own online companies. In order to steal personal information, spread destructive activities, and publish fake information, attackers and imposters have been drawn to OSNs because of their explosive growth and the vast amount of personal data they collect from their users. On the contrary, researchers have started to look into reliable strategies for identifying fake accounts and questionable activities using account attributes. However, several of the employed account variables have no impact at all or have a negative effect on the results. Furthermore, employing independent categorization algorithms does not necessarily yield positive outcomes. In order to effectively detect phone Instagram accounts, a novel algorithm called SVM-NN is proposed in this research. There were utilized four feature evaluation and data reduction procedures. The support vector machine, neural network, and the most current technique, SVMNN, were used to determine whether the chosen accounts were actual or spam. SVM-NN outperforms SVM and NN, using fewer characteristics while still being able to accurately classify about 89% of the users in our classification dataset.
References
More filters
Proceedings ArticleDOI
13 Aug 2016
TL;DR: Node2vec as mentioned in this paper learns a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes by using a biased random walk procedure.
Abstract: Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.

7,072 citations

Journal ArticleDOI
01 Dec 2015
TL;DR: A novel Class A classifier general enough to thwart overfitting, lightweight thanks to the usage of the less costly features, and still able to correctly classify more than 95% of the accounts of the original training set.
Abstract: Fake followers are those Twitter accounts specifically created to inflate the number of followers of a target account. Fake followers are dangerous for the social platform and beyond, since they may alter concepts like popularity and influence in the Twittersphere-hence impacting on economy, politics, and society. In this paper, we contribute along different dimensions. First, we review some of the most relevant existing features and rules (proposed by Academia and Media) for anomalous Twitter accounts detection. Second, we create a baseline dataset of verified human and fake follower accounts. Such baseline dataset is publicly available to the scientific community. Then, we exploit the baseline dataset to train a set of machine-learning classifiers built over the reviewed rules and features. Our results show that most of the rules proposed by Media provide unsatisfactory performance in revealing fake followers, while features proposed in the past by Academia for spam detection provide good results. Building on the most promising features, we revise the classifiers both in terms of reduction of overfitting and cost for gathering the data needed to compute the features. The final result is a novel Class A classifier, general enough to thwart overfitting, lightweight thanks to the usage of the less costly features, and still able to correctly classify more than 95% of the accounts of the original training set. We ultimately perform an information fusion-based sensitivity analysis, to assess the global sensitivity of each of the features employed by the classifier.The findings reported in this paper, other than being supported by a thorough experimental methodology and interesting on their own, also pave the way for further investigation on the novel issue of fake Twitter followers.

340 citations


"Detection of fake followers using f..." refers background in this paper

  • ...The Research work put forth by the MIB team also presented interesting research [4] and survey [5] outcomes....

    [...]

Proceedings ArticleDOI
13 Aug 2016
TL;DR: FRAUDAR is proposed, an algorithm that is camouflage-resistant, provides upper bounds on the effectiveness of fraudsters, and is effective in real-world data.
Abstract: Given a bipartite graph of users and the products that they review, or followers and followees, how can we detect fake reviews or follows? Existing fraud detection methods (spectral, etc.) try to identify dense subgraphs of nodes that are sparsely connected to the remaining graph. Fraudsters can evade these methods using camouflage, by adding reviews or follows with honest targets so that they look "normal". Even worse, some fraudsters use hijacked accounts from honest users, and then the camouflage is indeed organic. Our focus is to spot fraudsters in the presence of camouflage or hijacked accounts. We propose FRAUDAR, an algorithm that (a) is camouflage-resistant, (b) provides upper bounds on the effectiveness of fraudsters, and (c) is effective in real-world data. Experimental results under various attacks show that FRAUDAR outperforms the top competitor in accuracy of detecting both camouflaged and non-camouflaged fraud. Additionally, in real-world experiments with a Twitter follower-followee graph of 1.47 billion edges, FRAUDAR successfully detected a subgraph of more than 4000 detected accounts, of which a majority had tweets showing that they used follower-buying services.

273 citations


"Detection of fake followers using f..." refers background in this paper

  • ...The first paper we would like to highlight here is FRAUDAR [1], which was presented by Christos Faloutsos and his team of research students at KDD 2016 and was awarded the Best Paper....

    [...]

  • ...A. FRAUDAR: Bounding Graph Fraud in the Face of Camouflage The first paper we would like to highlight here is FRAUDAR [1], which was presented by Christos Faloutsos and his team of research students at KDD 2016 and was awarded the Best Paper....

    [...]

Proceedings ArticleDOI
24 Aug 2014
TL;DR: This work proposes a fast and effective method, CatchSync, which exploits two of the tell-tale signs left in graphs by fraudsters, and introduces novel measures to quantify both concepts ("synchronicity" and "normality") and proposes a parameter-free algorithm that works on the resulting synchronicities-normality plots.
Abstract: Given a directed graph of millions of nodes, how can we automatically spot anomalous, suspicious nodes, judging only from their connectivity patterns? Suspicious graph patterns show up in many applications, from Twitter users who buy fake followers, manipulating the social network, to botnet members performing distributed denial of service attacks, disturbing the network traffic graph. We propose a fast and effective method, CatchSync, which exploits two of the tell-tale signs left in graphs by fraudsters: (a) synchronized behavior: suspicious nodes have extremely similar behavior pattern, because they are often required to perform some task together (such as follow the same user); and (b) rare behavior: their connectivity patterns are very different from the majority. We introduce novel measures to quantify both concepts ("synchronicity" and "normality") and we propose a parameter-free algorithm that works on the resulting synchronicity-normality plots. Thanks to careful design, CatchSync has the following desirable properties: (a) it is scalable to large datasets, being linear on the graph size; (b) it is parameter free; and (c) it is side-information-oblivious: it can operate using only the topology, without needing labeled data, nor timing information, etc., while still capable of using side information, if available. We applied CatchSync on two large, real datasets 1-billion-edge Twitter social graph and 3-billion-edge Tencent Weibo social graph, and several synthetic ones; CatchSync consistently outperforms existing competitors, both in detection accuracy by 36% on Twitter and 20% on Tencent Weibo, as well as in speed.

169 citations


"Detection of fake followers using f..." refers background in this paper

  • ...Recent Academic publications [6,7,8,9] have also presented interesting research outcomes that motivate further research in the field....

    [...]

  • ...In [7] the behavior of suspicious followers are modeled and categorized as synchronized and abnormal....

    [...]

Proceedings ArticleDOI
27 Jul 2015
TL;DR: In this study, analysis of 62 million publicly available Twitter user profiles was conducted and a strategy to retroactively identify automatically generated fake profiles was established using a pattern-matching algorithm on screen-names with an analysis of tweet update times to allow for time-efficient detection of fake profiles in OSNs.
Abstract: In Online Social Networks (OSNs), the audience size commanded by an organization or an individual is a critical measure of that entity's popularity. This measure has important economic and/or political implications. Organizations can use information about their audience, such as age, location etc., to tailor their products or their message appropriately. But such tailoring can be biased by the presence of fake profiles on these networks. In this study, analysis of 62 million publicly available Twitter user profiles was conducted and a strategy to retroactively identify automatically generated fake profiles was established. Using a pattern-matching algorithm on screen-names with an analysis of tweet update times, a highly reliable sub-set of fake user accounts were identified. Analysis of profile creation times and URLs of these fake accounts revealed distinct behavior of the fake users relative to a ground truth data set. The combination of this scheme with established social graph analysis will allow for time-efficient detection of fake profiles in OSNs.

65 citations


"Detection of fake followers using f..." refers background in this paper

  • ...Recent Academic publications [6,7,8,9] have also presented interesting research outcomes that motivate further research in the field....

    [...]