Home
/
Authors
/
Leman Akoglu

Author

Leman Akoglu

Other affiliations: Stony Brook University, IBM, University of California, Davis

Bio: Leman Akoglu is an academic researcher from Carnegie Mellon University. The author has contributed to research in topics: Anomaly detection & Computer science. The author has an hindex of 39, co-authored 139 publications receiving 6755 citations. Previous affiliations of Leman Akoglu include Stony Brook University & IBM.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Graph based anomaly detection and description: a survey

[...]

Leman Akoglu¹, Hanghang Tong², Danai Koutra³•Institutions (3)

Stony Brook University¹, City University of New York², Carnegie Mellon University³

01 May 2015-Data Mining and Knowledge Discovery

TL;DR: This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs, and gives a general framework for the algorithms categorized under various settings.

...read moreread less

Abstract: Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, finance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured graph data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we give a general framework for the algorithms categorized under various settings: unsupervised versus (semi-)supervised approaches, for static versus dynamic graphs, for attributed versus plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly attribution and highlight the major techniques that facilitate digging out the root cause, or the `why', of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly detection in diverse domains, including financial, auction, computer traffic, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the field.

...read moreread less

998 citations

Posted Content•

Graph-based Anomaly Detection and Description: A Survey

[...]

Leman Akoglu¹, Hanghang Tong², Danai Koutra³•Institutions (3)

Stony Brook University¹, City University of New York², Carnegie Mellon University³

18 Apr 2014-arXiv: Social and Information Networks

TL;DR: A comprehensive survey of the state-of-the-art methods for anomaly detection in data represented as graphs can be found in this article, where the authors highlight the effectiveness, scalability, generality, and robustness aspects of the methods.

...read moreread less

Abstract: Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, finance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured {\em graph} data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we provide a comprehensive exploration of both data mining and machine learning algorithms for these {\em detection} tasks. we give a general framework for the algorithms categorized under various settings: unsupervised vs. (semi-)supervised approaches, for static vs. dynamic graphs, for attributed vs. plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly {\em attribution} and highlight the major techniques that facilitate digging out the root cause, or the `why', of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly detection in diverse domains, including financial, auction, computer traffic, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the field.

...read moreread less

703 citations

Book Chapter•DOI•

OddBall: spotting anomalies in weighted graphs

[...]

Leman Akoglu¹, Mary McGlohon¹, Christos Faloutsos¹•Institutions (1)

Carnegie Mellon University¹

21 Jun 2010

TL;DR: Several new rules in density, weights, ranks and eigenvalues that seem to govern the so-called “neighborhood sub-graphs” are discovered and shown how to use these rules for anomaly detection.

...read moreread less

Abstract: Given a large, weighted graph, how can we find anomalies? Which rules should be violated, before we label a node as an anomaly? We propose the oddball algorithm, to find such nodes The contributions are the following: (a) we discover several new rules (power laws) in density, weights, ranks and eigenvalues that seem to govern the so-called “neighborhood sub-graphs” and we show how to use these rules for anomaly detection; (b) we carefully choose features, and design oddball, so that it is scalable and it can work un-supervised (no user-defined constants) and (c) we report experiments on many real graphs with up to 1.6 million nodes, where oddball indeed spots unusual nodes that agree with intuition.

...read moreread less

555 citations

Proceedings Article•DOI•

Collective Opinion Spam Detection: Bridging Review Networks and Metadata

[...]

Shebuti Rayana¹, Leman Akoglu¹•Institutions (1)

Stony Brook University¹

10 Aug 2015

TL;DR: This work proposes a new holistic approach called SPEAGLE that utilizes clues from all metadata (text, timestamp, rating) as well as relational data (network), and harness them collectively under a unified framework to spot suspicious users and reviews, aswell as products targeted by spam.

...read moreread less

Abstract: Online reviews capture the testimonials of "real" people and help shape the decisions of other consumers. Due to the financial gains associated with positive reviews, however, opinion spam has become a widespread problem, with often paid spam reviewers writing fake reviews to unjustly promote or demote certain products or businesses. Existing approaches to opinion spam have successfully but separately utilized linguistic clues of deception, behavioral footprints, or relational ties between agents in a review system.In this work, we propose a new holistic approach called SPEAGLE that utilizes clues from all metadata (text, timestamp, rating) as well as relational data (network), and harness them collectively under a unified framework to spot suspicious users and reviews, as well as products targeted by spam. Moreover, our method can efficiently and seamlessly integrate semi-supervision, i.e., a (small) set of labels if available, without requiring any training or changes in its underlying algorithm. We demonstrate the effectiveness and scalability of SPEAGLE on three real-world review datasets from Yelp.com with filtered (spam) and recommended (non-spam) reviews, where it significantly outperforms several baselines and state-of-the-art methods. To the best of our knowledge, this is the largest scale quantitative evaluation performed to date for the opinion spam problem.

...read moreread less

451 citations

Proceedings Article•DOI•

RolX: structural role extraction & mining in large graphs

[...]

Keith Henderson¹, Brian Gallagher¹, Tina Eliassi-Rad², Hanghang Tong³, Sugato Basu⁴, Leman Akoglu⁵, Danai Koutra⁵, Christos Faloutsos⁵, Lei Li⁵ - Show less +5 more•Institutions (5)

Lawrence Livermore National Laboratory¹, Rutgers University², IBM³, Google⁴, Carnegie Mellon University⁵

12 Aug 2012

TL;DR: This paper proposes RolX (Role eXtraction), a scalable (linear in the number of edges), unsupervised learning approach for automatically extracting structural roles from general network data, and compares network role discovery with network community discovery.

...read moreread less

Abstract: Given a network, intuitively two nodes belong to the same role if they have similar structural behavior. Roles should be automatically determined from the data, and could be, for example, "clique-members," "periphery-nodes," etc. Roles enable numerous novel and useful network-mining tasks, such as sense-making, searching for similar nodes, and node classification. This paper addresses the question: Given a graph, how can we automatically discover roles for nodes? We propose RolX (Role eXtraction), a scalable (linear in the number of edges), unsupervised learning approach for automatically extracting structural roles from general network data. We demonstrate the effectiveness of RolX on several network-mining tasks: from exploratory data analysis to network transfer learning. Moreover, we compare network role discovery with network community discovery. We highlight fundamental differences between the two (e.g., roles generalize across disconnected networks, communities do not); and show that the two approaches are complimentary in nature.

...read moreread less

447 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

DeepWalk: online learning of social representations

[...]

Bryan Perozzi¹, Rami Al-Rfou¹, Steven Skiena¹•Institutions (1)

Stony Brook University¹

24 Aug 2014

TL;DR: DeepWalk as mentioned in this paper uses local information obtained from truncated random walks to learn latent representations by treating walks as the equivalent of sentences, which encode social relations in a continuous vector space, which is easily exploited by statistical models.

...read moreread less

Abstract: We present DeepWalk, a novel approach for learning latent representations of vertices in a network. These latent representations encode social relations in a continuous vector space, which is easily exploited by statistical models. DeepWalk generalizes recent advancements in language modeling and unsupervised feature learning (or deep learning) from sequences of words to graphs.DeepWalk uses local information obtained from truncated random walks to learn latent representations by treating walks as the equivalent of sentences. We demonstrate DeepWalk's latent representations on several multi-label network classification tasks for social networks such as BlogCatalog, Flickr, and YouTube. Our results show that DeepWalk outperforms challenging baselines which are allowed a global view of the network, especially in the presence of missing information. DeepWalk's representations can provide F1 scores up to 10% higher than competing methods when labeled data is sparse. In some experiments, DeepWalk's representations are able to outperform all baseline methods while using 60% less training data.DeepWalk is also scalable. It is an online learning algorithm which builds useful incremental results, and is trivially parallelizable. These qualities make it suitable for a broad class of real world applications such as network classification, and anomaly detection.

...read moreread less

8,117 citations

Proceedings Article•DOI•

node2vec: Scalable Feature Learning for Networks

[...]

Aditya Grover¹, Jure Leskovec¹•Institutions (1)

Stanford University¹

13 Aug 2016

TL;DR: Node2vec as mentioned in this paper learns a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes by using a biased random walk procedure.

...read moreread less

Abstract: Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.

...read moreread less

7,072 citations

Journal Article•DOI•

Social Media and Fake News in the 2016 Election

[...]

Hunt Allcott¹, Matthew Gentzkow²•Institutions (2)

New York University¹, Stanford University²

19 Jan 2017-Journal of Economic Perspectives

TL;DR: The authors found that people are much more likely to believe stories that favor their preferred candidate, especially if they have ideologically segregated social media networks, and that the average American adult saw on the order of one or perhaps several fake news stories in the months around the 2016 U.S. presidential election, with just over half of those who recalled seeing them believing them.

...read moreread less

Abstract: Following the 2016 U.S. presidential election, many have expressed concern about the effects of false stories (“fake news”), circulated largely through social media. We discuss the economics of fake news and present new data on its consumption prior to the election. Drawing on web browsing data, archives of fact-checking websites, and results from a new online survey, we find: (i) social media was an important but not dominant source of election news, with 14 percent of Americans calling social media their “most important” source; (ii) of the known false news stories that appeared in the three months before the election, those favoring Trump were shared a total of 30 million times on Facebook, while those favoring Clinton were shared 8 million times; (iii) the average American adult saw on the order of one or perhaps several fake news stories in the months around the election, with just over half of those who recalled seeing them believing them; and (iv) people are much more likely to believe stories that favor their preferred candidate, especially if they have ideologically segregated social media networks.

...read moreread less

3,959 citations

Social Network Analysis

[...]

Tom A. B. Snijders

01 Jan 2012

3,692 citations

On robust estimation of the location parameter

[...]

Frederick R. Forst

01 Jan 1980

3,652 citations