Journal•

arXiv: Data Structures and Algorithms

About: arXiv: Data Structures and Algorithms is an academic journal. The journal publishes majorly in the area(s): Time complexity & Approximation algorithm. Over the lifetime, 11914 publications have been published receiving 96051 citations.

...read moreread less

Topics: Time complexity, Approximation algorithm, Upper and lower bounds, Vertex (geometry), Parameterized complexity ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Posted Content•

Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters

[...]

Jure Leskovec¹, Kevin J. Lang, Anirban Dasgupta, Michael W. Mahoney¹•Institutions (1)

Stanford University¹

08 Oct 2008-arXiv: Data Structures and Algorithms

TL;DR: In this article, the authors employ approximation algorithms for the graph partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities.

...read moreread less

Abstract: A large body of work has been devoted to defining and identifying clusters or communities in social and information networks. We explore from a novel perspective several questions related to identifying meaningful communities in large social and information networks, and we come to several striking conclusions. We employ approximation algorithms for the graph partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities. In particular, we define the network community profile plot, which characterizes the "best" possible community--according to the conductance measure--over a wide range of size scales. We study over 100 large real-world social and information networks. Our results suggest a significantly more refined picture of community structure in large networks than has been appreciated previously. In particular, we observe tight communities that are barely connected to the rest of the network at very small size scales; and communities of larger size scales gradually "blend into" the expander-like core of the network and thus become less "community-like." This behavior is not explained, even at a qualitative level, by any of the commonly-used network generation models. Moreover, it is exactly the opposite of what one would expect based on intuition from expander graphs, low-dimensional or manifold-like graphs, and from small social networks that have served as testbeds of community detection algorithms. We have found that a generative graph model, in which new edges are added via an iterative "forest fire" burning process, is able to produce graphs exhibiting a network community profile plot similar to what we observe in our network datasets.

...read moreread less

1,555 citations

Posted Content•

A Tutorial on Spectral Clustering

[...]

Ulrike von Luxburg¹•Institutions (1)

Max Planck Society¹

01 Nov 2007-arXiv: Data Structures and Algorithms

TL;DR: This tutorial describes different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches.

...read moreread less

Abstract: In recent years, spectral clustering has become one of the most popular modern clustering algorithms. It is simple to implement, can be solved efficiently by standard linear algebra software, and very often outperforms traditional clustering algorithms such as the k-means algorithm. On the first glance spectral clustering appears slightly mysterious, and it is not obvious to see why it works at all and what it really does. The goal of this tutorial is to give some intuition on those questions. We describe different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches. Advantages and disadvantages of the different spectral clustering algorithms are discussed.

...read moreread less

1,160 citations

Posted Content•

Empirical Comparison of Algorithms for Network Community Detection

[...]

Jure Leskovec¹, Kevin J. Lang², Michael W. Mahoney¹•Institutions (2)

Stanford University¹, Yahoo!²

20 Apr 2010-arXiv: Data Structures and Algorithms

TL;DR: In this paper, the authors explore a range of network community detection methods in order to compare them and to understand their relative performance and the systematic biases in the clusters they identify, and examine several different classes of approximation algorithms that aim to optimize such objective functions.

...read moreread less

Abstract: Detecting clusters or communities in large real-world graphs such as large social or information networks is a problem of considerable interest. In practice, one typically chooses an objective function that captures the intuition of a network cluster as set of nodes with better internal connectivity than external connectivity, and then one applies approximation algorithms or heuristics to extract sets of nodes that are related to the objective function and that "look like" good communities for the application of interest. In this paper, we explore a range of network community detection methods in order to compare them and to understand their relative performance and the systematic biases in the clusters they identify. We evaluate several common objective functions that are used to formalize the notion of a network community, and we examine several different classes of approximation algorithms that aim to optimize such objective functions. In addition, rather than simply fixing an objective and asking for an approximation to the best cluster of any size, we consider a size-resolved version of the optimization problem. Considering community quality as a function of its size provides a much finer lens with which to examine community detection algorithms, since objective functions and approximation algorithms often have non-obvious size-dependent behavior.

...read moreread less

950 citations

Proceedings Article•DOI•

Powers of Tensors and Fast Matrix Multiplication

[...]

François Le Gall¹•Institutions (1)

University of Tokyo¹

30 Jan 2014-arXiv: Data Structures and Algorithms

TL;DR: In this paper, the authors present a method to analyze the powers of a given trilinear form (a special kind of algebraic constructions also called a tensor) and obtain upper bounds on the asymptotic complexity of matrix multiplication.

...read moreread less

Abstract: This paper presents a method to analyze the powers of a given trilinear form (a special kind of algebraic constructions also called a tensor) and obtain upper bounds on the asymptotic complexity of matrix multiplication. Compared with existing approaches, this method is based on convex optimization, and thus has polynomial-time complexity. As an application, we use this method to study powers of the construction given by Coppersmith and Winograd [Journal of Symbolic Computation, 1990] and obtain the upper bound $\omega<2.3728639$ on the exponent of square matrix multiplication, which slightly improves the best known upper bound.

...read moreread less

940 citations

Posted Content•

Inferring Networks of Diffusion and Influence

[...]

Manuel Gomez-Rodriguez¹, Jure Leskovec², Andreas Krause³•Institutions (3)

Max Planck Society¹, Stanford University², California Institute of Technology³

01 Jun 2010-arXiv: Data Structures and Algorithms

TL;DR: This work develops an efficient approximation algorithm that scales to large datasets and finds provably near-optimal networks for tracing paths of diffusion and influence through networks and inferring the networks over which contagions propagate.

...read moreread less

Abstract: Information diffusion and virus propagation are fundamental processes taking place in networks. While it is often possible to directly observe when nodes become infected with a virus or adopt the information, observing individual transmissions (i.e., who infects whom, or who influences whom) is typically very difficult. Furthermore, in many applications, the underlying network over which the diffusions and propagations spread is actually unobserved. We tackle these challenges by developing a method for tracing paths of diffusion and influence through networks and inferring the networks over which contagions propagate. Given the times when nodes adopt pieces of information or become infected, we identify the optimal network that best explains the observed infection times. Since the optimization problem is NP-hard to solve exactly, we develop an efficient approximation algorithm that scales to large datasets and finds provably near-optimal networks. We demonstrate the effectiveness of our approach by tracing information diffusion in a set of 170 million blogs and news articles over a one year period to infer how information flows through the online media space. We find that the diffusion network of news for the top 1,000 media sites and blogs tends to have a core-periphery structure with a small set of core media sites that diffuse information to the rest of the Web. These sites tend to have stable circles of influence with more general news media sites acting as connectors between them.

...read moreread less

915 citations

Collapse

Network Information

Related Journals (5)

SIAM Journal on Computing

3.5K papers, 327.5K citations

93% related

Information Processing Letters

7.7K papers, 189.7K citations

90% related

Theoretical Computer Science

12.4K papers, 368.9K citations

90% related

Journal of Computer and System Sciences

2.7K papers, 161K citations

88% related

Discrete Applied Mathematics

9.1K papers, 178.6K citations

86% related

Performance

Metrics

11,914

Papers

116,543

Citations

No. of papers from the Journal in previous years
Year	Papers
2021	1,096
2020	1,469
2019	1,343
2018	1,202
2017	1,051
2016	1,013