scispace - formally typeset
Search or ask a question
Author

Alessandro Panconesi

Bio: Alessandro Panconesi is an academic researcher from Sapienza University of Rome. The author has contributed to research in topics: Distributed algorithm & Approximation algorithm. The author has an hindex of 40, co-authored 133 publications receiving 6270 citations. Previous affiliations of Alessandro Panconesi include Cornell University & Centrum Wiskunde & Informatica.


Papers
More filters
Book
19 Oct 2009
TL;DR: In this paper, the authors present a coherent and unified treatment of probabilistic techniques for obtaining high probability estimates on the performance of randomized algorithms, from the basic tool kit from the Chernoff-Hoeffding (CH) bounds to more sophisticated techniques like Martingales and isoperimetric inequalities, as well as some recent developments like Talagrand's inequality, transportation cost inequalities, and log-Sobolev inequalities.
Abstract: Randomized algorithms have become a central part of the algorithms curriculum based on their increasingly widespread use in modern applications. This book presents a coherent and unified treatment of probabilistic techniques for obtaining high- probability estimates on the performance of randomized algorithms. It covers the basic tool kit from the Chernoff-Hoeffding (CH) bounds to more sophisticated techniques like Martingales and isoperimetric inequalities, as well as some recent developments like Talagrand's inequality, transportation cost inequalities, and log-Sobolev inequalities. Along the way, variations on the basic theme are examined, such as CH bounds in dependent settings. The authors emphasize comparative study of the different methods, highlighting respective strengths and weaknesses in concrete example applications. The exposition is tailored to discrete settings sufficient for the analysis of algorithms, avoiding unnecessary measure-theoretic details, thus making the book accessible to computer scientists as well as probabilists and discrete mathematicians.

1,028 citations

Proceedings ArticleDOI
28 Jun 2009
TL;DR: This work proposes simple combinatorial formulations that encapsulate efficient compressibility of graphs and shows that some of the problems are NP-hard yet admit effective heuristics, some of which can exploit properties of social networks such as link reciprocity.
Abstract: Motivated by structural properties of the Web graph that support efficient data structures for in memory adjacency queries, we study the extent to which a large network can be compressed. Boldi and Vigna (WWW 2004), showed that Web graphs can be compressed down to three bits of storage per edge; we study the compressibility of social networks where again adjacency queries are a fundamental primitive. To this end, we propose simple combinatorial formulations that encapsulate efficient compressibility of graphs. We show that some of the problems are NP-hard yet admit effective heuristics, some of which can exploit properties of social networks such as link reciprocity. Our extensive experiments show that social networks and the Web graph exhibit vastly different compressibility characteristics.

391 citations

Journal ArticleDOI
TL;DR: Fast and simple randomized algorithms for edge coloring a graph in the synchronous distributed point-to-point model of computation and new techniques for proving upper bounds on the tail probabilities of certain random variables which are not stochastically independent are introduced.
Abstract: Certain types of routing, scheduling, and resource-allocation problems in a distributed setting can be modeled as edge-coloring problems We present fast and simple randomized algorithms for edge coloring a graph in the synchronous distributed point-to-point model of computation Our algorithms compute an edge coloring of a graph $G$ with $n$ nodes and maximum degree $\Delta$ with at most $16 \Delta + O(\log^{1+ \delta} n)$ colors with high probability (arbitrarily close to 1) for any fixed $\delta > 0$; they run in polylogarithmic time The upper bound on the number of colors improves upon the $(2 \Delta - 1)$-coloring achievable by a simple reduction to vertex coloring To analyze the performance of our algorithms, we introduce new techniques for proving upper bounds on the tail probabilities of certain random variables The Chernoff--Hoeffding bounds are fundamental tools that are used very frequently in estimating tail probabilities However, they assume stochastic independence among certain random variables, which may not always hold Our results extend the Chernoff--Hoeffding bounds to certain types of random variables which are not stochastically independent We believe that these results are of independent interest and merit further study

340 citations

Journal ArticleDOI
TL;DR: The bounds for computing a network decomposition distributively and deterministically are improved and it is shown that the class of graphs G whose maximum degree isnO(?(n))where ?

225 citations

Journal ArticleDOI
TL;DR: It is shown that edge colourings with at most $2\Delta-1$ colours, and maximal matchings can be computed within deterministic rounds, where $\Delta$ is the maximum degree of the network.
Abstract: We give simple, deterministic, distributed algorithms for computing maximal matchings, maximal independent sets and colourings. We show that edge colourings with at most 2Δ-1 colours, and maximal matchings can be computed within O(log* n + Δ) deterministic rounds, where Δ is the maximum degree of the network. We also show how to find maximal independent sets and (Δ + 1)-vertex colourings within O(log* n + Δ2) deterministic rounds. All hidden constants are very small and the algorithms are very simple.

205 citations


Cited by
More filters
Proceedings ArticleDOI
22 Jan 2006
TL;DR: Some of the major results in random graphs and some of the more challenging open problems are reviewed, including those related to the WWW.
Abstract: We will review some of the major results in random graphs and some of the more challenging open problems. We will cover algorithmic and structural questions. We will touch on newer models, including those related to the WWW.

7,116 citations

Journal ArticleDOI
TL;DR: In this paper, the authors studied the asymptotic behavior of the cost of the solution returned by stochastic sampling-based path planning algorithms as the number of samples increases.
Abstract: During the last decade, sampling-based path planning algorithms, such as probabilistic roadmaps (PRM) and rapidly exploring random trees (RRT), have been shown to work well in practice and possess theoretical guarantees such as probabilistic completeness. However, little effort has been devoted to the formal analysis of the quality of the solution returned by such algorithms, e.g. as a function of the number of samples. The purpose of this paper is to fill this gap, by rigorously analyzing the asymptotic behavior of the cost of the solution returned by stochastic sampling-based algorithms as the number of samples increases. A number of negative results are provided, characterizing existing algorithms, e.g. showing that, under mild technical conditions, the cost of the solution returned by broadly used sampling-based algorithms converges almost surely to a non-optimal value. The main contribution of the paper is the introduction of new algorithms, namely, PRM* and RRT*, which are provably asymptotically optimal, i.e. such that the cost of the returned solution converges almost surely to the optimum. Moreover, it is shown that the computational complexity of the new algorithms is within a constant factor of that of their probabilistically complete (but not asymptotically optimal) counterparts. The analysis in this paper hinges on novel connections between stochastic sampling-based path planning algorithms and the theory of random geometric graphs.

3,438 citations

Journal ArticleDOI
16 Dec 2011-Science
TL;DR: A measure of dependence for two-variable relationships: the maximal information coefficient (MIC), which captures a wide range of associations both functional and not, and for functional relationships provides a score that roughly equals the coefficient of determination of the data relative to the regression function.
Abstract: Identifying interesting relationships between pairs of variables in large data sets is increasingly important. Here, we present a measure of dependence for two-variable relationships: the maximal information coefficient (MIC). MIC captures a wide range of associations both functional and not, and for functional relationships provides a score that roughly equals the coefficient of determination (R2) of the data relative to the regression function. MIC belongs to a larger class of maximal information-based nonparametric exploration (MINE) statistics for identifying and classifying relationships. We apply MIC and MINE to data sets in global health, gene expression, major-league baseball, and the human gut microbiota and identify known and novel relationships.

2,414 citations

Posted Content
TL;DR: The main contribution of the paper is the introduction of new algorithms, namely, PRM and RRT*, which are provably asymptotically optimal, i.e. such that the cost of the returned solution converges almost surely to the optimum.
Abstract: During the last decade, sampling-based path planning algorithms, such as Probabilistic RoadMaps (PRM) and Rapidly-exploring Random Trees (RRT), have been shown to work well in practice and possess theoretical guarantees such as probabilistic completeness. However, little effort has been devoted to the formal analysis of the quality of the solution returned by such algorithms, e.g., as a function of the number of samples. The purpose of this paper is to fill this gap, by rigorously analyzing the asymptotic behavior of the cost of the solution returned by stochastic sampling-based algorithms as the number of samples increases. A number of negative results are provided, characterizing existing algorithms, e.g., showing that, under mild technical conditions, the cost of the solution returned by broadly used sampling-based algorithms converges almost surely to a non-optimal value. The main contribution of the paper is the introduction of new algorithms, namely, PRM* and RRT*, which are provably asymptotically optimal, i.e., such that the cost of the returned solution converges almost surely to the optimum. Moreover, it is shown that the computational complexity of the new algorithms is within a constant factor of that of their probabilistically complete (but not asymptotically optimal) counterparts. The analysis in this paper hinges on novel connections between stochastic sampling-based path planning algorithms and the theory of random geometric graphs.

2,210 citations

Journal Article
TL;DR: In this article, the authors explore the effect of dimensionality on the nearest neighbor problem and show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance of the farthest data point.
Abstract: We explore the effect of dimensionality on the nearest neighbor problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the farthest data point. To provide a practical perspective, we present empirical results on both real and synthetic data sets that demonstrate that this effect can occur for as few as 10-15 dimensions. These results should not be interpreted to mean that high-dimensional indexing is never meaningful; we illustrate this point by identifying some high-dimensional workloads for which this effect does not occur. However, our results do emphasize that the methodology used almost universally in the database literature to evaluate high-dimensional indexing techniques is flawed, and should be modified. In particular, most such techniques proposed in the literature are not evaluated versus simple linear scan, and are evaluated over workloads for which nearest neighbor is not meaningful. Often, even the reported experiments, when analyzed carefully, show that linear scan would outperform the techniques being proposed on the workloads studied in high (10-15) dimensionality!.

1,992 citations