Author

# Shiva Chaudhuri

Other affiliations: University of Waterloo

Bio: Shiva Chaudhuri is an academic researcher from Max Planck Society. The author has contributed to research in topics: Upper and lower bounds & Treewidth. The author has an hindex of 13, co-authored 33 publications receiving 582 citations. Previous affiliations of Shiva Chaudhuri include University of Waterloo.

##### Papers

More filters

••

TL;DR: This work considers the problem of preprocessing an n -vertex digraph with real edge weights so that subsequent queries for the shortest path or distance between any two vertices can be efficiently answered and gives algorithms that depend on the treewidth of the input graph.

Abstract: We consider the problem of preprocessing an n -vertex digraph with real edge weights so that subsequent queries for the shortest path or distance between any two vertices can be efficiently answered. We give algorithms that depend on the treewidth of the input graph. When the treewidth is a constant, our algorithms can answer distance queries in O(α(n)) time after O(n) preprocessing. This improves upon previously known results for the same problem. We also give a dynamic algorithm which, after a change in an edge weight, updates the data structure in time O(n
β
) , for any constant 0 < β < 1 . Furthermore, an algorithm of independent interest is given: computing a shortest path tree, or finding a negative cycle in linear time.

99 citations

••

TL;DR: A generalization of the k-center problem with triangle inequality where, given a number p, one wishes to place k centers so as to minimize the maximum distance of any non-center node to its pth closest center is derived.

Abstract: The k-center problem with triangle inequality is that of placing k center nodes in a weighted undirected graph in which the edge weights obey the triangle inequality, so that the maximum distance of any node to its nearest center is minimized. In this paper, we consider a generalization of this problem where, given a number p, we wish to place k centers so as to minimize the maximum distance of any non-center node to its pth closest center. We derive a best possible approximation algorithm for this problem.

62 citations

••

01 Jul 1996TL;DR: It is shown that a circuit of depth d with S gates can be made to output a constant by setting O(S1− ) of its input values, which implies a superlinear size lower bound for a large class of functions.

Abstract: We study the complexity of computing Boolean functions using AND, OR and NOT gates. We show that a circuit of depth d with S gates can be made to output a constant by setting O(S1− ) (where (d) = 4−d) of its input values. This implies a superlinear size lower bound for a large class of functions. Using this, we obtain a function computable by a uniform family of constant depth polynomial size circuits that cannot be computed by constant depth circuits of linear size. We give circuit constructions that show that the bound O(S1− ) is near optimal. We also study the complexity of computing threshold functions. The function T r has the value 1 iff at least r of its inputs have the value 1. We show that a circuit computing T r has at least Ω(r (logn)/ log r) gates, for r ≤ n, improving previous bounds. We also show a trade-off between the number of gates and the number of wires in a threshold circuit, namely, a circuit with G (< n/2) gates and W wires computing T r satisfies W ≥ Ω(nr(logn)/(log(G/ logn))), showing that it is not possible to simultaneously optimize the number of gates and wires in a threshold circuit. Our bounds for threshold functions are based on a combinatorial lemma of independent interest.

56 citations

01 Jan 1996

TL;DR: In this paper, the complexity of computing Boolean functions using AND, OR and NOT gates was studied, and it was shown that O(S 1− ) is near optimal for a large class of functions.

Abstract: We study the complexity of computing Boolean functions using AND, OR and NOT gates. We show that a circuit of depth d with S gates can be made to output a constant by setting O(S1− ) (where (d) = 4−d) of its input values. This implies a superlinear size lower bound for a large class of functions. Using this, we obtain a function computable by a uniform family of constant depth polynomial size circuits that cannot be computed by constant depth circuits of linear size. We give circuit constructions that show that the bound O(S1− ) is near optimal. We also study the complexity of computing threshold functions. The function T r has the value 1 iff at least r of its inputs have the value 1. We show that a circuit computing T r has at least Ω(r (logn)/ log r) gates, for r ≤ n, improving previous bounds. We also show a trade-off between the number of gates and the number of wires in a threshold circuit, namely, a circuit with G (< n/2) gates and W wires computing T r satisfies W ≥ Ω(nr(logn)/(log(G/ logn))), showing that it is not possible to simultaneously optimize the number of gates and wires in a threshold circuit. Our bounds for threshold functions are based on a combinatorial lemma of independent interest.

55 citations

••

30 Aug 1993TL;DR: For all t≥(log log n)4, approximate selection problems of size n can be solved in O(t) time with optimal speedup with relative accuracy, and the number of processors is optimal for the given running time.

Abstract: The selection problem of size n is, given a set of n elements drawn from an ordered universe and an integer r with 1 0 asks for any element whose true rank differs from r by at most An. Our main results are: (1) For all t≥(log log n)4, approximate selection problems of size n can be solved in O(t) time with optimal speedup with relative accuracy \(2^{{{ - t} \mathord{\left/{\vphantom {{ - t} {\left( {\log \log n} \right)}}} \right.\kern-
ulldelimiterspace} {\left( {\log \log n} \right)}}^4 }\); no deterministic PRAM algorithm for approximate selection with a running time below Ο(log n/log log n) was previously known. (2) Exact selection problems of size n can be solved in O(log n/log log n) time with O(n log log n/log n) processors. This running time is the best possible (using only a polynomial number of processors), and the number of processors is optimal for the given running time (optimal speedup); the best previous algorithm achieves optimal speedup with a running time of O(log n log*n/log log n).

38 citations

##### Cited by

More filters

•

19 Oct 2009TL;DR: In this paper, the authors present a coherent and unified treatment of probabilistic techniques for obtaining high probability estimates on the performance of randomized algorithms, from the basic tool kit from the Chernoff-Hoeffding (CH) bounds to more sophisticated techniques like Martingales and isoperimetric inequalities, as well as some recent developments like Talagrand's inequality, transportation cost inequalities, and log-Sobolev inequalities.

Abstract: Randomized algorithms have become a central part of the algorithms curriculum based on their increasingly widespread use in modern applications. This book presents a coherent and unified treatment of probabilistic techniques for obtaining high- probability estimates on the performance of randomized algorithms. It covers the basic tool kit from the Chernoff-Hoeffding (CH) bounds to more sophisticated techniques like Martingales and isoperimetric inequalities, as well as some recent developments like Talagrand's inequality, transportation cost inequalities, and log-Sobolev inequalities. Along the way, variations on the basic theme are examined, such as CH bounds in dependent settings. The authors emphasize comparative study of the different methods, highlighting respective strengths and weaknesses in concrete example applications. The exposition is tailored to discrete settings sufficient for the analysis of algorithms, avoiding unnecessary measure-theoretic details, thus making the book accessible to computer scientists as well as probabilists and discrete mathematicians.

1,028 citations

••

TL;DR: The most impressive feature of the data structure is its constant query time, hence the name "oracle", and it provides faster constructions of sparse spanners of weighted graphs, and improved tree covers and distance labelings of weighted or unweighted graphs.

Abstract: Let G = (V,E) be an undirected weighted graph with vVv = n and vEv = m. Let k ≥ 1 be an integer. We show that G = (V,E) can be preprocessed in O(kmn1/k) expected time, constructing a data structure of size O(kn1p1/k), such that any subsequent distance query can be answered, approximately, in O(k) time. The approximate distance returned is of stretch at most 2k−1, that is, the quotient obtained by dividing the estimated distance by the actual distance lies between 1 and 2k−1. A 1963 girth conjecture of Erdos, implies that Ω(n1p1/k) space is needed in the worst case for any real stretch strictly smaller than 2kp1. The space requirement of our algorithm is, therefore, essentially optimal. The most impressive feature of our data structure is its constant query time, hence the name "oracle". Previously, data structures that used only O(n1p1/k) space had a query time of Ω(n1/k).Our algorithms are extremely simple and easy to implement efficiently. They also provide faster constructions of sparse spanners of weighted graphs, and improved tree covers and distance labelings of weighted or unweighted graphs.

618 citations

••

06 Jul 2001

TL;DR: The most impressive feature of the data structure is its constant query time, hence the name ``oracle', which provides faster constructions of sparse spanners of weighted graphs, and improved tree covers and distance labelings of weighted or unweighted graphs.

Abstract: Let G=(V,E) be an undirected weighted graph with |V|=n and |E|=m. Let k\ge 1 be an integer. We show that G=(V,E) can be preprocessed in O(kmn^{1/k}) expected time, constructing a data structure of size O(kn^{1+1/k}), such that any subsequent distance query can be answered, approximately, in O(k) time. The approximate distance returned is of stretch at most 2k-1, i.e., the quotient obtained by dividing the estimated distance by the actual distance lies between 1 and 2k-1. We show that a 1963 girth conjecture of Erd{\H{o}}s, implies that ω(n^{1+1/k}) space is needed in the worst case for any real stretch strictly smaller than 2k+1. The space requirement of our algorithm is, therefore, essentially optimal. The most impressive feature of our data structure is its constant query time, hence the name oracle. Previously, data structures that used only O(n^{1+1/k}) space had a query time of ω(n^{1/k}) and a slightly larger, non-optimal, stretch. Our algorithms are extremely simple and easy to implement efficiently. They also provide faster constructions of sparse spanners of weighted graphs, and improved tree covers and distance labelings of weighted or unweighted graphs.}

563 citations

••

TL;DR: In this paper, the authors investigate negative dependence among random variables and advocate its use as a simple and unifying paradigm for the analysis of random structures and algorithms, and show that negative dependence can be used for many applications.

Abstract: This paper investigates the notion of negative dependence amongst random variables and attempts to advocate its use as a simple and unifying paradigm for the analysis of random structures and algorithms. The assumption of independence between random variables is often very convenient for the several reasons. Firstly, it makes analyses and calculations much simpler. Secondly, one has at hand a whole array of powerful mathematical concepts and tools from classical probability theory for the analysis, such as laws of large numbers, central limit theorems and large deviation bounds which are usually derived under the assumption of independence. Unfortunately, the analysis of most randomized algorithms involves random variables that are not independent. In this case, classical tools from standard probability theory like large deviation theorems, that are valid under the assumption of independence between the random variables involved, cannot be used as such. It is then necessary to determine under what conditions of dependence one can still use the classical tools. It has been observed before [32, 33, 38, 8], that in some situations, even though the variables involved are not independent, one can still apply some of the standard tools that are valid for independent variables (directly or in suitably modified form), provided that the variables are dependent in specific ways. Unfortunately, it appears that in most cases somewhat ad hoc strategems have been devised, tailored to the specific situation at hand, and that a unifying underlying theory that delves deeper into the nature of dependence amongst the variables involved is lacking. A frequently occurring scenario underlying the analysis of many randomised algorithms and processes involves random variables that are, intuitively, dependent in the following negative way: if one subset of the variables is "high" then a disjoint subset of the variables is "low". In this paper, we bring to the forefront and systematize some precise notions of negative dependence in the literature, analyse their properties, compare them relative to each other, and illustrate them with several applications. One specific paradigm involving negative dependence is the classical "balls and bins" experiment. Suppose we throw m balls into n bins independently at random. For i in [n], let Bi be the random variable denoting the number of balls in the ith bin. We will often refer to these variables as occupancy numbers. This is a classical probabilistic paradigm [16, 22, 26] (see also [31, sec. 3.1]) that underlies the analysis of many probabilistic algorithms and processes. In the case when the balls are identical, this gives rise to the well-known multinomial distribution [16, sec VI.9]: there are m repeated independent trials (balls) where each trial (ball) can result in one of the outcomes E1, ..., En (bins). The probability of the realisation of event Ei is pi for i in [n] for each trial. (Of course the probabilities are subject to the condition Sum_i pi = 1.) Under the multinomial distribution, for any integers m1, ..., mn such that Sum_i mi = m the probability that for each i in [n], event Ei occurs mi times is m! m1! : : :mn!pm1 1 : : :pmn n : The balls and bins experiment is a generalisation of the multinomial distribution: in the general case, one can have an arbitrary set of probabilities for each ball: the probability that ball k goes into bin i is pi;k, subject only to the natural restriction that for each ball k, P i pi;k = 1. The joint distribution function correspondingly has a more complicated form. A fundamental natural question of interest is: how are these Bi related? Note that even though the balls are thrown independently of each other, the Bi variables are not independent; in particular, their sum is fixed to m. Intuitively, the Bi's are negatively dependent on each other in the manner described above: if one set of variables is "high", a disjoint set is "low". However, establishing such assertions precisely by a direct calculation from the joint distribution function, though possible in principle, appears to be quite a formidable task, even in the case where the balls are assumed to be identical. One of the major contributions of this paper is establishing that the the Bi are negatively dependent in a very strong sense. In particular, we show that the Bi variables satisfy negative association and negative regression, two strong notions of negative dependence that we define precisely below. All the intuitively obvious assertions of negative dependence in the balls and bins experiment follow as easy corollaries. We illustrate the usefulness of these results by showing how to streamline and simplify many existing probabilistic analyses in literature.

378 citations

••

TL;DR: POCO is presented, a framework for Pareto-based Optimal COntroller placement that provides operators with Pare to optimal placements with respect to different performance metrics and can be extended to solve similar virtual functions placement problems which appear in the context of Network Functions Virtualization (NFV).

Abstract: Software Defined Networking (SDN) marks a paradigm shift towards an externalized and logically centralized network control plane. A particularly important task in SDN architectures is that of controller placement, i.e., the positioning of a limited number of resources within a network to meet various requirements. These requirements range from latency constraints to failure tolerance and load balancing. In most scenarios, at least some of these objectives are competing, thus no single best placement is available and decision makers need to find a balanced trade-off. This work presents POCO, a framework for Pareto-based Optimal COntroller placement that provides operators with Pareto optimal placements with respect to different performance metrics. In its default configuration, POCO performs an exhaustive evaluation of all possible placements. While this is practically feasible for small and medium sized networks, realistic time and resource constraints call for an alternative in the context of large scale networks or dynamic networks whose properties change over time. For these scenarios, the POCO toolset is extended by a heuristic approach that is less accurate, but yields faster computation times. An evaluation of this heuristic is performed on a collection of real world network topologies from the Internet Topology Zoo. Utilizing a measure for quantifying the error introduced by the heuristic approach allows an analysis of the resulting trade-off between time and accuracy. Additionally, the proposed methods can be extended to solve similar virtual functions placement problems which appear in the context of Network Functions Virtualization (NFV).

357 citations