scispace - formally typeset
Search or ask a question

Showing papers in "Random Structures and Algorithms in 1998"


Journal ArticleDOI
TL;DR: In this paper, the authors investigate negative dependence among random variables and advocate its use as a simple and unifying paradigm for the analysis of random structures and algorithms, and show that negative dependence can be used for many applications.
Abstract: This paper investigates the notion of negative dependence amongst random variables and attempts to advocate its use as a simple and unifying paradigm for the analysis of random structures and algorithms. The assumption of independence between random variables is often very convenient for the several reasons. Firstly, it makes analyses and calculations much simpler. Secondly, one has at hand a whole array of powerful mathematical concepts and tools from classical probability theory for the analysis, such as laws of large numbers, central limit theorems and large deviation bounds which are usually derived under the assumption of independence. Unfortunately, the analysis of most randomized algorithms involves random variables that are not independent. In this case, classical tools from standard probability theory like large deviation theorems, that are valid under the assumption of independence between the random variables involved, cannot be used as such. It is then necessary to determine under what conditions of dependence one can still use the classical tools. It has been observed before [32, 33, 38, 8], that in some situations, even though the variables involved are not independent, one can still apply some of the standard tools that are valid for independent variables (directly or in suitably modified form), provided that the variables are dependent in specific ways. Unfortunately, it appears that in most cases somewhat ad hoc strategems have been devised, tailored to the specific situation at hand, and that a unifying underlying theory that delves deeper into the nature of dependence amongst the variables involved is lacking. A frequently occurring scenario underlying the analysis of many randomised algorithms and processes involves random variables that are, intuitively, dependent in the following negative way: if one subset of the variables is "high" then a disjoint subset of the variables is "low". In this paper, we bring to the forefront and systematize some precise notions of negative dependence in the literature, analyse their properties, compare them relative to each other, and illustrate them with several applications. One specific paradigm involving negative dependence is the classical "balls and bins" experiment. Suppose we throw m balls into n bins independently at random. For i in [n], let Bi be the random variable denoting the number of balls in the ith bin. We will often refer to these variables as occupancy numbers. This is a classical probabilistic paradigm [16, 22, 26] (see also [31, sec. 3.1]) that underlies the analysis of many probabilistic algorithms and processes. In the case when the balls are identical, this gives rise to the well-known multinomial distribution [16, sec VI.9]: there are m repeated independent trials (balls) where each trial (ball) can result in one of the outcomes E1, ..., En (bins). The probability of the realisation of event Ei is pi for i in [n] for each trial. (Of course the probabilities are subject to the condition Sum_i pi = 1.) Under the multinomial distribution, for any integers m1, ..., mn such that Sum_i mi = m the probability that for each i in [n], event Ei occurs mi times is m! m1! : : :mn!pm1 1 : : :pmn n : The balls and bins experiment is a generalisation of the multinomial distribution: in the general case, one can have an arbitrary set of probabilities for each ball: the probability that ball k goes into bin i is pi;k, subject only to the natural restriction that for each ball k, P i pi;k = 1. The joint distribution function correspondingly has a more complicated form. A fundamental natural question of interest is: how are these Bi related? Note that even though the balls are thrown independently of each other, the Bi variables are not independent; in particular, their sum is fixed to m. Intuitively, the Bi's are negatively dependent on each other in the manner described above: if one set of variables is "high", a disjoint set is "low". However, establishing such assertions precisely by a direct calculation from the joint distribution function, though possible in principle, appears to be quite a formidable task, even in the case where the balls are assumed to be identical. One of the major contributions of this paper is establishing that the the Bi are negatively dependent in a very strong sense. In particular, we show that the Bi variables satisfy negative association and negative regression, two strong notions of negative dependence that we define precisely below. All the intuitively obvious assertions of negative dependence in the balls and bins experiment follow as easy corollaries. We illustrate the usefulness of these results by showing how to streamline and simplify many existing probabilistic analyses in literature.

378 citations


Journal ArticleDOI
TL;DR: This paper presents an efficient algorithm for all k>cn0.5(log n)0, for any fixed c>0, thus improving the trivial case k>, and based on the spectral properties of the graph.
Abstract: We consider the following probabilistic model of a graph on n labeled vertices. First choose a random graph G(n, 1/2), and then choose randomly a subset Q of vertices of size k and force it to be a clique by joining every pair of vertices of Q by an edge. The problem is to give a polynomial time algorithm for finding this hidden clique almost surely for various values of k. This question was posed independently, in various variants, by Jerrum and by Kucera. In this paper we present an efficient algorithm for all k>cn0.5, for any fixed c>0, thus improving the trivial case k>cn0.5(log n)0.5. The algorithm is based on the spectral properties of the graph. © 1998 John Wiley & Sons, Inc. Random Struct. Alg., 13: 457–466, 1998

285 citations


Journal ArticleDOI
TL;DR: Kirousis et al. as mentioned in this paper considered the problem of computing the least real number k such that if the ratio of the number of clauses over k strictly exceeds k, then f is almost certainly unsatisfiable.
Abstract: Let f be a random Boolean formula that is an instance of 3-SAT. We consider the problem of computing the least real number k such that if the ratio of the number of clauses over the number of variables of f strictly exceeds k , then f is almost certainly unsatisfiable. By a well-known and more or less straightforward argument, it can be shown that kF5.191. This upper bound was improved by Kamath et al. to 4.758 by first providing new improved bounds for the occupancy problem. There is strong experimental evidence that the value of k is around 4.2. In this work, we define, in terms of the random formula f, a decreasing sequence of random variables such that, if the expected value of any one of them converges to zero, then f is almost certainly unsatisfiable. By letting the expected value of the first term of the sequence converge to zero, we obtain, by simple and elementary computations, an upper bound for k equal to 4.667. From the expected value of the second term of the sequence, we get the value 4.601q . In general, by letting the U This work was performed while the first author was visiting the School of Computer Science, Carleton Ž University, and was partially supported by NSERC Natural Sciences and Engineering Research Council . of Canada , and by a grant from the University of Patras for sabbatical leaves. The second and third Ž authors were supported in part by grants from NSERC Natural Sciences and Engineering Research . Council of Canada . During the last stages of this research, the first and last authors were also partially Ž . supported by EU ESPRIT Long-Term Research Project ALCOM-IT Project No. 20244 . †An extended abstract of this paper was published in the Proceedings of the Fourth Annual European Ž Symposium on Algorithms, ESA’96, September 25]27, 1996, Barcelona, Spain Springer-Verlag, LNCS, . pp. 27]38 . That extended abstract was coauthored by the first three authors of the present paper. Correspondence to: L. M. Kirousis Q 1998 John Wiley & Sons, Inc. CCC 1042-9832r98r030253-17 253

181 citations


Journal ArticleDOI
TL;DR: This work studies the average performance of a simple greedy algorithm for finding a matching in a sparse random graph Gn, c/n, where c>0 is constant and gives significantly improved estimates of the errors made by the algorithm.
Abstract: We study the average performance of a simple greedy algorithm for finding a matching in a sparse random graph Gn, c/n, where c>0 is constant. The algorithm was first proposed by Karp and Sipser [Proceedings of the Twenty-Second Annual IEEE Symposium on Foundations of Computing, 1981, pp. 364–375]. We give significantly improved estimates of the errors made by the algorithm. For the subcritical case where c e then with high probability the algorithm produces a matching which is within n1/5+o(1) of maximum size. © 1998 John Wiley & Sons, Inc. Random Struct. Alg., 12, 111–177, 1998

120 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that there is a numerical constant c>0 with the following property: if K is an arbitrary convex body in Rn with γn(K)≥1/2, then to each sequence u1,…,um∈Rn with ‖u1‖, ǫ≤c there correspond signs e1,..,em=± 1 such that ∑mi=1eiui∈K.
Abstract: Let ‖·‖ be the Euclidean norm on Rn and γn the (standard) Gaussian measure on Rn with density (2π)−n/2e. It is proved that there is a numerical constant c>0 with the following property: if K is an arbitrary convex body in Rn with γn(K)≥1/2, then to each sequence u1,…,um∈Rn with ‖u1‖,…,‖um‖≤c there correspond signs e1,…,em=±1 such that ∑mi=1eiui∈K. This improves the well-known result obtained by Spencer [Trans. Amer. Math. Soc.289, 679–705 (1985)] for the n-dimensional cube. © 1998 John Wiley & Sons, Inc. Random Struct. Alg., 12: 351–360, 1998

114 citations



Journal ArticleDOI
TL;DR: A parallel threshold strategy based on rethrowing balls placed in heavily loaded bins achieves loads within a constant factor of the lower bound for a constant number of rounds, and it achieves a final load of at most O(log’slog”n) given Ω( log log n) rounds of communication.
Abstract: It is well known that after placing n balls independently and uniformly at random into n bins, the fullest bin holds Θ(log n/log log n) balls with high probability. More recently, Azar et al. analyzed the following process: randomly choose d bins for each ball, and then place the balls, one by one, into the least full bin from its d choices. Azar et al. They show that after all n balls have been placed, the fullest bin contains only log log n/log d+Θ(1) balls with high probability. We explore extensions of this result to parallel and distributed settings. Our results focus on the tradeoff between the amount of communication and the final load. Given r rounds of communication, we provide lower bounds on the maximum load of \Omega(\root r \of {\log{n}/\log\log{n}}) [Note to reader: see “View Article” for equation] for a wide class of strategies. Our results extend to the case where the number of rounds is allowed to grow with n. We then demonstrate parallelizations of the sequential strategy presented in Azar et al. that achieve loads within a constant factor of the lower bound for two communication rounds and almost match the sequential strategy given log log n/log d+O(d) rounds of communication. We also examine a parallel threshold strategy based on rethrowing balls placed in heavily loaded bins. This strategy achieves loads within a constant factor of the lower bound for a constant number of rounds, and it achieves a final load of at most O(log log n) given Ω(log log n) rounds of communication. The algorithm also works well in asynchronous environments © 1998 John Wiley & Sons, Inc. Random Structure Alg., 13, 159–188 (1998)

95 citations



Journal ArticleDOI
TL;DR: This paper provides an algorithmic version of the blow-up lemma, which helps to find bounded degree spanning subgraphs in e-regular graphs and can be parallelized and implemented in NC5.
Abstract: Recently we developed a new method in graph theory based on the regularity lemma. The method is applied to find certain spanning subgraphs in dense graphs. The other main general tool of the method, besides the regularity lemma, is the so-called blow-up lemma (Komlos, Sarkozy, and Szemeredi [Combinatorica,17, 109–123 (1997)]. This lemma helps to find bounded degree spanning subgraphs in e-regular graphs. Our original proof of the lemma is not algorithmic, it applies probabilistic methods. In this paper we provide an algorithmic version of the blow-up lemma. The desired subgraph, for an n-vertex graph, can be found in time O(nM(n)), where M(n)=O(n2.376) is the time needed to multiply two n by n matrices with 0, 1 entires over the integers. We show that the algorithm can be parallelized and implemented in NC5. © 1998 John Wiley & Sons, Inc. Random Struct. Alg., 12, 297–312, 1998

86 citations


Journal ArticleDOI
TL;DR: In this article, a new Markov chain was defined for k-colorings of graphs, and its convergence properties were analyzed with respect to the maximum degree Δ of the graph.
Abstract: We define a new Markov chain on (proper) k-colorings of graphs, and relate its convergence properties to the maximum degree Δ of the graph. The chain is shown to have bounds on convergence time appreciably better than those for the well-known Jerrum/Salas–Sokal chain in most circumstances. For the case k=2Δ, we provide a dramatic decrease in running time. We also show improvements whenever the graph is regular, or fewer than 3Δ colors are used. The results are established using the method of path coupling. We indicate that our analysis is tight by showing that the couplings used are optimal in a sense which we define. © 1998 John Wiley & Sons, Inc. Random Struct. Alg., 13: 285–317, 1998

65 citations




Journal ArticleDOI
TL;DR: In this paper, the authors describe efficient constructions of small probability spaces that approximate the joint distribution of general random variables for the special case of identical, uniformly distributed random variables, which is a generalization of our approach.
Abstract: We describe efficient constructions of small probability spaces that approximate the joint distribution of general random variables. Previous work on efficient constructions concentrate on approximations of the joint distribution for the special case of identical, uniformly distributed random variables. © 1998 John Wiley & Sons, Inc. Random Struct. Alg., 13: 1–16, 1998


Journal ArticleDOI
TL;DR: This paper finds for all T, the values of ϵ for which a reconstruction of the value of the root of Tn with a probability bounded away from ½ for all n, and compares the ϵ values for recursive and nonrecursions.
Abstract: A periodic tree Tn consists of full n-level copies of a finite tree T. The tree Tn is labeled by random bits. The root label is chosen randomly, and the probability of two adjacent vertices to have the same label is 1−ϵ. This model simulates noisy propagation of a bit from the root, and has significance both in communication theory and in biology. Our aim is to find an algorithm which decides for every set of values of the boundary bits of T, if the root is more probable to be 0 or 1. We want to use this algorithm recursively to reconstruct the value of the root of Tn with a probability bounded away from ½ for all n. In this paper we find for all T, the values of ϵ for which such a reconstruction is possible. We then compare the ϵ values for recursive and nonrecursive algorithms. Finally, we discuss some problems concerning generalizations of this model. © 1998 John Wiley & Sons, Inc. Random Struct. Alg., 13, 81–97, 1998

Journal ArticleDOI
TL;DR: This paper is interested in behavior of this problem when the xi are i.i.d. random variables.
Abstract: In this paper we are interested in behavior of this problem when the xi are i.i.d. random variables. Under fairly general conditions, the median of the solution for the subset sum problem has been shown to be exponentially small when t is near E [∑n i=1 xi ] [Luek82]; this result has found application in the probabilistic analysis of approximation algorithms for the 0-1 Knapsack problem [Luek82, GMS84]. The median solution to the partition problem is known to be exponentially small [KKLO86] under fairly general conditions; this paper commented “a significant question which our results leave open is the expected value of the difference for the best partition” [KKLO86, p. 643].



Journal ArticleDOI
TL;DR: This work makes a start by proving a weak result, but the main purpose is to draw this topic to the attention of random graph theorists.
Abstract: Author(s): Aldous, D | Abstract: Component sizes in the usual random graph process are a special case of the Marcus-Lushnikov process discussed in the scientific literature, so it is natural to ask how theory surrounding emergence of the giant component generalizes to the Marcus-Lushnikov process. Essentially no rigorous results are known; we make a start by proving a weak result, but our main purpose is to draw this topic to the attention of random graph theorists. © 1998 John Wiley a Sons, Inc. Random Struct. Alg., 12, 179-196, 1998.

Journal ArticleDOI
TL;DR: In this article, it was shown that for any constant α > 1, almost all Boolean functions with formula complexity at most nα cannot be computed by any circuit constructed from literals and fewer than α−1nα two-input ∧, ∨ gates.
Abstract: Estimates are given of the number B(n, L) of distinct functions computed by propositional formulas of size L in n variables, constructed using only literals and ∧, ∨ connectives. (L is the number of occurrences of variables. L−1 is the number of binary ∧s and ∨s. B(n, L) is also the number of functions computed by two terminal series-parallel networks with L switches.) Enroute the read-once functions, which are closely related to Schroder numbers, are enumerated. Writing B(n, L)=b(n, L)L, we find that if L and β(n) go to infinity with increasing n and L≤2n/nβ(n), then b(n, L)∼cn, where c=2/(ln 4−1). Making a comparison with polynomial size Boolean circuits, this implies the following. For any constant α>1, almost all Boolean functions with formula complexity at most nα cannot be computed by any circuit constructed from literals and fewer than α−1nα two-input ∧, ∨ gates. © 1998 John Wiley & Sons, Inc. Random Struct. Alg., 13: 349–382, 1998

Journal ArticleDOI
TL;DR: In this paper, the asymptotic normality of the number of upper records in a sequence of geometric random variables is established and large deviations and local limit theorems are derived.
Abstract: We establish the asymptotic normality of the number of upper records in a sequence of iid geometric random variables. Large deviations and local limit theorems as well as approximation theorems for the number of lower records are also derived.




Journal ArticleDOI
TL;DR: A precise average-case analysis of Ben-Or's algorithm for testing the irreducibility of polynomials over finite fields is given and the expectation and variance of the smallest degree among theirreducible factors of a random polynomial are computed.
Abstract: We give a precise average-case analysis of Ben-Or's algorithm for testing the irreducibility of polynomials over finite fields. First, we study the probability that a random polynomial of degree n over q contains no irreducible factors of degree less than m, 1≤m≤n. The results are given in terms of the Buchstab function. Then, we compute the expectation and variance of the smallest degree among the irreducible factors of a random polynomial. The analysis of Ben-Or's algorithm readily follows from this expectation. We also compute the probability of a polynomial being irreducible when it has no irreducible factors of degree less than m, 1≤m≤n. This probability is useful in the analysis of some algorithms for factoring polynomials over finite fields. © 1998 John Wiley & Sons, Inc. Random Struct. Alg., 13: 439–456, 1998

Journal ArticleDOI
TL;DR: In this paper, the authors describe a new method for proving the polynomial-time convergence of an algorithm for sampling (almost) uniformly at random from a convex body in high dimension.
Abstract: In this paper we describe a new method for proving the polynomial-time convergence of an algorithm for sampling (almost) uniformly at random from a convex body in high dimension. Previous approaches have been based on estimating conductance via isoperimetric inequalities. We show that a more elementary coupling argument can be used to give a similar result. © 1998 John Wiley & Sons, Inc. Random Struct. Alg., 12, 213–235, 1998

Journal ArticleDOI
TL;DR: In this article, it was shown that for every r and dG 2 there is a C such that for most choices of d permutations p, p, p,..., p of S, the following holds: for any two r-tuples of distinct 12 dn 4
Abstract: We prove that for every r and dG 2 there is a C such that for most choices of d permutations p , p ,..., p of S , the following holds: for any two r-tuples of distinct 12 dn 4


Journal ArticleDOI
TL;DR: For trees with at least three vertices, the best possible result is asymptotically the best for all trees as mentioned in this paper, assuming that the edges of a tree can be decomposed into copies of the original tree.
Abstract: Let H be a tree on h≥2 vertices. It is shown that if G=(V, E) is a graph with \delta (G)\ge (|V|/2)+10h^4\sqrt{|V|\log|V|}, and h−1 divides |E|, then there is a decomposition of the edges of G into copies of H. This result is asymptotically the best possible for all trees with at least three vertices. © 1998 John Wiley & Sons, Inc. Random Struct. Alg., 12, 237–251, 1998