scispace - formally typeset
Search or ask a question

Showing papers by "Eric Blais published in 2019"


Posted ContentDOI
TL;DR: An optimal superlinear direct-sum-type theorem for randomized query complexity is obtained: there exists a function f for which R(fk) = [EQUATION], answering an open question of Drucker (2012).
Abstract: We establish two results regarding the query complexity of bounded-error randomized algorithms. * Bounded-error separation theorem. There exists a total function $f : \{0,1\}^n \to \{0,1\}$ whose $\epsilon$-error randomized query complexity satisfies $\overline{\mathrm{R}}_\epsilon(f) = \Omega( \mathrm{R}(f) \cdot \log\frac1\epsilon)$. * Strong direct sum theorem. For every function $f$ and every $k \ge 2$, the randomized query complexity of computing $k$ instances of $f$ simultaneously satisfies $\overline{\mathrm{R}}_\epsilon(f^k) = \Theta(k \cdot \overline{\mathrm{R}}_{\frac\epsilon k}(f))$. As a consequence of our two main results, we obtain an optimal superlinear direct-sum-type theorem for randomized query complexity: there exists a function $f$ for which $\mathrm{R}(f^k) = \Theta( k \log k \cdot \mathrm{R}(f))$. This answers an open question of Drucker (2012). Combining this result with the query-to-communication complexity lifting theorem of Goos, Pitassi, and Watson (2017), this also shows that there is a total function whose public-coin randomized communication complexity satisfies $\mathrm{R}^{\mathrm{cc}} (f^k) = \Theta( k \log k \cdot \mathrm{R}^{\mathrm{cc}}(f))$, answering a question of Feder, Kushilevitz, Naor, and Nisan (1995).

17 citations


Journal ArticleDOI
TL;DR: It is proved that the sample complexity of the aforementioned problem is essentially determined by a fundamental operator in the theory of interpolation of Banach spaces, known as Peetre's K-functional, which stems from an unexpected connection to functional analysis and refined concentration of measure inequalities, which arise naturally in the reduction.
Abstract: We present a new methodology for proving distribution testing lower bounds, establishing a connection between distribution testing and the simultaneous message passing (SMP) communication model. Extending the framework of Blais, Brody, and Matulef [15], we show a simple way to reduce (private-coin) SMP problems to distribution testing problems. This method allows us to prove new distribution testing lower bounds, as well as to provide simple proofs of known lower bounds.Our main result is concerned with testing identity to a specific distribution, p, given as a parameter. In a recent and influential work, Valiant and Valiant [55] showed that the sample complexity of the aforementioned problem is closely related to the e2/3-quasinorm of p. We obtain alternative bounds on the complexity of this problem in terms of an arguably more intuitive measure and using simpler proofs. More specifically, we prove that the sample complexity is essentially determined by a fundamental operator in the theory of interpolation of Banach spaces, known as Peetre’s K-functional. We show that this quantity is closely related to the size of the effective support of p (loosely speaking, the number of supported elements that constitute the vast majority of the mass of p). This result, in turn, stems from an unexpected connection to functional analysis and refined concentration of measure inequalities, which arise naturally in our reduction.

13 citations


Journal ArticleDOI
TL;DR: An algorithm is designed that solves the problem of tolerant testing of k-juntas via a new polynomial-time approximation algorithm for submodular function minimization (SFM) under large cardinality constraints, which holds even when only given an approximate oracle access to the function.
Abstract: A function f:l −1,1rn → l −1,1r is a k-junta if it depends on at most k of its variables. We consider the problem of tolerant testing of k-juntas, where the testing algorithm must accept any function that is e-close to some k-junta and reject any function that is e′-far from every k′-junta for some e′ = O(e) and k′ = O(k).Our first result is an algorithm that solves this problem with query complexity polynomial in k and 1/e. This result is obtained via a new polynomial-time approximation algorithm for submodular function minimization (SFM) under large cardinality constraints, which holds even when only given an approximate oracle access to the function.Our second result considers the case where k′ = k. We show how to obtain a smooth tradeoff between the amount of tolerance and the query complexity in this setting. Specifically, we design an algorithm that, given ρ ∈ (0,1), accepts any function that is e ρ/16-close to some k-junta and rejects any function that is e-far from every k-junta. The query complexity of the algorithm is O (k log k/e ρ (1-ρ)k.Finally, we show how to apply the second result to the problem of tolerant isomorphism testing between two unknown Boolean functions f and g. We give an algorithm for this problem whose query complexity only depends on the (unknown) smallest k such that either f or g is close to being a k-junta.

12 citations


Posted Content
TL;DR: This work studies how to generate box covers that contain small size certificates to guarantee efficient runtimes for beyond worst-case optimal join algorithms Minesweeper and Tetris, and combines ADORA and GAMB with Tetris to form a new algorithm, TetrisReordered, which provides several new beyond best-case bounds.
Abstract: Recent beyond worst-case optimal join algorithms Minesweeper and its generalization Tetris have brought the theory of indexing and join processing together by developing a geometric framework for joins. These algorithms take as input an index $\mathcal{B}$, referred to as a box cover, that stores output gaps that can be inferred from traditional indexes, such as B+ trees or tries, on the input relations. The performances of these algorithms highly depend on the certificate of $\mathcal{B}$, which is the smallest subset of gaps in $\mathcal{B}$ whose union covers all of the gaps in the output space of a query $Q$. We study how to generate box covers that contain small size certificates to guarantee efficient runtimes for these algorithms. First, given a query $Q$ over a set of relations of size $N$ and a fixed set of domain orderings for the attributes, we give a $\tilde{O}(N)$-time algorithm called GAMB which generates a box cover for $Q$ that is guaranteed to contain the smallest size certificate across any box cover for $Q$. Second, we show that finding a domain ordering to minimize the box cover size and certificate is NP-hard through a reduction from the 2 consecutive block minimization problem on boolean matrices. Our third contribution is a $\tilde{O}(N)$-time approximation algorithm called ADORA to compute domain orderings, under which one can compute a box cover of size $\tilde{O}(K^r)$, where $K$ is the minimum box cover for $Q$ under any domain ordering and $r$ is the maximum arity of any relation. This guarantees certificates of size $\tilde{O}(K^r)$. We combine ADORA and GAMB with Tetris to form a new algorithm we call TetrisReordered, which provides several new beyond worst-case bounds. On infinite families of queries, TetrisReordered's runtimes are unboundedly better than the bounds stated in prior work.

6 citations


DOI
17 Jul 2019
TL;DR: In this paper, it was shown that for any function f and every k ≥ 2, the randomized query complexity of computing k instances of f simultaneously satisfies [EQUATION].
Abstract: We establish two results regarding the query complexity of bounded-error randomized algorithms. Bounded-error separation theorem. There exists a total function f : {0, 1}n → {0, 1} whose ∈-error randomized query complexity satisfies [EQUATION].Strong direct sum theorem. For every function f and every k ≥ 2, the randomized query complexity of computing k instances of f simultaneously satisfies [EQUATION].As a consequence of our two main results, we obtain an optimal superlinear direct-sum-type theorem for randomized query complexity: there exists a function f for which R(fk) = [EQUATION]. This answers an open question of Drucker (2012). Combining this result with the query-to-communication complexity lifting theorem of Goos, Pitassi, and Watson (2017), this also shows that there is a total function whose public-coin randomized communication complexity satisfies Rcc (fk) = [EQUATION], answering a question of Feder, Kushilevitz, Naor, and Nisan (1995).

4 citations


Journal ArticleDOI
TL;DR: It is shown that a property of Boolean-valued functions on a finite domain $\mathcal{X}$ is testable with a constant number of samples if and only if it is (essentially) a $k$-part symmetric property for some constant $k.
Abstract: We characterize the set of properties of Boolean-valued functions on a finite domain $\mathcal{X}$ that are testable with a constant number of samples. Specifically, we show that a property $\mathcal{P}$ is testable with a constant number of samples if and only if it is (essentially) a $k$-part symmetric property for some constant $k$, where a property is {\em $k$-part symmetric} if there is a partition $S_1,\ldots,S_k$ of $\mathcal{X}$ such that whether $f:\mathcal{X} \to \{0,1\}$ satisfies the property is determined solely by the densities of $f$ on $S_1,\ldots,S_k$. We use this characterization to obtain a number of corollaries, namely: (i) A graph property $\mathcal{P}$ is testable with a constant number of samples if and only if whether a graph $G$ satisfies $\mathcal{P}$ is (essentially) determined by the edge density of $G$. (ii) An affine-invariant property $\mathcal{P}$ of functions $f:\mathbb{F}_p^n \to \{0,1\}$ is testable with a constant number of samples if and only if whether $f$ satisfies $\mathcal{P}$ is (essentially) determined by the density of $f$. (iii) For every constant $d \geq 1$, monotonicity of functions $f : [n]^d \to \{0, 1\}$ on the $d$-dimensional hypergrid is testable with a constant number of samples.

3 citations


Posted Content
TL;DR: This work provides a simplified version of the non-adaptive convexity tester on the line, and establishes new upper and lower bounds on the number of queries required to test conveXity of functions over various discrete domains.
Abstract: We establish new upper and lower bounds on the number of queries required to test convexity of functions over various discrete domains. 1. We provide a simplified version of the non-adaptive convexity tester on the line. We re-prove the upper bound $O(\frac{\log(\epsilon n)}{\epsilon})$ in the usual uniform model, and prove an $O(\frac{\log n}{\epsilon})$ upper bound in the distribution-free setting. 2. We show a tight lower bound of $\Omega(\frac{\log(\epsilon n)}{\epsilon})$ queries for testing convexity of functions $f: [n] \rightarrow \mathbb{R}$ on the line. This lower bound applies to both adaptive and non-adaptive algorithms, and matches the upper bound from item 1, showing that adaptivity does not help in this setting. 3. Moving to higher dimensions, we consider the case of a stripe $[3] \times [n]$. We construct an \emph{adaptive} tester for convexity of functions $f\colon [3] \times [n] \to \mathbb R$ with query complexity $O(\log^2 n)$. We also show that any \emph{non-adaptive} tester must use $\Omega(\sqrt{n})$ queries in this setting. Thus, adaptivity yields an exponential improvement for this problem. 4. For functions $f\colon [n]^d \to \mathbb R$ over domains of dimension $d \geq 2$, we show a non-adaptive query lower bound $\Omega((\frac{n}{d})^{\frac{d}{2}})$.

2 citations