Showing papers by "Eric Blais published in 2019"

PDF

Open Access

Posted Content•DOI•

Optimal Separation and Strong Direct Sum for Randomized Query Complexity

[...]

Eric Blais¹, Joshua Brody²•Institutions (2)

University of Waterloo¹, Swarthmore College²

02 Aug 2019-arXiv: Computational Complexity

TL;DR: An optimal superlinear direct-sum-type theorem for randomized query complexity is obtained: there exists a function f for which R(fk) = [EQUATION], answering an open question of Drucker (2012).

...read moreread less

Abstract: We establish two results regarding the query complexity of bounded-error randomized algorithms. * Bounded-error separation theorem. There exists a total function $f : \{0,1\}^n \to \{0,1\}$ whose $\epsilon$-error randomized query complexity satisfies $\overline{\mathrm{R}}_\epsilon(f) = \Omega( \mathrm{R}(f) \cdot \log\frac1\epsilon)$. * Strong direct sum theorem. For every function $f$ and every $k \ge 2$, the randomized query complexity of computing $k$ instances of $f$ simultaneously satisfies $\overline{\mathrm{R}}_\epsilon(f^k) = \Theta(k \cdot \overline{\mathrm{R}}_{\frac\epsilon k}(f))$. As a consequence of our two main results, we obtain an optimal superlinear direct-sum-type theorem for randomized query complexity: there exists a function $f$ for which $\mathrm{R}(f^k) = \Theta( k \log k \cdot \mathrm{R}(f))$. This answers an open question of Drucker (2012). Combining this result with the query-to-communication complexity lifting theorem of Goos, Pitassi, and Watson (2017), this also shows that there is a total function whose public-coin randomized communication complexity satisfies $\mathrm{R}^{\mathrm{cc}} (f^k) = \Theta( k \log k \cdot \mathrm{R}^{\mathrm{cc}}(f))$, answering a question of Feder, Kushilevitz, Naor, and Nisan (1995).

...read moreread less

17 citations

Journal Article•DOI•

Distribution Testing Lower Bounds via Reductions from Communication Complexity

[...]

Eric Blais¹, Clément L. Canonne², Tom Gur³•Institutions (3)

University of Waterloo¹, Stanford University², University of Warwick³

11 Feb 2019-ACM Transactions on Computation Theory

TL;DR: It is proved that the sample complexity of the aforementioned problem is essentially determined by a fundamental operator in the theory of interpolation of Banach spaces, known as Peetre's K-functional, which stems from an unexpected connection to functional analysis and refined concentration of measure inequalities, which arise naturally in the reduction.

...read moreread less

Abstract: We present a new methodology for proving distribution testing lower bounds, establishing a connection between distribution testing and the simultaneous message passing (SMP) communication model. Extending the framework of Blais, Brody, and Matulef [15], we show a simple way to reduce (private-coin) SMP problems to distribution testing problems. This method allows us to prove new distribution testing lower bounds, as well as to provide simple proofs of known lower bounds.Our main result is concerned with testing identity to a specific distribution, p, given as a parameter. In a recent and influential work, Valiant and Valiant [55] showed that the sample complexity of the aforementioned problem is closely related to the e2/3-quasinorm of p. We obtain alternative bounds on the complexity of this problem in terms of an arguably more intuitive measure and using simpler proofs. More specifically, we prove that the sample complexity is essentially determined by a fundamental operator in the theory of interpolation of Banach spaces, known as Peetre’s K-functional. We show that this quantity is closely related to the size of the effective support of p (loosely speaking, the number of supported elements that constitute the vast majority of the mass of p). This result, in turn, stems from an unexpected connection to functional analysis and refined concentration of measure inequalities, which arise naturally in our reduction.

...read moreread less

13 citations

Journal Article•DOI•

Tolerant Junta Testing and the Connection to Submodular Optimization and Function Isomorphism

[...]

Eric Blais¹, Clément L. Canonne², Talya Eden³, Amit Levi¹, Dana Ron³ - Show less +1 more•Institutions (3)

University of Waterloo¹, Stanford University², Tel Aviv University³

22 Jul 2019-ACM Transactions on Computation Theory

TL;DR: An algorithm is designed that solves the problem of tolerant testing of k-juntas via a new polynomial-time approximation algorithm for submodular function minimization (SFM) under large cardinality constraints, which holds even when only given an approximate oracle access to the function.

...read moreread less

Abstract: A function f:l −1,1rn → l −1,1r is a k-junta if it depends on at most k of its variables. We consider the problem of tolerant testing of k-juntas, where the testing algorithm must accept any function that is e-close to some k-junta and reject any function that is e′-far from every k′-junta for some e′ = O(e) and k′ = O(k).Our first result is an algorithm that solves this problem with query complexity polynomial in k and 1/e. This result is obtained via a new polynomial-time approximation algorithm for submodular function minimization (SFM) under large cardinality constraints, which holds even when only given an approximate oracle access to the function.Our second result considers the case where k′ = k. We show how to obtain a smooth tradeoff between the amount of tolerance and the query complexity in this setting. Specifically, we design an algorithm that, given ρ ∈ (0,1), accepts any function that is e ρ/16-close to some k-junta and rejects any function that is e-far from every k-junta. The query complexity of the algorithm is O (k log k/e ρ (1-ρ)k.Finally, we show how to apply the second result to the problem of tolerant isomorphism testing between two unknown Boolean functions f and g. We give an algorithm for this problem whose query complexity only depends on the (unknown) smallest k such that either f or g is close to being a k-junta.

...read moreread less

12 citations

Posted Content•

Box Covers and Domain Orderings for Beyond Worst-Case Join Processing

[...]

Kaleb Alway, Eric Blais¹, Semih Salihoglu¹•Institutions (1)

University of Waterloo¹

26 Sep 2019-arXiv: Databases

TL;DR: This work studies how to generate box covers that contain small size certificates to guarantee efficient runtimes for beyond worst-case optimal join algorithms Minesweeper and Tetris, and combines ADORA and GAMB with Tetris to form a new algorithm, TetrisReordered, which provides several new beyond best-case bounds.

...read moreread less

Abstract: Recent beyond worst-case optimal join algorithms Minesweeper and its generalization Tetris have brought the theory of indexing and join processing together by developing a geometric framework for joins. These algorithms take as input an index $\mathcal{B}$, referred to as a box cover, that stores output gaps that can be inferred from traditional indexes, such as B+ trees or tries, on the input relations. The performances of these algorithms highly depend on the certificate of $\mathcal{B}$, which is the smallest subset of gaps in $\mathcal{B}$ whose union covers all of the gaps in the output space of a query $Q$. We study how to generate box covers that contain small size certificates to guarantee efficient runtimes for these algorithms. First, given a query $Q$ over a set of relations of size $N$ and a fixed set of domain orderings for the attributes, we give a $\tilde{O}(N)$-time algorithm called GAMB which generates a box cover for $Q$ that is guaranteed to contain the smallest size certificate across any box cover for $Q$. Second, we show that finding a domain ordering to minimize the box cover size and certificate is NP-hard through a reduction from the 2 consecutive block minimization problem on boolean matrices. Our third contribution is a $\tilde{O}(N)$-time approximation algorithm called ADORA to compute domain orderings, under which one can compute a box cover of size $\tilde{O}(K^r)$, where $K$ is the minimum box cover for $Q$ under any domain ordering and $r$ is the maximum arity of any relation. This guarantees certificates of size $\tilde{O}(K^r)$. We combine ADORA and GAMB with Tetris to form a new algorithm we call TetrisReordered, which provides several new beyond worst-case bounds. On infinite families of queries, TetrisReordered's runtimes are unboundedly better than the bounds stated in prior work.

...read moreread less

6 citations

DOI•

Optimal separation and strong direct sum for randomized query complexity

[...]

Eric Blais¹, Joshua Brody²•Institutions (2)

University of Waterloo¹, Swarthmore College²

17 Jul 2019

TL;DR: In this paper, it was shown that for any function f and every k ≥ 2, the randomized query complexity of computing k instances of f simultaneously satisfies [EQUATION].

...read moreread less

Abstract: We establish two results regarding the query complexity of bounded-error randomized algorithms. Bounded-error separation theorem. There exists a total function f : {0, 1}n → {0, 1} whose ∈-error randomized query complexity satisfies [EQUATION].Strong direct sum theorem. For every function f and every k ≥ 2, the randomized query complexity of computing k instances of f simultaneously satisfies [EQUATION].As a consequence of our two main results, we obtain an optimal superlinear direct-sum-type theorem for randomized query complexity: there exists a function f for which R(fk) = [EQUATION]. This answers an open question of Drucker (2012). Combining this result with the query-to-communication complexity lifting theorem of Goos, Pitassi, and Watson (2017), this also shows that there is a total function whose public-coin randomized communication complexity satisfies Rcc (fk) = [EQUATION], answering a question of Feder, Kushilevitz, Naor, and Nisan (1995).

...read moreread less

4 citations

Journal Article•DOI•

A characterization of constant‐sample testable properties

[...]

Eric Blais, Yuichi Yoshida¹•Institutions (1)

National Institute of Informatics¹

01 Aug 2019-Random Structures and Algorithms

TL;DR: It is shown that a property of Boolean-valued functions on a finite domain $\mathcal{X}$ is testable with a constant number of samples if and only if it is (essentially) a $k$-part symmetric property for some constant $k.

...read moreread less

Abstract: We characterize the set of properties of Boolean-valued functions on a finite domain $\mathcal{X}$ that are testable with a constant number of samples. Specifically, we show that a property $\mathcal{P}$ is testable with a constant number of samples if and only if it is (essentially) a $k$-part symmetric property for some constant $k$, where a property is {\em $k$-part symmetric} if there is a partition $S_1,\ldots,S_k$ of $\mathcal{X}$ such that whether $f:\mathcal{X} \to \{0,1\}$ satisfies the property is determined solely by the densities of $f$ on $S_1,\ldots,S_k$. We use this characterization to obtain a number of corollaries, namely: (i) A graph property $\mathcal{P}$ is testable with a constant number of samples if and only if whether a graph $G$ satisfies $\mathcal{P}$ is (essentially) determined by the edge density of $G$. (ii) An affine-invariant property $\mathcal{P}$ of functions $f:\mathbb{F}_p^n \to \{0,1\}$ is testable with a constant number of samples if and only if whether $f$ satisfies $\mathcal{P}$ is (essentially) determined by the density of $f$. (iii) For every constant $d \geq 1$, monotonicity of functions $f : [n]^d \to \{0, 1\}$ on the $d$-dimensional hypergrid is testable with a constant number of samples.

...read moreread less

3 citations

Posted Content•

Testing convexity of functions over finite domains

[...]

Aleksandrs Belovs¹, Eric Blais², Abhinav Bommireddi²•Institutions (2)

University of Latvia¹, University of Waterloo²

07 Aug 2019-arXiv: Computational Complexity

TL;DR: This work provides a simplified version of the non-adaptive convexity tester on the line, and establishes new upper and lower bounds on the number of queries required to test conveXity of functions over various discrete domains.

...read moreread less

Abstract: We establish new upper and lower bounds on the number of queries required to test convexity of functions over various discrete domains. 1. We provide a simplified version of the non-adaptive convexity tester on the line. We re-prove the upper bound $O(\frac{\log(\epsilon n)}{\epsilon})$ in the usual uniform model, and prove an $O(\frac{\log n}{\epsilon})$ upper bound in the distribution-free setting. 2. We show a tight lower bound of $\Omega(\frac{\log(\epsilon n)}{\epsilon})$ queries for testing convexity of functions $f: [n] \rightarrow \mathbb{R}$ on the line. This lower bound applies to both adaptive and non-adaptive algorithms, and matches the upper bound from item 1, showing that adaptivity does not help in this setting. 3. Moving to higher dimensions, we consider the case of a stripe $[3] \times [n]$. We construct an \emph{adaptive} tester for convexity of functions $f\colon [3] \times [n] \to \mathbb R$ with query complexity $O(\log^2 n)$. We also show that any \emph{non-adaptive} tester must use $\Omega(\sqrt{n})$ queries in this setting. Thus, adaptivity yields an exponential improvement for this problem. 4. For functions $f\colon [n]^d \to \mathbb R$ over domains of dimension $d \geq 2$, we show a non-adaptive query lower bound $\Omega((\frac{n}{d})^{\frac{d}{2}})$.

...read moreread less

2 citations