Showing papers by "Eric Blais published in 2016"

PDF

Open Access

Proceedings Article•DOI•

A polynomial lower bound for testing monotonicity

[...]

Aleksandrs Belovs, Eric Blais¹•Institutions (1)

19 Jun 2016

TL;DR: In this paper, it was shown that there is an exponential gap between the query complexity of adaptive and non-adaptive algorithms for testing regular linear threshold functions (LTFs) for monotonicity.

...read moreread less

Abstract: We show that every algorithm for testing n-variate Boolean functions for monotonicityhas query complexity Ω(n1/4). All previous lower bounds for this problem were designed for non-adaptive algorithms and, as a result, the best previous lower bound for general (possibly adaptive) monotonicity testers was only Ω(logn). Combined with the query complexity of the non-adaptive monotonicity tester of Khot, Minzer, and Safra (FOCS 2015), our lower bound shows that adaptivity can result in at most a quadratic reduction in the query complexity for testing monotonicity. By contrast, we show that there is an exponential gap between the query complexity of adaptive and non-adaptive algorithms for testing regular linear threshold functions (LTFs) for monotonicity. Chen, De, Servedio, and Tan (STOC 2015)recently showed that non-adaptive algorithms require almost Ω(n1/2) queries for this task. We introduce a new adaptive monotonicity testing algorithm which has query complexity O(logn) when the input is a regular LTF.

...read moreread less

63 citations

Proceedings Article•DOI•

A mathematical model of performance-relevant feature interactions

[...]

Yi Zhang¹, Jianmei Guo², Eric Blais¹, Krzysztof Czarnecki¹, Huiqun Yu² - Show less +1 more•Institutions (2)

University of Waterloo¹, East China University of Science and Technology²

16 Sep 2016

TL;DR: A novel mathematical model for performance-relevant, or quantitative in general, feature interactions, based on the theory of Boolean functions is proposed, which provides two algorithms for detecting all such interactions with little measurement effort and potentially guaranteed accuracy and confidence level.

...read moreread less

Abstract: Modern software systems have grown significantly in their size and complexity, therefore understanding how software systems behave when there are many configuration options, also called features, is no longer a trivial task. This is primarily due to the potentially complex interactions among the features. In this paper, we propose a novel mathematical model for performance-relevant, or quantitative in general, feature interactions, based on the theory of Boolean functions. Moreover, we provide two algorithms for detecting all such interactions with little measurement effort and potentially guaranteed accuracy and confidence level. Empirical results on real-world configurable systems demonstrated the feasibility and effectiveness of our approach.

...read moreread less

14 citations

Proceedings Article•

Learning and Testing Junta Distributions

[...]

Maryam Aliakbarpour¹, Eric Blais², Ronitt Rubinfeld³•Institutions (3)

Massachusetts Institute of Technology¹, University of Waterloo², Tel Aviv University³

06 Jun 2016

TL;DR: It is shown that it is possible to learn k-junta distributions with respect to the uniform distribution over the Boolean hypercube in time poly(n, 1/ ).

...read moreread less

Abstract: We consider the problem of learning distributions in the presence of irrelevant features. This problem is formalized by introducing a new notion of k-junta distributions. Informally, a distribution D over the domain X is a k-junta distribution with respect to another distribution U over the same domain if there is a set J ⊆ [n] of size |J | ≤ k that captures the difference between D and U . We show that it is possible to learn k-junta distributions with respect to the uniform distribution over the Boolean hypercube {0, 1} in time poly(n, 1/ ). This result is obtained via a new Fourier-based learning algorithm inspired by the Low-Degree Algorithm of Linial, Mansour, and Nisan (1993). We also consider the problem of testing whether an unknown distribution is a k-junta distribution with respect to the uniform distribution. We give a nearly-optimal algorithm for this task. Both the analysis of the algorithm and the lower bound showing its optimality are obtained by establishing connections between the problem of testing junta distributions and testing uniformity of weighted collections of distributions.

...read moreread less

14 citations

Posted Content•

A Characterization of Constant-Sample Testable Properties

[...]

Eric Blais, Yuichi Yoshida¹•Institutions (1)

National Institute of Informatics¹

19 Dec 2016-arXiv: Data Structures and Algorithms

TL;DR: In this article, the authors characterize the set of properties of Boolean-valued functions on a finite domain that are testable with a constant number of samples, and obtain a number of corollaries.

...read moreread less

Abstract: We characterize the set of properties of Boolean-valued functions on a finite domain $\mathcal{X}$ that are testable with a constant number of samples. Specifically, we show that a property $\mathcal{P}$ is testable with a constant number of samples if and only if it is (essentially) a $k$-part symmetric property for some constant $k$, where a property is {\em $k$-part symmetric} if there is a partition $S_1,\ldots,S_k$ of $\mathcal{X}$ such that whether $f:\mathcal{X} \to \{0,1\}$ satisfies the property is determined solely by the densities of $f$ on $S_1,\ldots,S_k$. We use this characterization to obtain a number of corollaries, namely: (i) A graph property $\mathcal{P}$ is testable with a constant number of samples if and only if whether a graph $G$ satisfies $\mathcal{P}$ is (essentially) determined by the edge density of $G$. (ii) An affine-invariant property $\mathcal{P}$ of functions $f:\mathbb{F}_p^n \to \{0,1\}$ is testable with a constant number of samples if and only if whether $f$ satisfies $\mathcal{P}$ is (essentially) determined by the density of $f$. (iii) For every constant $d \geq 1$, monotonicity of functions $f : [n]^d \to \{0, 1\}$ on the $d$-dimensional hypergrid is testable with a constant number of samples.

...read moreread less

9 citations

Journal Article•

Alice and Bob Show Distribution Testing Lower Bounds (They don't talk to each other anymore.).

[...]

Eric Blais, Clément L. Canonne, Tom Gur

01 Jan 2016-Electronic Colloquium on Computational Complexity

TL;DR: It is proved that the sample complexity of the aforementioned problem is essentially determined by a fundamental operator in the theory of interpolation of Banach spaces, known as Peetre's K-functional, which stems from an unexpected connection to functional analysis and refined concentration of measure inequalities, which arise naturally in the reduction.

...read moreread less

Abstract: We present a new methodology for proving distribution testing lower bounds, establishing a connection between distribution testing and the simultaneous message passing (SMP) communication model. Extending the framework of Blais, Brody, and Matulef [BBM12], we show a simple way to reduce (private-coin) SMP problems to distribution testing problems. This method allows us to prove new distribution testing lower bounds, as well as to provide simple proofs of known lower bounds. Our main result is concerned with testing identity to a specific distribution p, given as a parameter. In a recent and influential work, Valiant and Valiant [VV14] showed that the sample complexity of the aforementioned problem is closely related to the `2/3-quasinorm of p. We obtain alternative bounds on the complexity of this problem in terms of an arguably more intuitive measure and using simpler proofs. More specifically, we prove that the sample complexity is essentially determined by a fundamental operator in the theory of interpolation of Banach spaces, known as Peetre’s K-functional. We show that this quantity is closely related to the size of the effective support of p (loosely speaking, the number of supported elements that constitute the vast majority of the mass of p). This result, in turn, stems from an unexpected connection to functional analysis and refined concentration of measure inequalities, which arise naturally in our reduction. ∗This work appeared in CCC’17 as [BCG17]. †University of Waterloo. Email: eric.blais@uwaterloo.ca. Research supported by NSERC Discovery grant. ‡Columbia University. Email: ccanonne@cs.columbia.edu. Research supported by NSF grants CCF-1115703 and NSF CCF-1319788. §UC Berkeley. Email: tom.gur@berkeley.edu. Research partially supported by the ISF grant number 671/13 and Irit Dinur’s ERC grant number 239985. ISSN 1433-8092 Electronic Colloquium on Computational Complexity, Revision 1 of Report No. 168 (2016)

...read moreread less

7 citations

Book Chapter•DOI•

Something for (Almost) Nothing: New Advances in Sublinear-Time Algorithms

[...]

Ronitt Rubinfeld, Eric Blais

18 Feb 2016

TL;DR: This chapter focuses on a formalization of approximate solutions that has been widely studied in an area of theoretical computer science known as property testing, and a close connection between property testing and the general parameter estimation.

...read moreread less

Abstract: What computational problems can we solve when we only have time to look at a tiny fraction of the data? This general question has been studied from many different angles in statistics. More recently, with the recent proliferation of massive datasets, it has also become a central question in computer science as well. Essentially all of the research on this question starts with a simple observation: except for a very small number of special cases, the only problems that can be solved in this very restrictive setting are those that admit approximate solutions. In this chapter, we focus on a formalization of approximate solutions that has been widely studied in an area of theoretical computer science known as property testing. Let X denote the underlying dataset, and consider the setting where this dataset represents a combinatorial object. Let P be any property of this type of combinatorial object. We say that X is -close to having property P if we can modify at most an fraction of X to obtain the description X ′ of an object that does have property P ; otherwise we say that X is -far from having the property. A randomized algorithm A is an -tester for P if it can distinguish with large constant probability1 between datasets that represent objects with the property P from those that are -far from having the same property. (The algorithm A is free to output anything on inputs that don’t have the property P but are also not -far from having this property; it is this leeway that will enable property testers to be so efficient.) There is a close connection between property testing and the general parameter estimation. Let X be a dataset and θ = θ(X) be any parameter of this dataset. For every threshold t, we can define the property of having θ ≤ t. If we have an efficient algorithm for testing this property, we can also use it to efficiently obtain an estimate θ̂ that is close to θ in the sense that θ̂ ≤ θ and the underlying object X is -close to another dataset X ′ with θ(X ′) = θ̂. Note that this notion of closeness is very different from the notions usually considered in parameter estimation; instead of determining it as a function L(θ, θ̂) of the true and estimated values of the parameter itself, here the quality of the estimate is a function of the underlying dataset.

...read moreread less

6 citations

Testing Juntas and Related Properties of Boolean Functions.

[...]

Eric Blais

01 Jan 2016

4 citations

Posted Content•

Testing submodularity and other properties of valuation functions

[...]

Eric Blais¹, Abhinav Bommireddi¹•Institutions (1)

University of Waterloo¹

23 Nov 2016-arXiv: Data Structures and Algorithms

TL;DR: It is shown that for any constant $\epsilon > 0$ and $p \ge 1$, it is possible to distinguish functions that are submodular from those that are $\epSilon$-far from every sub modular function in $\ell_p$ distance with a constant number of queries.

...read moreread less

Abstract: We show that for any constant $\epsilon > 0$ and $p \ge 1$, it is possible to distinguish functions $f : \{0,1\}^n \to [0,1]$ that are submodular from those that are $\epsilon$-far from every submodular function in $\ell_p$ distance with a constant number of queries. More generally, we extend the testing-by-implicit-learning framework of Diakonikolas et al. (2007) to show that every property of real-valued functions that is well-approximated in $\ell_2$ distance by a class of $k$-juntas for some $k = O(1)$ can be tested in the $\ell_p$-testing model with a constant number of queries. This result, combined with a recent junta theorem of Feldman and Vondrak (2016), yields the constant-query testability of submodularity. It also yields constant-query testing algorithms for a variety of other natural properties of valuation functions, including fractionally additive (XOS) functions, OXS functions, unit demand functions, coverage functions, and self-bounding functions.

...read moreread less

2 citations

Posted Content•

Tolerant Junta Testing and the Connection to Submodular Optimization and Function Isomorphism

[...]

Eric Blais¹, Clément L. Canonne², Talya Eden³, Amit Levi¹, Dana Ron³ - Show less +1 more•Institutions (3)

University of Waterloo¹, Stanford University², Tel Aviv University³

13 Jul 2016-arXiv: Data Structures and Algorithms

TL;DR: An algorithm is designed that gives an algorithm for this problem whose query complexity only depends on the (unknown) smallest $k$ such that either $f$ or $g$ is close to being a $k-junta.

...read moreread less

Abstract: A function $f\colon \{-1,1\}^n \to \{-1,1\}$ is a $k$-junta if it depends on at most $k$ of its variables. We consider the problem of tolerant testing of $k$-juntas, where the testing algorithm must accept any function that is $\epsilon$-close to some $k$-junta and reject any function that is $\epsilon'$-far from every $k'$-junta for some $\epsilon'= O(\epsilon)$ and $k' = O(k)$. Our first result is an algorithm that solves this problem with query complexity polynomial in $k$ and $1/\epsilon$. This result is obtained via a new polynomial-time approximation algorithm for submodular function minimization (SFM) under large cardinality constraints, which holds even when only given an approximate oracle access to the function. Our second result considers the case where $k'=k$. We show how to obtain a smooth tradeoff between the amount of tolerance and the query complexity in this setting. Specifically, we design an algorithm that given $\rho\in(0,1/2)$ accepts any function that is $\frac{\epsilon\rho}{16}$-close to some $k$-junta and rejects any function that is $\epsilon$-far from every $k$-junta. The query complexity of the algorithm is $O\big( \frac{k\log k}{\epsilon\rho(1-\rho)^k} \big)$. Finally, we show how to apply the second result to the problem of tolerant isomorphism testing between two unknown Boolean functions $f$ and $g$. We give an algorithm for this problem whose query complexity only depends on the (unknown) smallest $k$ such that either $f$ or $g$ is close to being a $k$-junta.

...read moreread less