scispace - formally typeset
Search or ask a question
Author

Eric Blais

Bio: Eric Blais is an academic researcher from University of Waterloo. The author has contributed to research in topics: Boolean function & Property testing. The author has an hindex of 18, co-authored 77 publications receiving 1515 citations. Previous affiliations of Eric Blais include McGill University & Autodesk.


Papers
More filters
Proceedings ArticleDOI
31 May 2009
TL;DR: It is shown that if a function f is "far" from being a k-junta, then f is 'far' from being determined by k parts in a random partition of the variables, and the structural lemma is proved using the Efron-Stein decomposition method.
Abstract: A function on n variables is called a k-junta if it depends on at most k of its variables. In this article, we show that it is possible to test whether a function is a k-junta or is "far" from being a k-junta with O(ke + k log k ) queries, where epsilon is the approximation parameter. This result improves on the previous best upper bound of O (k3/2)e queries and is asymptotically optimal, up to a logarithmic factor.We obtain the improved upper bound by introducing a new algorithm with one-sided error for testing juntas. Notably, the algorithm is a valid junta tester under very general conditions: it holds for functions with arbitrary finite domains and ranges, and it holds under any product distribution over the domain.A key component of the analysis of the new algorithm is a new structural result on juntas: roughly, we show that if a function f is "far" from being a k-junta, then f is "far" from being determined by k parts in a random partition of the variables. The structural lemma is proved using the Efron-Stein decomposition method.

153 citations

Journal ArticleDOI
08 Jun 2011
TL;DR: In this article, a technique for proving lower bounds in property testing, by showing a strong connection between testing and communication complexity, was developed, which is general and implies a number of new testing bounds, as well as simpler proofs of several known bounds.
Abstract: We develop a new technique for proving lower bounds in property testing, by showing a strong connection between testing and communication complexity. We give a simple scheme for reducing communication problems to testing problems, thus allowing us to use known lower bounds in communication complexity to prove lower bounds in testing. This scheme is general and implies a number of new testing bounds, as well as simpler proofs of several known bounds. For the problem of testing whether a boolean function is k-linear (a parity function on k variables), we achieve a lower bound of Omega(k) queries, even for adaptive algorithms with two-sided error, thus confirming a conjecture of Goldreich (2010). The same argument behind this lower bound also implies a new proof of known lower bounds for testing related classes such as k-juntas. For some classes, such as the class of monotone functions and the class of s-sparse GF(2) polynomials, we significantly strengthen the best known bounds.

150 citations

Journal ArticleDOI
01 Jan 2015
TL;DR: In this article, the authors focus on the problem of rapidly generating approximate visualizations while preserving crucial visual properties of interest to analysts, such as the visual property of ordering, and apply to some other visual properties.
Abstract: Visualizations are frequently used as a means to understand trends and gather insights from datasets, but often take a long time to generate. In this paper, we focus on the problem of rapidly generating approximate visualizations while preserving crucial visual properties of interest to analysts. Our primary focus will be on sampling algorithms that preserve the visual property of ordering; our techniques will also apply to some other visual properties. For instance, our algorithms can be used to generate an approximate visualization of a bar chart very rapidly, where the comparisons between any two bars are correct. We formally show that our sampling algorithms are generally applicable and provably optimal in theory, in that they do not take more samples than necessary to generate the visualizations with ordering guarantees. They also work well in practice, correctly ordering output groups while taking orders of magnitude fewer samples and much less time than conventional sampling schemes.

110 citations

Journal Article
TL;DR: A new technique for proving lower bounds in property testing is developed, by showing a strong connection between testing and communication complexity, and significantly strengthens the best known bounds.
Abstract: We develop a new technique for proving lower bounds in property testing, by showing a strong connection between testing and communication complexity. We give a simple scheme for reducing communication problems to testing problems, thus allowing us to use known lower bounds in communication complexity to prove lower bounds in testing. This scheme is general and implies a number of new testing bounds, as well as simpler proofs of several known bounds. For the problem of testing whether a boolean function is k-linear (a parity function on k variables), we achieve a lower bound of Omega(k) queries, even for adaptive algorithms with two-sided error, thus confirming a conjecture of Goldreich (2010). The same argument behind this lower bound also implies a new proof of known lower bounds for testing related classes such as k-juntas. For some classes, such as the class of monotone functions and the class of s-sparse GF(2) polynomials, we significantly strengthen the best known bounds.

107 citations

Proceedings ArticleDOI
09 Nov 2015
TL;DR: A novel algorithm based on Fourier transform that is able to make predictions of any configurable software system with theoretical guarantees of accuracy and confidence level specified by the user, while using minimum number of samples up to a constant factor is proposed.
Abstract: Understanding how performance varies across a large number of variants of a configurable software system is important for helping stakeholders to choose a desirable variant. Given a software system with n optional features, measuring all its 2^n possible configurations to determine their performances is usually infeasible. Thus, various techniques have been proposed to predict software performances based on a small sample of measured configurations. We propose a novel algorithm based on Fourier transform that is able to make predictions of any configurable software system with theoretical guarantees of accuracy and confidence level specified by the user, while using minimum number of samples up to a constant factor. Empirical results on the case studies constructed from real-world configurable systems demonstrate the effectiveness of our algorithm.

75 citations


Cited by
More filters
Journal Article
TL;DR: In this paper, the authors consider the question of determining whether a function f has property P or is e-far from any function with property P. In some cases, it is also allowed to query f on instances of its choice.
Abstract: In this paper, we consider the question of determining whether a function f has property P or is e-far from any function with property P. A property testing algorithm is given a sample of the value of f on instances drawn according to some distribution. In some cases, it is also allowed to query f on instances of its choice. We study this question for different properties and establish some connections to problems in learning theory and approximation.In particular, we focus our attention on testing graph properties. Given access to a graph G in the form of being able to query whether an edge exists or not between a pair of vertices, we devise algorithms to test whether the underlying graph has properties such as being bipartite, k-Colorable, or having a p-Clique (clique of density p with respect to the vertex set). Our graph property testing algorithms are probabilistic and make assertions that are correct with high probability, while making a number of queries that is independent of the size of the graph. Moreover, the property testing algorithms can be used to efficiently (i.e., in time linear in the number of vertices) construct partitions of the graph that correspond to the property being tested, if it holds for the input graph.

870 citations

Book
05 Jun 2014
TL;DR: This text gives a thorough overview of Boolean functions, beginning with the most basic definitions and proceeding to advanced topics such as hypercontractivity and isoperimetry, and includes a "highlight application" such as Arrow's theorem from economics.
Abstract: Boolean functions are perhaps the most basic objects of study in theoretical computer science. They also arise in other areas of mathematics, including combinatorics, statistical physics, and mathematical social choice. The field of analysis of Boolean functions seeks to understand them via their Fourier transform and other analytic methods. This text gives a thorough overview of the field, beginning with the most basic definitions and proceeding to advanced topics such as hypercontractivity and isoperimetry. Each chapter includes a "highlight application" such as Arrow's theorem from economics, the Goldreich-Levin algorithm from cryptography/learning theory, Hstad's NP-hardness of approximation results, and "sharp threshold" theorems for random graph properties. The book includes roughly 450 exercises and can be used as the basis of a one-semester graduate course. It should appeal to advanced undergraduates, graduate students, and researchers in computer science theory and related mathematical fields.

867 citations

MonographDOI
01 Jan 2014

575 citations

Book
01 Nov 2005
TL;DR: In this article, the authors present an efficient reduction from constrained to unconstrained maximum agreement subtree for the maximum quartet consistency problem, which can be solved by using semi-definite programming.
Abstract: Expression.- Spectral Clustering Gene Ontology Terms to Group Genes by Function.- Dynamic De-Novo Prediction of microRNAs Associated with Cell Conditions: A Search Pruned by Expression.- Clustering Gene Expression Series with Prior Knowledge.- A Linear Time Biclustering Algorithm for Time Series Gene Expression Data.- Time-Window Analysis of Developmental Gene Expression Data with Multiple Genetic Backgrounds.- Phylogeny.- A Lookahead Branch-and-Bound Algorithm for the Maximum Quartet Consistency Problem.- Computing the Quartet Distance Between Trees of Arbitrary Degree.- Using Semi-definite Programming to Enhance Supertree Resolvability.- An Efficient Reduction from Constrained to Unconstrained Maximum Agreement Subtree.- Pattern Identification in Biogeography.- On the Complexity of Several Haplotyping Problems.- A Hidden Markov Technique for Haplotype Reconstruction.- Algorithms for Imperfect Phylogeny Haplotyping (IPPH) with a Single Homoplasy or Recombination Event.- Networks.- A Faster Algorithm for Detecting Network Motifs.- Reaction Motifs in Metabolic Networks.- Reconstructing Metabolic Networks Using Interval Analysis.- Genome Rearrangements.- A 1.375-Approximation Algorithm for Sorting by Transpositions.- A New Tight Upper Bound on the Transposition Distance.- Perfect Sorting by Reversals Is Not Always Difficult.- Minimum Recombination Histories by Branch and Bound.- Sequences.- A Unifying Framework for Seed Sensitivity and Its Application to Subset Seeds.- Generalized Planted (l,d)-Motif Problem with Negative Set.- Alignment of Tandem Repeats with Excision, Duplication, Substitution and Indels (EDSI).- The Peres-Shields Order Estimator for Fixed and Variable Length Markov Models with Applications to DNA Sequence Similarity.- Multiple Structural RNA Alignment with Lagrangian Relaxation.- Faster Algorithms for Optimal Multiple Sequence Alignment Based on Pairwise Comparisons.- Ortholog Clustering on a Multipartite Graph.- Linear Time Algorithm for Parsing RNA Secondary Structure.- A Compressed Format for Collections of Phylogenetic Trees and Improved Consensus Performance.- Structure.- Optimal Protein Threading by Cost-Splitting.- Efficient Parameterized Algorithm for Biopolymer Structure-Sequence Alignment.- Rotamer-Pair Energy Calculations Using a Trie Data Structure.- Improved Maintenance of Molecular Surfaces Using Dynamic Graph Connectivity.- The Main Structural Regularities of the Sandwich Proteins.- Discovery of Protein Substructures in EM Maps.

492 citations

Journal ArticleDOI
TL;DR: In this article, the convergence of Distri butions of Likelihood Ratio has been discussed, and the authors propose a method to construct a set of limit laws for Likelihood Ratios.
Abstract: 1 Introduction.- 2 Experiments, Deficiencies, Distances v.- 2.1 Comparing Risk Functions.- 2.2 Deficiency and Distance between Experiments.- 2.3 Likelihood Ratios and Blackwell's Representation.- 2.4 Further Remarks on the Convergence of Distri butions of Likelihood Ratios.- 2.5 Historical Remarks.- 3 Contiguity - Hellinger Transforms.- 3.1 Contiguity.- 3.2 Hellinger Distances, Hellinger Transforms.- 3.3 Historical Remarks.- 4 Gaussian Shift and Poisson Experiments.- 4.1 Introduction.- 4.2 Gaussian Experiments.- 4.3 Poisson Experiments.- 4.4 Historical Remarks.- 5 Limit Laws for Likelihood Ratios.- 5.1 Introduction.- 5.2 Auxiliary Results.- 5.2.1 Lindeberg's Procedure.- 5.2.2 Levy Splittings.- 5.2.3 Paul Levy's Symmetrization Inequalities.- 5.2.4 Conditions for Shift-Compactness.- 5.2.5 A Central Limit Theorem for Infinitesimal Arrays.- 5.2.6 The Special Case of Gaussian Limits.- 5.2.7 Peano Differentiable Functions.- 5.3 Limits for Binary Experiments.- 5.4 Gaussian Limits.- 5.5 Historical Remarks.- 6 Local Asymptotic Normality.- 6.1 Introduction.- 6.2 Locally Asymptotically Quadratic Families.- 6.3 A Method of Construction of Estimates.- 6.4 Some Local Bayes Properties.- 6.5 Invariance and Regularity.- 6.6 The LAMN and LAN Conditions.- 6.7 Additional Remarks on the LAN Conditions.- 6.8 Wald's Tests and Confidence Ellipsoids.- 6.9 Possible Extensions.- 6.10 Historical Remarks.- 7 Independent, Identically Distributed Observations.- 7.1 Introduction.- 7.2 The Standard i.i.d. Case: Differentiability in Quadratic Mean.- 7.3 Some Examples.- 7.4 Some Nonparametric Considerations.- 7.5 Bounds on the Risk of Estimates.- 7.6 Some Cases Where the Number of Observations Is Random.- 7.7 Historical Remarks.- 8 On Bayes Procedures.- 8.1 Introduction.- 8.2 Bayes Procedures Behave Nicely.- 8.3 The Bernstein-von Mises Phenomenon.- 8.4 A Bernstein-von Mises Result for the i.i.d. Case.- 8.5 Bayes Procedures Behave Miserably.- 8.6 Historical Remarks.- Author Index.

483 citations