scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Optimal weighted least-squares methods

01 Oct 2017-Vol. 3, pp 181-203
TL;DR: In this paper, the authors consider the problem of reconstructing an unknown bounded function u defined on a domain X ⊂ R d from noiseless or noisy samples of u at n points (x i)i=1,...,n.
Abstract: We consider the problem of reconstructing an unknown bounded function u defined on a domain X ⊂ R d from noiseless or noisy samples of u at n points (x i)i=1,...,n. We measure the reconstruction error in a norm L 2 (X, dρ) for some given probability measure dρ. Given a linear space Vm with dim(Vm) = m ≤ n, we study in general terms the weighted least-squares approximations from the spaces Vm based on independent random samples. It is well known that least-squares approximations can be inaccurate and unstable when m is too close to n, even in the noiseless case. Recent results from [4, 5] have shown the interest of using weighted least squares for reducing the number n of samples that is needed to achieve an accuracy comparable to that of best approximation in Vm, compared to standard least squares as studied in [3]. The contribution of the present paper is twofold. From the theoretical perspective, we establish results in expectation and in probability for weighted least squares in general approximation spaces Vm. These results show that for an optimal choice of sampling measure dµ and weight w, which depends on the space Vm and on the measure dρ, stability and optimal accuracy are achieved under the mild condition that n scales linearly with m up to an additional logarithmic factor. In contrast to [3], the present analysis covers cases where the function u and its approximants from Vm are unbounded, which might occur for instance in the relevant case where X = R d and dρ is the Gaussian measure. From the numerical perspective, we propose a sampling method which allows one to generate independent and identically distributed samples from the optimal measure dµ. This method becomes of interest in the multivariate setting where dµ is generally not of tensor product type. We illustrate this for particular examples of approximation spaces Vm of polynomial type, where the domain X is allowed to be unbounded and high or even infinite dimensional, motivated by certain applications to parametric and stochastic PDEs.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: This survey describes probabilistic algorithms for linear algebraic computations, such as factorizing matrices and solving linear systems, that have a proven track record for real-world problems and treats both the theoretical foundations of the subject and practical computational issues.
Abstract: This survey describes probabilistic algorithms for linear algebraic computations, such as factorizing matrices and solving linear systems. It focuses on techniques that have a proven track record for real-world problems. The paper treats both the theoretical foundations of the subject and practical computational issues. Topics include norm estimation, matrix approximation by sampling, structured and unstructured random embeddings, linear regression problems, low-rank approximation, subspace iteration and Krylov methods, error estimation and adaptivity, interpolatory and CUR factorizations, Nystrom approximation of positive semidefinite matrices, single-view (‘streaming’) algorithms, full rank-revealing factorizations, solvers for linear systems, and approximation of kernel matrices that arise in machine learning and in scientific computing.

158 citations

Journal ArticleDOI
TL;DR: A comparison between the empirical performance of the selected sampling methods applied to three numerical examples, including high-order PCE’s, high-dimensional problems, and low oversampling ratios, is presented to provide a road map for practitioners seeking the most suitable sampling technique for a problem at hand.

113 citations

Posted Content
TL;DR: The random feature model is viewed as a non-intrusive data-driven emulator, a mathematical framework for its interpretation is provided, and its ability to efficiently and accurately approximate the nonlinear parameter-to-solution maps of two prototypical PDEs arising in physical science and engineering applications is demonstrated.
Abstract: Well known to the machine learning community, the random feature model, originally introduced by Rahimi and Recht in 2008, is a parametric approximation to kernel interpolation or regression methods. It is typically used to approximate functions mapping a finite-dimensional input space to the real line. In this paper, we instead propose a methodology for use of the random feature model as a data-driven surrogate for operators that map an input Banach space to an output Banach space. Although the methodology is quite general, we consider operators defined by partial differential equations (PDEs); here, the inputs and outputs are themselves functions, with the input parameters being functions required to specify the problem, such as initial data or coefficients, and the outputs being solutions of the problem. Upon discretization, the model inherits several desirable attributes from this infinite-dimensional, function space viewpoint, including mesh-invariant approximation error with respect to the true PDE solution map and the capability to be trained at one mesh resolution and then deployed at different mesh resolutions. We view the random feature model as a non-intrusive data-driven emulator, provide a mathematical framework for its interpretation, and demonstrate its ability to efficiently and accurately approximate the nonlinear parameter-to-solution maps of two prototypical PDEs arising in physical science and engineering applications: viscous Burgers' equation and a variable coefficient elliptic equation.

82 citations

Journal ArticleDOI
TL;DR: In this article, the authors proposed a greedy algorithm for sparse polynomial chaos (SPC) approximation, which is based on the theory of optimal design of experiments (ODE) and incorporates topics from ODE to estimate the PC coefficients.

77 citations

Book ChapterDOI
TL;DR: It is shown that smooth, multivariate functions possess expansions in orthogonal polynomial bases that are not only approximately sparse but possess a particular type of structured sparsity defined by so-called lower sets, and the curse of dimensionality – the bane of high-dimensional approximation – is mitigated to a significant extent.
Abstract: In recent years, the use of sparse recovery techniques in the approximation of high-dimensional functions has garnered increasing interest. In this work we present a survey of recent progress in this emerging topic. Our main focus is on the computation of polynomial approximations of high-dimensional functions on d-dimensional hypercubes. We show that smooth, multivariate functions possess expansions in orthogonal polynomial bases that are not only approximately sparse but possess a particular type of structured sparsity defined by so-called lower sets. This structure can be exploited via the use of weighted l1 minimization techniques, and, as we demonstrate, doing so leads to sample complexity estimates that are at most logarithmically dependent on the dimension d. Hence the curse of dimensionality – the bane of high-dimensional approximation – is mitigated to a significant extent. We also discuss several practical issues, including unknown noise (due to truncation or numerical error), and highlight a number of open problems and challenges.

74 citations

References
More filters
Journal ArticleDOI
TL;DR: This chapter reviews the main methods for generating random variables, vectors and processes in non-uniform random variate generation, and provides information on the expected time complexity of various algorithms before addressing modern topics such as indirectly specified distributions, random processes, and Markov chain methods.

3,304 citations

Book
16 Apr 1986
TL;DR: A survey of the main methods in non-uniform random variate generation can be found in this article, where the authors provide information on the expected time complexity of various algorithms, before addressing modern topics such as indirectly specified distributions, random processes and Markov chain methods.
Abstract: This is a survey of the main methods in non-uniform random variate generation, and highlights recent research on the subject. Classical paradigms such as inversion, rejection, guide tables, and transformations are reviewed. We provide information on the expected time complexity of various algorithms, before addressing modern topics such as indirectly specified distributions, random processes, and Markov chain methods. Authors’ address: School of Computer Science, McGill University, 3480 University Street, Montreal, Canada H3A 2K6. The authors’ research was sponsored by NSERC Grant A3456 and FCAR Grant 90-ER-0291. 1. The main paradigms The purpose of this chapter is to review the main methods for generating random variables, vectors and processes. Classical workhorses such as the inversion method, the rejection method and table methods are reviewed in section 1. In section 2, we discuss the expected time complexity of various algorithms, and give a few examples of the design of generators that are uniformly fast over entire families of distributions. In section 3, we develop a few universal generators, such as generators for all log concave distributions on the real line. Section 4 deals with random variate generation when distributions are indirectly specified, e.g, via Fourier coefficients, characteristic functions, the moments, the moment generating function, distributional identities, infinite series or Kolmogorov measures. Random processes are briefly touched upon in section 5. Finally, the latest developments in Markov chain methods are discussed in section 6. Some of this work grew from Devroye (1986a), and we are carefully documenting work that was done since 1986. More recent references can be found in the book by Hörmann, Leydold and Derflinger (2004). Non-uniform random variate generation is concerned with the generation of random variables with certain distributions. Such random variables are often discrete, taking values in a countable set, or absolutely continuous, and thus described by a density. The methods used for generating them depend upon the computational model one is working with, and upon the demands on the part of the output. For example, in a ram (random access memory) model, one accepts that real numbers can be stored and operated upon (compared, added, multiplied, and so forth) in one time unit. Furthermore, this model assumes that a source capable of producing an i.i.d. (independent identically distributed) sequence of uniform [0, 1] random variables is available. This model is of course unrealistic, but designing random variate generators based on it has several advantages: first of all, it allows one to disconnect the theory of non-uniform random variate generation from that of uniform random variate generation, and secondly, it permits one to plan for the future, as more powerful computers will be developed that permit ever better approximations of the model. Algorithms designed under finite approximation limitations will have to be redesigned when the next generation of computers arrives. For the generation of discrete or integer-valued random variables, which includes the vast area of the generation of random combinatorial structures, one can adhere to a clean model, the pure bit model, in which each bit operation takes one time unit, and storage can be reported in terms of bits. Typically, one now assumes that an i.i.d. sequence of independent perfect bits is available. In this model, an elegant information-theoretic theory can be derived. For example, Knuth and Yao (1976) showed that to generate a random integer X described by the probability distribution {X = n} = pn, n ≥ 1, any method must use an expected number of bits greater than the binary entropy of the distribution, ∑

3,217 citations

Journal ArticleDOI
TL;DR: This paper presents new probability inequalities for sums of independent, random, self-adjoint matrices and provides noncommutative generalizations of the classical bounds associated with the names Azuma, Bennett, Bernstein, Chernoff, Hoeffding, and McDiarmid.
Abstract: This paper presents new probability inequalities for sums of independent, random, self-adjoint matrices. These results place simple and easily verifiable hypotheses on the summands, and they deliver strong conclusions about the large-deviation behavior of the maximum eigenvalue of the sum. Tail bounds for the norm of a sum of random rectangular matrices follow as an immediate corollary. The proof techniques also yield some information about matrix-valued martingales. In other words, this paper provides noncommutative generalizations of the classical bounds associated with the names Azuma, Bennett, Bernstein, Chernoff, Hoeffding, and McDiarmid. The matrix inequalities promise the same diversity of application, ease of use, and strength of conclusion that have made the scalar inequalities so valuable.

1,675 citations

Book
01 Jan 1997
TL;DR: In this paper, the authors consider the effects of an external field (or weight) on the minimum energy problem and provide a unified approach to seemingly different problems in constructive analysis, such as the asymptotic analysis of orthogonal polynomials, the limited behavior of weighted Fekete points, the existence and construction of fast decreasing polynomial, the numerical conformal mapping of simply and doubly connected domains, generalization of the Weierstrass approximation theorem to varying weights, and the determination of convergence rates for best approximating rational functions.
Abstract: This treatment of potential theory emphasizes the effects of an external field (or weight) on the minimum energy problem. Several important aspects of the external field problem (and its extension to signed measures) justify its special attention. The most striking is that it provides a unified approach to seemingly different problems in constructive analysis. These include the asymptotic analysis of orthogonal polynomials, the limited behavior of weighted Fekete points; the existence and construction of fast decreasing polynomials; the numerical conformal mapping of simply and doubly connected domains; generalization of the Weierstrass approximation theorem to varying weights; and the determination of convergence rates for best approximating rational functions.

1,560 citations

Journal ArticleDOI
Paul Neval1
TL;DR: In this paper, the authors show that the convergence and absolute convergence of orthogonal polynomials on infinite intervals and on the untt crrcle can be explained by the convergence of Christoffel functions.

372 citations