Support points

Open AccessPosted Content

Support points

Simon Mak, +1 more

- 07 Sep 2016 -

arXiv: Statistics Theory

Chats0

TLDR

A new way to compact a continuous probability distribution into a set of representative points called support points, obtained by minimizing the energy distance, which can be formulated as a difference-of-convex program, which is manipulated using two algorithms to efficiently generate representative point sets.

Abstract:

This paper introduces a new way to compact a continuous probability distribution $F$ into a set of representative points called support points. These points are obtained by minimizing the energy distance, a statistical potential measure initially proposed by Sz\'ekely and Rizzo (2004) for testing goodness-of-fit. The energy distance has two appealing features. First, its distance-based structure allows us to exploit the duality between powers of the Euclidean distance and its Fourier transform for theoretical analysis. Using this duality, we show that support points converge in distribution to $F$, and enjoy an improved error rate to Monte Carlo for integrating a large class of functions. Second, the minimization of the energy distance can be formulated as a difference-of-convex program, which we manipulate using two algorithms to efficiently generate representative point sets. In simulation studies, support points provide improved integration performance to both Monte Carlo and a specific Quasi-Monte Carlo method. Two important applications of support points are then highlighted: (a) as a way to quantify the propagation of uncertainty in expensive simulations, and (b) as a method to optimally compact Markov chain Monte Carlo (MCMC) samples in Bayesian computation.

Citations

PDF

Open Access

More filters

Journal Article

Riemann manifold Langevin and Hamiltonian Monte Carlo methods

Mark Girolami, +1 more

- 01 Jan 2011 -

Journal of the Royal Statistical Society

TL;DR: The methodology proposed automatically adapts to the local structure when simulating paths across this manifold, providing highly efficient convergence and exploration of the target density, and substantial improvements in the time‐normalized effective sample size are reported when compared with alternative sampling approaches.

...read moreread less

Journal ArticleDOI

Optimal ratio for data splitting

V. Roshan Joseph

- 07 Feb 2022 -

Statistical Analysis and Data Mining

TL;DR: This article shows that the optimal training/testing splitting ratio is p:1, where p is the number of parameters in a linear regression model that explains the data well.

...read moreread less

Posted Content

An efficient surrogate model for emulation and physics extraction of large eddy simulations

Simon Mak, +7 more

- 23 Nov 2016 -

arXiv: Applications

TL;DR: In this article, the authors proposed a new surrogate model that provides efficient prediction and uncertainty quantification of turbulent flows in swirl injectors with varying geometries, devices commonly used in many engineering applications.

...read moreread less

Journal ArticleDOI

Estimating mechanical properties from spherical indentation using Bayesian approaches

Patxi Fernandez-Zelaia, +3 more

- 05 Jun 2018 -

Materials & Design

TL;DR: In this paper, a Gaussian Process (or kriging) surrogate model using finite element models of spherical indentation is proposed, and the inverse solution using a Bayesian framework and Markov Chain Monte Carlo sampling.

...read moreread less

Posted Content

Optimal Thinning of MCMC Output

Marina Riabiz, +6 more

- 08 May 2020 -

arXiv: Methodology

TL;DR: A novel method is proposed, based on greedy minimisation of a kernel Stein discrepancy, that is suitable for problems where heavy compression is required and its effectiveness is demonstrated in the challenging context of parameter inference for ordinary differential equations.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal Article

R: A language and environment for statistical computing.

R Core Team

- 01 Jan 2014 -

MSOR connections

TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.

...read moreread less

Book

Applied Regression Analysis

Norman R. Draper, +1 more

TL;DR: In this article, the Straight Line Case is used to fit a straight line by least squares, and the Durbin-Watson Test is used for checking the straight line fit.

...read moreread less

Journal ArticleDOI

Regularization Paths for Generalized Linear Models via Coordinate Descent

Jerome H. Friedman, +2 more

- 02 Feb 2010 -

Journal of Statistical Software

TL;DR: In comparative timings, the new algorithms are considerably faster than competing methods and can handle large problems and can also deal efficiently with sparse features.

...read moreread less

Journal ArticleDOI

Least squares quantization in PCM

S. P. Lloyd

- 01 Mar 1982 -

IEEE Transactions on Information Theory

TL;DR: In this article, the authors derived necessary conditions for any finite number of quanta and associated quantization intervals of an optimum finite quantization scheme to achieve minimum average quantization noise power.

...read moreread less

Least Squares Quantization in PCM

S. P. Lloyd

TL;DR: The corresponding result for any finite number of quanta is derived; that is, necessary conditions are found that the quanta and associated quantization intervals of an optimum finite quantization scheme must satisfy.

...read moreread less