Open AccessPosted Content
Support points
Simon Mak,V. Roshan Joseph +1 more
Reads0
Chats0
TLDR
A new way to compact a continuous probability distribution into a set of representative points called support points, obtained by minimizing the energy distance, which can be formulated as a difference-of-convex program, which is manipulated using two algorithms to efficiently generate representative point sets.Abstract:
This paper introduces a new way to compact a continuous probability distribution $F$ into a set of representative points called support points. These points are obtained by minimizing the energy distance, a statistical potential measure initially proposed by Sz\'ekely and Rizzo (2004) for testing goodness-of-fit. The energy distance has two appealing features. First, its distance-based structure allows us to exploit the duality between powers of the Euclidean distance and its Fourier transform for theoretical analysis. Using this duality, we show that support points converge in distribution to $F$, and enjoy an improved error rate to Monte Carlo for integrating a large class of functions. Second, the minimization of the energy distance can be formulated as a difference-of-convex program, which we manipulate using two algorithms to efficiently generate representative point sets. In simulation studies, support points provide improved integration performance to both Monte Carlo and a specific Quasi-Monte Carlo method. Two important applications of support points are then highlighted: (a) as a way to quantify the propagation of uncertainty in expensive simulations, and (b) as a method to optimally compact Markov chain Monte Carlo (MCMC) samples in Bayesian computation.read more
Citations
More filters
Journal Article
Riemann manifold Langevin and Hamiltonian Monte Carlo methods
Mark Girolami,Ben Calderhead +1 more
TL;DR: The methodology proposed automatically adapts to the local structure when simulating paths across this manifold, providing highly efficient convergence and exploration of the target density, and substantial improvements in the time‐normalized effective sample size are reported when compared with alternative sampling approaches.
Journal ArticleDOI
Optimal ratio for data splitting
TL;DR: This article shows that the optimal training/testing splitting ratio is p:1, where p is the number of parameters in a linear regression model that explains the data well.
Posted Content
An efficient surrogate model for emulation and physics extraction of large eddy simulations
Simon Mak,Chih-Li Sung,Xingjian Wang,Shiang-Ting Yeh,Yu-Hung Chang,V. Roshan Joseph,Vigor Yang,C. F. Jeff Wu +7 more
TL;DR: In this article, the authors proposed a new surrogate model that provides efficient prediction and uncertainty quantification of turbulent flows in swirl injectors with varying geometries, devices commonly used in many engineering applications.
Journal ArticleDOI
Estimating mechanical properties from spherical indentation using Bayesian approaches
TL;DR: In this paper, a Gaussian Process (or kriging) surrogate model using finite element models of spherical indentation is proposed, and the inverse solution using a Bayesian framework and Markov Chain Monte Carlo sampling.
Posted Content
Optimal Thinning of MCMC Output
Marina Riabiz,Wilson Ye Chen,Jon Cockayne,Pawel Swietach,Steven A. Niederer,Lester Mackey,Chris J. Oates +6 more
TL;DR: A novel method is proposed, based on greedy minimisation of a kernel Stein discrepancy, that is suitable for problems where heavy compression is required and its effectiveness is demonstrated in the challenging context of parameter inference for ordinary differential equations.
References
More filters
Journal Article
R: A language and environment for statistical computing.
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Book
Applied Regression Analysis
Norman R. Draper,Harry Smith +1 more
TL;DR: In this article, the Straight Line Case is used to fit a straight line by least squares, and the Durbin-Watson Test is used for checking the straight line fit.
Journal ArticleDOI
Regularization Paths for Generalized Linear Models via Coordinate Descent
TL;DR: In comparative timings, the new algorithms are considerably faster than competing methods and can handle large problems and can also deal efficiently with sparse features.
Journal ArticleDOI
Least squares quantization in PCM
TL;DR: In this article, the authors derived necessary conditions for any finite number of quanta and associated quantization intervals of an optimum finite quantization scheme to achieve minimum average quantization noise power.
Least Squares Quantization in PCM
TL;DR: The corresponding result for any finite number of quanta is derived; that is, necessary conditions are found that the quanta and associated quantization intervals of an optimum finite quantization scheme must satisfy.