Showing papers on "Probability distribution published in 2018"

PDF

Open Access

Journal Article•DOI•

Data-Driven Distributionally Robust Optimization Using the Wasserstein Metric: Performance Guarantees and Tractable Reformulations

[...]

Peyman Mohajerin Esfahani¹, Daniel Kuhn²•Institutions (2)

Delft University of Technology¹, École Polytechnique Fédérale de Lausanne²

01 Sep 2018-Mathematical Programming

TL;DR: In this paper, the authors consider stochastic programs where the distribution of the uncertain parameters is only observable through a finite training dataset and use the Wasserstein metric to construct a ball in the space of probability distributions centered at the uniform distribution on the training samples.

...read moreread less

Abstract: We consider stochastic programs where the distribution of the uncertain parameters is only observable through a finite training dataset. Using the Wasserstein metric, we construct a ball in the space of (multivariate and non-discrete) probability distributions centered at the uniform distribution on the training samples, and we seek decisions that perform best in view of the worst-case distribution within this Wasserstein ball. The state-of-the-art methods for solving the resulting distributionally robust optimization problems rely on global optimization techniques, which quickly become computationally excruciating. In this paper we demonstrate that, under mild assumptions, the distributionally robust optimization problems over Wasserstein balls can in fact be reformulated as finite convex programs—in many interesting cases even as tractable linear programs. Leveraging recent measure concentration results, we also show that their solutions enjoy powerful finite-sample performance guarantees. Our theoretical results are exemplified in mean-risk portfolio optimization as well as uncertainty quantification.

...read moreread less

913 citations

Journal Article•DOI•

Comparing performance between log-binomial and robust Poisson regression models for estimating risk ratios under model misspecification

[...]

Wansu Chen¹, Lei Qian¹, Jiaxiao Shi¹, Meredith Franklin²•Institutions (2)

Kaiser Permanente¹, University of Southern California²

22 Jun 2018-BMC Medical Research Methodology

TL;DR: Log-binomial and robust (modified) Poisson regression models are popular approaches to estimate risk ratios for binary response variables but their performance under model misspecification is poorly understood.

...read moreread less

Abstract: Log-binomial and robust (modified) Poisson regression models are popular approaches to estimate risk ratios for binary response variables. Previous studies have shown that comparatively they produce similar point estimates and standard errors. However, their performance under model misspecification is poorly understood. In this simulation study, the statistical performance of the two models was compared when the log link function was misspecified or the response depended on predictors through a non-linear relationship (i.e. truncated response). Point estimates from log-binomial models were biased when the link function was misspecified or when the probability distribution of the response variable was truncated at the right tail. The percentage of truncated observations was positively associated with the presence of bias, and the bias was larger if the observations came from a population with a lower response rate given that the other parameters being examined were fixed. In contrast, point estimates from the robust Poisson models were unbiased. Under model misspecification, the robust Poisson model was generally preferable because it provided unbiased estimates of risk ratios.

...read moreread less

260 citations

Proceedings Article•

Learning Generative Models with Sinkhorn Divergences

[...]

Aude Genevay¹, Gabriel Peyré², Marco Cuturi³•Institutions (3)

Massachusetts Institute of Technology¹, École Normale Supérieure², ENSAE ParisTech³

31 Mar 2018

TL;DR: In this paper, the authors propose a method to train large scale generative models using an optimal transport loss, which is based on two key ideas: (a) entropic smoothing, which turns the original OT loss into one that can be computed using Sinkhorn fixed point iterations; (b) algorithmic (automatic) differentiation of these iterations.

...read moreread less

Abstract: The ability to compare two degenerate probability distributions (i.e. two probability distributions supported on two distinct low-dimensional manifolds living in a much higher-dimensional space) is a crucial problem arising in the estimation of generative models for high-dimensional observations such as those arising in computer vision or natural language. It is known that optimal transport metrics can represent a cure for this problem, since they were specifically designed as an alternative to information divergences to handle such problematic scenarios. Unfortunately, training generative machines using OT raises formidable computational and statistical challenges, because of (i) the computational burden of evaluating OT losses, (ii) the instability and lack of smoothness of these losses, (iii) the difficulty to estimate robustly these losses and their gradients in high dimension. This paper presents the first tractable computational method to train large scale generative models using an optimal transport loss, and tackles these three issues by relying on two key ideas: (a) entropic smoothing, which turns the original OT loss into one that can be computed using Sinkhorn fixed point iterations; (b) algorithmic (automatic) differentiation of these iterations. These two approximations result in a robust and differentiable approximation of the OT loss with streamlined GPU execution. Entropic smoothing generates a family of losses interpolating between Wasserstein (OT) and Maximum Mean Discrepancy (MMD), thus allowing to find a sweet spot leveraging the geometry of OT and the favorable high-dimensional sample complexity of MMD which comes with unbiased gradient estimates. The resulting computational architecture complements nicely standard deep network generative models by a stack of extra layers implementing the loss function.

...read moreread less

245 citations

Journal Article•DOI•

Reweighted autoencoded variational Bayes for enhanced sampling (RAVE).

[...]

João Marcelo Lamim Ribeiro¹, Pablo Bravo¹, Pablo Bravo², Yihang Wang¹, Pratyush Tiwary¹ - Show less +1 more•Institutions (2)

University of Maryland, College Park¹, Pontifical Catholic University of Chile²

04 May 2018-Journal of Chemical Physics

TL;DR: The usefulness and reliability of RAVE is demonstrated by applying it to model potentials of increasing complexity, including computation of the binding free energy profile for a hydrophobic ligand-substrate system in explicit water with dissociation time of more than 3 min in computer time at least twenty times less than that needed for umbrella sampling or metadynamics.

...read moreread less

Abstract: Here we propose the reweighted autoencoded variational Bayes for enhanced sampling (RAVE) method, a new iterative scheme that uses the deep learning framework of variational autoencoders to enhance sampling in molecular simulations. RAVE involves iterations between molecular simulations and deep learning in order to produce an increasingly accurate probability distribution along a low-dimensional latent space that captures the key features of the molecular simulation trajectory. Using the Kullback-Leibler divergence between this latent space distribution and the distribution of various trial reaction coordinates sampled from the molecular simulation, RAVE determines an optimum, yet nonetheless physically interpretable, reaction coordinate and optimum probability distribution. Both then directly serve as the biasing protocol for a new biased simulation, which is once again fed into the deep learning module with appropriate weights accounting for the bias, the procedure continuing until estimates of desirable thermodynamic observables are converged. Unlike recent methods using deep learning for enhanced sampling purposes, RAVE stands out in that (a) it naturally produces a physically interpretable reaction coordinate, (b) is independent of existing enhanced sampling protocols to enhance the fluctuations along the latent space identified via deep learning, and (c) it provides the ability to easily filter out spurious solutions learned by the deep learning procedure. The usefulness and reliability of RAVE is demonstrated by applying it to model potentials of increasing complexity, including computation of the binding free energy profile for a hydrophobic ligand-substrate system in explicit water with dissociation time of more than 3 min, in computer time at least twenty times less than that needed for umbrella sampling or metadynamics.

...read moreread less

225 citations

Journal Article•DOI•

A Survey of Statistical Model Checking

[...]

Gul Agha¹, Karl Palmskog¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

31 Jan 2018-ACM Transactions on Modeling and Computer Simulation

TL;DR: SMC provides a more widely applicable and scalable alternative to analysis of properties of stochastic systems using numerical and symbolic methods, while emphasizing current limitations and tradeoffs between precision and scalability.

...read moreread less

Abstract: Interactive, distributed, and embedded systems often behave stochastically, for example, when inputs, message delays, or failures conform to a probability distribution. However, reasoning analytically about the behavior of complex stochastic systems is generally infeasible. While simulations of systems are commonly used in engineering practice, they have not traditionally been used to reason about formal specifications. Statistical model checking (SMC) addresses this weakness by using a simulation-based approach to reason about precise properties specified in a stochastic temporal logic. A specification for a communication system may state that within some time bound, the probability that the number of messages in a queue will be greater than 5 must be less than 0.01. Using SMC, executions of a stochastic system are first sampled, after which statistical techniques are applied to determine whether such a property holds. While the output of sample-based methods are not always correct, statistical inference can quantify the confidence in the result produced. In effect, SMC provides a more widely applicable and scalable alternative to analysis of properties of stochastic systems using numerical and symbolic methods. SMC techniques have been successfully applied to analyze systems with large state spaces in areas such as computer networking, security, and systems biology. In this article, we survey SMC algorithms, techniques, and tools, while emphasizing current limitations and tradeoffs between precision and scalability.

...read moreread less

211 citations

Journal Article•DOI•

Quantum computational finance: Monte Carlo pricing of financial derivatives

[...]

Patrick Rebentrost, Brajesh Gupt, Thomas R. Bromley

20 Aug 2018-Physical Review A

TL;DR: This work presents a quantum algorithm for the Monte Carlo pricing of financial derivatives and shows how the amplitude estimation algorithm can be applied to achieve a quadratic quantum speedup in the number of steps required to obtain an estimate for the price with high confidence.

...read moreread less

Abstract: This work presents a quantum algorithm for the Monte Carlo pricing of financial derivatives. We show how the relevant probability distributions can be prepared in quantum superposition, the payoff functions can be implemented via quantum circuits, and the price of financial derivatives can be extracted via quantum measurements. We show how the amplitude estimation algorithm can be applied to achieve a quadratic quantum speedup in the number of steps required to obtain an estimate for the price with high confidence. This work provides a starting point for further research at the interface of quantum computing and finance.

...read moreread less

208 citations

Proceedings Article•

Latent space oddity: On the curvature of deep generative models

[...]

Georgios Arvanitidis, Lars Kai Hansen, Søren Hauberg

01 Jan 2018

TL;DR: This work shows that the nonlinearity of the generator imply that the latent space gives a distorted view of the input space, and shows that this distortion can be characterized by a stochastic Riemannian metric, and demonstrates that distances and interpolants are significantly improved under this metric.

...read moreread less

Abstract: Deep generative models provide a systematic way to learn nonlinear data distributions, through a set of latent variables and a nonlinear "generator" function that maps latent points into the input space. The nonlinearity of the generator imply that the latent space gives a distorted view of the input space. Under mild conditions, we show that this distortion can be characterized by a stochastic Riemannian metric, and demonstrate that distances and interpolants are significantly improved under this metric. This in turn improves probability distributions, sampling algorithms and clustering in the latent space. Our geometric analysis further reveals that current generators provide poor variance estimates and we propose a new generator architecture with vastly improved variance estimates. Results are demonstrated on convolutional and fully connected variational autoencoders, but the formalism easily generalize to other deep generative models.

...read moreread less

200 citations

Journal Article•DOI•

Statistical Aspects of Wasserstein Distances

[...]

Victor M. Panaretos¹, Yoav Zemel²•Institutions (2)

École Polytechnique Fédérale de Lausanne¹, University of Göttingen²

14 Jun 2018-arXiv: Methodology

TL;DR: Wasserstein distances as discussed by the authors measure the minimal effort required to reconfigure the probability mass of one distribution in order to recover the other distribution, and have a long history that has catalyse core developments in analysis, optimization, and probability.

...read moreread less

Abstract: Wasserstein distances are metrics on probability distributions inspired by the problem of optimal mass transportation. Roughly speaking, they measure the minimal effort required to reconfigure the probability mass of one distribution in order to recover the other distribution. They are ubiquitous in mathematics, with a long history that has seen them catalyse core developments in analysis, optimization, and probability. Beyond their intrinsic mathematical richness, they possess attractive features that make them a versatile tool for the statistician: they can be used to derive weak convergence and convergence of moments, and can be easily bounded; they are well-adapted to quantify a natural notion of perturbation of a probability distribution; and they seamlessly incorporate the geometry of the domain of the distributions in question, thus being useful for contrasting complex objects. Consequently, they frequently appear in the development of statistical theory and inferential methodology, and have recently become an object of inference in themselves. In this review, we provide a snapshot of the main concepts involved in Wasserstein distances and optimal transportation, and a succinct overview of some of their many statistical aspects.

...read moreread less

186 citations

Journal Article•DOI•

Operator growth in the SYK model

[...]

Daniel A. Roberts¹, Douglas Stanford, Alexandre Streicher², Alexandre Streicher³•Institutions (3)

Facebook¹, Stanford University², University of California³

07 Feb 2018-Journal of High Energy Physics

TL;DR: In this paper, the size distribution of a time-evolving operator in the SYK model is discussed and the authors evaluate the distribution numerically for N = 30, and show how to compute it in the large-N theory using the dressed fermion propagator.

...read moreread less

Abstract: We discuss the probability distribution for the “size” of a time-evolving operator in the SYK model. Scrambling is related to the fact that as time passes, the distribution shifts towards larger operators. Initially, the rate is exponential and determined by the infinite-temperature chaos exponent. We evaluate the size distribution numerically for N = 30, and show how to compute it in the large-N theory using the dressed fermion propagator. We then evaluate the distribution explicitly at leading nontrivial order in the large-q expansion.

...read moreread less

169 citations

Journal Article•DOI•

Distributionally Robust Chance-Constrained Approximate AC-OPF With Wasserstein Metric

[...]

Chao Duan¹, Wanliang Fang¹, Lin Jiang², Li Yao¹, Jun Liu¹ - Show less +1 more•Institutions (2)

Xi'an Jiaotong University¹, University of Liverpool²

19 Feb 2018-IEEE Transactions on Power Systems

TL;DR: In this paper, a distributionally robust chance constrained approximate ac-OPF is proposed for variable renewable energy (VRE) uncertainties, where the ambiguity set is constructed from historical data without any presumption on the type of the probability distribution, and more data leads to smaller ambiguity set and less conservative strategy.

...read moreread less

Abstract: Chance constrained optimal power flow (OPF) has been recognized as a promising framework to manage the risk from variable renewable energy (VRE). In the presence of VRE uncertainties, this paper discusses a distributionally robust chance constrained approximate ac-OPF. The power flow model employed in the proposed OPF formulation combines an exact ac power flow model at the nominal operation point and an approximate linear power flow model to reflect the system response under uncertainties. The ambiguity set employed in the distributionally robust formulation is the Wasserstein ball centered at the empirical distribution. The proposed OPF model minimizes the expectation of the quadratic cost function w.r.t. the worst-case probability distribution and guarantees the chance constraints satisfied for any distribution in the ambiguity set. The whole method is data-driven in the sense that the ambiguity set is constructed from historical data without any presumption on the type of the probability distribution, and more data leads to smaller ambiguity set and less conservative strategy. Moreover, special problem structures of the proposed problem formulation are exploited to develop an efficient and scalable solution approach. Case studies are carried out on the IEEE 14 and 118 bus systems to show the accuracy and necessity of the approximate ac model and the attractive features of the distributionally robust optimization approach compared with other methods to deal with uncertainties.

...read moreread less

156 citations

Journal Article•DOI•

Kernel density estimation and its application

[...]

Stanisław Węglarczyk

01 Jan 2018

TL;DR: Kernel density estimation is a technique for estimation of probability density function that is a must-have enabling the user to better analyse the studied probability distribution than when using a traditional histogram.

...read moreread less

Abstract: Kernel density estimation is a technique for estimation of probability density function that is a must-have enabling the user to better analyse the studied probability distribution than when using a traditional histogram. Unlike the histogram, the kernel technique produces smooth estimate of the pdf, uses all sample points' locations and more convincingly suggest multimodality. In its two-dimensional applications, kernel estimation is even better as the 2D histogram requires additionally to define the orientation of 2D bins. Two concepts play fundamental role in kernel estimation: kernel function shape and coefficient of smoothness, of which the latter is crucial to the method. Several real-life examples, both for univariate and bivariate applications, are shown.

...read moreread less

Proceedings Article•DOI•

Edit Probability for Scene Text Recognition

[...]

Fan Bai¹, Zhanzhan Cheng, Yi Niu, Shiliang Pu, Shuigeng Zhou¹ - Show less +1 more•Institutions (1)

Fudan University¹

18 Jun 2018

TL;DR: Zhang et al. as discussed by the authors proposed a novel method called edit probability (EP) for scene text recognition, which tries to estimate the probability of generating a string from the output sequence of probability distribution conditioned on the input image, while considering the possible occurrences of missing/superfluous characters.

...read moreread less

Abstract: We consider the scene text recognition problem under the attention-based encoder-decoder framework, which is the state of the art. The existing methods usually employ a frame-wise maximal likelihood loss to optimize the models. When we train the model, the misalignment between the ground truth strings and the attention's output sequences of probability distribution, which is caused by missing or superfluous characters, will confuse and mislead the training process, and consequently make the training costly and degrade the recognition accuracy. To handle this problem, we propose a novel method called edit probability (EP) for scene text recognition. EP tries to effectively estimate the probability of generating a string from the output sequence of probability distribution conditioned on the input image, while considering the possible occurrences of missing/superfluous characters. The advantage lies in that the training process can focus on the missing, superfluous and unrecognized characters, and thus the impact of the misalignment problem can be alleviated or even overcome. We conduct extensive experiments on standard benchmarks, including the IIIT-5K, Street View Text and ICDAR datasets. Experimental results show that the EP can substantially boost scene text recognition performance.

...read moreread less

Journal Article•DOI•

Distributionally Robust Chance Constrained Optimal Power Flow with Renewables: A Conic Reformulation

[...]

Weijun Xie¹, Shabbir Ahmed¹•Institutions (1)

Georgia Institute of Technology¹

01 Mar 2018-IEEE Transactions on Power Systems

TL;DR: In this article, the authors proposed a data driven distributionally robust chance constrained optimal power flow model (DRCC-OPF), which ensures that the worst-case probability of violating both the upper and lower limit of a line/bus capacity under a wide family of distributions is small.

...read moreread less

Abstract: The uncertainty associated with renewable energy sources introduces significant challenges in optimal power flow (OPF) analysis. A variety of new approaches have been proposed that use chance constraints to limit line or bus overload risk in OPF models. Most existing formulations assume that the probability distributions associated with the uncertainty are known a priori or can be estimated accurately from empirical data, and/or use separate chance constraints for upper and lower line/bus limits. In this paper, we propose a data driven distributionally robust chance constrained optimal power flow model (DRCC-OPF), which ensures that the worst-case probability of violating both the upper and lower limit of a line/bus capacity under a wide family of distributions is small. Assuming that we can estimate the first and second moments of the underlying distributions based on empirical data, we propose an exact reformulation of DRCC-OPF as a tractable convex program. The key theoretical result behind this reformulation is a second-order cone programming (SOCP) reformulation of a general two-sided distributionally robust chance constrained set by lifting the set to a higher dimensional space. Our numerical study shows that the proposed SOCP formulation can be solved efficiently and that the results of our model are quite robust.

...read moreread less

Journal Article•DOI•

Flexible Spacing Adaptive Cruise Control Using Stochastic Model Predictive Control

[...]

Dominik Moser¹, Roman Schmied¹, Harald Waschl¹, Luigi del Re¹•Institutions (1)

Johannes Kepler University of Linz¹

01 Jan 2018-IEEE Transactions on Control Systems and Technology

TL;DR: This paper proposes a stochastic model predictive control approach to optimize the fuel consumption in a vehicle following context using a conditional linear Gauss model to estimate the probability distribution of the future velocity of the preceding vehicle.

...read moreread less

Abstract: This paper proposes a stochastic model predictive control (MPC) approach to optimize the fuel consumption in a vehicle following context. The practical solution of that problem requires solving a constrained moving horizon optimal control problem using a short-term prediction of the preceding vehicle’s velocity. In a deterministic framework, the prediction errors lead to constraint violations and to harsh control reactions. Instead, the suggested method considers errors, and limits the probability of a constraint violation. A conditional linear Gauss model is developed and trained with real measurements to estimate the probability distribution of the future velocity of the preceding vehicle. The prediction model is used to evaluate two different stochastic MPC approaches. On the one hand, an MPC with individual chance constraints is applied. On the other hand, samples are drawn from the conditional Gaussian model and used for a scenario-based optimization approach. Finally, both developed control strategies are evaluated and compared against a standard deterministic MPC. The evaluation of the controllers shows a significant reduction of the fuel consumption compared with standard adaptive cruise control algorithms.

...read moreread less

Journal Article•DOI•

Distributionally Robust Contingency-Constrained Unit Commitment

[...]

Chaoyue Zhao¹, Ruiwei Jiang²•Institutions (2)

Oklahoma State University–Stillwater¹, University of Michigan²

01 Jan 2018-IEEE Transactions on Power Systems

TL;DR: This paper proposes a distributionally robust optimization approach for the contingency-constrained unit commitment problem, and derives an equivalent reformulation and study a Benders’ decomposition algorithm for solving the model.

...read moreread less

Abstract: This paper proposes a distributionally robust optimization approach for the contingency-constrained unit commitment problem. In our approach, we consider a case where the true probability distribution of contingencies is ambiguous, i.e., difficult to accurately estimate. Instead of assigning a (fixed) probability estimate for each contingency scenario, we consider a set of contingency probability distributions (termed the ambiguity set) based on the $N-k$ security criterion and moment information. Our approach considers all possible distributions in the ambiguity set, and is hence distributionally robust. Meanwhile, as this approach utilizes moment information, it can benefit from available data and become less conservative than the robust optimization approaches. We derive an equivalent reformulation and study a Benders’ decomposition algorithm for solving the model. Furthermore, we extend the model to incorporate wind power uncertainty. The case studies on a 6-Bus system and the IEEE 118-Bus system demonstrate that the proposed approach provides less conservative unit commitment decisions as compared with the robust optimization approach.

...read moreread less

Journal Article•DOI•

Assessing Gaussian Assumption of PMU Measurement Error Using Field Data

[...]

Shaobu Wang¹, Junbo Zhao², Zhenyu Huang¹, Ruisheng Diao¹•Institutions (2)

Pacific Northwest National Laboratory¹, Virginia Tech²

01 Dec 2018-IEEE Transactions on Power Delivery

TL;DR: This letter proposes a simple yet effective approach to assess the Gaussian phasor measurement unit measurement error assumption by using the stability property of a probability distribution and the concept of redundant measurement.

...read moreread less

Abstract: Gaussian phasor measurement unit (PMU) measurement error has been assumed for many power system applications, such as state estimation, oscillatory modes monitoring, voltage stability analysis, to cite a few. This letter proposes a simple yet effective approach to assess this assumption by using the stability property of a probability distribution and the concept of redundant measurement. Extensive results using field PMU data from WECC system reveal that the Gaussian assumption is questionable.

...read moreread less

Journal Article•DOI•

Unsupervised Generative Modeling Using Matrix Product States

[...]

Zhao-Yu Han¹, Jun Wang¹, Heng Fan², Lei Wang², Pan Zhang² - Show less +1 more•Institutions (2)

Peking University¹, Chinese Academy of Sciences²

17 Jul 2018-Physical Review X

TL;DR: In this article, the probability distribution of complex data using insights from quantum physics is modelled using GNNs, which shows great potential compared to conventional neural network approaches, and is a fresh approach to generative modeling in machine learning.

...read moreread less

Abstract: Modeling the probability distribution of complex data using insights from quantum physics is a fresh approach to generative modeling in machine learning, and shows great potential compared to conventional neural network approaches.

...read moreread less

Posted Content•

End-to-End Saliency Mapping via Probability Distribution Prediction

[...]

Saumya Jetley¹, Naila Murray¹, Eleonora Vig¹•Institutions (1)

Xerox¹

05 Apr 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, a new saliency map model is proposed which formulates a map as a generalized Bernoulli distribution and then trains a deep architecture to predict such maps using novel loss functions which pair the softmax activation function with measures designed to compute distances between probability distributions.

...read moreread less

Abstract: Most saliency estimation methods aim to explicitly model low-level conspicuity cues such as edges or blobs and may additionally incorporate top-down cues using face or text detection. Data-driven methods for training saliency models using eye-fixation data are increasingly popular, particularly with the introduction of large-scale datasets and deep architectures. However, current methods in this latter paradigm use loss functions designed for classification or regression tasks whereas saliency estimation is evaluated on topographical maps. In this work, we introduce a new saliency map model which formulates a map as a generalized Bernoulli distribution. We then train a deep architecture to predict such maps using novel loss functions which pair the softmax activation function with measures designed to compute distances between probability distributions. We show in extensive experiments the effectiveness of such loss functions over standard ones on four public benchmark datasets, and demonstrate improved performance over state-of-the-art saliency methods.

...read moreread less

Proceedings Article•DOI•

Mixture models, robustness, and sum of squares proofs

[...]

Samuel B. Hopkins¹, Jerry Li²•Institutions (2)

Cornell University¹, Massachusetts Institute of Technology²

20 Jun 2018

TL;DR: In this paper, a polynomial-time algorithm with guarantees approaching the information-theoretic limit for non-Gaussian distributions was proposed for learning well-separated mixtures of Gaussians.

...read moreread less

Abstract: We use the Sum of Squares method to develop new efficient algorithms for learning well-separated mixtures of Gaussians and robust mean estimation, both in high dimensions, that substantially improve upon the statistical guarantees achieved by previous efficient algorithms. Our contributions are: Mixture models with separated means: We study mixtures of poly(k)-many k-dimensional distributions where the means of every pair of distributions are separated by at least ke. In the special case of spherical Gaussian mixtures, we give a kO(1/e)-time algorithm that learns the means assuming separation at least ke, for any e> 0. This is the first algorithm to improve on greedy (“single-linkage”) and spectral clustering, breaking a long-standing barrier for efficient algorithms at separation k1/4. Robust estimation: When an unknown (1−e)-fraction of X1,…,Xn are chosen from a sub-Gaussian distribution with mean µ but the remaining points are chosen adversarially, we give an algorithm recovering µ to error e1−1/t in time kO(t), so long as sub-Gaussian-ness up to O(t) moments can be certified by a Sum of Squares proof. This is the first polynomial-time algorithm with guarantees approaching the information-theoretic limit for non-Gaussian distributions. Previous algorithms could not achieve error better than e1/2. As a corollary, we achieve similar results for robust covariance estimation. Both of these results are based on a unified technique. Inspired by recent algorithms of Diakonikolas et al. in robust statistics, we devise an SDP based on the Sum of Squares method for the following setting: given X1,…,Xn ∈ ℝk for large k and n = poly(k) with the promise that a subset of X1,…,Xn were sampled from a probability distribution with bounded moments, recover some information about that distribution.

...read moreread less

Journal Article•DOI•

Steady state, relaxation and first-passage properties of a run-and-tumble particle in one-dimension

[...]

Kanaya Malakar¹, V. Jemseena², Anupam Kundu², K. Vijay Kumar², Sanjib Sabhapandit³, Satya N. Majumdar⁴, Sidney Redner⁵, Abhishek Dhar² - Show less +4 more•Institutions (5)

Presidency University, Kolkata¹, Tata Institute of Fundamental Research², Raman Research Institute³, University of Paris-Sud⁴, Santa Fe Institute⁵

26 Apr 2018-Journal of Statistical Mechanics: Theory and Experiment

TL;DR: In this article, the exact probability distribution of a run-and-tumble particle with and without diffusion on the infinite line, as well as in a finite interval, was investigated.

...read moreread less

Abstract: We investigate the motion of a run-and-tumble particle (RTP) in one dimension. We find the exact probability distribution of the particle with and without diffusion on the infinite line, as well as in a finite interval. In the infinite domain, this probability distribution approaches a Gaussian form in the long-time limit, as in the case of a regular Brownian particle. At intermediate times, this distribution exhibits unexpected multi-modal forms. In a finite domain, the probability distribution reaches a steady state form with peaks at the boundaries, in contrast to a Brownian particle. We also study the relaxation to the steady state analytically. Finally we compute the survival probability of the RTP in a semi-infinite domain. In the finite interval, we compute the exit probability and the associated exit times. We provide numerical verifications of our analytical results.

...read moreread less

Journal Article•DOI•

Stochastic Reconstruction of an Oolitic Limestone by Generative Adversarial Networks

[...]

Lukas Mosser¹, Olivier Dubrule¹, Martin J. Blunt¹•Institutions (1)

Imperial College London¹

01 Oct 2018-Transport in Porous Media

TL;DR: A GAN-based workflow of training and image generation to an oolitic Ketton limestone micro-CT unsegmented gray-level dataset is applied and results show that GANs allow a fast and accurate reconstruction of the evaluated image dataset.

...read moreread less

Abstract: Stochastic image reconstruction is a key part of modern digital rock physics and material analysis that aims to create representative samples of microstructures for upsampling, upscaling and uncertainty quantification. We present new results of a method of three-dimensional stochastic image reconstruction based on generative adversarial neural networks (GANs). GANs are a family of unsupervised learning methods that require no a priori inference of the probability distribution associated with the training data. Thanks to the use of two convolutional neural networks, the discriminator and the generator, in the training phase, and only the generator in the simulation phase, GANs allow the sampling of large and realistic volumetric images. We apply a GAN-based workflow of training and image generation to an oolitic Ketton limestone micro-CT unsegmented gray-level dataset. Minkowski functionals calculated as a function of the segmentation threshold are compared between simulated and acquired images. Flow simulations are run on the segmented images, and effective permeability and velocity distributions of simulated flow are also compared. Results show that GANs allow a fast and accurate reconstruction of the evaluated image dataset. We discuss the performance of GANs in relation to other simulation techniques and stress the benefits resulting from the use of convolutional neural networks . We address a number of challenges involved in GANs, in particular the representation of the probability distribution associated with the training data.

...read moreread less

Journal Article•DOI•

Transport Map Accelerated Markov Chain Monte Carlo

[...]

Matthew Parno, Youssef M. Marzouk

10 May 2018-SIAM/ASA Journal on Uncertainty Quantification

TL;DR: In this paper, a new framework for efficient sampling from complex probability distributions, using a combination of transport maps and the Metropolis-Hastings rule, is introduced, and the core idea is to use determin...

...read moreread less

Abstract: We introduce a new framework for efficient sampling from complex probability distributions, using a combination of transport maps and the Metropolis--Hastings rule. The core idea is to use determin...

...read moreread less

Journal Article•DOI•

Quantum Supremacy and the Complexity of Random Circuit Sampling

[...]

Adam Bouland¹, Bill Fefferman², Chinmay Nirkhe², Umesh Vazirani²•Institutions (2)

University of California¹, University of California, Berkeley²

12 Mar 2018-arXiv: Quantum Physics

TL;DR: This paper shows complexity theoretic evidence of hardness that is on par with the strongest theoretical proposals for supremacy, and shows that RCS satisfies an average-case hardness condition - computing output probabilities of typical quantum circuits is as hard as computing them in the worst-case, and therefore #P-hard.

...read moreread less

Abstract: A critical milestone on the path to useful quantum computers is quantum supremacy - a demonstration of a quantum computation that is prohibitively hard for classical computers. A leading near-term candidate, put forth by the Google/UCSB team, is sampling from the probability distributions of randomly chosen quantum circuits, which we call Random Circuit Sampling (RCS). In this paper we study both the hardness and verification of RCS. While RCS was defined with experimental realization in mind, we show complexity theoretic evidence of hardness that is on par with the strongest theoretical proposals for supremacy. Specifically, we show that RCS satisfies an average-case hardness condition - computing output probabilities of typical quantum circuits is as hard as computing them in the worst-case, and therefore #P-hard. Our reduction exploits the polynomial structure in the output amplitudes of random quantum circuits, enabled by the Feynman path integral. In addition, it follows from known results that RCS satisfies an anti-concentration property, making it the first supremacy proposal with both average-case hardness and anti-concentration.

...read moreread less

Proceedings Article•DOI•

A Generative Approach to Zero-Shot and Few-Shot Action Recognition

[...]

Ashish Mishra¹, Vinay Kumar Verma², M Shiva Krishna Reddy¹, Arulkumar S¹, Piyush Rai², Anurag Mittal¹ - Show less +2 more•Institutions (2)

Indian Institute of Technology Madras¹, Indian Institute of Technology Kanpur²

27 Jan 2018

TL;DR: A generative framework for zero-shot action recognition where some of the possible action classes do not occur in the training data, based on modeling each action class using a probability distribution whose parameters are functions of the attribute vector representing that action class.

...read moreread less

Abstract: We present a generative framework for zero-shot action recognition where some of the possible action classes do not occur in the training data. Our approach is based on modeling each action class using a probability distribution whose parameters are functions of the attribute vector representing that action class. In particular, we assume that the distribution parameters for any action class in the visual space can be expressed as a linear combination of a set of basis vectors where the combination weights are given by the attributes of the action class. These basis vectors can be learned solely using labeled data from the known (i.e., previously seen) action classes, and can then be used to predict the parameters of the probability distributions of unseen action classes. We consider two settings: (1) Inductive setting, where we use only the labeled examples of the seen action classes to predict the unseen action class parameters; and (2) Transductive setting which further leverages unlabeled data from the unseen action classes. Our framework also naturally extends to few-shot action recognition where a few labelled examples from unseen classes are available. Our experiments on benchmark datasets (UCF101, HMDB51 and Olympic) show significant performance improvements as compared to various baselines, in both standard zero-shot (disjoint seen and unseen classes) and generalized zero-shot learning settings.

...read moreread less

Journal Article•DOI•

Social Learning and Distributed Hypothesis Testing

[...]

Anusha Lalitha¹, Tara Javidi¹, Anand D. Sarwate²•Institutions (2)

University of California, San Diego¹, Rutgers University²

15 May 2018-IEEE Transactions on Information Theory

TL;DR: In this article, the authors consider the problem of distributed hypothesis testing over a network and characterize the exponential rate of learning in terms of the nodes' influence of the network and the divergences between the observations' distributions.

...read moreread less

Abstract: This paper considers a problem of distributed hypothesis testing over a network. Individual nodes in a network receive noisy local (private) observations whose distribution is parameterized by a discrete parameter (hypothesis). The marginals of the joint observation distribution conditioned on each hypothesis are known locally at the nodes, but the true parameter/hypothesis is not known. An update rule is analyzed in which nodes first perform a Bayesian update of their belief (distribution estimate) of each hypothesis based on their local observations, communicate these updates to their neighbors, and then perform a “non-Bayesian” linear consensus using the log-beliefs of their neighbors. Under mild assumptions, we show that the belief of any node on a wrong hypothesis converges to zero exponentially fast. We characterize the exponential rate of learning, which we call the network divergence, in terms of the nodes’ influence of the network and the divergences between the observations’ distributions. For a broad class of observation statistics which includes distributions with unbounded support such as Gaussian mixtures, we show that rate of rejection of wrong hypothesis satisfies a large deviation principle, i.e., the probability of sample paths on which the rate of rejection of wrong hypothesis deviates from the mean rate vanishes exponentially fast and we characterize the rate function in terms of the nodes’ influence of the network and the local observation models.

...read moreread less

Journal Article•DOI•

Probability-interval hybrid uncertainty analysis for structures with both aleatory and epistemic uncertainties: a review

[...]

Chen Jiang¹, Jing Zheng¹, Xu Han¹•Institutions (1)

Hunan University¹

01 Jun 2018-Structural and Multidisciplinary Optimization

TL;DR: In this article, the authors present a review of the state-of-the-art in probability-interval hybrid uncertainty analysis and provide an outlook for future research in this area.

...read moreread less

Abstract: Traditional structural uncertainty analysis is mainly based on probability models and requires the establishment of accurate parametric probability distribution functions using large numbers of experimental samples. In many actual engineering problems, the probability distributions of some parameters can be established due to sufficient samples available, whereas for some parameters, due to the lack or poor quality of samples, only their variation intervals can be obtained, or their probability distribution types can be determined based on the existing data while some of the distribution parameters such as mean and standard deviation can only be given interval estimations. This thus will constitute an important type of probability-interval hybrid uncertain problem, in which the aleatory and epistemic uncertainties both exist. The probability-interval hybrid uncertainty analysis provides an important mean for reliability analysis and design of many complex structures, and has become one of the research focuses in the field of structural uncertainty analysis over the past decades. This paper reviews the four main research directions in this area, i.e., uncertainty modeling, uncertainty propagation analysis, structural reliability analysis, and reliability-based design optimization. It summarizes the main scientific problems, technical difficulties, and current research status of each direction. Based on the review, this paper also provides an outlook for future research in probability-interval hybrid uncertainty analysis.

...read moreread less

Journal Article•DOI•

Probability multi-valued neutrosophic sets and its application in multi-criteria group decision-making problems

[...]

Hong-gang Peng¹, Hong-yu Zhang¹, Jian-qiang Wang¹•Institutions (1)

Central South University¹

01 Jul 2018-Neural Computing and Applications

TL;DR: In this paper, a probability multi-valued neutrosophic sets (PMVNSs) based on multiview sets and probability distribution is introduced to depict uncertain, incomplete, inconsistent and hesitant decision-making information and reflect the distribution characteristics of all provided evaluation values.

...read moreread less

Abstract: This paper introduces probability multi-valued neutrosophic sets (PMVNSs) based on multi-valued neutrosophic sets and probability distribution. PMVNS can serve as a reliable tool to depict uncertain, incomplete, inconsistent and hesitant decision-making information and reflect the distribution characteristics of all provided evaluation values. This paper focuses on developing an innovative method to address multi-criteria group decision-making (MCGDM) problems in which the weight information is completely unknown and the evaluation values taking the form of probability multi-valued neutrosophic numbers (PMVNNs). First, the definition of PMVNSs is described. Second, an extended convex combination operation of PMVNNs is defined, and the probability multi-valued neutrosophic number weighted average operator is proposed. Moreover, two cross-entropy measures for PMVNNs are presented, and a novel qualitative flexible multiple criteria method (QUALIFLEX) is developed. Subsequently, an innovative MCGDM approach is established by incorporating the proposed aggregation operator and the developed QUALIFLEX method. Finally, an illustrative example concerning logistics outsourcing is provided to demonstrate the proposed method, and its feasibility and validity are further verified by comparison with other existing methods.

...read moreread less

Journal Article•DOI•

Operator growth in the SYK model

[...]

Daniel A. Roberts¹, Douglas Stanford, Alexandre Streicher², Alexandre Streicher³•Institutions (3)

Facebook¹, Stanford University², University of California³

07 Feb 2018-arXiv: High Energy Physics - Theory

TL;DR: In this article, the authors discuss the probability distribution for the "size" of a time-evolving operator in the SYK model and evaluate the distribution explicitly at leading nontrivial order in the large-$q$ expansion.

...read moreread less

Abstract: We discuss the probability distribution for the "size" of a time-evolving operator in the SYK model Scrambling is related to the fact that as time passes, the distribution shifts towards larger operators Initially, the rate is exponential and determined by the infinite-temperature chaos exponent We evaluate the size distribution numerically for $N = 30$, and show how to compute it in the large-$N$ theory using the dressed fermion propagator We then evaluate the distribution explicitly at leading nontrivial order in the large-$q$ expansion

...read moreread less

Posted Content•

On Distributionally Robust Chance Constrained Programs with Wasserstein Distance

[...]

Weijun Xie¹•Institutions (1)

Virginia Tech¹

19 Jun 2018-arXiv: Optimization and Control

TL;DR: It is shown that a DRCCP can be reformulated as a conditional value-at-risk constrained optimization problem, and thus admits tight inner and outer approximations and a big-M free formulation.

...read moreread less

Abstract: This paper studies a distributionally robust chance constrained program (DRCCP) with Wasserstein ambiguity set, where the uncertain constraints should be satisfied with a probability at least a given threshold for all the probability distributions of the uncertain parameters within a chosen Wasserstein distance from an empirical distribution. In this work, we investigate equivalent reformulations and approximations of such problems. We first show that a DRCCP can be reformulated as a conditional value-at-risk constrained optimization problem, and thus admits tight inner and outer approximations. We also show that a DRCCP of bounded feasible region is mixed integer representable by introducing big-M coefficients and additional binary variables. For a DRCCP with pure binary decision variables, by exploring the submodular structure, we show that it admits a big-M free formulation, which can be solved by a branch and cut algorithm. Finally, we present a numerical study to illustrate the effectiveness of the proposed formulations.

...read moreread less

Journal Article•DOI•

Best-Fit Probability Distributions and Return Periods for Maximum Monthly Rainfall in Bangladesh

[...]

Ashraful Alam, Kazuo Emura, Craig Farnham, Jihui Yuan

31 Jan 2018-Climate

TL;DR: In this article, the best-fit probability distributions in the case of maximum monthly rainfall using 30 years of data (1984-2013) from 35 locations in Bangladesh by using different statistical analysis and distribution types were applied.

...read moreread less

Abstract: The study of frequency analysis is important to find the most suitable model that could anticipate extreme events of certain natural phenomena e.g., rainfall, floods, etc. The goal of this study is to determine the best-fit probability distributions in the case of maximum monthly rainfall using 30 years of data (1984–2013) from 35 locations in Bangladesh by using different statistical analysis and distribution types. Commonly used frequency distributions were applied. Parameters of these distributions were estimated by the method of moments and L-moments estimators. Three goodness-of-fit test statistics were applied. The best-fit result of each station was taken as the distribution with the lowest sum of the rank scores from each of the three test statistics. Generalized Extreme Value, Pearson type 3 and Log-Pearson type 3 distributions showed the largest number of best-fit results. Among the best score results, Generalized Extreme Value yielded the best-fit for 36% of the stations and Pearson type 3 and Log-Pearson type 3 each yielded the best-fit for 26% of the stations. The more practical result of this paper was that the 10-year, 25-year, 50-year and 100-year return periods of maximum monthly rainfall were calculated for all locations. The result of this study can be used to develop more accurate models of flooding risk and damage.

...read moreread less

Collapse