scispace - formally typeset
Search or ask a question

Showing papers on "Probability distribution published in 2018"


Journal ArticleDOI
TL;DR: In this paper, the authors consider stochastic programs where the distribution of the uncertain parameters is only observable through a finite training dataset and use the Wasserstein metric to construct a ball in the space of probability distributions centered at the uniform distribution on the training samples.
Abstract: We consider stochastic programs where the distribution of the uncertain parameters is only observable through a finite training dataset. Using the Wasserstein metric, we construct a ball in the space of (multivariate and non-discrete) probability distributions centered at the uniform distribution on the training samples, and we seek decisions that perform best in view of the worst-case distribution within this Wasserstein ball. The state-of-the-art methods for solving the resulting distributionally robust optimization problems rely on global optimization techniques, which quickly become computationally excruciating. In this paper we demonstrate that, under mild assumptions, the distributionally robust optimization problems over Wasserstein balls can in fact be reformulated as finite convex programs—in many interesting cases even as tractable linear programs. Leveraging recent measure concentration results, we also show that their solutions enjoy powerful finite-sample performance guarantees. Our theoretical results are exemplified in mean-risk portfolio optimization as well as uncertainty quantification.

913 citations


Journal ArticleDOI
TL;DR: Log-binomial and robust (modified) Poisson regression models are popular approaches to estimate risk ratios for binary response variables but their performance under model misspecification is poorly understood.
Abstract: Log-binomial and robust (modified) Poisson regression models are popular approaches to estimate risk ratios for binary response variables. Previous studies have shown that comparatively they produce similar point estimates and standard errors. However, their performance under model misspecification is poorly understood. In this simulation study, the statistical performance of the two models was compared when the log link function was misspecified or the response depended on predictors through a non-linear relationship (i.e. truncated response). Point estimates from log-binomial models were biased when the link function was misspecified or when the probability distribution of the response variable was truncated at the right tail. The percentage of truncated observations was positively associated with the presence of bias, and the bias was larger if the observations came from a population with a lower response rate given that the other parameters being examined were fixed. In contrast, point estimates from the robust Poisson models were unbiased. Under model misspecification, the robust Poisson model was generally preferable because it provided unbiased estimates of risk ratios.

260 citations


Proceedings Article
31 Mar 2018
TL;DR: In this paper, the authors propose a method to train large scale generative models using an optimal transport loss, which is based on two key ideas: (a) entropic smoothing, which turns the original OT loss into one that can be computed using Sinkhorn fixed point iterations; (b) algorithmic (automatic) differentiation of these iterations.
Abstract: The ability to compare two degenerate probability distributions (i.e. two probability distributions supported on two distinct low-dimensional manifolds living in a much higher-dimensional space) is a crucial problem arising in the estimation of generative models for high-dimensional observations such as those arising in computer vision or natural language. It is known that optimal transport metrics can represent a cure for this problem, since they were specifically designed as an alternative to information divergences to handle such problematic scenarios. Unfortunately, training generative machines using OT raises formidable computational and statistical challenges, because of (i) the computational burden of evaluating OT losses, (ii) the instability and lack of smoothness of these losses, (iii) the difficulty to estimate robustly these losses and their gradients in high dimension. This paper presents the first tractable computational method to train large scale generative models using an optimal transport loss, and tackles these three issues by relying on two key ideas: (a) entropic smoothing, which turns the original OT loss into one that can be computed using Sinkhorn fixed point iterations; (b) algorithmic (automatic) differentiation of these iterations. These two approximations result in a robust and differentiable approximation of the OT loss with streamlined GPU execution. Entropic smoothing generates a family of losses interpolating between Wasserstein (OT) and Maximum Mean Discrepancy (MMD), thus allowing to find a sweet spot leveraging the geometry of OT and the favorable high-dimensional sample complexity of MMD which comes with unbiased gradient estimates. The resulting computational architecture complements nicely standard deep network generative models by a stack of extra layers implementing the loss function.

245 citations


Journal ArticleDOI
TL;DR: The usefulness and reliability of RAVE is demonstrated by applying it to model potentials of increasing complexity, including computation of the binding free energy profile for a hydrophobic ligand-substrate system in explicit water with dissociation time of more than 3 min in computer time at least twenty times less than that needed for umbrella sampling or metadynamics.
Abstract: Here we propose the reweighted autoencoded variational Bayes for enhanced sampling (RAVE) method, a new iterative scheme that uses the deep learning framework of variational autoencoders to enhance sampling in molecular simulations. RAVE involves iterations between molecular simulations and deep learning in order to produce an increasingly accurate probability distribution along a low-dimensional latent space that captures the key features of the molecular simulation trajectory. Using the Kullback-Leibler divergence between this latent space distribution and the distribution of various trial reaction coordinates sampled from the molecular simulation, RAVE determines an optimum, yet nonetheless physically interpretable, reaction coordinate and optimum probability distribution. Both then directly serve as the biasing protocol for a new biased simulation, which is once again fed into the deep learning module with appropriate weights accounting for the bias, the procedure continuing until estimates of desirable thermodynamic observables are converged. Unlike recent methods using deep learning for enhanced sampling purposes, RAVE stands out in that (a) it naturally produces a physically interpretable reaction coordinate, (b) is independent of existing enhanced sampling protocols to enhance the fluctuations along the latent space identified via deep learning, and (c) it provides the ability to easily filter out spurious solutions learned by the deep learning procedure. The usefulness and reliability of RAVE is demonstrated by applying it to model potentials of increasing complexity, including computation of the binding free energy profile for a hydrophobic ligand-substrate system in explicit water with dissociation time of more than 3 min, in computer time at least twenty times less than that needed for umbrella sampling or metadynamics.

225 citations


Journal ArticleDOI
TL;DR: SMC provides a more widely applicable and scalable alternative to analysis of properties of stochastic systems using numerical and symbolic methods, while emphasizing current limitations and tradeoffs between precision and scalability.
Abstract: Interactive, distributed, and embedded systems often behave stochastically, for example, when inputs, message delays, or failures conform to a probability distribution. However, reasoning analytically about the behavior of complex stochastic systems is generally infeasible. While simulations of systems are commonly used in engineering practice, they have not traditionally been used to reason about formal specifications. Statistical model checking (SMC) addresses this weakness by using a simulation-based approach to reason about precise properties specified in a stochastic temporal logic. A specification for a communication system may state that within some time bound, the probability that the number of messages in a queue will be greater than 5 must be less than 0.01. Using SMC, executions of a stochastic system are first sampled, after which statistical techniques are applied to determine whether such a property holds. While the output of sample-based methods are not always correct, statistical inference can quantify the confidence in the result produced. In effect, SMC provides a more widely applicable and scalable alternative to analysis of properties of stochastic systems using numerical and symbolic methods. SMC techniques have been successfully applied to analyze systems with large state spaces in areas such as computer networking, security, and systems biology. In this article, we survey SMC algorithms, techniques, and tools, while emphasizing current limitations and tradeoffs between precision and scalability.

211 citations


Journal ArticleDOI
TL;DR: This work presents a quantum algorithm for the Monte Carlo pricing of financial derivatives and shows how the amplitude estimation algorithm can be applied to achieve a quadratic quantum speedup in the number of steps required to obtain an estimate for the price with high confidence.
Abstract: This work presents a quantum algorithm for the Monte Carlo pricing of financial derivatives. We show how the relevant probability distributions can be prepared in quantum superposition, the payoff functions can be implemented via quantum circuits, and the price of financial derivatives can be extracted via quantum measurements. We show how the amplitude estimation algorithm can be applied to achieve a quadratic quantum speedup in the number of steps required to obtain an estimate for the price with high confidence. This work provides a starting point for further research at the interface of quantum computing and finance.

208 citations


Proceedings Article
01 Jan 2018
TL;DR: This work shows that the nonlinearity of the generator imply that the latent space gives a distorted view of the input space, and shows that this distortion can be characterized by a stochastic Riemannian metric, and demonstrates that distances and interpolants are significantly improved under this metric.
Abstract: Deep generative models provide a systematic way to learn nonlinear data distributions, through a set of latent variables and a nonlinear "generator" function that maps latent points into the input space. The nonlinearity of the generator imply that the latent space gives a distorted view of the input space. Under mild conditions, we show that this distortion can be characterized by a stochastic Riemannian metric, and demonstrate that distances and interpolants are significantly improved under this metric. This in turn improves probability distributions, sampling algorithms and clustering in the latent space. Our geometric analysis further reveals that current generators provide poor variance estimates and we propose a new generator architecture with vastly improved variance estimates. Results are demonstrated on convolutional and fully connected variational autoencoders, but the formalism easily generalize to other deep generative models.

200 citations


Journal ArticleDOI
TL;DR: Wasserstein distances as discussed by the authors measure the minimal effort required to reconfigure the probability mass of one distribution in order to recover the other distribution, and have a long history that has catalyse core developments in analysis, optimization, and probability.
Abstract: Wasserstein distances are metrics on probability distributions inspired by the problem of optimal mass transportation. Roughly speaking, they measure the minimal effort required to reconfigure the probability mass of one distribution in order to recover the other distribution. They are ubiquitous in mathematics, with a long history that has seen them catalyse core developments in analysis, optimization, and probability. Beyond their intrinsic mathematical richness, they possess attractive features that make them a versatile tool for the statistician: they can be used to derive weak convergence and convergence of moments, and can be easily bounded; they are well-adapted to quantify a natural notion of perturbation of a probability distribution; and they seamlessly incorporate the geometry of the domain of the distributions in question, thus being useful for contrasting complex objects. Consequently, they frequently appear in the development of statistical theory and inferential methodology, and have recently become an object of inference in themselves. In this review, we provide a snapshot of the main concepts involved in Wasserstein distances and optimal transportation, and a succinct overview of some of their many statistical aspects.

186 citations


Journal ArticleDOI
TL;DR: In this paper, the size distribution of a time-evolving operator in the SYK model is discussed and the authors evaluate the distribution numerically for N = 30, and show how to compute it in the large-N theory using the dressed fermion propagator.
Abstract: We discuss the probability distribution for the “size” of a time-evolving operator in the SYK model. Scrambling is related to the fact that as time passes, the distribution shifts towards larger operators. Initially, the rate is exponential and determined by the infinite-temperature chaos exponent. We evaluate the size distribution numerically for N = 30, and show how to compute it in the large-N theory using the dressed fermion propagator. We then evaluate the distribution explicitly at leading nontrivial order in the large-q expansion.

169 citations


Journal ArticleDOI
TL;DR: In this paper, a distributionally robust chance constrained approximate ac-OPF is proposed for variable renewable energy (VRE) uncertainties, where the ambiguity set is constructed from historical data without any presumption on the type of the probability distribution, and more data leads to smaller ambiguity set and less conservative strategy.
Abstract: Chance constrained optimal power flow (OPF) has been recognized as a promising framework to manage the risk from variable renewable energy (VRE). In the presence of VRE uncertainties, this paper discusses a distributionally robust chance constrained approximate ac-OPF. The power flow model employed in the proposed OPF formulation combines an exact ac power flow model at the nominal operation point and an approximate linear power flow model to reflect the system response under uncertainties. The ambiguity set employed in the distributionally robust formulation is the Wasserstein ball centered at the empirical distribution. The proposed OPF model minimizes the expectation of the quadratic cost function w.r.t. the worst-case probability distribution and guarantees the chance constraints satisfied for any distribution in the ambiguity set. The whole method is data-driven in the sense that the ambiguity set is constructed from historical data without any presumption on the type of the probability distribution, and more data leads to smaller ambiguity set and less conservative strategy. Moreover, special problem structures of the proposed problem formulation are exploited to develop an efficient and scalable solution approach. Case studies are carried out on the IEEE 14 and 118 bus systems to show the accuracy and necessity of the approximate ac model and the attractive features of the distributionally robust optimization approach compared with other methods to deal with uncertainties.

156 citations


Journal ArticleDOI
01 Jan 2018
TL;DR: Kernel density estimation is a technique for estimation of probability density function that is a must-have enabling the user to better analyse the studied probability distribution than when using a traditional histogram.
Abstract: Kernel density estimation is a technique for estimation of probability density function that is a must-have enabling the user to better analyse the studied probability distribution than when using a traditional histogram. Unlike the histogram, the kernel technique produces smooth estimate of the pdf, uses all sample points' locations and more convincingly suggest multimodality. In its two-dimensional applications, kernel estimation is even better as the 2D histogram requires additionally to define the orientation of 2D bins. Two concepts play fundamental role in kernel estimation: kernel function shape and coefficient of smoothness, of which the latter is crucial to the method. Several real-life examples, both for univariate and bivariate applications, are shown.

Proceedings ArticleDOI
Fan Bai1, Zhanzhan Cheng, Yi Niu, Shiliang Pu, Shuigeng Zhou1 
18 Jun 2018
TL;DR: Zhang et al. as discussed by the authors proposed a novel method called edit probability (EP) for scene text recognition, which tries to estimate the probability of generating a string from the output sequence of probability distribution conditioned on the input image, while considering the possible occurrences of missing/superfluous characters.
Abstract: We consider the scene text recognition problem under the attention-based encoder-decoder framework, which is the state of the art. The existing methods usually employ a frame-wise maximal likelihood loss to optimize the models. When we train the model, the misalignment between the ground truth strings and the attention's output sequences of probability distribution, which is caused by missing or superfluous characters, will confuse and mislead the training process, and consequently make the training costly and degrade the recognition accuracy. To handle this problem, we propose a novel method called edit probability (EP) for scene text recognition. EP tries to effectively estimate the probability of generating a string from the output sequence of probability distribution conditioned on the input image, while considering the possible occurrences of missing/superfluous characters. The advantage lies in that the training process can focus on the missing, superfluous and unrecognized characters, and thus the impact of the misalignment problem can be alleviated or even overcome. We conduct extensive experiments on standard benchmarks, including the IIIT-5K, Street View Text and ICDAR datasets. Experimental results show that the EP can substantially boost scene text recognition performance.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a data driven distributionally robust chance constrained optimal power flow model (DRCC-OPF), which ensures that the worst-case probability of violating both the upper and lower limit of a line/bus capacity under a wide family of distributions is small.
Abstract: The uncertainty associated with renewable energy sources introduces significant challenges in optimal power flow (OPF) analysis. A variety of new approaches have been proposed that use chance constraints to limit line or bus overload risk in OPF models. Most existing formulations assume that the probability distributions associated with the uncertainty are known a priori or can be estimated accurately from empirical data, and/or use separate chance constraints for upper and lower line/bus limits. In this paper, we propose a data driven distributionally robust chance constrained optimal power flow model (DRCC-OPF), which ensures that the worst-case probability of violating both the upper and lower limit of a line/bus capacity under a wide family of distributions is small. Assuming that we can estimate the first and second moments of the underlying distributions based on empirical data, we propose an exact reformulation of DRCC-OPF as a tractable convex program. The key theoretical result behind this reformulation is a second-order cone programming (SOCP) reformulation of a general two-sided distributionally robust chance constrained set by lifting the set to a higher dimensional space. Our numerical study shows that the proposed SOCP formulation can be solved efficiently and that the results of our model are quite robust.

Journal ArticleDOI
TL;DR: This paper proposes a stochastic model predictive control approach to optimize the fuel consumption in a vehicle following context using a conditional linear Gauss model to estimate the probability distribution of the future velocity of the preceding vehicle.
Abstract: This paper proposes a stochastic model predictive control (MPC) approach to optimize the fuel consumption in a vehicle following context. The practical solution of that problem requires solving a constrained moving horizon optimal control problem using a short-term prediction of the preceding vehicle’s velocity. In a deterministic framework, the prediction errors lead to constraint violations and to harsh control reactions. Instead, the suggested method considers errors, and limits the probability of a constraint violation. A conditional linear Gauss model is developed and trained with real measurements to estimate the probability distribution of the future velocity of the preceding vehicle. The prediction model is used to evaluate two different stochastic MPC approaches. On the one hand, an MPC with individual chance constraints is applied. On the other hand, samples are drawn from the conditional Gaussian model and used for a scenario-based optimization approach. Finally, both developed control strategies are evaluated and compared against a standard deterministic MPC. The evaluation of the controllers shows a significant reduction of the fuel consumption compared with standard adaptive cruise control algorithms.

Journal ArticleDOI
TL;DR: This paper proposes a distributionally robust optimization approach for the contingency-constrained unit commitment problem, and derives an equivalent reformulation and study a Benders’ decomposition algorithm for solving the model.
Abstract: This paper proposes a distributionally robust optimization approach for the contingency-constrained unit commitment problem. In our approach, we consider a case where the true probability distribution of contingencies is ambiguous, i.e., difficult to accurately estimate. Instead of assigning a (fixed) probability estimate for each contingency scenario, we consider a set of contingency probability distributions (termed the ambiguity set) based on the $N-k$ security criterion and moment information. Our approach considers all possible distributions in the ambiguity set, and is hence distributionally robust. Meanwhile, as this approach utilizes moment information, it can benefit from available data and become less conservative than the robust optimization approaches. We derive an equivalent reformulation and study a Benders’ decomposition algorithm for solving the model. Furthermore, we extend the model to incorporate wind power uncertainty. The case studies on a 6-Bus system and the IEEE 118-Bus system demonstrate that the proposed approach provides less conservative unit commitment decisions as compared with the robust optimization approach.

Journal ArticleDOI
TL;DR: This letter proposes a simple yet effective approach to assess the Gaussian phasor measurement unit measurement error assumption by using the stability property of a probability distribution and the concept of redundant measurement.
Abstract: Gaussian phasor measurement unit (PMU) measurement error has been assumed for many power system applications, such as state estimation, oscillatory modes monitoring, voltage stability analysis, to cite a few. This letter proposes a simple yet effective approach to assess this assumption by using the stability property of a probability distribution and the concept of redundant measurement. Extensive results using field PMU data from WECC system reveal that the Gaussian assumption is questionable.

Journal ArticleDOI
TL;DR: In this article, the probability distribution of complex data using insights from quantum physics is modelled using GNNs, which shows great potential compared to conventional neural network approaches, and is a fresh approach to generative modeling in machine learning.
Abstract: Modeling the probability distribution of complex data using insights from quantum physics is a fresh approach to generative modeling in machine learning, and shows great potential compared to conventional neural network approaches.

Posted Content
TL;DR: In this paper, a new saliency map model is proposed which formulates a map as a generalized Bernoulli distribution and then trains a deep architecture to predict such maps using novel loss functions which pair the softmax activation function with measures designed to compute distances between probability distributions.
Abstract: Most saliency estimation methods aim to explicitly model low-level conspicuity cues such as edges or blobs and may additionally incorporate top-down cues using face or text detection. Data-driven methods for training saliency models using eye-fixation data are increasingly popular, particularly with the introduction of large-scale datasets and deep architectures. However, current methods in this latter paradigm use loss functions designed for classification or regression tasks whereas saliency estimation is evaluated on topographical maps. In this work, we introduce a new saliency map model which formulates a map as a generalized Bernoulli distribution. We then train a deep architecture to predict such maps using novel loss functions which pair the softmax activation function with measures designed to compute distances between probability distributions. We show in extensive experiments the effectiveness of such loss functions over standard ones on four public benchmark datasets, and demonstrate improved performance over state-of-the-art saliency methods.

Proceedings ArticleDOI
20 Jun 2018
TL;DR: In this paper, a polynomial-time algorithm with guarantees approaching the information-theoretic limit for non-Gaussian distributions was proposed for learning well-separated mixtures of Gaussians.
Abstract: We use the Sum of Squares method to develop new efficient algorithms for learning well-separated mixtures of Gaussians and robust mean estimation, both in high dimensions, that substantially improve upon the statistical guarantees achieved by previous efficient algorithms. Our contributions are: Mixture models with separated means: We study mixtures of poly(k)-many k-dimensional distributions where the means of every pair of distributions are separated by at least ke. In the special case of spherical Gaussian mixtures, we give a kO(1/e)-time algorithm that learns the means assuming separation at least ke, for any e> 0. This is the first algorithm to improve on greedy (“single-linkage”) and spectral clustering, breaking a long-standing barrier for efficient algorithms at separation k1/4. Robust estimation: When an unknown (1−e)-fraction of X1,…,Xn are chosen from a sub-Gaussian distribution with mean µ but the remaining points are chosen adversarially, we give an algorithm recovering µ to error e1−1/t in time kO(t), so long as sub-Gaussian-ness up to O(t) moments can be certified by a Sum of Squares proof. This is the first polynomial-time algorithm with guarantees approaching the information-theoretic limit for non-Gaussian distributions. Previous algorithms could not achieve error better than e1/2. As a corollary, we achieve similar results for robust covariance estimation. Both of these results are based on a unified technique. Inspired by recent algorithms of Diakonikolas et al. in robust statistics, we devise an SDP based on the Sum of Squares method for the following setting: given X1,…,Xn ∈ ℝk for large k and n = poly(k) with the promise that a subset of X1,…,Xn were sampled from a probability distribution with bounded moments, recover some information about that distribution.

Journal ArticleDOI
TL;DR: In this article, the exact probability distribution of a run-and-tumble particle with and without diffusion on the infinite line, as well as in a finite interval, was investigated.
Abstract: We investigate the motion of a run-and-tumble particle (RTP) in one dimension. We find the exact probability distribution of the particle with and without diffusion on the infinite line, as well as in a finite interval. In the infinite domain, this probability distribution approaches a Gaussian form in the long-time limit, as in the case of a regular Brownian particle. At intermediate times, this distribution exhibits unexpected multi-modal forms. In a finite domain, the probability distribution reaches a steady state form with peaks at the boundaries, in contrast to a Brownian particle. We also study the relaxation to the steady state analytically. Finally we compute the survival probability of the RTP in a semi-infinite domain. In the finite interval, we compute the exit probability and the associated exit times. We provide numerical verifications of our analytical results.

Journal ArticleDOI
TL;DR: A GAN-based workflow of training and image generation to an oolitic Ketton limestone micro-CT unsegmented gray-level dataset is applied and results show that GANs allow a fast and accurate reconstruction of the evaluated image dataset.
Abstract: Stochastic image reconstruction is a key part of modern digital rock physics and material analysis that aims to create representative samples of microstructures for upsampling, upscaling and uncertainty quantification. We present new results of a method of three-dimensional stochastic image reconstruction based on generative adversarial neural networks (GANs). GANs are a family of unsupervised learning methods that require no a priori inference of the probability distribution associated with the training data. Thanks to the use of two convolutional neural networks, the discriminator and the generator, in the training phase, and only the generator in the simulation phase, GANs allow the sampling of large and realistic volumetric images. We apply a GAN-based workflow of training and image generation to an oolitic Ketton limestone micro-CT unsegmented gray-level dataset. Minkowski functionals calculated as a function of the segmentation threshold are compared between simulated and acquired images. Flow simulations are run on the segmented images, and effective permeability and velocity distributions of simulated flow are also compared. Results show that GANs allow a fast and accurate reconstruction of the evaluated image dataset. We discuss the performance of GANs in relation to other simulation techniques and stress the benefits resulting from the use of convolutional neural networks . We address a number of challenges involved in GANs, in particular the representation of the probability distribution associated with the training data.

Journal ArticleDOI
TL;DR: In this paper, a new framework for efficient sampling from complex probability distributions, using a combination of transport maps and the Metropolis-Hastings rule, is introduced, and the core idea is to use determin...
Abstract: We introduce a new framework for efficient sampling from complex probability distributions, using a combination of transport maps and the Metropolis--Hastings rule. The core idea is to use determin...

Journal ArticleDOI
TL;DR: This paper shows complexity theoretic evidence of hardness that is on par with the strongest theoretical proposals for supremacy, and shows that RCS satisfies an average-case hardness condition - computing output probabilities of typical quantum circuits is as hard as computing them in the worst-case, and therefore #P-hard.
Abstract: A critical milestone on the path to useful quantum computers is quantum supremacy - a demonstration of a quantum computation that is prohibitively hard for classical computers. A leading near-term candidate, put forth by the Google/UCSB team, is sampling from the probability distributions of randomly chosen quantum circuits, which we call Random Circuit Sampling (RCS). In this paper we study both the hardness and verification of RCS. While RCS was defined with experimental realization in mind, we show complexity theoretic evidence of hardness that is on par with the strongest theoretical proposals for supremacy. Specifically, we show that RCS satisfies an average-case hardness condition - computing output probabilities of typical quantum circuits is as hard as computing them in the worst-case, and therefore #P-hard. Our reduction exploits the polynomial structure in the output amplitudes of random quantum circuits, enabled by the Feynman path integral. In addition, it follows from known results that RCS satisfies an anti-concentration property, making it the first supremacy proposal with both average-case hardness and anti-concentration.

Proceedings ArticleDOI
27 Jan 2018
TL;DR: A generative framework for zero-shot action recognition where some of the possible action classes do not occur in the training data, based on modeling each action class using a probability distribution whose parameters are functions of the attribute vector representing that action class.
Abstract: We present a generative framework for zero-shot action recognition where some of the possible action classes do not occur in the training data. Our approach is based on modeling each action class using a probability distribution whose parameters are functions of the attribute vector representing that action class. In particular, we assume that the distribution parameters for any action class in the visual space can be expressed as a linear combination of a set of basis vectors where the combination weights are given by the attributes of the action class. These basis vectors can be learned solely using labeled data from the known (i.e., previously seen) action classes, and can then be used to predict the parameters of the probability distributions of unseen action classes. We consider two settings: (1) Inductive setting, where we use only the labeled examples of the seen action classes to predict the unseen action class parameters; and (2) Transductive setting which further leverages unlabeled data from the unseen action classes. Our framework also naturally extends to few-shot action recognition where a few labelled examples from unseen classes are available. Our experiments on benchmark datasets (UCF101, HMDB51 and Olympic) show significant performance improvements as compared to various baselines, in both standard zero-shot (disjoint seen and unseen classes) and generalized zero-shot learning settings.

Journal ArticleDOI
TL;DR: In this article, the authors consider the problem of distributed hypothesis testing over a network and characterize the exponential rate of learning in terms of the nodes' influence of the network and the divergences between the observations' distributions.
Abstract: This paper considers a problem of distributed hypothesis testing over a network. Individual nodes in a network receive noisy local (private) observations whose distribution is parameterized by a discrete parameter (hypothesis). The marginals of the joint observation distribution conditioned on each hypothesis are known locally at the nodes, but the true parameter/hypothesis is not known. An update rule is analyzed in which nodes first perform a Bayesian update of their belief (distribution estimate) of each hypothesis based on their local observations, communicate these updates to their neighbors, and then perform a “non-Bayesian” linear consensus using the log-beliefs of their neighbors. Under mild assumptions, we show that the belief of any node on a wrong hypothesis converges to zero exponentially fast. We characterize the exponential rate of learning, which we call the network divergence, in terms of the nodes’ influence of the network and the divergences between the observations’ distributions. For a broad class of observation statistics which includes distributions with unbounded support such as Gaussian mixtures, we show that rate of rejection of wrong hypothesis satisfies a large deviation principle, i.e., the probability of sample paths on which the rate of rejection of wrong hypothesis deviates from the mean rate vanishes exponentially fast and we characterize the rate function in terms of the nodes’ influence of the network and the local observation models.

Journal ArticleDOI
TL;DR: In this article, the authors present a review of the state-of-the-art in probability-interval hybrid uncertainty analysis and provide an outlook for future research in this area.
Abstract: Traditional structural uncertainty analysis is mainly based on probability models and requires the establishment of accurate parametric probability distribution functions using large numbers of experimental samples. In many actual engineering problems, the probability distributions of some parameters can be established due to sufficient samples available, whereas for some parameters, due to the lack or poor quality of samples, only their variation intervals can be obtained, or their probability distribution types can be determined based on the existing data while some of the distribution parameters such as mean and standard deviation can only be given interval estimations. This thus will constitute an important type of probability-interval hybrid uncertain problem, in which the aleatory and epistemic uncertainties both exist. The probability-interval hybrid uncertainty analysis provides an important mean for reliability analysis and design of many complex structures, and has become one of the research focuses in the field of structural uncertainty analysis over the past decades. This paper reviews the four main research directions in this area, i.e., uncertainty modeling, uncertainty propagation analysis, structural reliability analysis, and reliability-based design optimization. It summarizes the main scientific problems, technical difficulties, and current research status of each direction. Based on the review, this paper also provides an outlook for future research in probability-interval hybrid uncertainty analysis.

Journal ArticleDOI
TL;DR: In this paper, a probability multi-valued neutrosophic sets (PMVNSs) based on multiview sets and probability distribution is introduced to depict uncertain, incomplete, inconsistent and hesitant decision-making information and reflect the distribution characteristics of all provided evaluation values.
Abstract: This paper introduces probability multi-valued neutrosophic sets (PMVNSs) based on multi-valued neutrosophic sets and probability distribution. PMVNS can serve as a reliable tool to depict uncertain, incomplete, inconsistent and hesitant decision-making information and reflect the distribution characteristics of all provided evaluation values. This paper focuses on developing an innovative method to address multi-criteria group decision-making (MCGDM) problems in which the weight information is completely unknown and the evaluation values taking the form of probability multi-valued neutrosophic numbers (PMVNNs). First, the definition of PMVNSs is described. Second, an extended convex combination operation of PMVNNs is defined, and the probability multi-valued neutrosophic number weighted average operator is proposed. Moreover, two cross-entropy measures for PMVNNs are presented, and a novel qualitative flexible multiple criteria method (QUALIFLEX) is developed. Subsequently, an innovative MCGDM approach is established by incorporating the proposed aggregation operator and the developed QUALIFLEX method. Finally, an illustrative example concerning logistics outsourcing is provided to demonstrate the proposed method, and its feasibility and validity are further verified by comparison with other existing methods.

Journal ArticleDOI
TL;DR: In this article, the authors discuss the probability distribution for the "size" of a time-evolving operator in the SYK model and evaluate the distribution explicitly at leading nontrivial order in the large-$q$ expansion.
Abstract: We discuss the probability distribution for the "size" of a time-evolving operator in the SYK model Scrambling is related to the fact that as time passes, the distribution shifts towards larger operators Initially, the rate is exponential and determined by the infinite-temperature chaos exponent We evaluate the size distribution numerically for $N = 30$, and show how to compute it in the large-$N$ theory using the dressed fermion propagator We then evaluate the distribution explicitly at leading nontrivial order in the large-$q$ expansion

Posted Content
Weijun Xie1
TL;DR: It is shown that a DRCCP can be reformulated as a conditional value-at-risk constrained optimization problem, and thus admits tight inner and outer approximations and a big-M free formulation.
Abstract: This paper studies a distributionally robust chance constrained program (DRCCP) with Wasserstein ambiguity set, where the uncertain constraints should be satisfied with a probability at least a given threshold for all the probability distributions of the uncertain parameters within a chosen Wasserstein distance from an empirical distribution. In this work, we investigate equivalent reformulations and approximations of such problems. We first show that a DRCCP can be reformulated as a conditional value-at-risk constrained optimization problem, and thus admits tight inner and outer approximations. We also show that a DRCCP of bounded feasible region is mixed integer representable by introducing big-M coefficients and additional binary variables. For a DRCCP with pure binary decision variables, by exploring the submodular structure, we show that it admits a big-M free formulation, which can be solved by a branch and cut algorithm. Finally, we present a numerical study to illustrate the effectiveness of the proposed formulations.

Journal ArticleDOI
31 Jan 2018-Climate
TL;DR: In this article, the best-fit probability distributions in the case of maximum monthly rainfall using 30 years of data (1984-2013) from 35 locations in Bangladesh by using different statistical analysis and distribution types were applied.
Abstract: The study of frequency analysis is important to find the most suitable model that could anticipate extreme events of certain natural phenomena e.g., rainfall, floods, etc. The goal of this study is to determine the best-fit probability distributions in the case of maximum monthly rainfall using 30 years of data (1984–2013) from 35 locations in Bangladesh by using different statistical analysis and distribution types. Commonly used frequency distributions were applied. Parameters of these distributions were estimated by the method of moments and L-moments estimators. Three goodness-of-fit test statistics were applied. The best-fit result of each station was taken as the distribution with the lowest sum of the rank scores from each of the three test statistics. Generalized Extreme Value, Pearson type 3 and Log-Pearson type 3 distributions showed the largest number of best-fit results. Among the best score results, Generalized Extreme Value yielded the best-fit for 36% of the stations and Pearson type 3 and Log-Pearson type 3 each yielded the best-fit for 26% of the stations. The more practical result of this paper was that the 10-year, 25-year, 50-year and 100-year return periods of maximum monthly rainfall were calculated for all locations. The result of this study can be used to develop more accurate models of flooding risk and damage.