scispace - formally typeset
Search or ask a question

Showing papers on "Probability distribution published in 2013"


Posted Content
TL;DR: This paper abandon the normality assumption and instead use statistical methods for nonparametric density estimation for kernel estimation, which suggests that kernel estimation is a useful tool for learning Bayesian models.
Abstract: When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality assumption and instead use statistical methods for nonparametric density estimation. For a naive Bayesian classifier, we present experimental results on a variety of natural and artificial domains, comparing two methods of density estimation: assuming normality and modeling each conditional distribution with a single Gaussian; and using nonparametric kernel density estimation. We observe large reductions in error on several natural and artificial data sets, which suggests that kernel estimation is a useful tool for learning Bayesian models.

3,071 citations


Book
09 Jun 2013
TL;DR: This chapter discusses the application of the Binomial Distribution to network Modelling and Evaluation of Simple Systems and System Reliability Evaluation Using Probability Distributions.
Abstract: Introduction. Basic Probability Theory. Application of the Binomial Distribution. Network Modelling and Evaluation of Simple Systems. Network Modelling and Evaluation of Complex Systems. Probability Distributions in Reliability Evaluation. System Reliability Evaluation Using Probability Distributions. Monte Carlo Simulation. Epilogue.

1,062 citations


Journal ArticleDOI
TL;DR: The authors' Wigner-like surmises are shown to be very accurate when compared to numerics and exact calculations in the large matrix size limit, and quantitative improvements are found through a polynomial expansion.
Abstract: We derive expressions for the probability distribution of the ratio of two consecutive level spacings for the classical ensembles of random matrices. This ratio distribution was recently introduced to study spectral properties of many-body problems, as, contrary to the standard level spacing distributions, it does not depend on the local density of states. Our Wigner-like surmises are shown to be very accurate when compared to numerics and exact calculations in the large matrix size limit. Quantitative improvements are found through a polynomial expansion. Examples from a quantum many-body lattice model and from zeros of the Riemann zeta function are presented.

705 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that even an approximate or noisy classical simulation would already imply a collapse of the polynomial hierarchy, and hence the hierarchy collapses to the third level.
Abstract: We give new evidence that quantum computers -- moreover, rudimentary quantum computers built entirely out of linear-optical elements -- cannot be efficiently simulated by classical computers. In particular, we define a model of computation in which identical photons are generated, sent through a linear-optical network, then nonadaptively measured to count the number of photons in each mode. This model is not known or believed to be universal for quantum computation, and indeed, we discuss the prospects for realizing the model using current technology. On the other hand, we prove that the model is able to solve sampling problems and search problems that are classically intractable under plausible assumptions. Our first result says that, if there exists a polynomial-time classical algorithm that samples from the same probability distribution as a linear-optical network, then P^#P=BPP^NP, and hence the polynomial hierarchy collapses to the third level. Unfortunately, this result assumes an extremely accurate simulation. Our main result suggests that even an approximate or noisy classical simulation would already imply a collapse of the polynomial hierarchy. For this, we need two unproven conjectures: the "Permanent-of-Gaussians Conjecture", which says that it is #P-hard to approximate the permanent of a matrix A of independent N(0,1) Gaussian entries, with high probability over A; and the "Permanent Anti-Concentration Conjecture", which says that |Per(A)|>=sqrt(n!)/poly(n) with high probability over A. We present evidence for these conjectures, both of which seem interesting even apart from our application. This paper does not assume knowledge of quantum optics. Indeed, part of its goal is to develop the beautiful theory of noninteracting bosons underlying our model, and its connection to the permanent function, in a self-contained way accessible to theoretical computer scientists.

619 citations


Journal ArticleDOI
TL;DR: An approach to modifying a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement.
Abstract: Many problems arising in applications result in the need to probe a probability distribution for functions. Examples include Bayesian nonparametric statistics and conditioned diffusion processes. Standard MCMC algorithms typically become arbitrarily slow under the mesh refinement dictated by nonparametric description of the unknown function. We describe an approach to modifying a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement. Gaussian processes or random fields are fields whose marginal distributions, when evaluated at any finite set of NNpoints, are ℝ^N-valued Gaussians. The algorithmic approach that we describe is applicable not only when the desired probability measure has density with respect to a Gaussian process or Gaussian random field reference measure, but also to some useful non-Gaussian reference measures constructed through random truncation. In the applications of interest the data is often sparse and the prior specification is an essential part of the overall modelling strategy. These Gaussian-based reference measures are a very flexible modelling tool, finding wide-ranging application. Examples are shown in density estimation, data assimilation in fluid mechanics, subsurface geophysics and image registration. The key design principle is to formulate the MCMC method so that it is, in principle, applicable for functions; this may be achieved by use of proposals based on carefully chosen time-discretizations of stochastic dynamical systems which exactly preserve the Gaussian reference measure. Taking this approach leads to many new algorithms which can be implemented via minor modification of existing algorithms, yet which show enormous speed-up on a wide range of applied problems.

553 citations


Journal ArticleDOI
TL;DR: In this paper, a unifying framework linking two classes of statistics used in two-sample and independence testing is presented, namely, the energy distance and distance covariances from the statistics literature; and the maximum mean discrepancy (MMD), that is, distances between embeddings of distributions to reproducing kernel Hilbert spaces.
Abstract: We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, maximum mean discrepancies (MMD), that is, distances between embeddings of distributions to reproducing kernel Hilbert spaces (RKHS), as established in machine learning. In the case where the energy distance is computed with a semimetric of negative type, a positive definite kernel, termed distance kernel, may be defined such that the MMD corresponds exactly to the energy distance. Conversely, for any positive definite kernel, we can interpret the MMD as energy distance with respect to some negative-type semimetric. This equivalence readily extends to distance covariance using kernels on the product space. We determine the class of probability distributions for which the test statistics are consistent against all alternatives. Finally, we investigate the performance of the family of distance kernels in two-sample and independence tests: we show in particular that the energy distance most commonly employed in statistics is just one member of a parametric family of kernels, and that other choices from this family can yield more powerful tests.

342 citations


01 Jan 2013
TL;DR: In this paper, the authors describe an approach to modify a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement.
Abstract: Many problems arising in applications result in the need to probe a probability distribution for functions. Examples include Bayesian nonparametric statistics and conditioned diffusion processes. Standard MCMC algorithms typically become arbitrarily slow under the mesh refinement dictated by nonparametric description of the un- known function. We describe an approach to modifying a whole range of MCMC methods, applicable whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure, which ensures that their speed of convergence is robust under mesh refinement. Gaussian processes or random fields are fields whose marginal distri- butions, when evaluated at any finite set of N points, are RN-valued Gaussians. The algorithmic approach that we describe is applicable not only when the desired probability measure has density with respect to a Gaussian process or Gaussian random field reference measure, but also to some useful non-Gaussian reference measures constructed through random truncation. In the applications of interest the data is often sparse and the prior specification is an essential part of the over- all modelling strategy. These Gaussian-based reference measures are a very flexible modelling tool, finding wide-ranging application. Examples are shown in density estimation, data assimilation in fluid mechanics, subsurface geophysics and image registration. The key design principle is to formulate the MCMC method so that it is, in principle, applicable for functions; this may be achieved by use of proposals based on carefully chosen time-discretizations of stochas- tic dynamical systems which exactly preserve the Gaussian reference measure. Taking this approach leads to many new algorithms which can be implemented via minor modification of existing algorithms, yet which show enormous speed-up on a wide range of applied problems.

340 citations


Journal ArticleDOI
TL;DR: The Connectivity Modeling System is described, a probabilistic, multi-scale model that provides Lagrangian descriptions of oceanic phenomena and can be used in a broad range of oceanographic applications, from the fate of pollutants to the pathways of water masses in the global ocean.
Abstract: Pelagic organisms' movement and motion of buoyant particles are driven by processes operating across multiple, spatial and temporal scales. We developed a probabilistic, multi-scale model, the Connectivity Modeling System (CMS), to gain a mechanistic understanding of dispersion and migration processes in the ocean. The model couples offline a new nested-grid technique to a stochastic Lagrangian framework where individual variability is introduced by drawing particles' attributes at random from specified probability distributions of traits. This allows 1) to track seamlessly a large number of both actively swimming and inertial particles over multiple, independent ocean model domains and 2) to generate ensemble forecasts or hindcasts of the particles' three dimensional trajectories, dispersal kernels, and transition probability matrices used for connectivity estimates. In addition, CMS provides Lagrangian descriptions of oceanic phenomena (advection, dispersion, retention) and can be used in a broad range of oceanographic applications, from the fate of pollutants to the pathways of water masses in the global ocean. Here we describe the CMS modular system where particle behavior can be augmented with specific features, and a parallel module implementation simplifies data management and CPU intensive computations associated with solving for the tracking of millions of active particles. Some novel features include on-the-fly data access of operational hydrodynamic models, individual particle variability and inertial motion, and multi-nesting capabilities to optimize resolution. We demonstrate the performance of the interpolation algorithm by testing accuracy in tracing the flow stream lines in both time and space and the efficacy of probabilistic modeling in evaluating the bio-physical coupling against empirical data. Finally, following recommended practices for the development of community models, we provide an open source code with a series of coupled standalone, optional modules detailed in a user's guide.

281 citations


Book ChapterDOI
01 Jan 2013
TL;DR: The Gamma function as discussed by the authors is a generalized factorial function that can be used to estimate the probability distribution of a probability distribution, and it has been used in many applications, e.g., as part of probability distributions.
Abstract: In what follows, we introduce the classical Gamma function in Sect. 2.1. It is essentially understood to be a generalized factorial. However, there are many further applications, e.g., as part of probability distributions (see, e.g., Evans et al. 2000). The main properties of the Gamma function are explained in this chapter (for a more detailed discussion the reader is referred to, e.g., Artin (1964), Lebedev (1973), Muller (1998), Nielsen (1906), and Whittaker and Watson (1948) and the references therein).

267 citations


Journal ArticleDOI
TL;DR: In this paper, the performance analysis of a dual-hop relay transmission system composed of asymmetric radio-frequency (RF)/free-space optical (FSO) links with pointing errors is presented.
Abstract: In this work, the performance analysis of a dual-hop relay transmission system composed of asymmetric radio-frequency (RF)/free-space optical (FSO) links with pointing errors is presented. More specifically, we build on the system model presented in to derive new exact closed-form expressions for the cumulative distribution function, probability density function, moment generating function, and moments of the end-to-end signal-to-noise ratio in terms of the Meijer's G function. We then capitalize on these results to offer new exact closed-form expressions for the higher-order amount of fading, average error rate for binary and M-ary modulation schemes, and the ergodic capacity, all in terms of Meijer's G functions. Our new analytical results were also verified via computer-based Monte-Carlo simulation results.

253 citations


Journal ArticleDOI
TL;DR: The main advantage of the proposed probabilistic load flow method is that high accurate solution can be obtained with less computation, and it is almost unconstrained for the probability distributions of the input random variables.
Abstract: This paper proposed a probabilistic load flow method that can address the correlated power sources and loads. The proposed probabilistic load flow method is based on the Nataf transformation and the Latin Hypercube Sampling. The main advantage of the proposed method is that high accurate solution can be obtained with less computation. Also, it is almost unconstrained for the probability distributions of the input random variables. Considering the uncertainties of correlated wind power, solar energy and loads, the effectiveness and the accuracy of the proposed probabilistic load flow method are verified by the comparative tests in a modified IEEE 14-bus system and a modified IEEE 118-bus system.

Journal ArticleDOI
TL;DR: This work proposes a variant of the EM algorithm that iteratively maximizes the maximization of a generalized likelihood criterion, which can be interpreted as a degree of agreement between the statistical model and the uncertain observations.
Abstract: We consider the problem of parameter estimation in statistical models in the case where data are uncertain and represented as belief functions. The proposed method is based on the maximization of a generalized likelihood criterion, which can be interpreted as a degree of agreement between the statistical model and the uncertain observations. We propose a variant of the EM algorithm that iteratively maximizes this criterion. As an illustration, the method is applied to uncertain data clustering using finite mixture models, in the cases of categorical and continuous attributes.

Book
01 Jan 2013
TL;DR: This revision of QUANTITATIVE METHODS for Business provides students with a conceptual understanding of the role that quantitative methods play in the decision-making process and motivates students by using examples that illustrate situations in which quantitative methods are useful in decision making.
Abstract: Preface. 1. Introduction. 2. Introduction to Probability. 3. Probability Distributions. 4. Decision Analysis. 5. Utility and Game Theory. 6. Time Series Analysis and Forecasting. 7. Introduction to Linear Programming. 8. Linear Programming: Sensitivity Analysis and Interpretation of Solution. 9. Linear Programming Applications in Marketing, Finance, and Operations Management. 10. Distribution and Network Models. 11. Integer Linear Programming. 12. Advanced Optimization Applications. 13. Project Scheduling: PERT/CPM. 14. Inventory Models. 15. Waiting Line Models. 16. Simulation. 17. Markov Processes. Appendix A: Building Spreadsheet Models. Appendix B: Binomial Probabilities. Appendix C: Poisson Probabilities. Appendix D: Areas for the Standard Normal Distribution. Appendix E: Values for e-?. Appendix F: References and Bibliography. Appendix G: Self-Test Solutions and Answers to Even-Numbered Problems.

Journal ArticleDOI
TL;DR: An approximate message passing (AMP) algorithm is used and a rigorous proof is given that this approach is successful as soon as the undersampling rate δ exceeds the (upper) Rényi information dimension of the signal, d̅(pX).
Abstract: We study the compressed sensing reconstruction problem for a broad class of random, band-diagonal sensing matrices. This construction is inspired by the idea of spatial coupling in coding theory. As demonstrated heuristically and numerically by Krzakala [30], message passing algorithms can effectively solve the reconstruction problem for spatially coupled measurements with undersampling rates close to the fraction of nonzero coordinates. We use an approximate message passing (AMP) algorithm and analyze it through the state evolution method. We give a rigorous proof that this approach is successful as soon as the undersampling rate δ exceeds the (upper) Renyi information dimension of the signal, d(pX). More precisely, for a sequence of signals of diverging dimension n whose empirical distribution converges to pX, reconstruction is with high probability successful from d(pX) n+o(n) measurements taken according to a band diagonal matrix. For sparse signals, i.e., sequences of dimension n and k(n) nonzero entries, this implies reconstruction from k(n)+o(n) measurements. For “discrete” signals, i.e., signals whose coordinates take a fixed finite set of values, this implies reconstruction from o(n) measurements. The result is robust with respect to noise, does not apply uniquely to random signals, but requires the knowledge of the empirical distribution of the signal pX.

Posted Content
TL;DR: In this article, a new method is developed to represent probabilistic relations on multiple random events, where a probability distribution over the relations is directly represented by a Bayesian network.
Abstract: A new method is developed to represent probabilistic relations on multiple random events. Where previously knowledge bases containing probabilistic rules were used for this purpose, here a probability distribution over the relations is directly represented by a Bayesian network. By using a powerful way of specifying conditional probability distributions in these networks, the resulting formalism is more expressive than the previous ones. Particularly, it provides for constraints on equalities of events, and it allows to define complex, nested combination functions.

Journal ArticleDOI
TL;DR: This paper considers logic-based argumentation with uncertain arguments by considering models of the language, which can be used to give a probability distribution over arguments that are constructed using classical logic, and shows how this formalization of uncertainty of logical arguments relates to uncertainty of abstract arguments.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a new method to derive the probability distribution of a function of random variables representing the structural response, based on the maximum entropy principle in which constraints are specified in terms of the fractional moments, in place of commonly used integer moments.

Journal ArticleDOI
TL;DR: In this paper, a probability distribution model named "versatile distribution" is formulated and developed along with its properties and applications, which can well represent forecast errors for all forecast timescales and magnitudes.
Abstract: The existence of wind power forecast errors is one of the most challenging issues for wind power system operation. It is difficult to find a reasonable method for the representation of forecast errors and apply it in scheduling. In this paper, a probability distribution model named “versatile distribution” is formulated and developed along with its properties and applications. The model can well represent forecast errors for all forecast timescales and magnitudes. The incorporation of the model in economic dispatch (ED) problems can simplify the wind-induced uncertainties via a few analytical terms in the problem formulation. The ED problem with wind power could hence be solved by the classical optimization methods, such as sequential linear programming which has been widely accepted by industry for solving ED problems. Discussions are also extended on the incorporation of the proposed versatile distribution into unit commitment problems. The results show that the new distribution is more effective than other commonly used distributions (i.e., Gaussian and Beta) with more accurate representation of forecast errors and better formulation and solution of ED problems.

Journal ArticleDOI
TL;DR: To solve the complicated nonlinear, non-smooth, and non-differentiable SDEED, an enhanced particle swarm optimization (PSO) algorithm is applied to obtain the best solution for the corresponding scenarios to improve the quality of the solutions attained by PSO.

Journal ArticleDOI
TL;DR: This analysis indicates that heavy-tailed degree distribution is causally determined by similarly skewed distribution of human activity, which cannot be explained by interactive models, like preferential attachment, since the observed actions are not likely to be caused by interactions with other people.
Abstract: The probability distribution of number of ties of an individual in a social network follows a scale-free power-law. However, how this distribution arises has not been conclusively demonstrated in direct analyses of people's actions in social networks. Here, we perform a causal inference analysis and find an underlying cause for this phenomenon. Our analysis indicates that heavy-tailed degree distribution is causally determined by similarly skewed distribution of human activity. Specifically, the degree of an individual is entirely random - following a “maximum entropy attachment” model - except for its mean value which depends deterministically on the volume of the users' activity. This relation cannot be explained by interactive models, like preferential attachment, since the observed actions are not likely to be caused by interactions with other people.

Journal ArticleDOI
TL;DR: In this article, the authors highlight the importance of decision-making tools designed for situations where generally agreed-upon probability distributions are not available and stakeholders show different degrees of risk tolerance.
Abstract: Climate change studies rarely yield consensus on the probability distribution of exposure, vulnerability, or possible outcomes, and therefore the evaluation of alternative policy strategies is difficult. This Perspective highlights the importance of decision-making tools designed for situations where generally agreed-upon probability distributions are not available and stakeholders show different degrees of risk tolerance.

Journal ArticleDOI
TL;DR: In this article, an effective estimation of distribution algorithm (EDA) is proposed to solve the distributed permutation flow shop scheduling problem (DPFSP), where the earliest completion factory rule is employed for the permutation based encoding to generate feasible schedules and calculate the schedule objective value.

Journal ArticleDOI
TL;DR: In this article, the authors present an algorithm which uses sublinear in n, specifically, O(n2/3e−8/3 log n), independent samples from each distribution, runs in time linear in the sample size, makes no assumptions about the structure of the distributions, and distinguishes the cases when the distance between the distributions is small (less than {e4/3n−1/3/32, en−1 /2/4}) or large (more than e) in e 1 distance.
Abstract: Given samples from two distributions over an n-element set, we wish to test whether these distributions are statistically close. We present an algorithm which uses sublinear in n, specifically, O(n2/3e−8/3 log n), independent samples from each distribution, runs in time linear in the sample size, makes no assumptions about the structure of the distributions, and distinguishes the cases when the distance between the distributions is small (less than {e4/3n−1/3/32, en−1/2/4}) or large (more than e) in e1 distance. This result can be compared to the lower bound of Ω(n2/3e−2/3) for this problem given by Valiant [2008].Our algorithm has applications to the problem of testing whether a given Markov process is rapidly mixing. We present sublinear algorithms for several variants of this problem as well.

Journal ArticleDOI
TL;DR: In this article, a non-probabilistic reliability model is given for structures with convex model uncertainty, which is defined as a ratio of the multidimensional volume falling into the reliability domain to the one of the whole model.

Journal ArticleDOI
TL;DR: The purpose of the addressed filtering problem is to design an unbiased and recursive filter for the random parameter matrices, stochastic nonlinearity, and multiple fading measurements as well as correlated noises.

Proceedings Article
13 Jun 2013
TL;DR: This work designs differentially private algorithms for statistical model selection and gives sufficient conditions for the LASSO estimator to be robust to small changes in the data set, and shows that these conditions hold with high probability under essentially the same stochastic assumptions that are used in the literature to analyze convergence of the LassO.
Abstract: We design differentially private algorithms for statistical model selection. Given a data set and a large, discrete collection of “models”, each of which is a family of probability distributions, the goal is to determine the model that best “fits” the data. This is a basic problem in many areas of statistics and machine learning. We consider settings in which there is a well-defined answer, in the following sense: Suppose that there is a nonprivate model selection proceduref which is the reference to which we compare our performance. Our differentially private algorithms output the correct valuef(D) wheneverf is stable on the input data setD. We work with two notions, perturbation stability and subsampling stability. We give two classes of results: generic ones, that apply to any function with discrete output set; and specific algorithms for the problem of sparse linear regression. The algorithms we describe are efficient and in some cases match the optimal nonprivate asymptotic sample complexity. Our algorithms for sparse linear regression require analyzing the stability properties of the popular LASSO estimator. We give sufficient conditions for the LASSO estimator to be robust to small changes in the data set, and show that these conditions hold with high probability under essentially the same stochastic assumptions that are used in the literature to analyze convergence of the LASSO.

Journal ArticleDOI
TL;DR: This work uses the well-known Kullback-Leibler divergence to measure similarity between uncertain objects in both the continuous and discrete cases, and integrates it into partitioning and density-based clustering methods to cluster uncertain objects.
Abstract: Clustering on uncertain data, one of the essential tasks in mining uncertain data, posts significant challenges on both modeling similarity between uncertain objects and developing efficient computational methods The previous methods extend traditional partitioning clustering methods like $(k)$-means and density-based clustering methods like DBSCAN to uncertain data, thus rely on geometric distances between objects Such methods cannot handle uncertain objects that are geometrically indistinguishable, such as products with the same mean but very different variances in customer ratings Surprisingly, probability distributions, which are essential characteristics of uncertain objects, have not been considered in measuring similarity between uncertain objects In this paper, we systematically model uncertain objects in both continuous and discrete domains, where an uncertain object is modeled as a continuous and discrete random variable, respectively We use the well-known Kullback-Leibler divergence to measure similarity between uncertain objects in both the continuous and discrete cases, and integrate it into partitioning and density-based clustering methods to cluster uncertain objects Nevertheless, a naive implementation is very costly Particularly, computing exact KL divergence in the continuous case is very costly or even infeasible To tackle the problem, we estimate KL divergence in the continuous case by kernel density estimation and employ the fast Gauss transform technique to further speed up the computation Our extensive experiment results verify the effectiveness, efficiency, and scalability of our approaches

Journal ArticleDOI
TL;DR: One-dimensional free fermions are studied with emphasis on propagating fronts emerging from a step initial condition and it is found that the full counting statistics coincide with the eigenvalue statistics of the edge spectrum of matrices from the Gaussian unitary ensemble.
Abstract: One-dimensional free fermions are studied with emphasis on propagating fronts emerging from a step initial condition. The probability distribution of the number of particles at the edge of the front is determined exactly. It is found that the full counting statistics coincide with the eigenvalue statistics of the edge spectrum of matrices from the Gaussian unitary ensemble. The correspondence established between the random matrix eigenvalues and the particle positions yields the order statistics of the rightmost particles in the front and, furthermore, it implies their subdiffusive spreading.

Book
03 Jan 2013
TL;DR: In this paper, a structural classification of probability distances and probability metrics is presented, including primary, simple and compound probability distances, and minimal and maximal distances and norms, and a structural class of probability metrics.
Abstract: Main directions in the theory of probability metrics- Probability distances and probability metrics: Definitions- Primary, simple and compound probability distances, and minimal and maximal distances and norms- A structural classification of probability distances-Monge-Kantorovich mass transference problem, minimal distances and minimal norms- Quantitative relationships between minimal distances and minimal norms- K-Minimal metrics- Relations between minimal and maximal distances- Moment problems related to the theory of probability metrics: Relations between compound and primary distances- Moment distances- Uniformity in weak and vague convergence- Glivenko-Cantelli theorem and Bernstein-Kantorovich invariance principle- Stability of queueing systems-Optimal quality usage- Ideal metrics with respect to summation scheme for iid random variables- Ideal metrics and rate of convergence in the CLT for random motions- Applications of ideal metrics for sums of iid random variables to the problems of stability and approximation in risk theory- How close are the individual and collective models in risk theory?- Ideal metric with respect to maxima scheme of iid random elements- Ideal metrics and stability of characterizations of probability distributions- Positive and negative de nite kernels and their properties- Negative definite kernels and metrics: Recovering measures from potential- Statistical estimates obtained by the minimal distances method- Some statistical tests based on N-distances- Distances defined by zonoids- N-distance tests of uniformity on the hypersphere-

Journal ArticleDOI
TL;DR: The statistical characteristics of cost overruns experienced from contract award in 276 Australian construction and engineering projects were analyzed in this article, where the skewness and kurtosis values of the cost overrun are computed to determine if the empirical distribution of the data follows a normal distribution.
Abstract: The statistical characteristics of cost overruns experienced from contract award in 276 Australian construction and engineering projects were analyzed The skewness and kurtosis values of the cost overruns are computed to determine if the empirical distribution of the data follows a normal distribution The Kolmogorov-Smirnov, Anderson-Darling, and chi-squared nonparametric tests are used to determine the goodness of fit of the selected probability distributions A three-parameter Frechet probability function is found to describe the behavior of cost overruns and provide the best overall distribution fit The Frechet distribution is then used to calculate the probability of a cost overrun being experienced The statistical characteristics of contract size and cost overruns were also analyzed The Cauchy (