scispace - formally typeset


Probability distribution

About: Probability distribution is a(n) research topic. Over the lifetime, 40928 publication(s) have been published within this topic receiving 1105809 citation(s). The topic is also known as: distribution.

More filters
01 Jan 1967
Abstract: The main purpose of this paper is to describe a process for partitioning an N-dimensional population into k sets on the basis of a sample. The process, which is called 'k-means,' appears to give partitions which are reasonably efficient in the sense of within-class variance. That is, if p is the probability mass function for the population, S = {S1, S2, * *, Sk} is a partition of EN, and ui, i = 1, 2, * , k, is the conditional mean of p over the set Si, then W2(S) = ff=ISi f z u42 dp(z) tends to be low for the partitions S generated by the method. We say 'tends to be low,' primarily because of intuitive considerations, corroborated to some extent by mathematical analysis and practical computational experience. Also, the k-means procedure is easily programmed and is computationally economical, so that it is feasible to process very large samples on a digital computer. Possible applications include methods for similarity grouping, nonlinear prediction, approximating multivariate distributions, and nonparametric tests for independence among several variables. In addition to suggesting practical classification methods, the study of k-means has proved to be theoretically interesting. The k-means concept represents a generalization of the ordinary sample mean, and one is naturally led to study the pertinent asymptotic behavior, the object being to establish some sort of law of large numbers for the k-means. This problem is sufficiently interesting, in fact, for us to devote a good portion of this paper to it. The k-means are defined in section 2.1, and the main results which have been obtained on the asymptotic behavior are given there. The rest of section 2 is devoted to the proofs of these results. Section 3 describes several specific possible applications, and reports some preliminary results from computer experiments conducted to explore the possibilities inherent in the k-means idea. The extension to general metric spaces is indicated briefly in section 4. The original point of departure for the work described here was a series of problems in optimal classification (MacQueen [9]) which represented special

22,533 citations

Journal ArticleDOI
Abstract: We discuss the following problem given a random sample X = (X 1, X 2,…, X n) from an unknown probability distribution F, estimate the sampling distribution of some prespecified random variable R(X, F), on the basis of the observed data x. (Standard jackknife theory gives an approximate mean and variance in the case R(X, F) = \(\theta \left( {\hat F} \right) - \theta \left( F \right)\), θ some parameter of interest.) A general method, called the “bootstrap”, is introduced, and shown to work satisfactorily on a variety of estimation problems. The jackknife is shown to be a linear approximation method for the bootstrap. The exposition proceeds by a series of examples: variance of the sample median, error rates in a linear discriminant analysis, ratio estimation, estimating regression parameters, etc.

13,648 citations

01 Jun 1969
Abstract: Uncertainties in measurements probability distributions error analysis estimates of means and errors Monte Carlo techniques dependent and independent variables least-squares fit to a polynomial least-squares fit to an arbitrary function fitting composite peaks direct application of the maximum likelihood. Appendices: numerical methods matrices graphs and tables histograms and graphs computer routines in Pascal.

12,721 citations

Journal ArticleDOI
Abstract: : Given a sequence of independent identically distributed random variables with a common probability density function, the problem of the estimation of a probability density function and of determining the mode of a probability function are discussed. Only estimates which are consistent and asymptotically normal are constructed. (Author)

9,261 citations

Journal ArticleDOI
TL;DR: The theory of possibility described in this paper is related to the theory of fuzzy sets by defining the concept of a possibility distribution as a fuzzy restriction which acts as an elastic constraint on the values that may be assigned to a variable.
Abstract: The theory of possibility described in this paper is related to the theory of fuzzy sets by defining the concept of a possibility distribution as a fuzzy restriction which acts as an elastic constraint on the values that may be assigned to a variable. More specifically, if F is a fuzzy subset of a universe of discourse U={u} which is characterized by its membership function μF, then a proposition of the form “X is F,” where X is a variable taking values in U, induces a possibility distribution ∏X which equates the possibility of X taking the value u to μF(u)—the compatibility of u with F. In this way, X becomes a fuzzy variable which is associated with the possibility distribution ∏x in much the same way as a random variable is associated with a probability distribution. In general, a variable may be associated both with a possibility distribution and a probability distribution, with the weak connection between the two expressed as the possibility/probability consistency principle. A thesis advanced in this paper is that the imprecision that is intrinsic in natural languages is, in the main, possibilistic rather than probabilistic in nature. Thus, by employing the concept of a possibility distribution, a proposition, p, in a natural language may be translated into a procedure which computes the probability distribution of a set of attributes which are implied by p. Several types of conditional translation rules are discussed and, in particular, a translation rule for propositions of the form “X is F is α-possible,” where α is a number in the interval [0, 1], is formulated and illustrated by examples.

8,549 citations

Network Information
Related Topics (5)

97.3K papers, 2.6M citations

86% related
Cluster analysis

146.5K papers, 2.9M citations

86% related
Nonlinear system

208.1K papers, 4M citations

86% related
Monte Carlo method

95.9K papers, 2.1M citations

85% related
Matrix (mathematics)

105.5K papers, 1.9M citations

85% related
No. of papers in the topic in previous years