scispace - formally typeset
Search or ask a question

Showing papers on "Probability distribution published in 2010"


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a penalized linear unbiased selection (PLUS) algorithm, which computes multiple exact local minimizers of a possibly nonconvex penalized loss function in a certain main branch of the graph of critical points of the loss.
Abstract: We propose MC+, a fast, continuous, nearly unbiased and accurate method of penalized variable selection in high-dimensional linear regression. The LASSO is fast and continuous, but biased. The bias of the LASSO may prevent consistent variable selection. Subset selection is unbiased but computationally costly. The MC+ has two elements: a minimax concave penalty (MCP) and a penalized linear unbiased selection (PLUS) algorithm. The MCP provides the convexity of the penalized loss in sparse regions to the greatest extent given certain thresholds for variable selection and unbiasedness. The PLUS computes multiple exact local minimizers of a possibly nonconvex penalized loss function in a certain main branch of the graph of critical points of the penalized loss. Its output is a continuous piecewise linear path encompassing from the origin for infinite penalty to a least squares solution for zero penalty. We prove that at a universal penalty level, the MC+ has high probability of matching the signs of the unknowns, and thus correct selection, without assuming the strong irrepresentable condition required by the LASSO. This selection consistency applies to the case of p≫n, and is proved to hold for exactly the MC+ solution among possibly many local minimizers. We prove that the MC+ attains certain minimax convergence rates in probability for the estimation of regression coefficients in lr balls. We use the SURE method to derive degrees of freedom and Cp-type risk estimates for general penalized LSE, including the LASSO and MC+ estimators, and prove their unbiasedness. Based on the estimated degrees of freedom, we propose an estimator of the noise level for proper choice of the penalty level. For full rank designs and general sub-quadratic penalties, we provide necessary and sufficient conditions for the continuity of the penalized LSE. Simulation results overwhelmingly support our claim of superior variable selection properties and demonstrate the computational efficiency of the proposed method.

2,382 citations


Journal ArticleDOI
TL;DR: This paper proposes a model that describes uncertainty in both the distribution form (discrete, Gaussian, exponential, etc.) and moments (mean and covariance matrix) and demonstrates that for a wide range of cost functions the associated distributionally robust stochastic program can be solved efficiently.
Abstract: Stochastic programming can effectively describe many decision-making problems in uncertain environments. Unfortunately, such programs are often computationally demanding to solve. In addition, their solution can be misleading when there is ambiguity in the choice of a distribution for the random parameters. In this paper, we propose a model that describes uncertainty in both the distribution form (discrete, Gaussian, exponential, etc.) and moments (mean and covariance matrix). We demonstrate that for a wide range of cost functions the associated distributionally robust (or min-max) stochastic program can be solved efficiently. Furthermore, by deriving a new confidence region for the mean and the covariance matrix of a random vector, we provide probabilistic arguments for using our model in problems that rely heavily on historical data. These arguments are confirmed in a practical example of portfolio selection, where our framework leads to better-performing policies on the “true” distribution underlying the daily returns of financial assets.

1,569 citations


Journal ArticleDOI
TL;DR: This paper analyzes the statistical properties, bias and variance, of the k-fold cross-validation classification error estimator (k-cv) and proposes a novel theoretical decomposition of the variance considering its sources of variance: sensitivity to changes in the training set and sensitivity to changed folds.
Abstract: In the machine learning field, the performance of a classifier is usually measured in terms of prediction error. In most real-world problems, the error cannot be exactly calculated and it must be estimated. Therefore, it is important to choose an appropriate estimator of the error. This paper analyzes the statistical properties, bias and variance, of the k-fold cross-validation classification error estimator (k-cv). Our main contribution is a novel theoretical decomposition of the variance of the k-cv considering its sources of variance: sensitivity to changes in the training set and sensitivity to changes in the folds. The paper also compares the bias and variance of the estimator for different values of k. The experimental study has been performed in artificial domains because they allow the exact computation of the implied quantities and we can rigorously specify the conditions of experimentation. The experimentation has been performed for two classifiers (naive Bayes and nearest neighbor), different numbers of folds, sample sizes, and training sets coming from assorted probability distributions. We conclude by including some practical recommendation on the use of k-fold cross validation.

1,270 citations


Journal ArticleDOI
TL;DR: This work develops and analyzes M-estimation methods for divergence functionals and the likelihood ratios of two probability distributions based on a nonasymptotic variational characterization of f -divergences, which allows the problem of estimating divergences to be tackled via convex empirical risk optimization.
Abstract: We develop and analyze M-estimation methods for divergence functionals and the likelihood ratios of two probability distributions. Our method is based on a nonasymptotic variational characterization of f -divergences, which allows the problem of estimating divergences to be tackled via convex empirical risk optimization. The resulting estimators are simple to implement, requiring only the solution of standard convex programs. We present an analysis of consistency and convergence for these estimators. Given conditions only on the ratios of densities, we show that our estimators can achieve optimal minimax rates for the likelihood ratio and the divergence functionals in certain regimes. We derive an efficient optimization algorithm for computing our estimates, and illustrate their convergence behavior and practical viability by simulations.

729 citations


Journal ArticleDOI
TL;DR: The nested Chinese restaurant process (nCRP) as discussed by the authors is a stochastic process that assigns probability distributions to ensembles of infinitely deep, infinitely branching trees, and it can be used as a prior distribution in a Bayesian nonparametric model of document collections.
Abstract: We present the nested Chinese restaurant process (nCRP), a stochastic process that assigns probability distributions to ensembles of infinitely deep, infinitely branching trees. We show how this stochastic process can be used as a prior distribution in a Bayesian nonparametric model of document collections. Specifically, we present an application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Given a corpus of documents, a posterior inference algorithm finds an approximation to a posterior distribution over trees, topics and allocations of words to levels of the tree. We demonstrate this algorithm on collections of scientific abstracts from several journals. This model exemplifies a recent trend in statistical machine learning—the use of Bayesian nonparametric methods to infer distributions on flexible data structures.

613 citations


Journal Article
TL;DR: In this article, it was shown that it is #P-hard to approximate the permanent of a matrix A of independent N(0, 1) Gaussian entries, with high probability over A.
Abstract: We give new evidence that quantum computers -- moreover, rudimentary quantum computers built entirely out of linear-optical elements -- cannot be efficiently simulated by classical computers In particular, we define a model of computation in which identical photons are generated, sent through a linear-optical network, then nonadaptively measured to count the number of photons in each mode This model is not known or believed to be universal for quantum computation, and indeed, we discuss the prospects for realizing the model using current technology On the other hand, we prove that the model is able to solve sampling problems and search problems that are classically intractable under plausible assumptions Our first result says that, if there exists a polynomial-time classical algorithm that samples from the same probability distribution as a linear-optical network, then P^#P=BPP^NP, and hence the polynomial hierarchy collapses to the third level Unfortunately, this result assumes an extremely accurate simulation Our main result suggests that even an approximate or noisy classical simulation would already imply a collapse of the polynomial hierarchy For this, we need two unproven conjectures: the "Permanent-of-Gaussians Conjecture", which says that it is #P-hard to approximate the permanent of a matrix A of independent N(0,1) Gaussian entries, with high probability over A; and the "Permanent Anti-Concentration Conjecture", which says that |Per(A)|>=sqrt(n!)/poly(n) with high probability over A We present evidence for these conjectures, both of which seem interesting even apart from our application This paper does not assume knowledge of quantum optics Indeed, part of its goal is to develop the beautiful theory of noninteracting bosons underlying our model, and its connection to the permanent function, in a self-contained way accessible to theoretical computer scientists

521 citations


Journal ArticleDOI
TL;DR: This application of Bayes' Theorem automatically applies a quantitative Ockham's razor that penalizes the data‐fit of more complex model classes that extract more information from the data.
Abstract: Probability logic with Bayesian updating provides a rigorous framework to quantify modeling uncertainty and perform system identification. It uses probability as a multi-valued propositional logic for plausible reasoning where the probability of a model is a measure of its relative plausibility within a set of models. System identification is thus viewed as inference about plausible system models and not as a quixotic quest for the true model. Instead of using system data to estimate the model parameters, Bayes' Theorem is used to update the relative plausibility of each model in a model class, which is a set of input–output probability models for the system and a probability distribution over this set that expresses the initial plausibility of each model. Robust predictive analyses informed by the system data use the entire model class with the probabilistic predictions of each model being weighed by its posterior probability. Additional robustness to modeling uncertainty comes from combining the robust predictions of each model class in a set of candidates for the system, where each contribution is weighed by the posterior probability of the model class. This application of Bayes' Theorem automatically applies a quantitative Ockham's razor that penalizes the data-fit of more complex model classes that extract more information from the data. Robust analyses involve integrals over parameter spaces that usually must be evaluated numerically by Laplace's method of asymptotic approximation or by Markov Chain Monte Carlo methods. An illustrative application is given using synthetic data corresponding to a structural health monitoring benchmark structure.

497 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider the crossover behavior between the symmetric and asymmetric exclusion processes and obtain explicit formulas for the one-dimensional marginal distributions, which interpolate between a standard Gaussian distribution and the GUE Tracy-Widom distribution.
Abstract: We consider the solution of the stochastic heat equation \partial_T \mathcal{Z} = 1/2 \partial_X^2 \mathcal{Z} - \mathcal{Z} \dot{\mathscr{W}} with delta function initial condition \mathcal{Z} (T=0)= \delta_0 whose logarithm, with appropriate normalizations, is the free energy of the continuum directed polymer, or the solution of the Kardar-Parisi-Zhang equation with narrow wedge initial conditions. We obtain explicit formulas for the one-dimensional marginal distributions -- the {\it crossover distributions} -- which interpolate between a standard Gaussian distribution (small time) and the GUE Tracy-Widom distribution (large time). The proof is via a rigorous steepest descent analysis of the Tracy-Widom formula for the asymmetric simple exclusion with anti-shock initial data, which is shown to converge to the continuum equations in an appropriate weakly asymmetric limit. The limit also describes the crossover behaviour between the symmetric and asymmetric exclusion processes.

486 citations


Posted Content
TL;DR: It is proved that if the probability distribution F obeys a simple incoherence property and an isotropy property, one can faithfully recover approximately sparse signals from a minimal number of noisy measurements.
Abstract: This paper introduces a simple and very general theory of compressive sensing. In this theory, the sensing mechanism simply selects sensing vectors independently at random from a probability distribution F; it includes all models - e.g. Gaussian, frequency measurements - discussed in the literature, but also provides a framework for new measurement strategies as well. We prove that if the probability distribution F obeys a simple incoherence property and an isotropy property, one can faithfully recover approximately sparse signals from a minimal number of noisy measurements. The novelty is that our recovery results do not require the restricted isometry property (RIP) - they make use of a much weaker notion - or a random model for the signal. As an example, the paper shows that a signal with s nonzero entries can be faithfully recovered from about s log n Fourier coefficients that are contaminated with noise.

483 citations


Journal ArticleDOI
TL;DR: This work presents novel quantum-behaved PSO (QPSO) approaches using mutation operator with Gaussian probability distribution employed in well-studied continuous optimization problems of engineering design and indicates that Gaussian QPSO approaches handle such problems efficiently in terms of precision and convergence.
Abstract: Particle swarm optimization (PSO) is a population-based swarm intelligence algorithm that shares many similarities with evolutionary computation techniques. However, the PSO is driven by the simulation of a social psychological metaphor motivated by collective behaviors of bird and other social organisms instead of the survival of the fittest individual. Inspired by the classical PSO method and quantum mechanics theories, this work presents novel quantum-behaved PSO (QPSO) approaches using mutation operator with Gaussian probability distribution. The application of Gaussian mutation operator instead of random sequences in QPSO is a powerful strategy to improve the QPSO performance in preventing premature convergence to local optima. In this paper, new combinations of QPSO and Gaussian probability distribution are employed in well-studied continuous optimization problems of engineering design. Two case studies are described and evaluated in this work. Our results indicate that Gaussian QPSO approaches handle such problems efficiently in terms of precision and convergence and, in most cases, they outperform the results presented in the literature.

405 citations


Journal ArticleDOI
TL;DR: In this paper, the adaptive group Lasso was used to select nonzero components in a nonparametric additive model of a conditional mean function, where the additive components are approximated by truncated series expansions with B-spline bases, and the problem of component selection becomes that of selecting the groups of coefficients in the expansion.
Abstract: We consider a nonparametric additive model of a conditional mean function in which the number of variables and additive components may be larger than the sample size but the number of nonzero additive components is "small" relative to the sample size. The statistical problem is to determine which additive components are nonzero. The additive components are approximated by truncated series expansions with B-spline bases. With this approximation, the problem of component selection becomes that of selecting the groups of coefficients in the expansion. We apply the adaptive group Lasso to select nonzero components, using the group Lasso to obtain an initial estimator and reduce the dimension of the problem. We give conditions under which the group Lasso selects a model whose number of components is comparable with the underlying model, and the adaptive group Lasso selects the nonzero components correctly with probability approaching one as the sample size increases and achieves the optimal rate of convergence. The results of Monte Carlo experiments show that the adaptive group Lasso procedure works well with samples of moderate size. A data example is used to illustrate the application of the proposed method.

Journal ArticleDOI
TL;DR: A generalised two-filter smoothing formula is proposed which only requires approximating probability distributions and applies to any state–space model, removing the need to make restrictive assumptions used in previous approaches to this problem.
Abstract: Two-filter smoothing is a principled approach for performing optimal smoothing in non-linear non-Gaussian state-space models where the smoothing dis- tributions are computed through the combination of 'forward' and 'backward' time filters. The 'forward' filter is the standard Bayesian filter but the 'backward' filter, generally referred to as the backward information filter, is not a probability measure on the space of the hidden Markov process. In cases where the backward information filter can be computed in closed form, this technical point is not important. However, forgeneralstate-spacemodelswherethereisnoclosedformexpression,thisprohibits the use of flexible numerical techniques such as Sequential Monte Carlo (SMC) to approximate the two-filter smoothing formula. We propose here a generalised two- filter smoothing formula which only requires approximating probability distributions and applies to any state-space model, removing the need to make restrictive assump- tions used in previous approaches to this problem. SMC algorithms are developed to implement this generalised recursion and we illustrate their performance on various problems.

Journal ArticleDOI
TL;DR: This paper develops a set of methods enabling an information-theoretic distributed control architecture to facilitate search by a mobile sensor network that captures effects in more general scenarios that are not possible with linearized methods.
Abstract: This paper develops a set of methods enabling an information-theoretic distributed control architecture to facilitate search by a mobile sensor network. Given a particular configuration of sensors, this technique exploits the structure of the probability distributions of the target state and of the sensor measurements to control the mobile sensors such that future observations minimize the expected future uncertainty of the target state. The mutual information between the sensors and the target state is computed using a particle filter representation of the posterior probability distribution, making it possible to directly use nonlinear and non-Gaussian target state and sensor models. To make the approach scalable to increasing network sizes, single-node and pairwise-node approximations to the mutual information are derived for general probability density models, with analytically bounded error. The pairwise-node approximation is proven to be a more accurate objective function than the single-node approximation. The mobile sensors are cooperatively controlled using a distributed optimization, yielding coordinated motion of the network. These methods are explored for various sensing modalities, including bearings-only sensing, range-only sensing, and magnetic field sensing, all with potential for search and rescue applications. For each sensing modality, the behavior of this non-parametric method is compared and contrasted with the results of linearized methods, and simulations are performed of a target search using the dynamics of actual vehicles. Monte Carlo results demonstrate that as network size increases, the sensors more quickly localize the target, and the pairwise-node approximation provides superior performance to the single-node approximation. The proposed methods are shown to produce similar results to linearized methods in particular scenarios, yet they capture effects in more general scenarios that are not possible with linearized methods.

Proceedings Article
11 Jul 2010
TL;DR: This work shows that the problem of probabilistic plan recognition can be solved efficiently using classical planners provided that the probability of a partially observed execution given a goal is defined in terms of the cost difference of achieving the goal under two conditions: complying with the observations, and not complying with them.
Abstract: Plan recognition is the problem of inferring the goals and plans of an agent after observing its behavior. Recently, it has been shown that this problem can be solved efficiently, without the need of a plan library, using slightly modified planning algorithms. In this work, we extend this approach to the more general problem of probabilistic plan recognition where a probability distribution over the set of goals is sought under the assumptions that actions have deterministic effects and both agent and observer have complete information about the initial state. We show that this problem can be solved efficiently using classical planners provided that the probability of a partially observed execution given a goal is defined in terms of the cost difference of achieving the goal under two conditions: complying with the observations, and not complying with them. This cost, and hence the posterior goal probabilities, are computed by means of two calls to a classical planner that no longer has to be modified in any way. A number of examples is considered to illustrate the quality, flexibility, and scalability of the approach.

Journal ArticleDOI
TL;DR: In this article, a probabilistic approach for statistical modeling of the loads in distribution networks is presented, where the probability density functions (pdfs) of loads at different buses show a number of variations and cannot be represented by any specific distribution.
Abstract: This paper presents a probabilistic approach for statistical modeling of the loads in distribution networks. In a distribution network, the probability density functions (pdfs) of loads at different buses show a number of variations and cannot be represented by any specific distribution. The approach presented in this paper represents all the load pdfs through Gaussian mixture model (GMM). The expectation maximization (EM) algorithm is used to obtain the parameters of the mixture components. The performance of the method is demonstrated on a 95-bus generic distribution network model.

Journal ArticleDOI
TL;DR: This paper presented a parsimonious characterization of risk taking behavior by estimating a finite mixture model for three different experimental data sets, two Swiss and one Chinese, over a large number of real gains and losses.
Abstract: It has long been recognized that there is considerable heterogeneity in individual risk taking behavior, but little is known about the distribution of risk taking types. We present a parsimonious characterization of risk taking behavior by estimating a finite mixture model for three different experimental data sets, two Swiss and one Chinese, over a large number of real gains and losses. We find two major types of individuals: In all three data sets, the choices of roughly 80% of the subjects exhibit significant deviations from linear probability weighting of varying strength, consistent with prospect theory. Twenty percent of the subjects weight probabilities near linearly and behave essentially as expected value maximizers. Moreover, individuals are cleanly assigned to one type with probabilities close to unity. The reliability and robustness of our classification suggest using a mix of preference theories in applied economic modeling.

Journal ArticleDOI
TL;DR: This work shows that the category ID of D-posets of fuzzy sets and sequentially continuous D-homomorphisms allows to characterize the passage from classical to fuzzy events as the minimal generalization having nontrivial quantum character.
Abstract: First, we discuss basic probability notions from the viewpoint of category theory. Our approach is based on the following four “sine quibus non” conditions: 1. (elementary) category theory is efficient (and suffices); 2. random variables, observables, probability measures, and states are morphisms; 3. classical probability theory and fuzzy probability theory in the sense of S. Gudder and S. Bugajski are special cases of a more general model; 4. a good model allows natural modifications. Second, we show that the category ID of D-posets of fuzzy sets and sequentially continuous D-homomorphisms allows to characterize the passage from classical to fuzzy events as the minimal generalization having nontrivial quantum character: a degenerated state can be transported to a nondegenerated one. Third, we describe a general model of probability theory based on the category ID so that the classical and fuzzy probability theories become special cases and the model allows natural modifications. Finally, we present a modification in which the closed unit interval [0,1] as the domain of traditional states is replaced by a suitable simplex.

Journal ArticleDOI
TL;DR: This paper aims to design a linear full-order filter such that the estimation error converges to zero exponentially in the mean square while the disturbance rejection attenuation is constrained to a give level by means of the H∞ performance index.
Abstract: In this paper, the robust H∞ filtering problem is studied for a class of uncertain nonlinear networked systems with both multiple stochastic time-varying communication delays and multiple packet dropouts. A sequence of random variables, all of which are mutually independent but obey Bernoulli distribution, are introduced to account for the randomly occurred communication delays. The packet dropout phenomenon occurs in a random way and the occurrence probability for each sensor is governed by an individual random variable satisfying a certain probabilistic distribution in the interval. The discrete-time system under consideration is also subject to parameter uncertainties, state-dependent stochastic disturbances and sector-bounded nonlinearities. We aim to design a linear full-order filter such that the estimation error converges to zero exponentially in the mean square while the disturbance rejection attenuation is constrained to a give level by means of the H∞ performance index. Intensive stochastic analysis is carried out to obtain sufficient conditions for ensuring the exponential stability as well as prescribed H∞ performance for the overall filtering error dynamics, in the presence of random delays, random dropouts, nonlinearities, and the parameter uncertainties. These conditions are characterized in terms of the feasibility of a set of linear matrix inequalities (LMIs), and then the explicit expression is given for the desired filter parameters. Simulation results are employed to demonstrate the effectiveness of the proposed filter design technique in this paper.

Journal ArticleDOI
TL;DR: Three modern research arenas in animal movement modelling imply more detail in the movement pattern than general models of movement can accommodate, but realistic empiric evaluation of their predictions requires dense locational data, both in time and space, only available with modern GPS telemetry.
Abstract: Modern animal movement modelling derives from two traditions. Lagrangian models, based on random walk behaviour, are useful for multi-step trajectories of single animals. Continuous Eulerian models describe expected behaviour, averaged over stochastic realizations, and are usefully applied to ensembles of individuals. We illustrate three modern research arenas. (i) Models of home-range formation describe the process of an animal ‘settling down’, accomplished by including one or more focal points that attract the animal's movements. (ii) Memory-based models are used to predict how accumulated experience translates into biased movement choices, employing reinforced random walk behaviour, with previous visitation increasing or decreasing the probability of repetition. (iii) Levy movement involves a step-length distribution that is over-dispersed, relative to standard probability distributions, and adaptive in exploring new environments or searching for rare targets. Each of these modelling arenas implies more detail in the movement pattern than general models of movement can accommodate, but realistic empiric evaluation of their predictions requires dense locational data, both in time and space, only available with modern GPS telemetry.

Proceedings ArticleDOI
Allan Sly1
23 Oct 2010
TL;DR: In this paper, it was shown that unless NP$=$RP there is no polynomial time approximation scheme for the partition function (the sum of such weighted independent sets) on graphs of maximum degree $d$ for fugacity parameter $\lambda_c(d) 0.
Abstract: The hardcore model is a model of lattice gas systems which has received much attention in statistical physics, probability theory and theoretical computer science. It is the probability distribution over independent sets $I$ of a graph weighted proportionally to $\lambda^{|I|}$ with fugacity parameter $\lambda$. We prove that at the uniqueness threshold of the hardcore model on the $d$-regular tree, approximating the partition function becomes computationally hard on graphs of maximum degree $d$. Specifically, we show that unless NP$=$RP there is no polynomial time approximation scheme for the partition function (the sum of such weighted independent sets) on graphs of maximum degree $d$ for fugacity $\lambda_c(d) 0$. Weitz produced an FPTAS for approximating the partition function when $0

Journal ArticleDOI
TL;DR: In this article, the authors proposed a beta-transformed linear opinion pool for the aggregation of probability forecasts from distinct, calibrated or uncalibrated sources, which fits an optimal non-linearly recalibrated forecast combination.
Abstract: Summary Linear pooling is by far the most popular method for combining probability forecasts However, any non-trivial weighted average of two or more distinct, calibrated probability forecasts is necessarily uncalibrated and lacks sharpness In view of this, linear pooling requires recalibration, even in the ideal case in which the individual forecasts are calibrated Towards this end, we propose a beta-transformed linear opinion pool for the aggregation of probability forecasts from distinct, calibrated or uncalibrated sources The method fits an optimal non-linearly recalibrated forecast combination, by compositing a beta transform and the traditional linear opinion pool The technique is illustrated in a simulation example and in a case-study on statistical and National Weather Service probability of precipitation forecasts

Journal ArticleDOI
TL;DR: The cone-complementarity-linearization procedure is employed to cast the controller-design problem into a sequential minimization one that is solved by the semi-definite program method.
Abstract: In this paper, the robust H∞-control problem is investigated for a class of uncertain discrete-time fuzzy systems with both multiple probabilistic delays and multiple missing measurements. A sequence of random variables, all of which are mutually independent but obey the Bernoulli distribution, is introduced to account for the probabilistic communication delays. The measurement-missing phenomenon occurs in a random way. The missing probability for each sensor satisfies a certain probabilistic distribution in the interval. Here, the attention is focused on the analysis and design of H∞ fuzzy output-feedback controllers such that the closed-loop Takagi-Sugeno (T-S) fuzzy-control system is exponentially stable in the mean square. The disturbance-rejection attenuation is constrained to a given level by means of the H∞-performance index. Intensive analysis is carried out to obtain sufficient conditions for the existence of admissible output feedback controllers, which ensures the exponential stability as well as the prescribed H∞ performance. The cone-complementarity-linearization procedure is employed to cast the controller-design problem into a sequential minimization one that is solved by the semi-definite program method. Simulation results are utilized to demonstrate the effectiveness of the proposed design technique in this paper.

Posted Content
TL;DR: In this article, the optimal sample size for all distributions with finite fourth moment was shown to be O(n) up to an iterated logarithmic factor, where n is the number of independent points in the distribution.
Abstract: Given a probability distribution in R^n with general (non-white) covariance, a classical estimator of the covariance matrix is the sample covariance matrix obtained from a sample of N independent points. What is the optimal sample size N = N(n) that guarantees estimation with a fixed accuracy in the operator norm? Suppose the distribution is supported in a centered Euclidean ball of radius \sqrt{n}. We conjecture that the optimal sample size is N = O(n) for all distributions with finite fourth moment, and we prove this up to an iterated logarithmic factor. This problem is motivated by the optimal theorem of Rudelson which states that N = O(n \log n) for distributions with finite second moment, and a recent result of Adamczak, Litvak, Pajor and Tomczak-Jaegermann which guarantees that N = O(n) for sub-exponential distributions.

Posted Content
TL;DR: It is shown that parameters of a Gaussian mixture distribution with fixed number of components can be learned using a sample whose size is polynomial in dimension and all other parameters.
Abstract: The question of polynomial learnability of probability distributions, particularly Gaussian mixture distributions, has recently received significant attention in theoretical computer science and machine learning. However, despite major progress, the general question of polynomial learnability of Gaussian mixture distributions still remained open. The current work resolves the question of polynomial learnability for Gaussian mixtures in high dimension with an arbitrary fixed number of components. The result on learning Gaussian mixtures relies on an analysis of distributions belonging to what we call "polynomial families" in low dimension. These families are characterized by their moments being polynomial in parameters and include almost all common probability distributions as well as their mixtures and products. Using tools from real algebraic geometry, we show that parameters of any distribution belonging to such a family can be learned in polynomial time and using a polynomial number of sample points. The result on learning polynomial families is quite general and is of independent interest. To estimate parameters of a Gaussian mixture distribution in high dimensions, we provide a deterministic algorithm for dimensionality reduction. This allows us to reduce learning a high-dimensional mixture to a polynomial number of parameter estimations in low dimension. Combining this reduction with the results on polynomial families yields our result on learning arbitrary Gaussian mixtures in high dimensions.

Proceedings ArticleDOI
01 Nov 2010
TL;DR: The findings show that the mixing time of social graphs is much larger than anticipated, and being used in literature, and this implies that either the current security systems based on fast mixing have weaker utility guarantees or have to be less efficient, with less security guarantees, in order to compensate for the slower mixing.
Abstract: Social networks provide interesting algorithmic properties that can be used to bootstrap the security of distributed systems. For example, it is widely believed that social networks are fast mixing, and many recently proposed designs of such systems make crucial use of this property. However, whether real-world social networks are really fast mixing is not verified before, and this could potentially affect the performance of such systems based on the fast mixing property. To address this problem, we measure the mixing time of several social graphs, the time that it takes a random walk on the graph to approach the stationary distribution of that graph, using two techniques. First, we use the second largest eigenvalue modulus which bounds the mixing time. Second, we sample initial distributions and compute the random walk length required to achieve probability distributions close to the stationary distribution. Our findings show that the mixing time of social graphs is much larger than anticipated, and being used in literature, and this implies that either the current security systems based on fast mixing have weaker utility guarantees or have to be less efficient, with less security guarantees, in order to compensate for the slower mixing.

Journal ArticleDOI
TL;DR: In this paper, a new analytical approach for the derivation of fragility curves for masonry buildings is proposed, based on nonlinear stochastic analyses of building prototypes, where the mechanical properties of the prototypes are considered as random variables, assumed to vary within appropriate ranges of values.

Journal ArticleDOI
TL;DR: Insight is gained into the decision-making strategy used by human observers in a low-level perceptual task to examine which of three plausible strategies could account for each observer's behavior the best.
Abstract: The question of which strategy is employed in human decision making has been studied extensively in the context of cognitive tasks; however, this question has not been investigated systematically in the context of perceptual tasks. The goal of this study was to gain insight into the decision-making strategy used by human observers in a low-level perceptual task. Data from more than 100 individuals who participated in an auditory-visual spatial localization task was evaluated to examine which of three plausible strategies could account for each observer's behavior the best. This task is very suitable for exploring this question because it involves an implicit inference about whether the auditory and visual stimuli were caused by the same object or independent objects, and provides different strategies of how using the inference about causes can lead to distinctly different spatial estimates and response patterns. For example, employing the commonly used cost function of minimizing the mean squared error of spatial estimates would result in a weighted averaging of estimates corresponding to different causal structures. A strategy that would minimize the error in the inferred causal structure would result in the selection of the most likely causal structure and sticking with it in the subsequent inference of location-"model selection." A third strategy is one that selects a causal structure in proportion to its probability, thus attempting to match the probability of the inferred causal structure. This type of probability matching strategy has been reported to be used by participants predominantly in cognitive tasks. Comparing these three strategies, the behavior of the vast majority of observers in this perceptual task was most consistent with probability matching. While this appears to be a suboptimal strategy and hence a surprising choice for the perceptual system to adopt, we discuss potential advantages of such a strategy for perception.

Journal ArticleDOI
Emmanuel Vazquez1, Julien Bect1
TL;DR: The first result is that under some mild hypotheses on the covariance function k of the Gaussian process, the expected improvement algorithm produces a dense sequence of evaluation points in the search domain, when the function to be optimized is in the reproducing kernel Hilbert space generated by k.

Journal ArticleDOI
TL;DR: In this article, a hierarchical probabilistic method for performing the relevant meta-analysis, that is, inferring the true eccentricity distribution, taking as input the likelihood functions for the individual star eccentricities, or samplings of the posterior probability distributions for the eccentricities (under a given, uninformative prior).
Abstract: Standard maximum-likelihood estimators for binary-star and exoplanet eccentricities are biased high, in the sense that the estimated eccentricity tends to be larger than the true eccentricity. As with most non-trivial observables, a simple histogram of estimated eccentricities is not a good estimate of the true eccentricity distribution. Here, we develop and test a hierarchical probabilistic method for performing the relevant meta-analysis, that is, inferring the true eccentricity distribution, taking as input the likelihood functions for the individual star eccentricities, or samplings of the posterior probability distributions for the eccentricities (under a given, uninformative prior). The method is a simple implementation of a hierarchical Bayesian model; it can also be seen as a kind of heteroscedastic deconvolution. It can be applied to any quantity measured with finite precision—other orbital parameters, or indeed any astronomical measurements of any kind, including magnitudes, distances, or photometric redshifts—so long as the measurements have been communicated as a likelihood function or a posterior sampling.

Journal ArticleDOI
TL;DR: A comprehensive software package for two- and three-dimensional stochastic rock fracture simulation using marked point processes and a case study in rock fracture modelling is provided to demonstrate the application of the software.