scispace - formally typeset
Search or ask a question

Showing papers on "Probability distribution published in 2011"


01 Jan 2011
TL;DR: In this paper, a polynomial dimensional decomposition (PDD) method for global sensitivity analysis of stochastic systems subject to independent random input following arbitrary probability distributions is presented.
Abstract: This paper presents a polynomial dimensional decomposition (PDD) method for global sensitivity analysis of stochastic systems subject to independent random input following arbitrary probability distributions. The method involves Fourier-polynomial expansions of lower-variate component functions of a stochastic response by measure-consistent orthonormal polynomial bases, analytical formulae for calculating the global sensitivity indices in terms of the expansion coefficients, and dimension-reduction integration for estimating the expansion coefficients. Due to identical dimensional structures of PDD and analysis-of-variance decomposition, the proposed method facilitates simple and direct calculation of the global sensitivity indices. Numerical results of the global sensitivity indices computed for smooth systems reveal significantly higher convergence rates of the PDD approximation than those from existing methods, including polynomial chaos expansion, random balance design, state-dependent parameter, improved Sobol’s method, and sampling-based methods. However, for non-smooth functions, the convergence properties of the PDD solution deteriorate to a great extent, warranting further improvements. The computational complexity of the PDD method is polynomial, as opposed to exponential, thereby alleviating the curse of dimensionality to some extent. Mathematical modeling of complex systems often requires sensitivity analysis to determine how an output variable of interest is influenced by individual or subsets of input variables. A traditional local sensitivity analysis entails gradients or derivatives, often invoked in design optimization, describing changes in the model response due to the local variation of input. Depending on the model output, obtaining gradients or derivatives, if they exist, can be simple or difficult. In contrast, a global sensitivity analysis (GSA), increasingly becoming mainstream, characterizes how the global variation of input, due to its uncertainty, impacts the overall uncertain behavior of the model. In other words, GSA constitutes the study of how the output uncertainty from a mathematical model is divvied up, qualitatively or quantitatively, to distinct sources of input variation in the model [1].

1,296 citations


Proceedings ArticleDOI
06 Jun 2011
TL;DR: In this paper, it was shown that even an approximate or noisy classical simulation would already imply a collapse of the polynomial hierarchy, and hence the hierarchy collapses to the third level.
Abstract: We give new evidence that quantum computers -- moreover, rudimentary quantum computers built entirely out of linear-optical elements -- cannot be efficiently simulated by classical computers. In particular, we define a model of computation in which identical photons are generated, sent through a linear-optical network, then nonadaptively measured to count the number of photons in each mode. This model is not known or believed to be universal for quantum computation, and indeed, we discuss the prospects for realizing the model using current technology. On the other hand, we prove that the model is able to solve sampling problems and search problems that are classically intractable under plausible assumptions. Our first result says that, if there exists a polynomial-time classical algorithm that samples from the same probability distribution as a linear-optical network, then P#P=BPPNP, and hence the polynomial hierarchy collapses to the third level. Unfortunately, this result assumes an extremely accurate simulation.Our main result suggests that even an approximate or noisy classical simulation would already imply a collapse of the polynomial hierarchy. For this, we need two unproven conjectures: the Permanent-of-Gaussians Conjecture, which says that it is #P-hard to approximate the permanent of a matrix A of independent N(0,1) Gaussian entries, with high probability over A; and the Permanent Anti-Concentration Conjecture, which says that |Per(A)|>=√(n!)poly(n) with high probability over A. We present evidence for these conjectures, both of which seem interesting even apart from our application.This paper does not assume knowledge of quantum optics. Indeed, part of its goal is to develop the beautiful theory of noninteracting bosons underlying our model, and its connection to the permanent function, in a self-contained way accessible to theoretical computer scientists.

606 citations


Journal ArticleDOI
TL;DR: In this paper, the authors considered the solution of the stochastic heat equation @TZ D 1 @ 2 X ZZ P W with delta function initial condition Z and obtained explicit formulas for the one-dimensional marginal distributions, the crossover distributions, which interpolate between a standard Gaussian dis- tribution (small time) and the GUE Tracy-Widom distribution (large time).
Abstract: We consider the solution of the stochastic heat equation @TZ D 1 @ 2 X ZZ P W with delta function initial condition Z.T D0;X/ DiXD0 whose logarithm, with appropriate normalization, is the free energy of the con- tinuum directed polymer, or the Hopf-Cole solution of the Kardar-Parisi-Zhang equation with narrow wedge initial conditions. We obtain explicit formulas for the one-dimensional marginal distributions, the crossover distributions, which interpolate between a standard Gaussian dis- tribution (small time) and the GUE Tracy-Widom distribution (large time). The proof is via a rigorous steepest-descent analysis of the Tracy-Widom formula for the asymmetric simple exclusion process with antishock initial data, which is shown to converge to the continuum equations in an appropriate weakly asymmetric limit. The limit also describes the crossover behavior between the symmetric and asymmetric exclusion processes. © 2010 Wiley Periodicals, Inc.

601 citations


Book ChapterDOI
29 May 2011
TL;DR: A new definition of the averaging of discrete probability distributions as a barycenter over the Monge-Kantorovich optimal transport space is proposed and a new fast gradient descent algorithm is introduced to compute Wasserstein barycenters of point clouds.
Abstract: This paper proposes a new definition of the averaging of discrete probability distributions as a barycenter over the Monge-Kantorovich optimal transport space To overcome the time complexity involved by the numerical solving of such problem, the original Wasserstein metric is replaced by a sliced approximation over 1D distributions This enables us to introduce a new fast gradient descent algorithm to compute Wasserstein barycenters of point clouds This new notion of barycenter of probabilities is likely to find applications in computer vision where one wants to average features defined as distributions We show an application to texture synthesis and mixing, where a texture is characterized by the distribution of the response to a multi-scale oriented filter bank This leads to a simple way to navigate over a convex domain of color textures

575 citations


Journal ArticleDOI
TL;DR: In this article, the authors introduced a simple and very general theory of compressive sensing, in which the sensing mechanism simply selects sensing vectors independently at random from a probability distribution F; it includes all standard models-e.g., Gaussian, frequency measurements-discussed in the literature, but also provides a framework for new measurement strategies as well.
Abstract: This paper introduces a simple and very general theory of compressive sensing. In this theory, the sensing mechanism simply selects sensing vectors independently at random from a probability distribution F; it includes all standard models-e.g., Gaussian, frequency measurements-discussed in the literature, but also provides a framework for new measurement strategies as well. We prove that if the probability distribution F obeys a simple incoherence property and an isotropy property, one can faithfully recover approximately sparse signals from a minimal number of noisy measurements. The novelty is that our recovery results do not require the restricted isometry property (RIP) to hold near the sparsity level in question, nor a random model for the signal. As an example, the paper shows that a signal with s nonzero entries can be faithfully recovered from about s logn Fourier coefficients that are contaminated with noise.

520 citations


Journal ArticleDOI
TL;DR: A detailed derivation of this distribution is given, and its use as a prior in an infinite latent feature model in probabilistic models such as bipartite graphs in which the size of at least one class of nodes is unknown is unknown.
Abstract: The Indian buffet process is a stochastic process defining a probability distribution over equivalence classes of sparse binary matrices with a finite number of rows and an unbounded number of columns. This distribution is suitable for use as a prior in probabilistic models that represent objects using a potentially infinite array of features, or that involve bipartite graphs in which the size of at least one class of nodes is unknown. We give a detailed derivation of this distribution, and illustrate its use as a prior in an infinite latent feature model. We then review recent applications of the Indian buffet process in machine learning, discuss its extensions, and summarize its connections to other stochastic processes.

428 citations


Journal ArticleDOI
TL;DR: The class post-IQP of languages decided with bounded error by uniform families of IQP circuits with post-selection is introduced, and it is proved first that post- IQP equals the classical class PP, and that if the output distributions of uniform IQP circuit families could be classically efficiently sampled, then the infinite tower of classical complexity classes known as the polynomial hierarchy would collapse to its third level.
Abstract: We consider quantum computations comprising only commuting gates, known as IQP computations, and provide compelling evidence that the task of sampling their output probability distributions is unli...

417 citations


Journal ArticleDOI
TL;DR: Sousbie et al. as discussed by the authors presented DisPerSE, an approach to the coherent multiscale identification of all types of astrophysical structures, in particular the filaments, in the large-scale distribution of the matter in the Universe.
Abstract: We present DisPerSE, a novel approach to the coherent multiscale identification of all types of astrophysical structures, in particular the filaments, in the large-scale distribution of the matter in the Universe. This method and the corresponding piece of software allows for a genuinely scale-free and parameter-free identification of the voids, walls, filaments, clusters and their configuration within the cosmic web, directly from the discrete distribution of particles in N-body simulations or galaxies in sparse observational catalogues. To achieve that goal, the method works directly over the Delaunay tessellation of the discrete sample and uses the Delaunay tessellation field estimator density computed at each tracer particle; no further sampling, smoothing or processing of the density field is required. The idea is based on recent advances in distinct subdomains of the computational topology, namely the discrete Morse theory which allows for a rigorous application of topological principles to astrophysical data sets, and the theory of persistence, which allows us to consistently account for the intrinsic uncertainty and Poisson noise within data sets. Practically, the user can define a given persistence level in terms of robustness with respect to noise (defined as a ‘number of σ’) and the algorithm returns the structures with the corresponding significance as sets of critical points, lines, surfaces and volumes corresponding to the clusters, filaments, walls and voids – filaments, connected at cluster nodes, crawling along the edges of walls bounding the voids. From a geometrical point of view, the method is also interesting as it allows for a robust quantification of the topological properties of a discrete distribution in terms of Betti numbers or Euler characteristics, without having to resort to smoothing or having to define a particular scale. In this paper, we introduce the necessary mathematical background and describe the method and implementation, while we address the application to 3D simulated and observed data sets in the companion paper (Sousbie, Pichon & Kawahara, Paper II).

408 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider the properties of weighted linear combinations of n prediction models, or linear pools, evaluated using the conventional log predictive scoring rule, and derive several interesting formal results: for example, a prediction model with positive weight in a pool may have zero weight if some other models are deleted from that pool.

382 citations


Journal ArticleDOI
TL;DR: An overview article reviewing the necessary tools, among which are widely linear transformations, augmented statistical descriptions, and Wirtinger calculus, for complex-valued signal processing, addressing the topics of model selection, filtering, and source separation.
Abstract: Complex-valued signals occur in many areas of science and engineering and are thus of fundamental interest. In the past, it has often been assumed, usually implicitly, that complex random signals are proper or circular. A proper complex random variable is uncorrelated with its complex conjugate, and a circular complex random variable has a probability distribution that is invariant under rotation in the complex plane. While these assumptions are convenient because they simplify computations, there are many cases where proper and circular random signals are very poor models of the underlying physics. When taking impropriety and noncircularity into account, the right type of processing can provide significant performance gains. There are two key ingredients in the statistical signal processing of complex-valued data: 1) utilizing the complete statistical characterization of complex-valued random signals; and 2) the optimization of real-valued cost functions with respect to complex parameters. In this overview article, we review the necessary tools, among which are widely linear transformations, augmented statistical descriptions, and Wirtinger calculus. We also present some selected recent developments in the field of complex-valued signal processing, addressing the topics of model selection, filtering, and source separation.

362 citations


Journal ArticleDOI
TL;DR: In this paper, the authors compare distributions in terms of three different metrics: probability plot R2, estimates of average turbine power output, and estimates of extreme wind speed, and show that the widely accepted Weibull distribution provides a poor fit to the distribution of wind speeds when compared with more complicated models.

Book ChapterDOI
01 Jan 2011
TL;DR: In this paper, some aspects of the estimation of the density function of a univariate probability distribution are discussed, and the asymptotic mean square error of a particular class of estimates is evaluated.
Abstract: This note discusses some aspects of the estimation of the density function of a univariate probability distribution. All estimates of the density function satisfying relatively mild conditions are shown to be biased. The asymptotic mean square error of a particular class of estimates is evaluated.

Journal ArticleDOI
TL;DR: In this paper, a method for the determination of stationary crystal nucleation rates in solutions has been developed, which makes use of the stochastic nature of nucleation, which is reflected in the variation of the induction time in many measurements at a constant supersaturation.
Abstract: A novel method for the determination of stationary crystal nucleation rates in solutions has been developed. This method makes use of the stochastic nature of nucleation, which is reflected in the variation of the induction time in many measurements at a constant supersaturation. A probability distribution function was derived which describes, under the condition of constant supersaturation, the probability of detecting crystals as a function of time, stationary nucleation rate, sample volume, and a time needed to grow the formed nuclei to a detectable size. Cumulative probability distributions of the induction time at constant supersaturation were experimentally determined using at least 80 induction times per supersaturation in 1 mL stirred solutions. The nucleation rate was determined by the best fit of the derived equation to the experimentally obtained distribution. This method was successfully applied to measure the nucleation rates at different supersaturations of two model compounds, m-aminobenzoi...

Journal ArticleDOI
TL;DR: This work presents two particle algorithms to compute the score vector and observed information matrix recursively in nonlinear non-Gaussian state space models and shows how both methods can be used to perform batch and recursive parameter estimation.
Abstract: Particle methods are popular computational tools for Bayesian inference in nonlinear non-Gaussian state space models. For this class of models, we present two particle algorithms to compute the score vector and observed information matrix recursively. The first algorithm is implemented with computational complexity O(N) and the second with complexity O(N 2 ), where N is the number of particles. Although cheaper, the performance of the O(N) method degrades quickly, as it relies on the approximation of a sequence of probability distributions whose dimension increases linearly with time. In particular, even under strong mixing assumptions, the variance of the estimates computed with the O(N) method increases at least quadratically in time. The more expensive O(N 2 ) method relies on a nonstandard particle implementation and does not suffer from this rapid degradation. It is shown how both methods can be used to perform batch and recursive parameter estimation.

Journal ArticleDOI
TL;DR: In this article, a series of image processing technologies and geometric measurement methods is introduced to quantify multiple scale microporosity in images, such as probability entropy, probability distribution index and fractal dimension were introduced to describe the distribution of the three major characteristics of pore system.

Posted Content
TL;DR: In this paper, the recovery theorem is used to separate the market's forecast of returns and the market risk aversion from state prices alone, and to determine the pricing kernel, market risk premium, the probability of a catastrophe and to construct model free tests of the efficient market hypothesis.
Abstract: We can only estimate the distribution of stock returns but we observe the distribution of risk neutral state prices Risk neutral state prices are the product of risk aversion - the pricing kernel - and the natural probability distribution The Recovery Theorem enables us to separate these and to determine the market's forecast of returns and the market's risk aversion from state prices alone Among other things, this allows us to determine the pricing kernel, the market risk premium, the probability of a catastrophe, and to construct model free tests of the efficient market hypothesis

Journal ArticleDOI
TL;DR: Replacing compact subsets by measures, a notion of distance function to a probability distribution in ℝd is introduced and it is shown that it is possible to reconstruct offsets of sampled shapes with topological guarantees even in the presence of outliers.
Abstract: Data often comes in the form of a point cloud sampled from an unknown compact subset of Euclidean space. The general goal of geometric inference is then to recover geometric and topological features (e.g., Betti numbers, normals) of this subset from the approximating point cloud data. It appears that the study of distance functions allows one to address many of these questions successfully. However, one of the main limitations of this framework is that it does not cope well with outliers or with background noise. In this paper, we show how to extend the framework of distance functions to overcome this problem. Replacing compact subsets by measures, we introduce a notion of distance function to a probability distribution in ℝ d . These functions share many properties with classical distance functions, which make them suitable for inference purposes. In particular, by considering appropriate level sets of these distance functions, we show that it is possible to reconstruct offsets of sampled shapes with topological guarantees even in the presence of outliers. Moreover, in settings where empirical measures are considered, these functions can be easily evaluated, making them of particular practical interest.

Journal ArticleDOI
TL;DR: A mixture gamma (MG) distribution for the signal-to-noise ratio (SNR) of wireless channels is proposed, which is not only a more accurate model for composite fading, but is also a versatile approximation for any fading SNR.
Abstract: Composite fading (i.e., multipath fading and shadowing together) has increasingly been analyzed by means of the K channel and related models. Nevertheless, these models do have computational and analytical difficulties. Motivated by this context, we propose a mixture gamma (MG) distribution for the signal-to-noise ratio (SNR) of wireless channels. Not only is it a more accurate model for composite fading, but is also a versatile approximation for any fading SNR. As this distribution consists of N (≥ 1) component gamma distributions, we show how its parameters can be determined by using probability density function (PDF) or moment generating function (MGF) matching. We demonstrate the accuracy of the MG model by computing the mean square error (MSE) or the Kullback-Leibler (KL) divergence or by comparing the moments. With this model, performance metrics such as the average channel capacity, the outage probability, the symbol error rate (SER), and the detection capability of an energy detector are readily derived.

Journal Article
TL;DR: In this article, the two-parameter Weibull probability distribution is embedded in a larger family obtained by introducing an additional parameter, which is called the transmuted Weibell distribution.
Abstract: In this article, the two parameter Weibull probability distribution is embedded in a larger family obtained by introducing an additional parameter. We generalize the two parameter Weibull distribution using the quadratic rank transmutation map studied by Shaw et al. [ 9 ] to develop a transmuted Weibull distribution. We provide a comprehensive description of the mathematical properties of the subject distribution along with its reliability behavior. The usefulness of the transmuted Weibull distribution for modeling reliability data is illustrated using real data.

Journal ArticleDOI
TL;DR: It is essential to estimate how much power can be delivered from vehicles to grid, called achievable power capacity (APC), for practical vehicle-to-grid (V2G) services, and a method of estimating the APC in a probabilistic manner is proposed.
Abstract: It is essential to estimate how much power can be delivered from vehicles to grid, called achievable power capacity (APC), for practical vehicle-to-grid (V2G) services. We propose a method of estimating the APC in a probabilistic manner. Its probability distribution is obtained from the normal approximation to the binomial distribution, and hence represented with two parameters, i.e., mean and covariance. Based on the probability distribution of the APC, we calculate the power capacity that V2G regulation providers (or V2G aggregators) are contracted to provide grid operators with, called the contracted power capacity (CPC). Four possible contract types between a grid operator and a V2G regulation provider are suggested and, for each contract type, a profit function is developed from the APC and the penalty imposed to the V2G aggregator. The CPCs for four contract types are chosen to maximize the corresponding profit functions. Finally, simulations are provided to illustrate the accuracy of the estimated probability distribution of APC and the effectiveness of the profit functions.

Journal ArticleDOI
TL;DR: In this paper, the authors investigate a Bayesian procedure to include sparsity and a scale matrix in the three-term AVO inversion problem and find an iterative algorithm to solve the Bayesian inversion.
Abstract: Three-term AVO inversion can be used to estimate P-wave velocity, S-wave velocity, and density perturbations from reflection seismic data. The density term, however, exhibits little sensitivity to amplitudes and, therefore, its inversion is unstable. One way to stabilize the density term is by including a scale matrix that provides correlation information between the three unknown AVO parameters. We investigate a Bayesian procedure to include sparsity and a scale matrix in the three-term AVO inversion problem. To this end, we model the prior distribution of the AVO parameters via a Trivariate Cauchy distribution. We found an iterative algorithm to solve the Bayesian inversion and, in addition, comparisons are provided with the classical inversion approach that uses a Multivariate Gaussian prior. It is important to point out that the Multivariate Gaussian prior allows us to include the correlation of the AVO parameters in the solution of the inverse problem. The Trivariate Cauchy prior not only permits us ...

Book
27 Aug 2011
TL;DR: In this paper, the authors expose a more subtle fallacy based upon a fallacious use of the Central-Limit Theorem and show that a many-period expected-utility maximizer should maximize either the expected logarithm of portfolio outcomes or the expected average compound return of his portfolio.
Abstract: The fallacy that a many-period expected-utility maximizer should maximize (a) the expected logarithm of portfolio outcomes or (b) the expected average compound return of his portfolio is now understood to rest upon a fallacious use of the Law of Large Numbers . This paper exposes a more subtle fallacy based upon a fallacious use of the Central-Limit Theorem . While the properly normalized product of independent random variables does asymptotically approach a log-normal distribution under proper assumptions, it involves a fallacious manipulation of double limits to infer from this that a maximizer of expected utility after many periods will get a useful approximation to his optimal policy by calculating an efficiency frontier based upon (a) the expected log of wealth outcomes and its variance or (b) the expected average compound return and its variance. Expected utilities calculated from the surrogate log-normal function differ systematically from the correct expected utilities calculated from the true probability distribution. A new concept of ‘initial wealth equivalent’ provides a transitive ordering of portfolios that illuminates commonly held confusions. A non-fallacious application of the log-normal limit and its associated mean-variance efficiency frontier is established for a limit where any fixed horizon period is subdivided into ever more independent sub-intervals. Strong mutual-fund Separation Theorems are then shown to be asymptotically valid.

Journal ArticleDOI
TL;DR: In this paper, a restricted family of non-dominated probability measures for smooth processes is studied. But the authors focus on developing stochastic analysis simultaneously under a general family of probability measures that are not dominated by a single probability measure.
Abstract: This paper is on developing stochastic analysis simultaneously under a general family of probability measures that are not dominated by a single probability measure. The interest in this question originates from the probabilistic representations of fully nonlinear partial differential equations and applications to mathematical finance. The existing literature relies either on the capacity theory (Denis and Martini), or on the underlying nonlinear partial differential equation (Peng). In both approaches, the resulting theory requires certain smoothness, the so-called quasi-sure continuity, of the corresponding processes and random variables in terms of the underlying canonical process. In this paper, we investigate this question for a larger class of ``non-smooth" processes, but with a restricted family of non-dominated probability measures. For smooth processes, our approach leads to similar results as in previous literature, provided the restricted family satisfies an additional density property.

Journal ArticleDOI
TL;DR: In this paper, a sparse linear discriminant analysis (LDA) was proposed to classify human cancer into two classes of leukemia based on a set of 7,129 genes and a training sample of size 72.
Abstract: In many social, economical, biological and medical studies, one objective is to classify a subject into one of several classes based on a set of variables observed from the subject. Because the probability distribution of the variables is usually unknown, the rule of classification is constructed using a training sample. The well-known linear discriminant analysis (LDA) works well for the situation where the number of variables used for classification is much smaller than the training sample size. Because of the advance in technologies, modern statistical studies often face classification problems with the number of variables much larger than the sample size, and the LDA may perform poorly. We explore when and why the LDA has poor performance and propose a sparse LDA that is asymptotically optimal under some sparsity conditions on the unknown parameters. For illustration of application, we discuss an example of classifying human cancer into two classes of leukemia based on a set of 7,129 genes and a training sample of size 72. A simulation is also conducted to check the performance of the proposed method.

Journal ArticleDOI
TL;DR: This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian nonparametric and semiparametric models in R, DPpackage.
Abstract: Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian nonparametric and semiparametric models in R, DPpackage . Currently, DPpackage includes models for marginal and conditional density estimation, receiver operating characteristic curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison and for eliciting the precision parameter of the Dirichlet process prior, and a general purpose Metropolis sampling algorithm. To maximize computational efficiency, the actual sampling for each model is carried out using compiled C, C++ or Fortran code.

Journal ArticleDOI
TL;DR: It is shown that if the variance of the Gaussian noise is small in a certain sense, then the homology can be learned with high confidence by an algorithm that has a weak (linear) dependence on the ambient dimension.
Abstract: In this paper, we take a topological view of unsupervised learning. From this point of view, clustering may be interpreted as trying to find the number of connected components of any underlying geometrically structured probability distribution in a certain sense that we will make precise. We construct a geometrically structured probability distribution that seems appropriate for modeling data in very high dimensions. A special case of our construction is the mixture of Gaussians where there is Gaussian noise concentrated around a finite set of points (the means). More generally we consider Gaussian noise concentrated around a low dimensional manifold and discuss how to recover the homology of this underlying geometric core from data that do not lie on it. We show that if the variance of the Gaussian noise is small in a certain sense, then the homology can be learned with high confidence by an algorithm that has a weak (linear) dependence on the ambient dimension. Our algorithm has a natural interpretation as a spectral learning algorithm using a combinatorial Laplacian of a suitable data-derived simplicial complex.

Journal ArticleDOI
TL;DR: This work discovers that the accuracy of a decision tree classifier can be much improved if the "complete information" of a data item (taking into account the probability density function (pdf)) is utilized.
Abstract: Traditional decision tree classifiers work with data whose values are known and precise. We extend such classifiers to handle data with uncertain information. Value uncertainty arises in many applications during the data collection process. Example sources of uncertainty include measurement/quantization errors, data staleness, and multiple repeated measurements. With uncertainty, the value of a data item is often represented not by one single value, but by multiple values forming a probability distribution. Rather than abstracting uncertain data by statistical derivatives (such as mean and median), we discover that the accuracy of a decision tree classifier can be much improved if the "complete information" of a data item (taking into account the probability density function (pdf)) is utilized. We extend classical decision tree building algorithms to handle data tuples with uncertain values. Extensive experiments have been conducted which show that the resulting classifiers are more accurate than those using value averages. Since processing pdfs is computationally more costly than processing single values (e.g., averages), decision tree construction on uncertain data is more CPU demanding than that for certain data. To tackle this problem, we propose a series of pruning techniques that can greatly improve construction efficiency.

Journal ArticleDOI
TL;DR: In this paper, it was shown that a stochastic logistic population under regime switching controlled by a Markov chain is either stochastically permanent or extinctive, and they obtained the sufficient and necessary conditions under some assumptions.

Journal ArticleDOI
TL;DR: When and why the linear discriminant analysis (LDA) has poor performance is explored and a sparse LDA is proposed that is asymptotically optimal under some sparsity conditions on the unknown parameters.
Abstract: In many social, economical, biological and medical studies, one objective is to classify a subject into one of several classes based on a set of variables observed from the subject. Because the probability distribution of the variables is usually unknown, the rule of classification is constructed using a training sample. The well-known linear discriminant analysis (LDA) works well for the situation where the number of variables used for classification is much smaller than the training sample size. Because of the advance in technologies, modern statistical studies often face classification problems with the number of variables much larger than the sample size, and the LDA may perform poorly. We explore when and why the LDA has poor performance and propose a sparse LDA that is asymptotically optimal under some sparsity conditions on the unknown parameters. For illustration of application, we discuss an example of classifying human cancer into two classes of leukemia based on a set of 7,129 genes and a training sample of size 72. A simulation is also conducted to check the performance of the proposed method.

Journal ArticleDOI
TL;DR: The theoretical background, algorithm and validation of a recently developed novel method of ranking based on the sum of ranking differences, called Sum of Ranking differences (SRD) and Comparison of Ranks by Random Numbers (CRNN), respectively are described.
Abstract: This paper describes the theoretical background, algorithm and validation of a recently developed novel method of ranking based on the sum of ranking differences [TrAC Trends Anal. Chem. 2010; 29: 101–109]. The ranking is intended to compare models, methods, analytical techniques, panel members, etc. and it is entirely general. First, the objects to be ranked are arranged in the rows and the variables (for example model results) in the columns of an input matrix. Then, the results of each model for each object are ranked in the order of increasing magnitude. The difference between the rank of the model results and the rank of the known, reference or standard results is then computed. (If the golden standard ranking is known the rank differences can be completed easily.) In the end, the absolute values of the differences are summed together for all models to be compared. The sum of ranking differences (SRD) arranges the models in a unique and unambiguous way. The closer the SRD value to zero (i.e. the closer the ranking to the golden standard), the better is the model. The proximity of SRD values shows similarity of the models, whereas large variation will imply dissimilarity. Generally, the average can be accepted as the golden standard in the absence of known or reference results, even if bias is also present in the model results in addition to random error. Validation of the SRD method can be carried out by using simulated random numbers for comparison (permutation test). A recursive algorithm calculates the discrete distribution for a small number of objects (n < 14), whereas the normal distribution is used as a reasonable approximation if the number of objects is large. The theoretical distribution is visualized for random numbers and can be used to identify SRD values for models that are far from being random. The ranking and validation procedures are called Sum of Ranking differences (SRD) and Comparison of Ranks by Random Numbers (CRNN), respectively. Copyright © 2010 John Wiley & Sons, Ltd.