scispace - formally typeset
Search or ask a question

Showing papers in "Statistics and Computing in 2012"


Journal ArticleDOI
TL;DR: Approximate Bayesian Computation (ABC) methods, also known as likelihood-free techniques, have appeared in the past ten years as the most satisfactory approach to intractable likelihood problems, first in genetics then in a broader spectrum of applications as discussed by the authors.
Abstract: Approximate Bayesian Computation (ABC) methods, also known as likelihood-free techniques, have appeared in the past ten years as the most satisfactory approach to intractable likelihood problems, first in genetics then in a broader spectrum of applications. However, these methods suffer to some degree from calibration difficulties that make them rather volatile in their implementation and thus render them suspicious to the users of more traditional Monte Carlo methods. In this survey, we study the various improvements and extensions brought on the original ABC algorithm in recent years.

748 citations


Journal ArticleDOI
TL;DR: An adaptive SMC algorithm is proposed which admits a computational complexity that is linear in the number of samples and adaptively determines the simulation parameters.
Abstract: Approximate Bayesian computation (ABC) is a popular approach to address inference problems where the likelihood function is intractable, or expensive to calculate To improve over Markov chain Monte Carlo (MCMC) implementations of ABC, the use of sequential Monte Carlo (SMC) methods has recently been suggested Most effective SMC algorithms that are currently available for ABC have a computational complexity that is quadratic in the number of Monte Carlo samples (Beaumont et al, Biometrika 86:983---990, 2009; Peters et al, Technical report, 2008; Toni et al, J Roy Soc Interface 6:187---202, 2009) and require the careful choice of simulation parameters In this article an adaptive SMC algorithm is proposed which admits a computational complexity that is linear in the number of samples and adaptively determines the simulation parameters We demonstrate our algorithm on a toy example and on a birth-death-mutation model arising in epidemiology

530 citations


Journal ArticleDOI
TL;DR: The circumstances under which space-filling superiority holds are reviewed, some new arguments are provided and some motives to go beyond space- filling are clarified.
Abstract: When setting up a computer experiment, it has become a standard practice to select the inputs spread out uniformly across the available space. These so-called space-filling designs are now ubiquitous in corresponding publications and conferences. The statistical folklore is that such designs have superior properties when it comes to prediction and estimation of emulator functions. In this paper we want to review the circumstances under which this superiority holds, provide some new arguments and clarify the motives to go beyond space-filling. An overview over the state of the art of space-filling is introducing and complementing these results.

342 citations


Journal ArticleDOI
TL;DR: SUR (stepwise uncertainty reduction) strategies are derived from a Bayesian formulation of the problem of estimating a probability of failure of a function f using a Gaussian process model of f and aim at performing evaluations of f as efficiently as possible to infer the value of the probabilities of failure.
Abstract: This paper deals with the problem of estimating the volume of the excursion set of a function f:? d ?? above a given threshold, under a probability measure on ? d that is assumed to be known. In the industrial world, this corresponds to the problem of estimating a probability of failure of a system. When only an expensive-to-simulate model of the system is available, the budget for simulations is usually severely limited and therefore classical Monte Carlo methods ought to be avoided. One of the main contributions of this article is to derive SUR (stepwise uncertainty reduction) strategies from a Bayesian formulation of the problem of estimating a probability of failure. These sequential strategies use a Gaussian process model of f and aim at performing evaluations of f as efficiently as possible to infer the value of the probability of failure. We compare these strategies to other strategies also based on a Gaussian process model for estimating a probability of failure.

330 citations


Journal ArticleDOI
TL;DR: A new robust adaptive Metropolis algorithm estimating the shape of the target distribution and simultaneously coercing the acceptance rate and showing promising behaviour in an example with Student target distribution having no finite second moment.
Abstract: The adaptive Metropolis (AM) algorithm of Haario, Saksman and Tamminen (Bernoulli 7(2):223---242, 2001) uses the estimated covariance of the target distribution in the proposal distribution. This paper introduces a new robust adaptive Metropolis algorithm estimating the shape of the target distribution and simultaneously coercing the acceptance rate. The adaptation rule is computationally simple adding no extra cost compared with the AM algorithm. The adaptation strategy can be seen as a multidimensional extension of the previously proposed method adapting the scale of the proposal distribution in order to attain a given acceptance rate. The empirical results show promising behaviour of the new algorithm in an example with Student target distribution having no finite second moment, where the AM covariance estimate is unstable. In the examples with finite second moments, the performance of the new approach seems to be competitive with the AM algorithm combined with scale adaptation.

267 citations


Journal ArticleDOI
TL;DR: It is shown that estimating a (non-zero) nugget can lead to surrogate models with better statistical properties, such as predictive accuracy and coverage, in a variety of common situations.
Abstract: Most surrogate models for computer experiments are interpolators, and the most common interpolator is a Gaussian process (GP) that deliberately omits a small-scale (measurement) error term called the nugget The explanation is that computer experiments are, by definition, "deterministic", and so there is no measurement error We think this is too narrow a focus for a computer experiment and a statistically inefficient way to model them We show that estimating a (non-zero) nugget can lead to surrogate models with better statistical properties, such as predictive accuracy and coverage, in a variety of common situations

240 citations


Journal ArticleDOI
TL;DR: An introduction to the slope heuristics and an overview of the theoretical and practical results about it are presented and a new practical approach is carried out and compared to the standard dimension jump method.
Abstract: Model selection is a general paradigm which includes many statistical problems. One of the most fruitful and popular approaches to carry it out is the minimization of a penalized criterion. Birge and Massart (Probab. Theory Relat. Fields 138:33---73, 2006) have proposed a promising data-driven method to calibrate such criteria whose penalties are known up to a multiplicative factor: the "slope heuristics". Theoretical works validate this heuristic method in some situations and several papers report a promising practical behavior in various frameworks. The purpose of this work is twofold. First, an introduction to the slope heuristics and an overview of the theoretical and practical results about it are presented. Second, we focus on the practical difficulties occurring for applying the slope heuristics. A new practical approach is carried out and compared to the standard dimension jump method. All the practical solutions discussed in this paper in different frameworks are implemented and brought together in a Matlab graphical user interface called capushe. Supplemental Materials containing further information and an additional application, the capushe package and the datasets presented in this paper, are available on the journal Web site.

231 citations


Journal ArticleDOI
TL;DR: A novel strategy for simulating rare events and an associated Monte Carlo estimation of tail probabilities using a system of interacting particles and exploits a Feynman-Kac representation of that system to analyze their fluctuations.
Abstract: This paper discusses a novel strategy for simulating rare events and an associated Monte Carlo estimation of tail probabilities. Our method uses a system of interacting particles and exploits a Feynman-Kac representation of that system to analyze their fluctuations. Our precise analysis of the variance of a standard multilevel splitting algorithm reveals an opportunity for improvement. This leads to a novel method that relies on adaptive levels and produces, in the limit of an idealized version of the algorithm, estimates with optimal variance. The motivation for this theoretical work comes from problems occurring in watermarking and fingerprinting of digital contents, which represents a new field of applications of rare event simulation techniques. Some numerical results show performance close to the idealized version of our technique for these practical applications.

216 citations


Journal ArticleDOI
TL;DR: A novel family of mixture models wherein each component is modeled using a multivariate t-distribution with an eigen-decomposed covariance structure is put forth, known as the tEIGEN family.
Abstract: The last decade has seen an explosion of work on the use of mixture models for clustering The use of the Gaussian mixture model has been common practice, with constraints sometimes imposed upon the component covariance matrices to give families of mixture models Similar approaches have also been applied, albeit with less fecundity, to classification and discriminant analysis In this paper, we begin with an introduction to model-based clustering and a succinct account of the state-of-the-art We then put forth a novel family of mixture models wherein each component is modeled using a multivariate t-distribution with an eigen-decomposed covariance structure This family, which is largely a t-analogue of the well-known MCLUST family, is known as the tEIGEN family The efficacy of this family for clustering, classification, and discriminant analysis is illustrated with both real and simulated data The performance of this family is compared to its Gaussian counterpart on three real data sets

151 citations


Journal ArticleDOI
TL;DR: This article presents an ABC approximation designed to perform biased filtering for a Hidden Markov Model when the likelihood function is intractable and uses a sequential Monte Carlo algorithm to both fit and sample from the ABC approximation of the target probability density.
Abstract: Approximate Bayesian computation (ABC) has become a popular technique to facilitate Bayesian inference from complex models. In this article we present an ABC approximation designed to perform biased filtering for a Hidden Markov Model when the likelihood function is intractable. We use a sequential Monte Carlo (SMC) algorithm to both fit and sample from our ABC approximation of the target probability density. This approach is shown to, empirically, be more accurate w.r.t. the original filter than competing methods. The theoretical bias of our method is investigated; it is shown that the bias goes to zero at the expense of increased computational effort. Our approach is illustrated on a constrained sequential lasso for portfolio allocation to 15 constituents of the FTSE 100 share index.

128 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed a global sensitivity analysis methodology for stochastic computer codes, for which the result of each code run is itself random and the framework of the joint modeling of the mean and dispersion of heteroscedastic data is used.
Abstract: The global sensitivity analysis method used to quantify the influence of uncertain input variables on the variability in numerical model responses has already been applied to deterministic computer codes; deterministic means here that the same set of input variables always gives the same output value. This paper proposes a global sensitivity analysis methodology for stochastic computer codes, for which the result of each code run is itself random. The framework of the joint modeling of the mean and dispersion of heteroscedastic data is used. To deal with the complexity of computer experiment outputs, nonparametric joint models are discussed and a new Gaussian process-based joint model is proposed. The relevance of these models is analyzed based upon two case studies. Results show that the joint modeling approach yields accurate sensitivity index estimators even when heteroscedasticity is strong.

Journal ArticleDOI
TL;DR: A new Monte Carlo algorithm for the consistent and unbiased estimation of multidimensional integrals and the efficient sampling from multiddimensional densities is described, inspired by the classical splitting method.
Abstract: We describe a new Monte Carlo algorithm for the consistent and unbiased estimation of multidimensional integrals and the efficient sampling from multidimensional densities. The algorithm is inspired by the classical splitting method and can be applied to general static simulation models. We provide examples from rare-event probability estimation, counting, and sampling, demonstrating that the proposed method can outperform existing Markov chain sampling methods in terms of convergence speed and accuracy.

Journal ArticleDOI
TL;DR: An efficient EM algorithm for optimization with provable numerical convergence properties is proposed and the methodology to handle missing values in a sparse regression context is extended.
Abstract: We propose an ? 1-regularized likelihood method for estimating the inverse covariance matrix in the high-dimensional multivariate normal model in presence of missing data. Our method is based on the assumption that the data are missing at random (MAR) which entails also the completely missing at random case. The implementation of the method is non-trivial as the observed negative log-likelihood generally is a complicated and non-convex function. We propose an efficient EM algorithm for optimization with provable numerical convergence properties. Furthermore, we extend the methodology to handle missing values in a sparse regression context. We demonstrate both methods on simulated and real data.

Journal ArticleDOI
TL;DR: In this paper, a discriminative latent mixture (DLM) model is proposed to fit the data in a latent orthonormal discriminant subspace with an intrinsic dimension lower than the dimension of the original space.
Abstract: Clustering in high-dimensional spaces is nowadays a recurrent problem in many scientific domains but remains a difficult task from both the clustering accuracy and the result understanding points of view. This paper presents a discriminative latent mixture (DLM) model which fits the data in a latent orthonormal discriminative subspace with an intrinsic dimension lower than the dimension of the original space. By constraining model parameters within and between groups, a family of 12 parsimonious DLM models is exhibited which allows to fit onto various situations. An estimation algorithm, called the Fisher-EM algorithm, is also proposed for estimating both the mixture parameters and the discriminative subspace. Experiments on simulated and real datasets highlight the good performance of the proposed approach as compared to existing clustering methods while providing a useful representation of the clustered data. The method is as well applied to the clustering of mass spectrometry data.

Journal ArticleDOI
TL;DR: A latent Markov quantile regression model for longitudinal data with non-informative drop-out that allows exact inference through an ad hoc EM-type algorithm based on appropriate recursions is proposed.
Abstract: We propose a latent Markov quantile regression model for longitudinal data with non-informative drop-out. The observations, conditionally on covariates, are modeled through an asymmetric Laplace distribution. Random effects are assumed to be time-varying and to follow a first order latent Markov chain. This latter assumption is easily interpretable and allows exact inference through an ad hoc EM-type algorithm based on appropriate recursions. Finally, we illustrate the model on a benchmark data set.

Journal ArticleDOI
TL;DR: In this paper, a variant of the sequential Monte Carlo sampler by incorporating the partial rejection control mechanism of Liu (2001) is presented, which can reduce the variance of the incremental importance weights when compared with standard sequential Monte-Carlo samplers.
Abstract: We present a variant of the sequential Monte Carlo sampler by incorporating the partial rejection control mechanism of Liu (2001). We show that the resulting algorithm can be considered as a sequential Monte Carlo sampler with a modified mutation kernel. We prove that the new sampler can reduce the variance of the incremental importance weights when compared with standard sequential Monte Carlo samplers, and provide a central limit theorem. Finally, the sampler is adapted for application under the challenging approximate Bayesian computation modelling framework.

Journal ArticleDOI
TL;DR: A Bayesian extension of the latent block model for model-based block clustering of data matrices considers a block model where block parameters may be integrated out and produces a posterior defined over the number of clusters in rows and columns and cluster memberships.
Abstract: We introduce a Bayesian extension of the latent block model for model-based block clustering of data matrices. Our approach considers a block model where block parameters may be integrated out. The result is a posterior defined over the number of clusters in rows and columns and cluster memberships. The number of row and column clusters need not be known in advance as these are sampled along with cluster memberhips using Markov chain Monte Carlo. This differs from existing work on latent block models, where the number of clusters is assumed known or is chosen using some information criteria. We analyze both simulated and real data to validate the technique.

Journal ArticleDOI
TL;DR: This work derives exact, explicit and tractable formulae for the posterior distribution of variables such as the number of change-points or their positions and demonstrates that several classical Bayesian model selection criteria can be computed exactly.
Abstract: In segmentation problems, inference on change-point position and model selection are two difficult issues due to the discrete nature of change-points. In a Bayesian context, we derive exact, explicit and tractable formulae for the posterior distribution of variables such as the number of change-points or their positions. We also demonstrate that several classical Bayesian model selection criteria can be computed exactly. All these results are based on an efficient strategy to explore the whole segmentation space, which is very large. We illustrate our methodology on both simulated data and a comparative genomic hybridization profile.

Journal ArticleDOI
TL;DR: This work proposes an exact estimation procedure to obtain the maximum likelihood estimates of the fixed-effects and variance components, using a stochastic approximation of the EM algorithm, and compares the performance of the normal and the SMN models with two real data sets.
Abstract: Nonlinear mixed-effects models are very useful to analyze repeated measures data and are used in a variety of applications. Normal distributions for random effects and residual errors are usually assumed, but such assumptions make inferences vulnerable to the presence of outliers. In this work, we introduce an extension of a normal nonlinear mixed-effects model considering a subclass of elliptical contoured distributions for both random effects and residual errors. This elliptical subclass, the scale mixtures of normal (SMN) distributions, includes heavy-tailed multivariate distributions, such as Student-t, the contaminated normal and slash, among others, and represents an interesting alternative to outliers accommodation maintaining the elegance and simplicity of the maximum likelihood theory. We propose an exact estimation procedure to obtain the maximum likelihood estimates of the fixed-effects and variance components, using a stochastic approximation of the EM algorithm. We compare the performance of the normal and the SMN models with two real data sets.

Journal ArticleDOI
TL;DR: This work employs an information-theoretical framework that can be used to construct appropriate (approximately sufficient) statistics by combining different statistics until the loss of information is minimized, and demonstrates that such sets of statistics can be constructed for both parameter estimation and model selection problems.
Abstract: For nearly any challenging scientific problem evaluation of the likelihood is problematic if not impossible. Approximate Bayesian computation (ABC) allows us to employ the whole Bayesian formalism to problems where we can use simulations from a model, but cannot evaluate the likelihood directly. When summary statistics of real and simulated data are compared—rather than the data directly—information is lost, unless the summary statistics are sufficient. Sufficient statistics are, however, not common but without them statistical inference in ABC inferences are to be considered with caution. Previously other authors have attempted to combine different statistics in order to construct (approximately) sufficient statistics using search and information heuristics. Here we employ an information-theoretical framework that can be used to construct appropriate (approximately sufficient) statistics by combining different statistics until the loss of information is minimized. We start from a potentially large number of different statistics and choose the smallest set that captures (nearly) the same information as the complete set. We then demonstrate that such sets of statistics can be constructed for both parameter estimation and model selection problems, and we apply our approach to a range of illustrative and real-world model selection problems.

Journal ArticleDOI
TL;DR: This work considers a variation of the CE method whose performance does not deteriorate as the dimension of the problem increases, and illustrates the algorithm via a high-dimensional estimation problem in risk management.
Abstract: The cross-entropy (CE) method is an adaptive importance sampling procedure that has been successfully applied to a diverse range of complicated simulation problems. However, recent research has shown that in some high-dimensional settings, the likelihood ratio degeneracy problem becomes severe and the importance sampling estimator obtained from the CE algorithm becomes unreliable. We consider a variation of the CE method whose performance does not deteriorate as the dimension of the problem increases. We then illustrate the algorithm via a high-dimensional estimation problem in risk management.

Journal ArticleDOI
TL;DR: This paper presents Smooth Functional Tempering, a new population Markov Chain Monte Carlo approach for posterior estimation of parameters, which tempers towards data features rather than tempering via approximations to the posterior that are more heavily rooted in the prior.
Abstract: Differential equations are used in modeling diverse system behaviors in a wide variety of sciences. Methods for estimating the differential equation parameters traditionally depend on the inclusion of initial system states and numerically solving the equations. This paper presents Smooth Functional Tempering, a new population Markov Chain Monte Carlo approach for posterior estimation of parameters. The proposed method borrows insights from parallel tempering and model based smoothing to define a sequence of approximations to the posterior. The tempered approximations depend on relaxations of the solution to the differential equation model, reducing the need for estimating the initial system states and obtaining a numerical differential equation solution. Rather than tempering via approximations to the posterior that are more heavily rooted in the prior, this new method tempers towards data features. Using our proposed approach, we observed faster convergence and robustness to both initial values and prior distributions that do not reflect the features of the data. Two variations of the method are proposed and their performance is examined through simulation studies and a real application to the chemical reaction dynamics of producing nylon.

Journal ArticleDOI
TL;DR: This paper constructs kernels that reproduce the computer code complexity by mimicking its interaction structure by constructing a Kriging model suited for a general interaction structure, and will take advantage of the absence of interaction between some inputs.
Abstract: Kriging models have been widely used in computer experiments for the analysis of time-consuming computer codes. Based on kernels, they are flexible and can be tuned to many situations. In this paper, we construct kernels that reproduce the computer code complexity by mimicking its interaction structure. While the standard tensor-product kernel implicitly assumes that all interactions are active, the new kernels are suited for a general interaction structure, and will take advantage of the absence of interaction between some inputs. The methodology is twofold. First, the interaction structure is estimated from the data, using a first initial standard Kriging model, and represented by a so-called FANOVA graph. New FANOVA-based sensitivity indices are introduced to detect active interactions. Then this graph is used to derive the form of the kernel, and the corresponding Kriging model is estimated by maximum likelihood. The performance of the overall procedure is illustrated by several 3-dimensional and 6-dimensional simulated and real examples. A substantial improvement is observed when the computer code has a relatively high level of complexity.

Journal ArticleDOI
TL;DR: This paper reviews the method of nonparametric combination of dependent permutation tests and its main properties along with some new results in experimental and observational situations (robust testing, multi-sided alternatives and testing for survival functions).
Abstract: In recent years permutation testing methods have increased both in number of applications and in solving complex multivariate problems. When available permutation tests are essentially of an exact nonparametric nature in a conditional context, where conditioning is on the pooled observed data set which is often a set of sufficient statistics in the null hypothesis. Whereas, the reference null distribution of most parametric tests is only known asymptotically. Thus, for most sample sizes of practical interest, the possible lack of efficiency of permutation solutions may be compensated by the lack of approximation of parametric counterparts. There are many complex multivariate problems, quite common in empirical sciences, which are difficult to solve outside the conditional framework and in particular outside the method of nonparametric combination (NPC) of dependent permutation tests. In this paper we review such a method and its main properties along with some new results in experimental and observational situations (robust testing, multi-sided alternatives and testing for survival functions).

Journal ArticleDOI
TL;DR: The generalized Pareto distribution beyond a given threshold is combined with a nonparametric estimation approach below the threshold and this semiparametric setup is shown to generalize a few existing approaches and enables density estimation over the complete sample space.
Abstract: This paper is concerned with extreme value density estimation. The generalized Pareto distribution (GPD) beyond a given threshold is combined with a nonparametric estimation approach below the threshold. This semiparametric setup is shown to generalize a few existing approaches and enables density estimation over the complete sample space. Estimation is performed via the Bayesian paradigm, which helps identify model components. Estimation of all model parameters, including the threshold and higher quantiles, and prediction for future observations is provided. Simulation studies suggest a few useful guidelines to evaluate the relevance of the proposed procedures. They also provide empirical evidence about the improvement of the proposed methodology over existing approaches. Models are then applied to environmental data sets. The paper is concluded with a few directions for future work.

Journal ArticleDOI
TL;DR: This work quantifies the phenomenon that, in certain conditions, the contrast between the nearest and the farthest neighbouring points vanishes as the data dimensionality increases by bounding the tails of the probability that distances become meaningless in a distribution-free manner.
Abstract: Distance concentration is the phenomenon that, in certain conditions, the contrast between the nearest and the farthest neighbouring points vanishes as the data dimensionality increases. It affects high dimensional data processing, analysis, retrieval, and indexing, which all rely on some notion of distance or dissimilarity. Previous work has characterised this phenomenon in the limit of infinite dimensions. However, real data is finite dimensional, and hence the infinite-dimensional characterisation is insufficient. Here we quantify the phenomenon more precisely, for the possibly high but finite dimensional case in a distribution-free manner, by bounding the tails of the probability that distances become meaningless. As an application, we show how this can be used to assess the concentration of a given distance function in some unknown data distribution solely on the basis of an available data sample from it. This can be used to test and detect problematic cases more rigorously than it is currently possible, and we demonstrate the working of this approach on both synthetic data and ten real-world data sets from different domains.

Journal ArticleDOI
TL;DR: The Gaussian rank correlation as discussed by the authors is the usual correlation coefficient computed from the normal scores of the data and it has attractive robustness properties, in particular, its breakdown point is above 12%.
Abstract: The Gaussian rank correlation equals the usual correlation coefficient computed from the normal scores of the data. Although its influence function is unbounded, it still has attractive robustness properties. In particular, its breakdown point is above 12%. Moreover, the estimator is consistent and asymptotically efficient at the normal distribution. The correlation matrix obtained from pairwise Gaussian rank correlations is always positive semidefinite, and very easy to compute, also in high dimensions. We compare the properties of the Gaussian rank correlation with the popular Kendall and Spearman correlation measures. A simulation study confirms the good efficiency and robustness properties of the Gaussian rank correlation. In the empirical application, we show how it can be used for multivariate outlier detection based on robust principal component analysis.

Journal ArticleDOI
TL;DR: A new class of distributions, multivariate t distributions with the Box-Cox transformation, is proposed for mixture modeling, which provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues.
Abstract: Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components.

Journal ArticleDOI
TL;DR: The proposed methodology is particularly useful for analyzing multimodal asymmetric data as produced by major biotechnological platforms like flow cytometry and is provided with the help of an illustrative example.
Abstract: This paper deals with the problem of maximum likelihood estimation for a mixture of skew Student-t-normal distributions, which is a novel model-based tool for clustering heterogeneous (multiple groups) data in the presence of skewed and heavy-tailed outcomes. We present two analytically simple EM-type algorithms for iteratively computing the maximum likelihood estimates. The observed information matrix is derived for obtaining the asymptotic standard errors of parameter estimates. A small simulation study is conducted to demonstrate the superiority of the skew Student-t-normal distribution compared to the skew t distribution. The proposed methodology is particularly useful for analyzing multimodal asymmetric data as produced by major biotechnological platforms like flow cytometry. We provide such an application with the help of an illustrative example.

Journal ArticleDOI
TL;DR: This work considers how the tempered transitions algorithm may be tuned to increase the acceptance rates for a given number of temperatures and finds that the commonly assumed geometric spacing of temperatures is reasonable in many but not all applications.
Abstract: The method of tempered transitions was proposed by Neal (Stat. Comput. 6:353---366, 1996) for tackling the difficulties arising when using Markov chain Monte Carlo to sample from multimodal distributions. In common with methods such as simulated tempering and Metropolis-coupled MCMC, the key idea is to utilise a series of successively easier to sample distributions to improve movement around the state space. Tempered transitions does this by incorporating moves through these less modal distributions into the MCMC proposals. Unfortunately the improved movement between modes comes at a high computational cost with a low acceptance rate of expensive proposals. We consider how the algorithm may be tuned to increase the acceptance rates for a given number of temperatures. We find that the commonly assumed geometric spacing of temperatures is reasonable in many but not all applications.