scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Computational and Graphical Statistics in 2004"


Journal ArticleDOI
TL;DR: In this article, a Bayesian version of P-spline is proposed for modeling nonlinear smooth effects of covariates within the additive and varying coefficient models framework, which is particularly useful in situations with changing curvature of the underlying smooth function or with highly oscillating functions.
Abstract: P-splines are an attractive approach for modeling nonlinear smooth effects of covariates within the additive and varying coefficient models framework. In this article, we first develop a Bayesian version for P-splines and generalize in a second step the approach in various ways. First, the assumption of constant smoothing parameters can be replaced by allowing the smoothing parameters to be locally adaptive. This is particularly useful in situations with changing curvature of the underlying smooth function or with highly oscillating functions. In a second extension, one-dimensional P-splines are generalized to two-dimensional surface fitting for modeling interactions between metrical covariates. In a last step, the approach is extended to situations with spatially correlated responses allowing the estimation of geoadditive models. Inference is fully Bayesian and uses recent MCMC techniques for drawing random samples from the posterior. In a couple of simulation studies the performance of Bayesian P-spline...

889 citations


Journal ArticleDOI
TL;DR: A split-merge Markov chain algorithm is proposed to address the problem of inefficient sampling for conjugate Dirichlet process mixture models by employing a new technique in which an appropriate proposal for splitting or merging components is obtained by using a restricted Gibbs sampling scan.
Abstract: This article proposes a split-merge Markov chain algorithm to address the problem of inefficient sampling for conjugate Dirichlet process mixture models. Traditional Markov chain Monte Carlo methods for Bayesian mixture models, such as Gibbs sampling, can become trapped in isolated modes corresponding to an inappropriate clustering of data points. This article describes a Metropolis-Hastings procedure that can escape such local modes by splitting or merging mixture components. Our algorithm employs a new technique in which an appropriate proposal for splitting or merging components is obtained by using a restricted Gibbs sampling scan. We demonstrate empirically that our method outperforms the Gibbs sampler in situations where two or more components are similar in structure.

527 citations


Journal ArticleDOI
TL;DR: This work reanalyzes the ion channel model using an importance sampling scheme based on a hidden Markov representation, and compares population Monte Carlo with a corresponding MCMC algorithm.
Abstract: Importance sampling methods can be iterated like MCMC algorithms, while being more robust against dependence and starting values. The population Monte Carlo principle consists of iterated generations of importance samples, with importance functions depending on the previously generated importance samples. The advantage over MCMC algorithms is that the scheme is unbiased at any iteration and can thus be stopped at any time, while iterations improve the performances of the importance function, thus leading to an adaptive importance sampling. We illustrate this method on a mixture example with multiscale importance functions. A second example reanalyzes the ion channel model using an importance sampling scheme based on a hidden Markov representation, and compares population Monte Carlo with a corresponding MCMC algorithm.

435 citations


Journal ArticleDOI
TL;DR: A general method for graphically representing any set of t(t−1)/2 all-pairwise significance statements (p values) for t treatments by a familiar letter display is described, applicable regardless of the underlying data structure or the statistical method used for comparisons.
Abstract: All-pairwise comparisons among a set of t treatments or groups are one of the most frequent tasks in applied statistics. Users of statistical software are accustomed to the familiar lines display, in which treatments that do not differ significantly, are connected by a common line or letter. Availability of the lines display is restricted mainly to the balanced analysis of variance setup. This limited availability is at stark variance with the diversity of statistical methods and models, which call for multiple comparisons. This article describes a general method for graphically representing any set of t(t−1)/2 all-pairwise significance statements (p values) for t treatments by a familiar letter display, which is applicable regardless of the underlying data structure or the statistical method used for comparisons. The method reproduces the familiar lines display in case of the balanced analysis of variance. Its broad applicability is demonstrated using data from an international multienvironment wheat yie...

287 citations


Journal ArticleDOI
TL;DR: The Medcouple as mentioned in this paper is a robust alternative to the classical skewness coefficient, which has a 25% breakdown value and a bounded influence function, and it has a fast algorithm for its computation, and investigate its finite sample behavior through simulated and real datasets.
Abstract: The asymmetry of a univariate continuous distribution is commonly measured by the classical skewness coefficient. Because this estimator is based on the first three moments of the dataset, it is strongly affected by the presence of one or more outliers. This article investigates the medcouple, a robust alternative to the classical skewness coefficient. We show that it has a 25% breakdown value and a bounded influence function. We present a fast algorithm for its computation, and investigate its finite-sample behavior through simulated and real datasets.

265 citations


Journal ArticleDOI
TL;DR: This article proposes an approach to unify exploratory data analysis with more formal statistical methods based on probability models, developed in the context of examples from fields including psychology, medicine, and social science.
Abstract: “Exploratory” and “confirmatory” data analysis can both be viewed as methods for comparing observed data to what would be obtained under an implicit or explicit statistical model. For example, many of Tukey's methods can be interpreted as checks against hypothetical linear models and Poisson distributions. In more complex situations, Bayesian methods can be useful for constructing reference distributions for various plots that are useful in exploratory data analysis. This article proposes an approach to unify exploratory data analysis with more formal statistical methods based on probability models. These ideas are developed in the context of examples from fields including psychology, medicine, and social science.

204 citations


Journal ArticleDOI
TL;DR: In this paper, the Haar-Fisz transformation is used to estimate the intensity of an inhomogeneous one-dimensional Poisson process, which is a Gaussian wavelet shrinkage method.
Abstract: This article introduces a new method for the estimation of the intensity of an inhomogeneous one-dimensional Poisson process. The Haar-Fisz transformation transforms a vector of binned Poisson counts to approximate normality with variance one. Hence we can use any suitable Gaussian wavelet shrinkage method to estimate the Poisson intensity. Since the Haar-Fisz operator does not commute with the shift operator we can dramatically improve accuracy by always cycle spinning before the Haar-Fisz transform as well as optionally after. Extensive simulations show that our approach usually significantly outperformed state-of-the-art competitors but was occasionally comparable. Our method is fast, simple, automatic, and easy to code. Our technique is applied to the estimation of the intensity of earthquakes in northern California. We show that our technique gives visually similar results to the current state-of-the-art.

176 citations


Journal ArticleDOI
TL;DR: This article proposes to fit a piecewise (multiple or simple) linear logistic regression model by recursively partitioning the data and fitting a differentLogistic regression in each partition, which allows nonlinear features of the data to be modeled without requiring variable transformations.
Abstract: Logistic regression is a powerful technique for fitting models to data with a binary response variable, but the models are difficult to interpret if collinearity, nonlinearity, or interactions are present. Besides, it is hard to judge model adequacy because there are few diagnostics for choosing variable transformations and no true goodness-of-fit test. To overcome these problems, this article proposes to fit a piecewise (multiple or simple) linear logistic regression model by recursively partitioning the data and fitting a different logistic regression in each partition. This allows nonlinear features of the data to be modeled without requiring variable transformations. The binary tree that results from the partitioning process is pruned to minimize a cross-validation estimate of the predicted deviance. This obviates the need for a formal goodness-of-fit test. The resulting model is especially easy to interpret if a simple linear logistic regression is fitted to each partition, because the tree structure...

127 citations


Journal ArticleDOI
TL;DR: The flexible moving-average function that is considered is composed of many small rectangles, which eliminates the integration problem, and the FFT allows us to compute the cross-variogram on a set of discrete lags; it is shown how to interpolate theCross-Variogram for any continuous lag, which allows to fit flexible models using standard minimization routines.
Abstract: Models for spatial autocorrelation and cross-correlation depend on the distance and direction separating two locations, and are constrained so that for all possible sets of locations, the covariance matrices implied from the models remain nonnegative-definite. Based on spatial correlation, optimal linear predictors can be constructed that yield complete maps of spatial fields from incomplete and noisy spatial data. This methodology is called kriging if the data are of only one variable type, and it is called cokriging if it is of two or more variable types. Historically, to satisfy the nonnegative-definite condition, cokriging has used coregionalization models for cross-variograms, even though this class of models is not very flexible. Recent research has shown that moving-average functions may be used to generate a large class of valid, flexible variogram models, and that they can also be used to generate valid cross-variograms that are compatible with component variograms. There are several problems wit...

114 citations


Journal ArticleDOI
TL;DR: In this paper, the generalized linear spatial models are used for parameter estimation and model selection using Markov chain Monte Carlo maximum likelihood is a feasible and very useful technique for estimating radionuclide concentrations on Rongelap Island.
Abstract: When using a model-based approach to geostatistical problems, often, due to the complexity of the models, inference relies on Markov chain Monte Carlo methods. This article focuses on the generalized linear spatial models, and demonstrates that parameter estimation and model selection using Markov chain Monte Carlo maximum likelihood is a feasible and very useful technique. A dataset of radionuclide concentrations on Rongelap Island is used to illustrate the techniques. For this dataset we demonstrate that the log-link function is not a good choice, and that there exists additional nonspatial variation which cannot be attributed to the Poisson error distribution. We also show that the interpretation of this additional variation as either micro-scale variation or measurement error has a significant impact on predictions. The techniques presented in this article would also be useful for other types of geostatistical models.

106 citations


Journal ArticleDOI
TL;DR: This article demonstrates that the impact of high dimensions is much less severe when the component displays are clustered together according to some index of merit, which reduces the dimensionality and makes interpretation easier.
Abstract: Many graphical methods for displaying multivariate data consist of arrangements of multiple displays of one or two variables; scatterplot matrices and parallel coordinates plots are two such methods. In principle these methods generalize to arbitrary numbers of variables but become difficult to interpret for even moderate numbers of variables. This article demonstrates that the impact of high dimensions is much less severe when the component displays are clustered together according to some index of merit. Effectively, this clustering reduces the dimensionality and makes interpretation easier. For scatterplot matrices and parallel coordinates plots clustering of component displays is achieved by finding suitable permutations of the variables. I discuss algorithms based on cluster analysis for finding permutations, and present examples using various indices of merit.

Journal ArticleDOI
TL;DR: A new sampling method based on an analogy with the Swendsen-Wang algorithm for the Ising model which can give substantial improvements over alternative sampling schemes in the presence of multicollinearity.
Abstract: The need to explore model uncertainty in linear regression models with many predictors has motivated improvements in Markov chain Monte Carlo sampling algorithms for Bayesian variable selection. Currently used sampling algorithms for Bayesian variable selection may perform poorly when there are severe multicollinearities among the predictors. This article describes a new sampling method based on an analogy with the Swendsen-Wang algorithm for the Ising model, and which can give substantial improvements over alternative sampling schemes in the presence of multicollinearity. In linear regression with a given set of potential predictors we can index possible models by a binary parameter vector that indicates which of the predictors are included or excluded. By thinking of the posterior distribution of this parameter as a binary spatial field, we can use auxiliary variable methods inspired by the Swendsen-Wang algorithm for the Ising model to sample from the posterior where dependence among parameters is redu...

Journal ArticleDOI
TL;DR: A method of constructing regression trees within the framework of maximum likelihood that inherits the backward fitting idea of classification and regression trees (CART) but has more rigorous justification.
Abstract: We propose a method of constructing regression trees within the framework of maximum likelihood. It inherits the backward fitting idea of classification and regression trees (CART) but has more rigorous justification. Simulation studies show that it provides more accurate tree model selection compared to CART. The analysis of a baseball dataset is given as an illustration.

Journal ArticleDOI
TL;DR: This article provides an efficient implementation of the so-called context algorithm which requires only O(n log(n) operations) operations and includes additional important new features and options: diagnostics, goodness of fit, simulation and bootstrap, residuals, and tuning the context algorithm.
Abstract: This article presents a tutorial and new, publicly available computational tools for variable length Markov chains (VLMC). VLMCs are Markov chains with the additional attractive structure that their memories depend on a variable number of lagged values, depending on what the actual past (the lagged values) looks like. They build a very flexible class of tree-structured models for categorical time series. Fitting VLMCs from data is a nontrivial computational task. We provide an efficient implementation of the so-called context algorithm which requires only O(n log(n)) operations. The implementation, which is publicly available, includes additional important new features and options: diagnostics, goodness of fit, simulation and bootstrap, residuals, and tuning the context algorithm. Our tutorial is presented with a version in R which is available from the Comprehensive R Archive Network (CRAN). The exposition is self-contained, gives rigorous and partly new mathematical descriptions, and is illustrated by a...

Journal ArticleDOI
TL;DR: In this article, the problem of making simultaneous probability statements in multivariate inferential problems based on samples from a posterior distribution is considered, and a Monte Carlo method to estimate contour probabilities and an approach based on Rao-Blackwellization is proposed.
Abstract: This article considers the problem of making simultaneous probability statements in multivariate inferential problems based on samples from a posterior distribution. The calculation of simultaneous credible bands is reviewed and—as an alternative—contour probabilities are proposed. These are defined as 1 minus the content of the highest posterior density region which just covers a certain point of interest. We discuss a Monte Carlo method to estimate contour probabilities and distinguish whether or not the functional form of the posterior density is available. In the latter case, an approach based on Rao-Blackwellization is proposed. We highlight that this new estimate has an important invariance property. We illustrate the performance of the different methods in three applications.

Journal ArticleDOI
TL;DR: In this paper, the authors present an algorithm for accommodating missing data in situations where a natural set of estimating equations exists for the complete data setting, which can correspond to the score functions from a standard, partial, or quasi-likelihood or they can be generalized estimating equations (GEEs).
Abstract: This article presents an algorithm for accommodating missing data in situations where a natural set of estimating equations exists for the complete data setting. The complete data estimating equations can correspond to the score functions from a standard, partial, or quasi-likelihood, or they can be generalized estimating equations (GEEs). In analogy to the EM, which is a special case, the method is called the ES algorithm, because it iterates between an E-Step wherein functions of the complete data are replaced by their expected values, and an S-Step where these expected values are substituted into the complete-data estimating equation, which is then solved. Convergence properties of the algorithm are established by appealing to general theory for iterative solutions to nonlinear equations. In particular, the ES algorithm (and indeed the EM) are shown to correspond to examples of nonlinear Gauss-Seidel algorithms. An added advantage of the approach is that it yields a computationally simple method for es...

Journal ArticleDOI
TL;DR: In this article, a unifying l 1-penalized likelihood approach is proposed to regularize the maximum likelihood estimation by adding an l 1 penalty of the wavelet coefficients, which works for all types of wavelets and for a range of noise distributions.
Abstract: Wavelet-based denoising techniques are well suited to estimate spatially inhomogeneous signals. Waveshrink (Donoho and Johnstone) assumes independent Gaussian errors and equispaced sampling of the signal. Various articles have relaxed some of these assumptions, but a systematic generalization to distributions such as Poisson, binomial, or Bernoulli is missing. We consider a unifying l1-penalized likelihood approach to regularize the maximum likelihood estimation by adding an l1 penalty of the wavelet coefficients. Our approach works for all types of wavelets and for a range of noise distributions. We develop both an algorithm to solve the estimation problem and rules to select the smoothing parameter automatically. In particular, using results from Poisson processes, we give an explicit formula for the universal smoothing parameter to denoise Poisson measurements. Simulations show that the procedure is an improvement over other methods. An astronomy example is given.

Journal ArticleDOI
TL;DR: In this article, the problem of optimal pair matching with two control groups is shown by a series of transformations to be equivalent to a particular form of optimal nonbipartite matching, a problem for which polynomial time algorithms exist.
Abstract: In an effort to detect hidden biases due to failure to control for an unobserved covariate, some observational or nonrandomized studies include two control groups selected to systematically vary the unobserved covariate. Comparisons of the treated group and two control groups must, of course, control for imbalances in observed covariates. Using the three groups, we form pairs optimally matched for observed covariates—that is, we optimally construct from observational data an incomplete block design. The incomplete block design may use all available data, or it may use data selectively to produce a balanced incomplete block design, or it may be the basis for constructing a matched sample when expensive outcome information is to be collected only for sampled individuals. The problem of optimal pair matching with two control groups is shown by a series of transformations to be equivalent to a particular form of optimal nonbipartite matching, a problem for which polynomial time algorithms exist. In our exampl...

Journal ArticleDOI
TL;DR: The goal is to construct graphics that not only detect the influential points but also classify the observations according to their robust distances, and gain additional insight in the data by detecting different types of deviating observations.
Abstract: Robust techniques for multivariate statistical methods—such as principal component analysis, canonical correlation analysis, and factor analysis—have been recently constructed. In contrast to the classical approach, these robust techniques are able to resist the effect of outliers. However, there does not yet exist a graphical tool to identify in a comprehensive way the data points that do not obey the model assumptions. Our goal is to construct such graphics based on empirical influence functions. These graphics not only detect the influential points but also classify the observations according to their robust distances. In this way the observations are divided into four different classes which are regular points, nonoutlying influential points, influential outliers, and noninfluential outliers. We thus gain additional insight in the data by detecting different types of deviating observations. Some real data examples will be given to show how these plots can be used in practice.

Journal ArticleDOI
TL;DR: A novel MCMC scheme is introduced for the purpose of making posterior inferences for the AFT regression model and is viewed as a simple extension of existing parametric models.
Abstract: We model the baseline distribution in the accelerated failure-time (AFT) model as a mixture of Dirichlet processes for interval-censored data. This mixture is distinct from Dirichlet process mixtures, and can be viewed as a simple extension of existing parametric models, which we believe is an advantage in the practical modeling of data. We introduce a novel MCMC scheme for the purpose of making posterior inferences for the AFT regression model and illustrate our methods with several real examples.

Journal ArticleDOI
TL;DR: This article describes a sampling algorithm based on the Swendsen-Wang algorithm for the Ising model which works well when the predictors are far from orthogonality, and extends a similar algorithm for variable selection problems in linear models.
Abstract: Bayesian approaches to prediction and the assessment of predictive uncertainty in generalized linear models are often based on averaging predictions over different models, and this requires methods for accounting for model uncertainty. When there are linear dependencies among potential predictor variables in a generalized linear model, existing Markov chain Monte Carlo algorithms for sampling from the posterior distribution on the model and parameter space in Bayesian variable selection problems may not work well. This article describes a sampling algorithm based on the Swendsen-Wang algorithm for the Ising model, and which works well when the predictors are far from orthogonality. In problems of variable selection for generalized linear models we can index different models by a binary parameter vector, where each binary variable indicates whether or not a given predictor variable is included in the model. The posterior distribution on the model is a distribution on this collection of binary strings, and ...

Journal ArticleDOI
TL;DR: In this article, the authors introduce an estimate of the distribution of the deviance residuals of generalized linear regression models and propose a new Q-Q plot where the observed deviances are plotted against the quantiles of the estimated distribution.
Abstract: The normal quantile–quantile (Q–Q) plot of residuals is a popular diagnostic tool for ordinary linear regression with normal errors. However, for some generalized linear regression models, the distribution of deviance residuals may be very far from normality,and therefore the corresponding normal Q–Q plots may be misleading to check model adequacy. We introduce an estimate of the distribution of the deviance residuals of generalized linear models. We propose a new Q–Q plot where the observed deviance residuals are plotted against the quantiles of the estimated distribution. The method is illustrated by the analysis of real and simulated data.

Journal ArticleDOI
TL;DR: It is argued that level set trees provide a useful method for exploratory data analysis and are applied for visualization of estimates of multivarate density functions.
Abstract: This article presents a method for visualization of multivariate functions. The method is based on a tree structure—called the level set tree—built from separated parts of level sets of a function. The method is applied for visualization of estimates of multivarate density functions. With different graphical representations of level set trees we may visualize the number and location of modes, excess masses associated with the modes, and certain shape characteristics of the estimate. Simulation examples are presented where projecting data to two dimensions does not help to reveal the modes of the density, but with the help of level set trees one may detect the modes. I argue that level set trees provide a useful method for exploratory data analysis.

Journal ArticleDOI
TL;DR: A primal-dual log-barrier interior point algorithm is proposed to solve the corresponding convex programming problem and a rule for the automatic selection of the smoothing parameters is derived, enabling the estimator to be fully automated in practice.
Abstract: A simple and yet powerful method is presented to estimate nonlinearly and nonparametrically the components of additive models using wavelets The estimator enjoys the good statistical and computational properties of the Waveshrink scatterplot smoother and it can be efficiently computed using the block coordinate relaxation optimization technique A rule for the automatic selection of the smoothing parameters, suitable for data mining of large datasets, is derived The wavelet-based method is then extended to estimate generalized additive models A primal-dual log-barrier interior point algorithm is proposed to solve the corresponding convex programming problem Based on an asymptotic analysis, a rule for selecting the smoothing parameters is derived, enabling the estimator to be fully automated in practice We illustrate the finite sample property with a Gaussian and a Poisson simulation

Journal ArticleDOI
TL;DR: In this paper, the authors extend the definition of wavelet variance to wavelet packets and apply it to time series of crack widths on the Brunelleschi dome of the Santa Maria del Fiore cathedral in Florence.
Abstract: In this article we extend the definition of wavelet variance to wavelet packets. We also adapt to wavelet packets an iterated cumulative sum of squares algorithm for the location of variance change points. Wavelet packets have greater decorrelation properties than standard wavelets in that they induce a finer partitioning of the frequency domain of the process generating the data. This allows our procedure to be applied to a wide class of processes. We show this on simulated data and on a benchmark time series. Our initial interest in wavelet variance change points location was motivated by an application to time series of crack widths on the Brunelleschi dome of the Santa Maria del Fiore cathedral in Florence. The structure of the dome includes an internal thick dome and an external thin one. In an effort to understand the dynamics of the crack widths we apply wavelet packet variance analysis to measurements from instruments located in the different parts of the outer and inner domes, highlighting differ...

Journal ArticleDOI
TL;DR: In this article, two projection methods, namely the iterative convex minorant algorithm (ICM) and a generalization of the Rosen algorithm (GR), were compared to the well-known EM algorithm.
Abstract: In this article, we study algorithms for computing the nonparametric maximum likelihood estimator (NPMLE) of the failure function with two types of censored data: doubly censored data and (type 2) interval-censored data. We consider two projection methods, namely the iterative convex minorant algorithm (ICM) and a generalization of the Rosen algorithm (GR) and compare these methods to the well-known EM algorithm. The comparison conducted via simulation studies shows that the hybrid algorithms that alternately use the EM and GR for doubly censored data or, alternately, use the EM and ICM for (type 2) interval-censored data appear to be much more efficient than the EM, especially in large sample situation.

Journal ArticleDOI
TL;DR: A method to estimate the normalizing constant of the autologistic model in an efficient manner that allows tasks such as maximum likelihood estimation and inference for model parameters and compares estimates of model parameters based on these methods with the commonly used pseudo-likelihood approach.
Abstract: The autologistic model is commonly used to model spatial binary data on the lattice. However, if the lattice size is too large, then exact calculation of its normalizing constant poses a major difficulty. Various different methods for estimation of model parameters, such as pseudo-likelihood, have been proposed to overcome this problem. This article presents a method to estimate the normalizing constant in an efficient manner. In particular, this allows tasks such as maximum likelihood estimation and inference for model parameters. We also consider the true likelihood approximated by the product of likelihoods for which the normalizing constant can be found by an analytic computational method by wrapping the lattice on the cylinder. This gives a simulation-free method of inference. We compare estimates of model parameters based on our new methods with the commonly used pseudo-likelihood approach. Although we have not considered Bayesian inferences here, the method can be straightforwardly extended to find...

Journal ArticleDOI
TL;DR: Essex Speaks Statistics (ESS) provides an intelligent and consistent interface between the user and statistics software that understands the syntax for numerous data analysis languages, provides consistent display and editing features across packages, and assists in the interactive or batch execution of statements by statistics packages.
Abstract: Computer programming is an important component of statistical research and data analysis. It is a necessary skill for using sophisticated statistical packages and for writing custom scripts and software to perform data analysis using modern statistical methods. Emacs Speaks Statistics (ESS) provides an intelligent and consistent interface between the user and statistics software. ESS interfaces with SAS, S-Plus, R, and other statistics packages under the Unix, Microsoft Windows, and Apple Macintosh operating systems. ESS extends the Emacs text editor to streamline the use and creation of statistical software. ESS understands the syntax for numerous data analysis languages, provides consistent display and editing features across packages, and assists in the interactive or batch execution of statements by statistics packages. We describe in detail the features that ESS provides to increase efficiency.

Journal ArticleDOI
TL;DR: In this article, the authors presented methods to project a p-dimensional dataset with classified points from s known classes onto a lower dimensional hyperplane so that the classes appear optimally separated.
Abstract: This article discusses methods to project a p-dimensional dataset with classified points from s known classes onto a lower dimensional hyperplane so that the classes appear optimally separated. Such projections can be used, for example, for data visualization and classification in lower dimensions. New methods, which are asymmetric with respect to the numbering of the groups, are introduced for s = 2. They aim at generating data projections where one class is homogeneous and optimally separated from the other class, while the other class may be widespread. They are compared to classical discriminant coordinates and other symmetric methods from the literature by a simulation study, the application to a 12-dimensional dataset of 74,159 spectra of stellar objects, and to land snails distribution data. Neighborhood-based methods are also investigated, where local information about the separation of the classes is averaged. The use of robust MCD-covariance matrices is suggested.

Journal ArticleDOI
TL;DR: In this paper, a constrained continuous version of the simulated annealing for a Metropolis-Hastings dynamic is proposed to estimate both the space deformation and the isotropic correlation.
Abstract: During the past decade, a useful model for nonstationary random fields has been developed. This consists of reducing the random field of interest to isotropy via a bijective bi-continuous deformation of the index space. Then the problem consists of estimating this space deformation together with the isotropic correlation in the deformed index space. We propose to estimate both this space deformation and this isotropic correlation using a constrained continuous version of the simulated annealing for a Metropolis-Hastings dynamic. This method provides a nonparametric estimation of the deformation which has the required property to be bijective; so far, the previous nonparametric methods do not guarantee this property. We illustrate our work with two examples, one concerning a precipitation dataset. We also give one idea of how spatial prediction should proceed in the new coordinate space.