scispace - formally typeset
Search or ask a question

Showing papers in "Statistics and Computing in 2000"


Journal ArticleDOI
TL;DR: How and why various modern computing concepts, such as object-orientation and run-time linking, feature in the software's design are discussed and how the framework may be extended.
Abstract: WinBUGS is a fully extensible modular framework for constructing and analysing Bayesian full probability models. Models may be specified either textually via the BUGS language or pictorially using a graphical interface called DoodleBUGS. WinBUGS processes the model specification and constructs an object-oriented representation of the model. The software offers a user-interface, based on dialogue boxes and menu commands, through which the model may then be analysed using Markov chain Monte Carlo techniques. In this paper we discuss how and why various modern computing concepts, such as object-orientation and run-time linking, feature in the software's design. We also discuss how the framework may be extended. It is possible to write specific applications that form an apparently seamless interface with WinBUGS for users with specialized requirements. It is also possible to interface with WinBUGS at a lower level by incorporating new object types that may be used by WinBUGS without knowledge of the modules in which they are implemented. Neither of these types of extension require access to, or even recompilation of, the WinBUGS source-code.

5,620 citations


Journal ArticleDOI
TL;DR: An overview of methods for sequential simulation from posterior distributions for discrete time dynamic models that are typically nonlinear and non-Gaussian, and how to incorporate local linearisation methods similar to those which have previously been employed in the deterministic filtering literature are shown.
Abstract: In this article, we present an overview of methods for sequential simulation from posterior distributions. These methods are of particular interest in Bayesian filtering for discrete time dynamic models that are typically nonlinear and non-Gaussian. A general importance sampling framework is developed that unifies many of the methods which have been proposed over the last few decades in several different scientific disciplines. Novel extensions to the existing methods are also proposed. We show in particular how to incorporate local linearisation methods similar to those which have previously been employed in the deterministic filtering literatures these lead to very effective importance distributions. Furthermore we describe a method which uses Rao-Blackwellisation in order to take advantage of the analytic structure present in some important classes of state-space models. In a final section we develop algorithms for prediction, smoothing and evaluation of the likelihood in dynamic models.

4,810 citations


Journal ArticleDOI
TL;DR: The use of the ECM algorithm to fit this t mixture model is described and examples of its use are given in the context of clustering multivariate data in the presence of atypical observations in the form of background noise.
Abstract: Normal mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster sets of continuous multivariate data. However, for a set of data containing a group or groups of observations with longer than normal tails or atypical observations, the use of normal components may unduly affect the fit of the mixture model. In this paper, we consider a more robust approach by modelling the data by a mixture of t distributions. The use of the ECM algorithm to fit this t mixture model is described and examples of its use are given in the context of clustering multivariate data in the presence of atypical observations in the form of background noise.

903 citations


Journal ArticleDOI
TL;DR: It is shown that an accurate variational transformation can be used to obtain a closed form approximation to the posterior distribution of the parameters thereby yielding an approximate posterior predictive model.
Abstract: We consider a logistic regression model with a Gaussian prior distribution over the parameters. We show that an accurate variational transformation can be used to obtain a closed form approximation to the posterior distribution of the parameters thereby yielding an approximate posterior predictive model. This approach is readily extended to binary graphical model with complete observations. For graphical models with incomplete observations we utilize an additional variational transformation and again obtain a closed form approximation to the posterior. Finally, we show that the dual of the regression problem gives a latent variable density model, the variational formulation of which leads to exactly solvable EM updates.

632 citations


Journal ArticleDOI
TL;DR: The cross-validation approach, as well as penalized likelihood and McLachlan's bootstrap method, are applied to two data sets and the results from all three methods are in close agreement.
Abstract: Cross-validated likelihood is investigated as a tool for automatically determining the appropriate number of components (given the data) in finite mixture modeling, particularly in the context of model-based probabilistic clustering. The conceptual framework for the cross-validation approach to model selection is straightforward in the sense that models are judged directly on their estimated out-of-sample predictive performance. The cross-validation approach, as well as penalized likelihood and McLachlan's bootstrap method, are applied to two data sets and the results from all three methods are in close agreement. The second data set involves a well-known clustering problem from the atmospheric science literature using historical records of upper atmosphere geopotential height in the Northern hemisphere. Cross-validated likelihood provides an interpretable and objective solution to the atmospheric clustering problem. The clusters found are in agreement with prior analyses of the same data based on non-probabilistic clustering techniques.

318 citations


Journal ArticleDOI
TL;DR: Recent progress in developing statistical approaches to image estimation that can overcome limitations in direct modeling of the detector system or of the inherent statistical fluctuations in the data is reviewed.
Abstract: Positron emission tomography is a medical imaging modality for producing 3D images of the spatial distribution of biochemical tracers within the human body. The images are reconstructed from data formed through detection of radiation resulting from the emission of positrons from radioisotopes tagged onto the tracer of interest. These measurements are approximate line integrals from which the image can be reconstructed using analytical inversion formulae. However these direct methods do not allow accurate modeling either of the detector system or of the inherent statistical fluctuations in the data. Here we review recent progress in developing statistical approaches to image estimation that can overcome these limitations. We describe the various components of the physical model and review different formulations of the inverse problem. The wide range of numerical procedures for solving these problems are then reviewed. Finally, we describe recent work aimed at quantifying the quality of the resulting images, both in terms of classical measures of estimator bias and variance, and also using measures that are of more direct clinical relevance.

234 citations


Journal ArticleDOI
TL;DR: This work outlines how MML is used for statistical parameter estimation, and how the MML mixture modelling program, Snob, uses the message lengths from various parameter estimates to enable it to combine parameter estimation with selection of the number of components and estimation of the relative abundances of the components.
Abstract: Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also statistically consistent and efficient. We provide a brief overview of MML inductive inference (Wallace C.S. and Boulton D.M. 1968. Computer Journal, 11: 185–194s Wallace C.S. and Freeman P.R. 1987. J. Royal Statistical Society (Series B), 49: 240–252s Wallace C.S. and Dowe D.L. (1999). Computer Journal), and how it has both an information-theoretic and a Bayesian interpretation. We then outline how MML is used for statistical parameter estimation, and how the MML mixture modelling program, Snob (Wallace C.S. and Boulton D.M. 1968. Computer Journal, 11: 185–194; Wallace C.S. 1986. In: Proceedings of the Nineteenth Australian Computer Science Conference (ACSC-9), Vol. 8, Monash University, Australia, pp. 357–366s Wallace C.S. and Dowe D.L. 1994b. In: Zhang C. et al. (Eds.), Proc. 7th Australian Joint Conf. on Artif. Intelligence. World Scientific, Singapore, pp. 37–44. See http://www.csse.monash.edu.au/-dld/Snob.html) uses the message lengths from various parameter estimates to enable it to combine parameter estimation with selection of the number of components and estimation of the relative abundances of the components. The message length is (to within a constant) the logarithm of the posterior probability (not a posterior density) of the theory. So, the MML theory can also be regarded as the theory with the highest posterior probability. Snob currently assumes that variables are uncorrelated within each component, and permits multi-variate data from Gaussian, discrete multi-category (or multi-state or multinomial), Poisson and von Mises circular distributions, as well as missing data. Additionally, Snob can do fully-parameterised mixture modelling, estimating the latent class assignments in addition to estimating the number of components, the relative abundances of the parameters and the component parameters. We also report on extensions of Snob for data which has sequential or spatial correlations between observations, or correlations between attributes.

180 citations


Journal ArticleDOI
TL;DR: It is believed that the only approach that currently makes fully informative multilocus linkage analysis possible on large extended pedigrees is the novel application of blocked Gibbs sampling for Bayesian networks.
Abstract: The problem of multilocus linkage analysis is expressed as a graphical model, making explicit a previously implicit connection, and recent developments in the field are described in this context A novel application of blocked Gibbs sampling for Bayesian networks is developed to generate inheritance matrices from an irreducible Markov chain This is used as the basis for reconstruction of historical meiotic states and approximate calculation of the likelihood function for the location of an unmapped genetic trait We believe this to be the only approach that currently makes fully informative multilocus linkage analysis possible on large extended pedigrees

122 citations


Journal ArticleDOI
Dominique Jeulin1
TL;DR: This work considers the construction and properties of some basic random structure models (point processes, random sets and random function models) for the description and for the simulation of heterogeneous materials.
Abstract: We consider the construction and properties of some basic random structure models (point processes, random sets and random function models) for the description and for the simulation of heterogeneous materials. They can be specialized to three dimensional Euclidean space. Their implementation requires the use of image analysis tools.

94 citations


Journal ArticleDOI
TL;DR: Results suggest that the Jansen Winding Stairs method provides better estimates of the Total Sensitivity Indices at small sample sizes.
Abstract: Sensitivity analysis aims to ascertain how each model input factor influences the variation in the model output. In performing global sensitivity analysis, we often encounter the problem of selecting the required number of runs in order to estimate the first order and/or the total indices accurately at a reasonable computational cost. The Winding Stairs sampling scheme (Jansen M.J.W., Rossing W.A.H., and Daamen R.A. 1994. In: Gasman J. and van Straten G. (Eds.), Predictability and Nonlinear Modelling in Natural Sciences and Economics. pp. 334–343.) is designed to provide an economic way to compute these indices. The main advantage of it is the multiple use of model evaluations, hence reducing the total number of model evaluations by more than half. The scheme is used in three simulation studies to compare its performance with the classic Sobol' LP_τ. Results suggest that the Jansen Winding Stairs method provides better estimates of the Total Sensitivity Indices at small sample sizes.

76 citations


Journal ArticleDOI
TL;DR: A new procedure is proposed for deriving variable bandwidths in univariate kernel density estimation, based upon likelihood cross-validation and an analysis of a Bayesian graphical model, which is shown to perform well in both theoretical and practical situations.
Abstract: A new procedure is proposed for deriving variable bandwidths in univariate kernel density estimation, based upon likelihood cross-validation and an analysis of a Bayesian graphical model. The procedure admits bandwidth selection which is flexible in terms of the amount of smoothing required. In addition, the basic model can be extended to incorporate local smoothing of the density estimate. The method is shown to perform well in both theoretical and practical situations, and we compare our method with those of Abramson (The Annals of Statistics 10: 1217–1223) and Sain and Scott (Journal of the American Statistical Association 91: 1525–1534). In particular, we note that in certain cases, the Sain and Scott method performs poorly even with relatively large sample sizes. We compare various bandwidth selection methods using standard mean integrated square error criteria to assess the quality of the density estimates. We study situations where the underlying density is assumed both known and unknown, and note that in practice, our method performs well when sample sizes are small. In addition, we also apply the methods to real data, and again we believe our methods perform at least as well as existing methods.

Journal ArticleDOI
TL;DR: MML coding considerations allows the derivation of useful results to guide the implementation of a mixture modelling program and allows model search to be controlled based on the minimum variance for a component and the amount of data required to distinguish two overlapping components.
Abstract: We use minimum message length (MML) estimation for mixture modelling. MML estimates are derived to choose the number of components in the mixture model to best describe the data and to estimate the parameters of the component densities for Gaussian mixture models. An empirical comparison of criteria prominent in the literature for estimating the number of components in a data set is performed. We have found that MML coding considerations allows the derivation of useful results to guide our implementation of a mixture modelling program. These advantages allow model search to be controlled based on the minimum variance for a component and the amount of data required to distinguish two overlapping components.

Journal ArticleDOI
TL;DR: The experimentation with several public domain classification datasets suggest that the evidence approach produces the most accurate predictions in the log-score sense, and the evidence-based methods are also quite robust in the sense that they predict surprisingly well even when only a small fraction of the full training set is used.
Abstract: In this paper we are interested in discrete prediction problems for a decision-theoretic setting, where the task is to compute the predictive distribution for a finite set of possible alternatives. This question is first addressed in a general Bayesian framework, where we consider a set of probability distributions defined by some parametric model class. Given a prior distribution on the model parameters and a set of sample data, one possible approach for determining a predictive distribution is to fix the parameters to the instantiation with the maximum a posteriori probability. A more accurate predictive distribution can be obtained by computing the evidence (marginal likelihood), i.e., the integral over all the individual parameter instantiations. As an alternative to these two approaches, we demonstrate how to use Rissanen's new definition of stochastic complexity for determining predictive distributions, and show how the evidence predictive distribution with Jeffrey's prior approaches the new stochastic complexity predictive distribution in the limit with increasing amount of sample data. To compare the alternative approaches in practice, each of the predictive distributions discussed is instantiated in the Bayesian network model family case. In particular, to determine Jeffrey's prior for this model family, we show how to compute the (expected) Fisher information matrix for a fixed but arbitrary Bayesian network structure. In the empirical part of the paper the predictive distributions are compared by using the simple tree-structured Naive Bayes model, which is used in the experiments for computational reasons. The experimentation with several public domain classification datasets suggest that the evidence approach produces the most accurate predictions in the log-score sense. The evidence-based methods are also quite robust in the sense that they predict surprisingly well even when only a small fraction of the full training set is used.

Journal ArticleDOI
TL;DR: A Monte Carlo investigation of a number of variants of cross- validation for the assessment of performance of predictive models, including different values of k in leave-k-out cross-validation, and implementation either in a one-deep or a two-deep fashion.
Abstract: We describe a Monte Carlo investigation of a number of variants of cross-validation for the assessment of performance of predictive models, including different values of k in leave-k-out cross-validation, and implementation either in a one-deep or a two-deep fashion. We assume an underlying linear model that is being fitted using either ridge regression or partial least squares, and vary a number of design factors such as sample size n relative to number of variables p, and error variance. The investigation encompasses both the non-singular (i.e. n > p) and the singular (i.e. n ≤ p) cases. The latter is now common in areas such as chemometrics but has as yet received little rigorous investigation. Results of the experiments enable us to reach some definite conclusions and to make some practical recommendations.

Journal ArticleDOI
TL;DR: It is shown experimentally that, for Bayesian-network models containing discrete variables only, the predictive performance of the model average can be significantly better than those of single models selected by either criterion, and that differences between models chosen by the two criterion can be substantial.
Abstract: Given a set of possible models for variables X and a set of possible parameters for each model, the Bayesian “estimate” of the probability distribution for X given observed data is obtained by averaging over the possible models and their parameters. An often-used approximation for this estimate is obtained by selecting a single model and averaging over its parameters. The approximation is useful because it is computationally efficient, and because it provides a model that facilitates understanding of the domain. A common criterion for model selection is the posterior probability of the model. Another criterion for model selection, proposed by San Martini and Spezzafari (1984), is the predictive performance of a model for the next observation to be seen. From the standpoint of domain understanding, both criteria are useful, because one identifies the model that is most likely, whereas the other identifies the model that is the best predictor of the next observation. To highlight the difference, we refer to the posterior-probability and alternative criteria as the scientific criterion (SC) and engineering criterion (EC), respectively. When we are interested in predicting the next observation, the model-averaged estimate is at least as good as that produced by EC, which itself is at least as good as the estimate produced by SC. We show experimentally that, for Bayesian-network models containing discrete variables only, the predictive performance of the model average can be significantly better than those of single models selected by either criterion, and that differences between models selected by the two criterion can be substantial.


Journal ArticleDOI
TL;DR: An overview of popular statistical distributions used to model real, complex, and polarimetric SAR images and two distinct methods of SAR image analysis are focused on: Constant false alarm rate processing for target detection; and pixel classification using statistical models.
Abstract: A Synthetic Aperture Radar (SAR) is an imaging sensor capable of capturing high-resolution aerial images under a variety of imaging conditions. SAR images find application in remote sensing and military target detection and surveillance. Since SAR images exhibit considerable variations in signal strength, even when imaging similar features or objects belonging to the same class, probabilistic descriptions are useful for modeling SAR data. This paper includes an overview of popular statistical distributions used to model real, complex, and polarimetric SAR images. Specialized techniques are necessary for analyzing SAR images due to their unique characteristics when compared to aerial images produced by other sensors. We focus on two distinct methods of SAR image analysis in this paper: Constant false alarm rate processing for target detections and pixel classification using statistical models. Previous work done in each of these areas is reviewed and compared. Some of the popular image analysis techniques are illustrated with experimental results from real high-resolution SAR data.

Journal ArticleDOI
TL;DR: A new class of hierarchical priors is proposed which indicate a preference for smooth local mean structure, resulting in tree models which shrink predictions from adjacent terminal node towards each other.
Abstract: The Bayesian CART (classification and regression tree) approach proposed by Chipman, George and McCulloch (1998) entails putting a prior distribution on the set of all CART models and then using stochastic search to select a model. The main thrust of this paper is to propose a new class of hierarchical priors which enhance the potential of this Bayesian approach. These priors indicate a preference for smooth local mean structure, resulting in tree models which shrink predictions from adjacent terminal node towards each other. Past methods for tree shrinkage have searched for trees without shrinking, and applied shrinkage to the identified tree only after the search. By using hierarchical priors in the stochastic search, the proposed method searches for shrunk trees that fit well and improves the tree through shrinkage of predictions.

Journal ArticleDOI
TL;DR: It is shown how the classic morphological opening and closing filters lead to measures of size via granulometries, and the use of connected openings and thinnings will be demonstrated.
Abstract: In this paper we give an overview of both classical and more modern morphological techniques. We will demonstrate their utility through a range of practical examples. After discussing the fundamental morphological ideas, we show how the classic morphological opening and closing filters lead to measures of size via granulometries, and we will discuss briefly their implementation. We also present an overview of morphological segmentation techniques, and the use of connected openings and thinnings will be demonstrated. This then leads us into the more recent set-theoretic notions of graph based approaches to image analysis.

Journal ArticleDOI
TL;DR: From multiscale analysis and noise modeling, a comprehensive methodology for data analysis of 2D images, 1D signals, and point pattern data is developed, focusing on the pivotal issue of measurement noise in the physical sciences.
Abstract: We describe a range of powerful multiscale analysis methods. We also focus on the pivotal issue of measurement noise in the physical sciences. From multiscale analysis and noise modeling, we develop a comprehensive methodology for data analysis of 2D images, 1D signals (or spectra), and point pattern data. Noise modeling is based on the following: (i) multiscale transforms, including wavelet transformss (ii) a data structure termed the multiresolution supports and (iii) multiple scale significance testing. The latter two aspects serve to characterize signal with respect to noise. The data analysis objectives we deal with include noise filtering and scale decomposition for visualization or feature detection.

Journal ArticleDOI
TL;DR: A class of conditional models are proposed to deal with binary longitudinal responses, including unknown sources of heterogeneity in the regression parameters as well as serial dependence of Markovian form, estimated by means of an EM algorithm for nonparametric maximum likelihood.
Abstract: We extend the approach introduced by Aitkin and Alfo (1998, Statistics and Computing, 4, pp. 289–307) to the general framework of random coefficient models and propose a class of conditional models to deal with binary longitudinal responses, including unknown sources of heterogeneity in the regression parameters as well as serial dependence of Markovian form. Furthermore, we discuss the extension of the proposed approach to the analysis of informative drop-outs, which represent a central problem in longitudinal studies, and define, as suggested by Follmann and Wu (1995, Biometrics, 51, pp. 151–168), a conditional specification of the full shared parameter model for the primary response and the missingness indicator. The model is applied to a dataset from a methadone maintenance treatment programme held in Sydney in 1986 and previously analysed by Chan et al. (1998, Australian & New Zealand Journal of Statistics, 40, pp. 1–10). All of the proposed models are estimated by means of an EM algorithm for nonparametric maximum likelihood, without assuming any specific parametric distribution for the random coefficients and for the drop-out process. A small scale simulation work is described to explore the behaviour of the extended approach in a number of different situations where informative drop-outs are present.

Journal ArticleDOI
TL;DR: This paper presents both the theoretical aspects and the experimental results of a prototype recognition system based on COSMOS, the framework for representing and recognizing free-form objects.
Abstract: Three-dimensional object recognition entails a number of fundamental problems in computer vision: representation of a 3D object, identification of the object from its image, estimation of its position and orientation, and registration of multiple views of the object for automatic model construction. This paper surveys three of those topics, namely representation, matching, and pose estimation. It also presents an overview of the free-form surface matching problem, and describes COSMOS, our framework for representing and recognizing free-form objects. The COSMOS system recognizes arbitrarily curved 3D rigid objects from a single view using dense surface data. We present both the theoretical aspects and the experimental results of a prototype recognition system based on COSMOS.

Journal ArticleDOI
TL;DR: This paper proposes non-parametric smoothing procedures for both parts of the location model, the number of parameters to be estimated is dramatically reduced and the range of applicability thereby greatly increased.
Abstract: The location model is a familiar basis for discriminant analysis of mixtures of categorical and continuous variables. Its usual implementation involves second-order smoothing, using multivariate regression for the continuous variables and log-linear models for the categorical variables. In spite of the smoothing, these procedures still require many parameters to be estimated and this in turn restricts the categorical variables to a small number if implementation is to be feasible. In this paper we propose non-parametric smoothing procedures for both parts of the model. The number of parameters to be estimated is dramatically reduced and the range of applicability thereby greatly increased. The methods are illustrated on several data sets, and the performances are compared with a range of other popular discrimination techniques. The proposed method compares very favourably with all its competitors.

Journal ArticleDOI
TL;DR: This paper considers using the CFTP and related algorithms to create tours so as to combine the precision of exact sampling with the efficiency of using entire tours.
Abstract: Propp and Wilson (Random Structures and Algorithms (1996) 9: 223–252, Journal of Algorithms (1998) 27: 170–217) described a protocol called coupling from the past (CFTP) for exact sampling from the steady-state distribution of a Markov chain Monte Carlo (MCMC) process. In it a past time is identified from which the paths of coupled Markov chains starting at every possible state would have coalesced into a single value by the present times this value is then a sample from the steady-state distribution. Unfortunately, producing an exact sample typically requires a large computational effort. We consider the question of how to make efficient use of the sample values that are generated. In particular, we make use of regeneration events (cf. Mykland et al. Journal of the American Statistical Association (1995) 90: 233–241) to aid in the analysis of MCMC runs. In a regeneration event, the chain is in a fixed reference distribution– this allows the chain to be broken up into a series of tours which are independent, or nearly so (though they do not represent draws from the true stationary distribution). In this paper we consider using the CFTP and related algorithms to create tours. In some cases their elements are exactly in the stationary distributions their length may be fixed or random. This allows us to combine the precision of exact sampling with the efficiency of using entire tours. Several algorithms and estimators are proposed and analysed.

Journal ArticleDOI
TL;DR: An algorithm based on interval analysis is presented to find the globally optimal value for the smoothing parameter, and a numerical example illustrates the performance of the algorithm.
Abstract: Generalized cross-validation is a method for choosing the smoothing parameter in smoothing splines and related regularization problems. This method requires the global minimization of the generalized cross-validation function. In this paper an algorithm based on interval analysis is presented to find the globally optimal value for the smoothing parameter, and a numerical example illustrates the performance of the algorithm.

Journal ArticleDOI
TL;DR: In this paper, Bayes linear separation is introduced as a second-order generalised conditional independence relation, and graphical models are constructed using this property, and interpretive and diagnostic shadings are given, which summarise the analysis over the associated moral graph.
Abstract: This paper concerns the geometric treatment of graphical models using Bayes linear methods. We introduce Bayes linear separation as a second order generalised conditional independence relation, and Bayes linear graphical models are constructed using this property. A system of interpretive and diagnostic shadings are given, which summarise the analysis over the associated moral graph. Principles of local computation are outlined for the graphical models, and an algorithm for implementing such computation over the junction tree is described. The approach is illustrated with two examples. The first concerns sales forecasting using a multivariate dynamic linear model. The second concerns inference for the error variance matrices of the model for sales, and illustrates the generality of our geometric approach by treating the matrices directly as random objects. The examples are implemented using a freely available set of object-oriented programming tools for Bayes linear local computation and graphical diagnostic display.

Journal ArticleDOI
TL;DR: Analytical formulae for exact calculation of the ideal bootstrap estimate of expected prediction error for k-NN classifiers are presented and a new weighted k-nn classifier is proposed based on resampling ideas.
Abstract: Euclidean distance k-nearest neighbor (k-NN) classifiers are simple nonparametric classification rules. Bootstrap methods, widely used for estimating the expected prediction error of classification rules, are motivated by the objective of calculating the ideal bootstrap estimate of expected prediction error. In practice, bootstrap methods use Monte Carlo resampling to estimate the ideal bootstrap estimate because exact calculation is generally intractable. In this article, we present analytical formulae for exact calculation of the ideal bootstrap estimate of expected prediction error for k-NN classifiers and propose a new weighted k-NN classifier based on resampling ideas. The resampling-weighted k-NN classifier replaces the k-NN posterior probability estimates by their expectations under resampling and predicts an unclassified covariate as belonging to the group with the largest resampling expectation. A simulation study and an application involving remotely sensed data show that the resampling-weighted k-NN classifier compares favorably to unweighted and distance-weighted k-NN classifiers.

Journal ArticleDOI
TL;DR: The permutation procedure to test for the equality of selected elements of a covariance or correlation matrix across groups involves either centring or standardising each variable within each group before randomly permuting observations between groups.
Abstract: This paper describes a permutation procedure to test for the equality of selected elements of a covariance or correlation matrix across groups. It involves either centring or standardising each variable within each group before randomly permuting observations between groups. Since the assumption of exchangeability of observations between groups does not strictly hold following such transformations, Monte Carlo simulations were used to compare expected and empirical rejection levels as a function of group size, the number of groups and distribution type (Normal, mixtures of Normals and Gamma with various values of the shape parameter). The Monte Carlo study showed that the estimated probability levels are close to those that would be obtained with an exact test except at very small sample sizes (5 or 10 observations per group). The test appears robust against non-normal data, different numbers of groups or variables per group and unequal sample sizes per group. Power was increased with increasing sample size, effect size and the number of elements in the matrix and power was decreased with increasingly unequal numbers of observations per group.


Journal ArticleDOI
TL;DR: This paper examines the issues of random sampling of a discrete population and offers four new estimators, each with its own strengths and liabilities, and offers some comparative performances of the four with XBAR.
Abstract: Consider the random sampling of a discrete population. The observations, as they are collected one by one, are enhanced in that the probability mass associated with each observation is also observed. The goal is to estimate the population mean. Without this extra information about probability mass, the best general purpose estimator is the arithmetic average of the observations, XBAR. The issue is whether or not the extra information can be used to improve on XBAR. This paper examines the issues and offers four new estimators, each with its own strengths and liabilities. Some comparative performances of the four with XBAR are made. The motivating application is a Monte Carlo simulation that proceeds in two stages. The first stage independently samples n characteristics to obtain a “configuration” of some kind, together with a configuration probability p obtained, if desired, as a product of n individual probabilities. A relatively expensive calculation then determines an output X as a function of the configuration. A random sample of X could simply be averaged to estimate the mean output, but there are possibly more efficient estimators on account of the known configuration probabilities.