Showing papers in &quot;Statistics and Computing in 2000&quot;

On sequential Monte Carlo sampling methods for Bayesian filtering

TL;DR: How and why various modern computing concepts, such as object-orientation and run-time linking, feature in the software's design are discussed and how the framework may be extended.

...read moreread less

Abstract: WinBUGS is a fully extensible modular framework for constructing and analysing Bayesian full probability models. Models may be specified either textually via the BUGS language or pictorially using a graphical interface called DoodleBUGS. WinBUGS processes the model specification and constructs an object-oriented representation of the model. The software offers a user-interface, based on dialogue boxes and menu commands, through which the model may then be analysed using Markov chain Monte Carlo techniques. In this paper we discuss how and why various modern computing concepts, such as object-orientation and run-time linking, feature in the software's design. We also discuss how the framework may be extended. It is possible to write specific applications that form an apparently seamless interface with WinBUGS for users with specialized requirements. It is also possible to interface with WinBUGS at a lower level by incorporating new object types that may be used by WinBUGS without knowledge of the modules in which they are implemented. Neither of these types of extension require access to, or even recompilation of, the WinBUGS source-code.

...read moreread less

5,620 citations

Journal Article•DOI•

[...]

Arnaud Doucet¹, Simon J. Godsill¹, Christophe Andrieu¹•Institutions (1)

University of Cambridge¹

Robust mixture modelling using the t distribution

TL;DR: An overview of methods for sequential simulation from posterior distributions for discrete time dynamic models that are typically nonlinear and non-Gaussian, and how to incorporate local linearisation methods similar to those which have previously been employed in the deterministic filtering literature are shown.

...read moreread less

Abstract: In this article, we present an overview of methods for sequential simulation from posterior distributions. These methods are of particular interest in Bayesian filtering for discrete time dynamic models that are typically nonlinear and non-Gaussian. A general importance sampling framework is developed that unifies many of the methods which have been proposed over the last few decades in several different scientific disciplines. Novel extensions to the existing methods are also proposed. We show in particular how to incorporate local linearisation methods similar to those which have previously been employed in the deterministic filtering literatures these lead to very effective importance distributions. Furthermore we describe a method which uses Rao-Blackwellisation in order to take advantage of the analytic structure present in some important classes of state-space models. In a final section we develop algorithms for prediction, smoothing and evaluation of the likelihood in dynamic models.

...read moreread less

4,810 citations

Journal Article•DOI•

[...]

David Peel¹, Geoffrey J. McLachlan¹•Institutions (1)

University of Queensland¹

Bayesian parameter estimation via variational methods

TL;DR: The use of the ECM algorithm to fit this t mixture model is described and examples of its use are given in the context of clustering multivariate data in the presence of atypical observations in the form of background noise.

...read moreread less

Abstract: Normal mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster sets of continuous multivariate data. However, for a set of data containing a group or groups of observations with longer than normal tails or atypical observations, the use of normal components may unduly affect the fit of the mixture model. In this paper, we consider a more robust approach by modelling the data by a mixture of t distributions. The use of the ECM algorithm to fit this t mixture model is described and examples of its use are given in the context of clustering multivariate data in the presence of atypical observations in the form of background noise.

...read moreread less

903 citations

Journal Article•DOI•

[...]

Tommi S. Jaakkola¹, Michael I. Jordan²•Institutions (2)

Massachusetts Institute of Technology¹, University of California, Berkeley²

Model selection for probabilistic clustering using cross-validatedlikelihood

TL;DR: It is shown that an accurate variational transformation can be used to obtain a closed form approximation to the posterior distribution of the parameters thereby yielding an approximate posterior predictive model.

...read moreread less

Abstract: We consider a logistic regression model with a Gaussian prior distribution over the parameters. We show that an accurate variational transformation can be used to obtain a closed form approximation to the posterior distribution of the parameters thereby yielding an approximate posterior predictive model. This approach is readily extended to binary graphical model with complete observations. For graphical models with incomplete observations we utilize an additional variational transformation and again obtain a closed form approximation to the posterior. Finally, we show that the dual of the regression problem gives a latent variable density model, the variational formulation of which leads to exactly solvable EM updates.

...read moreread less

632 citations

Journal Article•DOI•

[...]

Padhraic Smyth¹•Institutions (1)

California Institute of Technology¹

Statistical approaches in quantitative positron emissiontomography

TL;DR: The cross-validation approach, as well as penalized likelihood and McLachlan's bootstrap method, are applied to two data sets and the results from all three methods are in close agreement.

...read moreread less

Abstract: Cross-validated likelihood is investigated as a tool for automatically determining the appropriate number of components (given the data) in finite mixture modeling, particularly in the context of model-based probabilistic clustering. The conceptual framework for the cross-validation approach to model selection is straightforward in the sense that models are judged directly on their estimated out-of-sample predictive performance. The cross-validation approach, as well as penalized likelihood and McLachlan's bootstrap method, are applied to two data sets and the results from all three methods are in close agreement. The second data set involves a well-known clustering problem from the atmospheric science literature using historical records of upper atmosphere geopotential height in the Northern hemisphere. Cross-validated likelihood provides an interpretable and objective solution to the atmospheric clustering problem. The clusters found are in agreement with prior analyses of the same data based on non-probabilistic clustering techniques.

...read moreread less

318 citations

Journal Article•DOI•

[...]

Richard M. Leahy¹, Jinyi Qi¹•Institutions (1)

University of Southern California¹

MML clustering of multi-state, Poisson, vonMises circular and Gaussian distributions

TL;DR: Recent progress in developing statistical approaches to image estimation that can overcome limitations in direct modeling of the detector system or of the inherent statistical fluctuations in the data is reviewed.

...read moreread less

Abstract: Positron emission tomography is a medical imaging modality for producing 3D images of the spatial distribution of biochemical tracers within the human body. The images are reconstructed from data formed through detection of radiation resulting from the emission of positrons from radioisotopes tagged onto the tracer of interest. These measurements are approximate line integrals from which the image can be reconstructed using analytical inversion formulae. However these direct methods do not allow accurate modeling either of the detector system or of the inherent statistical fluctuations in the data. Here we review recent progress in developing statistical approaches to image estimation that can overcome these limitations. We describe the various components of the physical model and review different formulations of the inverse problem. The wide range of numerical procedures for solving these problems are then reviewed. Finally, we describe recent work aimed at quantifying the quality of the resulting images, both in terms of classical measures of estimator bias and variance, and also using measures that are of more direct clinical relevance.

...read moreread less

234 citations

Journal Article•DOI•

[...]

Chris S. Wallace¹, David L. Dowe¹•Institutions (1)

Monash University, Clayton campus¹

Multilocus linkage analysis by blocked Gibbs sampling

TL;DR: This work outlines how MML is used for statistical parameter estimation, and how the MML mixture modelling program, Snob, uses the message lengths from various parameter estimates to enable it to combine parameter estimation with selection of the number of components and estimation of the relative abundances of the components.

...read moreread less

Abstract: Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also statistically consistent and efficient. We provide a brief overview of MML inductive inference (Wallace C.S. and Boulton D.M. 1968. Computer Journal, 11: 185–194s Wallace C.S. and Freeman P.R. 1987. J. Royal Statistical Society (Series B), 49: 240–252s Wallace C.S. and Dowe D.L. (1999). Computer Journal), and how it has both an information-theoretic and a Bayesian interpretation. We then outline how MML is used for statistical parameter estimation, and how the MML mixture modelling program, Snob (Wallace C.S. and Boulton D.M. 1968. Computer Journal, 11: 185–194; Wallace C.S. 1986. In: Proceedings of the Nineteenth Australian Computer Science Conference (ACSC-9), Vol. 8, Monash University, Australia, pp. 357–366s Wallace C.S. and Dowe D.L. 1994b. In: Zhang C. et al. (Eds.), Proc. 7th Australian Joint Conf. on Artif. Intelligence. World Scientific, Singapore, pp. 37–44. See http://www.csse.monash.edu.au/-dld/Snob.html) uses the message lengths from various parameter estimates to enable it to combine parameter estimation with selection of the number of components and estimation of the relative abundances of the components. The message length is (to within a constant) the logarithm of the posterior probability (not a posterior density) of the theory. So, the MML theory can also be regarded as the theory with the highest posterior probability. Snob currently assumes that variables are uncorrelated within each component, and permits multi-variate data from Gaussian, discrete multi-category (or multi-state or multinomial), Poisson and von Mises circular distributions, as well as missing data. Additionally, Snob can do fully-parameterised mixture modelling, estimating the latent class assignments in addition to estimating the number of components, the relative abundances of the parameters and the component parameters. We also report on extensions of Snob for data which has sequential or spatial correlations between observations, or correlations between attributes.

...read moreread less

180 citations

Journal Article•DOI•

[...]

Alun Thomas¹, Alexander Gutin¹, Victor Abkevich¹, Aruna T. Bansal•Institutions (1)

Myriad Genetics¹

Random texture models for material structures

TL;DR: It is believed that the only approach that currently makes fully informative multilocus linkage analysis possible on large extended pedigrees is the novel application of blocked Gibbs sampling for Bayesian networks.

...read moreread less

Abstract: The problem of multilocus linkage analysis is expressed as a graphical model, making explicit a previously implicit connection, and recent developments in the field are described in this context A novel application of blocked Gibbs sampling for Bayesian networks is developed to generate inheritance matrices from an irreducible Markov chain This is used as the basis for reconstruction of historical meiotic states and approximate calculation of the likelihood function for the location of an unmapped genetic trait We believe this to be the only approach that currently makes fully informative multilocus linkage analysis possible on large extended pedigrees

...read moreread less

122 citations

Journal Article•DOI•

[...]

Dominique Jeulin¹•Institutions (1)

Mines ParisTech¹

Winding Stairs: A sampling tool to compute sensitivity indices

TL;DR: This work considers the construction and properties of some basic random structure models (point processes, random sets and random function models) for the description and for the simulation of heterogeneous materials.

...read moreread less

Abstract: We consider the construction and properties of some basic random structure models (point processes, random sets and random function models) for the description and for the simulation of heterogeneous materials. They can be specialized to three dimensional Euclidean space. Their implementation requires the use of image analysis tools.

...read moreread less

94 citations

Journal Article•DOI•

[...]

Karen Chan, Andrea Saltelli, Stefano Tarantola

A Bayesian model for local smoothing in kernel density estimation

TL;DR: Results suggest that the Jansen Winding Stairs method provides better estimates of the Total Sensitivity Indices at small sample sizes.

...read moreread less

Abstract: Sensitivity analysis aims to ascertain how each model input factor influences the variation in the model output. In performing global sensitivity analysis, we often encounter the problem of selecting the required number of runs in order to estimate the first order and/or the total indices accurately at a reasonable computational cost. The Winding Stairs sampling scheme (Jansen M.J.W., Rossing W.A.H., and Daamen R.A. 1994. In: Gasman J. and van Straten G. (Eds.), Predictability and Nonlinear Modelling in Natural Sciences and Economics. pp. 334–343.) is designed to provide an economic way to compute these indices. The main advantage of it is the multiple use of model evaluations, hence reducing the total number of model evaluations by more than half. The scheme is used in three simulation studies to compare its performance with the classic Sobol' LP_τ. Results suggest that the Jansen Winding Stairs method provides better estimates of the Total Sensitivity Indices at small sample sizes.

...read moreread less

76 citations

Journal Article•DOI•

[...]

Mark J. Brewer¹•Institutions (1)

University of Exeter¹

Finding overlapping components with MML

TL;DR: A new procedure is proposed for deriving variable bandwidths in univariate kernel density estimation, based upon likelihood cross-validation and an analysis of a Bayesian graphical model, which is shown to perform well in both theoretical and practical situations.

...read moreread less

Abstract: A new procedure is proposed for deriving variable bandwidths in univariate kernel density estimation, based upon likelihood cross-validation and an analysis of a Bayesian graphical model. The procedure admits bandwidth selection which is flexible in terms of the amount of smoothing required. In addition, the basic model can be extended to incorporate local smoothing of the density estimate. The method is shown to perform well in both theoretical and practical situations, and we compare our method with those of Abramson (The Annals of Statistics 10: 1217–1223) and Sain and Scott (Journal of the American Statistical Association 91: 1525–1534). In particular, we note that in certain cases, the Sain and Scott method performs poorly even with relatively large sample sizes. We compare various bandwidth selection methods using standard mean integrated square error criteria to assess the quality of the density estimates. We study situations where the underlying density is assumed both known and unknown, and note that in practice, our method performs well when sample sizes are small. In addition, we also apply the methods to real data, and again we believe our methods perform at least as well as existing methods.

...read moreread less

Journal Article•DOI•

[...]

Rohan A. Baxter, Jonathan J. Oliver

On predictive distributions and Bayesian networks

TL;DR: MML coding considerations allows the derivation of useful results to guide the implementation of a mixture modelling program and allows model search to be controlled based on the minimum variance for a component and the amount of data required to distinguish two overlapping components.

...read moreread less

Abstract: We use minimum message length (MML) estimation for mixture modelling. MML estimates are derived to choose the number of components in the mixture model to best describe the data and to estimate the parameters of the component densities for Gaussian mixture models. An empirical comparison of criteria prominent in the literature for estimating the number of components in a data set is performed. We have found that MML coding considerations allows the derivation of useful results to guide our implementation of a mixture modelling program. These advantages allow model search to be controlled based on the minimum variance for a component and the amount of data required to distinguish two overlapping components.

...read moreread less

Journal Article•DOI•

[...]

Petri Kontkanen¹, Petri Myllymäki¹, Tomi Silander¹, Henry Tirri¹, Peter Grünwald² - Show less +1 more•Institutions (2)

University of Helsinki¹, Stanford University²

On the use of cross-validation to assess performance in multivariate prediction

TL;DR: The experimentation with several public domain classification datasets suggest that the evidence approach produces the most accurate predictions in the log-score sense, and the evidence-based methods are also quite robust in the sense that they predict surprisingly well even when only a small fraction of the full training set is used.

...read moreread less

Abstract: In this paper we are interested in discrete prediction problems for a decision-theoretic setting, where the task is to compute the predictive distribution for a finite set of possible alternatives. This question is first addressed in a general Bayesian framework, where we consider a set of probability distributions defined by some parametric model class. Given a prior distribution on the model parameters and a set of sample data, one possible approach for determining a predictive distribution is to fix the parameters to the instantiation with the maximum a posteriori probability. A more accurate predictive distribution can be obtained by computing the evidence (marginal likelihood), i.e., the integral over all the individual parameter instantiations. As an alternative to these two approaches, we demonstrate how to use Rissanen's new definition of stochastic complexity for determining predictive distributions, and show how the evidence predictive distribution with Jeffrey's prior approaches the new stochastic complexity predictive distribution in the limit with increasing amount of sample data. To compare the alternative approaches in practice, each of the predictive distributions discussed is instantiated in the Bayesian network model family case. In particular, to determine Jeffrey's prior for this model family, we show how to compute the (expected) Fisher information matrix for a fixed but arbitrary Bayesian network structure. In the empirical part of the paper the predictive distributions are compared by using the simple tree-structured Naive Bayes model, which is used in the experiments for computational reasons. The experimentation with several public domain classification datasets suggest that the evidence approach produces the most accurate predictions in the log-score sense. The evidence-based methods are also quite robust in the sense that they predict surprisingly well even when only a small fraction of the full training set is used.

...read moreread less

Journal Article•DOI•

[...]

Philip Jonathan¹, Wojtek J. Krzanowski², W. V. McCarthy¹•Institutions (2)

Royal Dutch Shell¹, University of Exeter²

A comparison of scientific and engineering criteria for Bayesian modelselection

TL;DR: A Monte Carlo investigation of a number of variants of cross- validation for the assessment of performance of predictive models, including different values of k in leave-k-out cross-validation, and implementation either in a one-deep or a two-deep fashion.

...read moreread less

Abstract: We describe a Monte Carlo investigation of a number of variants of cross-validation for the assessment of performance of predictive models, including different values of k in leave-k-out cross-validation, and implementation either in a one-deep or a two-deep fashion. We assume an underlying linear model that is being fitted using either ridge regression or partial least squares, and vary a number of design factors such as sample size n relative to number of variables p, and error variance. The investigation encompasses both the non-singular (i.e. n > p) and the singular (i.e. n ≤ p) cases. The latter is now common in areas such as chemometrics but has as yet received little rigorous investigation. Results of the experiments enable us to reach some definite conclusions and to make some practical recommendations.

...read moreread less

Journal Article•DOI•

[...]

David Maxwell Chickering¹, David Heckerman¹•Institutions (1)

Microsoft¹

On sequential sampling Monte Carlo sampling methods for Bayesian filtering

TL;DR: It is shown experimentally that, for Bayesian-network models containing discrete variables only, the predictive performance of the model average can be significantly better than those of single models selected by either criterion, and that differences between models chosen by the two criterion can be substantial.

...read moreread less

Abstract: Given a set of possible models for variables X and a set of possible parameters for each model, the Bayesian “estimate” of the probability distribution for X given observed data is obtained by averaging over the possible models and their parameters. An often-used approximation for this estimate is obtained by selecting a single model and averaging over its parameters. The approximation is useful because it is computationally efficient, and because it provides a model that facilitates understanding of the domain. A common criterion for model selection is the posterior probability of the model. Another criterion for model selection, proposed by San Martini and Spezzafari (1984), is the predictive performance of a model for the next observation to be seen. From the standpoint of domain understanding, both criteria are useful, because one identifies the model that is most likely, whereas the other identifies the model that is the best predictor of the next observation. To highlight the difference, we refer to the posterior-probability and alternative criteria as the scientific criterion (SC) and engineering criterion (EC), respectively. When we are interested in predicting the next observation, the model-averaged estimate is at least as good as that produced by EC, which itself is at least as good as the estimate produced by SC. We show experimentally that, for Bayesian-network models containing discrete variables only, the predictive performance of the model average can be significantly better than those of single models selected by either criterion, and that differences between models selected by the two criterion can be substantial.

...read moreread less

Journal Article•

[...]

Arnaud Doucet, Simon J. Godsill, Christophe Andrieu

Statistical modeling and analysisof high-resolution Synthetic Aperture Radar images

Journal Article•DOI•

[...]

Shyam Kuttikkad, Rama Chellappa¹•Institutions (1)

University of Maryland, College Park¹

Hierarchical priors for Bayesian CART shrinkage

TL;DR: An overview of popular statistical distributions used to model real, complex, and polarimetric SAR images and two distinct methods of SAR image analysis are focused on: Constant false alarm rate processing for target detection; and pixel classification using statistical models.

...read moreread less

Abstract: A Synthetic Aperture Radar (SAR) is an imaging sensor capable of capturing high-resolution aerial images under a variety of imaging conditions. SAR images find application in remote sensing and military target detection and surveillance. Since SAR images exhibit considerable variations in signal strength, even when imaging similar features or objects belonging to the same class, probabilistic descriptions are useful for modeling SAR data. This paper includes an overview of popular statistical distributions used to model real, complex, and polarimetric SAR images. Specialized techniques are necessary for analyzing SAR images due to their unique characteristics when compared to aerial images produced by other sensors. We focus on two distinct methods of SAR image analysis in this paper: Constant false alarm rate processing for target detections and pixel classification using statistical models. Previous work done in each of these areas is reviewed and compared. Some of the popular image analysis techniques are illustrated with experimental results from real high-resolution SAR data.

...read moreread less

Journal Article•DOI•

[...]

Hugh A. Chipman¹, Robert E. McCulloch²•Institutions (2)

University of Waterloo¹, University of Texas at Austin²

Mathematical morphology: A useful set of tools for imageanalysis

TL;DR: A new class of hierarchical priors is proposed which indicate a preference for smooth local mean structure, resulting in tree models which shrink predictions from adjacent terminal node towards each other.

...read moreread less

Abstract: The Bayesian CART (classification and regression tree) approach proposed by Chipman, George and McCulloch (1998) entails putting a prior distribution on the set of all CART models and then using stochastic search to select a model. The main thrust of this paper is to propose a new class of hierarchical priors which enhance the potential of this Bayesian approach. These priors indicate a preference for smooth local mean structure, resulting in tree models which shrink predictions from adjacent terminal node towards each other. Past methods for tree shrinkage have searched for trees without shrinking, and applied shrinkage to the identified tree only after the search. By using hierarchical priors in the stochastic search, the proposed method searches for shrunk trees that fit well and improves the tree through shrinkage of predictions.

...read moreread less

Journal Article•DOI•

[...]

Edmond Joseph Breen¹, Ronald Jones¹, Hugues Talbot¹•Institutions (1)

Commonwealth Scientific and Industrial Research Organisation¹

Image processing through multiscale analysis and measurementnoise modeling

TL;DR: It is shown how the classic morphological opening and closing filters lead to measures of size via granulometries, and the use of connected openings and thinnings will be demonstrated.

...read moreread less

Abstract: In this paper we give an overview of both classical and more modern morphological techniques. We will demonstrate their utility through a range of practical examples. After discussing the fundamental morphological ideas, we show how the classic morphological opening and closing filters lead to measures of size via granulometries, and we will discuss briefly their implementation. We also present an overview of morphological segmentation techniques, and the use of connected openings and thinnings will be demonstrated. This then leads us into the more recent set-theoretic notions of graph based approaches to image analysis.

...read moreread less

Journal Article•DOI•

[...]

F. Murtagh¹, J.-L. Starck•Institutions (1)

Queen's University Belfast¹

Random coefficient models for binary longitudinal responses with attrition

TL;DR: From multiscale analysis and noise modeling, a comprehensive methodology for data analysis of 2D images, 1D signals, and point pattern data is developed, focusing on the pivotal issue of measurement noise in the physical sciences.

...read moreread less

Abstract: We describe a range of powerful multiscale analysis methods. We also focus on the pivotal issue of measurement noise in the physical sciences. From multiscale analysis and noise modeling, we develop a comprehensive methodology for data analysis of 2D images, 1D signals (or spectra), and point pattern data. Noise modeling is based on the following: (i) multiscale transforms, including wavelet transformss (ii) a data structure termed the multiresolution supports and (iii) multiple scale significance testing. The latter two aspects serve to characterize signal with respect to noise. The data analysis objectives we deal with include noise filtering and scale decomposition for visualization or feature detection.

...read moreread less

Journal Article•DOI•

[...]

Marco Alfò, Murray Aitkin¹•Institutions (1)

National Chemical Laboratory¹

3D object recognition: Representation and matching

TL;DR: A class of conditional models are proposed to deal with binary longitudinal responses, including unknown sources of heterogeneity in the regression parameters as well as serial dependence of Markovian form, estimated by means of an EM algorithm for nonparametric maximum likelihood.

...read moreread less

Abstract: We extend the approach introduced by Aitkin and Alfo (1998, Statistics and Computing, 4, pp. 289–307) to the general framework of random coefficient models and propose a class of conditional models to deal with binary longitudinal responses, including unknown sources of heterogeneity in the regression parameters as well as serial dependence of Markovian form. Furthermore, we discuss the extension of the proposed approach to the analysis of informative drop-outs, which represent a central problem in longitudinal studies, and define, as suggested by Follmann and Wu (1995, Biometrics, 51, pp. 151–168), a conditional specification of the full shared parameter model for the primary response and the missingness indicator. The model is applied to a dataset from a methadone maintenance treatment programme held in Sydney in 1986 and previously analysed by Chan et al. (1998, Australian & New Zealand Journal of Statistics, 40, pp. 1–10). All of the proposed models are estimated by means of an EM algorithm for nonparametric maximum likelihood, without assuming any specific parametric distribution for the random coefficients and for the drop-out process. A small scale simulation work is described to explore the behaviour of the extended approach in a number of different situations where informative drop-outs are present.

...read moreread less

Journal Article•DOI•

[...]

Anil K. Jain¹, Chitra Dorai²•Institutions (2)

Michigan State University¹, IBM²

Non-parametric smoothing of the location model in mixed variable discrimination

TL;DR: This paper presents both the theoretical aspects and the experimental results of a prototype recognition system based on COSMOS, the framework for representing and recognizing free-form objects.

...read moreread less

Abstract: Three-dimensional object recognition entails a number of fundamental problems in computer vision: representation of a 3D object, identification of the object from its image, estimation of its position and orientation, and registration of multiple views of the object for automatic model construction. This paper surveys three of those topics, namely representation, matching, and pose estimation. It also presents an overview of the free-form surface matching problem, and describes COSMOS, our framework for representing and recognizing free-form objects. The COSMOS system recognizes arbitrarily curved 3D rigid objects from a single view using dense surface data. We present both the theoretical aspects and the experimental results of a prototype recognition system based on COSMOS.

...read moreread less

Journal Article•DOI•

[...]

O. Asparoukhov¹, Wojtek J. Krzanowski²•Institutions (2)

Bulgarian Academy of Sciences¹, University of Exeter²

Efficient use of exact samples

TL;DR: This paper proposes non-parametric smoothing procedures for both parts of the location model, the number of parameters to be estimated is dramatically reduced and the range of applicability thereby greatly increased.

...read moreread less

Abstract: The location model is a familiar basis for discriminant analysis of mixtures of categorical and continuous variables. Its usual implementation involves second-order smoothing, using multivariate regression for the continuous variables and log-linear models for the categorical variables. In spite of the smoothing, these procedures still require many parameters to be estimated and this in turn restricts the categorical variables to a small number if implementation is to be feasible. In this paper we propose non-parametric smoothing procedures for both parts of the model. The number of parameters to be estimated is dramatically reduced and the range of applicability thereby greatly increased. The methods are illustrated on several data sets, and the performances are compared with a range of other popular discrimination techniques. The proposed method compares very favourably with all its competitors.

...read moreread less

Journal Article•DOI•

[...]

Duncan J. Murdoch¹, Jeffrey S. Rosenthal²•Institutions (2)

University of Western Ontario¹, University of Toronto²

Global optimization of the generalized cross-validation criterion

TL;DR: This paper considers using the CFTP and related algorithms to create tours so as to combine the precision of exact sampling with the efficiency of using entire tours.

...read moreread less

Abstract: Propp and Wilson (Random Structures and Algorithms (1996) 9: 223–252, Journal of Algorithms (1998) 27: 170–217) described a protocol called coupling from the past (CFTP) for exact sampling from the steady-state distribution of a Markov chain Monte Carlo (MCMC) process. In it a past time is identified from which the paths of coupled Markov chains starting at every possible state would have coalesced into a single value by the present times this value is then a sample from the steady-state distribution. Unfortunately, producing an exact sample typically requires a large computational effort. We consider the question of how to make efficient use of the sample values that are generated. In particular, we make use of regeneration events (cf. Mykland et al. Journal of the American Statistical Association (1995) 90: 233–241) to aid in the analysis of MCMC runs. In a regeneration event, the chain is in a fixed reference distribution– this allows the chain to be broken up into a series of tours which are independent, or nearly so (though they do not represent draws from the true stationary distribution). In this paper we consider using the CFTP and related algorithms to create tours. In some cases their elements are exactly in the stationary distributions their length may be fixed or random. This allows us to combine the precision of exact sampling with the efficiency of using entire tours. Several algorithms and estimators are proposed and analysed.

...read moreread less

Journal Article•DOI•

[...]

John T. Kent¹, Mohsen Mohammadzadeh²•Institutions (2)

University of Leeds¹, Tarbiat Modares University²

Bayes linear analysis for graphical models: The geometric approach to local computation and interpretive graphics

TL;DR: An algorithm based on interval analysis is presented to find the globally optimal value for the smoothing parameter, and a numerical example illustrates the performance of the algorithm.

...read moreread less

Abstract: Generalized cross-validation is a method for choosing the smoothing parameter in smoothing splines and related regularization problems. This method requires the global minimization of the generalized cross-validation function. In this paper an algorithm based on interval analysis is presented to find the globally optimal value for the smoothing parameter, and a numerical example illustrates the performance of the algorithm.

...read moreread less

Journal Article•DOI•

[...]

Michael Goldstein¹, Darren J. Wilkinson²•Institutions (2)

Durham University¹, University of Newcastle²

Ideal bootstrap estimation of expected prediction error for k-nearest neighbor classifiers: Applications for classification and error assessment

TL;DR: In this paper, Bayes linear separation is introduced as a second-order generalised conditional independence relation, and graphical models are constructed using this property, and interpretive and diagnostic shadings are given, which summarise the analysis over the associated moral graph.

...read moreread less

Abstract: This paper concerns the geometric treatment of graphical models using Bayes linear methods. We introduce Bayes linear separation as a second order generalised conditional independence relation, and Bayes linear graphical models are constructed using this property. A system of interpretive and diagnostic shadings are given, which summarise the analysis over the associated moral graph. Principles of local computation are outlined for the graphical models, and an algorithm for implementing such computation over the junction tree is described. The approach is illustrated with two examples. The first concerns sales forecasting using a multivariate dynamic linear model. The second concerns inference for the error variance matrices of the model for sales, and illustrates the generality of our geometric approach by treating the matrices directly as random objects. The examples are implemented using a freely available set of object-oriented programming tools for Bayes linear local computation and graphical diagnostic display.

...read moreread less

Journal Article•DOI•

[...]

Brian M. Steele¹, David A. Patterson¹•Institutions (1)

University of Montana¹

A permutation procedure for testing the equality of pattern hypotheses across groups involving correlation or covariance matrices

TL;DR: Analytical formulae for exact calculation of the ideal bootstrap estimate of expected prediction error for k-NN classifiers are presented and a new weighted k-nn classifier is proposed based on resampling ideas.

...read moreread less

Abstract: Euclidean distance k-nearest neighbor (k-NN) classifiers are simple nonparametric classification rules. Bootstrap methods, widely used for estimating the expected prediction error of classification rules, are motivated by the objective of calculating the ideal bootstrap estimate of expected prediction error. In practice, bootstrap methods use Monte Carlo resampling to estimate the ideal bootstrap estimate because exact calculation is generally intractable. In this article, we present analytical formulae for exact calculation of the ideal bootstrap estimate of expected prediction error for k-NN classifiers and propose a new weighted k-NN classifier based on resampling ideas. The resampling-weighted k-NN classifier replaces the k-NN posterior probability estimates by their expectations under resampling and predicts an unclassified covariate as belonging to the group with the largest resampling expectation. A simulation study and an application involving remotely sensed data show that the resampling-weighted k-NN classifier compares favorably to unweighted and distance-weighted k-NN classifiers.

...read moreread less

Journal Article•DOI•

[...]

Bill Shipley¹•Institutions (1)

Université de Sherbrooke¹

Guest editorial: Image analysis

TL;DR: The permutation procedure to test for the equality of selected elements of a covariance or correlation matrix across groups involves either centring or standardising each variable within each group before randomly permuting observations between groups.

...read moreread less

Abstract: This paper describes a permutation procedure to test for the equality of selected elements of a covariance or correlation matrix across groups. It involves either centring or standardising each variable within each group before randomly permuting observations between groups. Since the assumption of exchangeability of observations between groups does not strictly hold following such transformations, Monte Carlo simulations were used to compare expected and empirical rejection levels as a function of group size, the number of groups and distribution type (Normal, mixtures of Normals and Gamma with various values of the shape parameter). The Monte Carlo study showed that the estimated probability levels are close to those that would be obtained with an exact test except at very small sample sizes (5 or 10 observations per group). The test appears robust against non-normal data, different numbers of groups or variables per group and unequal sample sizes per group. Power was increased with increasing sample size, effect size and the number of elements in the matrix and power was decreased with increasingly unequal numbers of observations per group.

...read moreread less

Journal Article•DOI•

[...]

Mark Berman¹•Institutions (1)

Commonwealth Scientific and Industrial Research Organisation¹

Estimating means when sampling gives probabilities as well as values or “Looking a gift horse in the mouth”

Journal Article•DOI•

[...]

Robert R. Read¹, Lyn C. Thomas², Alan R. Washburn¹•Institutions (2)

Naval Postgraduate School¹, University of Edinburgh²