Showing papers in &quot;Statistics and Computing in 1994&quot;

Genetic programming as a means for programming computers by natural selection

TL;DR: This tutorial covers the canonical genetic algorithm as well as more experimental forms of genetic algorithms, including parallel island models and parallel cellular genetic algorithms.

...read moreread less

Abstract: This tutorial covers the canonical genetic algorithm as well as more experimental forms of genetic algorithms, including parallel island models and parallel cellular genetic algorithms. The tutorial also illustrates genetic search by hyperplane sampling. The theoretical foundations of genetic algorithms are reviewed, include the schema theorem as well as recently developed exact models of the canonical genetic algorithm.

...read moreread less

3,967 citations

Journal Article•DOI•

[...]

John R. Koza¹•Institutions (1)

Stanford University¹

Genetic algorithms and scatter search: unsuspected potentials

TL;DR: The recently developed genetic programming paradigm described herein provides a way to search the space of possible computer programs for a highly fit individual computer program to solve (or approximately solve) a surprising variety of different problems from different fields.

...read moreread less

Abstract: Many seemingly different problems in machine learning, artificial intelligence, and symbolic processing can be viewed as requiring the discovery of a computer program that produces some desired output for particular inputs. When viewed in this way, the process of solving these problems becomes equivalent to searching a space of possible computer programs for a highly fit individual computer program. The recently developed genetic programming paradigm described herein provides a way to search the space of possible computer programs for a highly fit individual computer program to solve (or approximately solve) a surprising variety of different problems from different fields. In genetic programming, populations of computer programs are genetically bred using the Darwinian principle of survival of the fittest and using a genetic crossover (sexual recombination) operator appropriate for genetically mating computer programs. Genetic programming is illustrated via an example of machine learning of the Boolean 11-multiplexer function and symbolic regression of the econometric exchange equation from noisy empirical data.

...read moreread less

950 citations

Journal Article•DOI•

[...]

Fred Glover¹•Institutions (1)

University of Colorado Boulder¹

The statistics of linear models: back to basics

TL;DR: It is demonstrated that the opportunity exists to develop more advanced procedures that make fuller use of scatter search strategies and their recent extensions.

...read moreread less

Abstract: We provide a tutorial survey of connections between genetic algorithms and scatter search that have useful implications for developing new methods for optimization problems. The links between these approaches are rooted in principles underlying mathematical relaxations, which became inherited and extended by scatter search. Hybrid methods incorporating elements of genetic algorithms and scatter search are beginning to be explored in the literature, and we demonstrate that the opportunity exists to develop more advanced procedures that make fuller use of scatter search strategies and their recent extensions.

...read moreread less

144 citations

Journal Article•DOI•

[...]

John A. Nelder¹•Institutions (1)

Imperial College London¹

Computer-aided classification of human chromosomes: a review

TL;DR: Three false steps are identified and discussed: they concern constraints on parameters, neglect of marginality constraints, and confusion between non-centrality parameters and corresponding hypotheses.

...read moreread less

Abstract: Inference from the fitting of linear models is basic to statistical practice, but the development of strategies for analysis has been hindered by unnecessary complexities in the descriptions of such models. Three false steps are identified and discussed: they concern constraints on parameters, neglect of marginality constraints, and confusion between non-centrality parameters and corresponding hypotheses. Useful primitive statistical steps are discussed, and the need for strategies, rather than tactics, of analysis stressed. The implications for the development of good, fully interactive, computing software are set out, and illustrated with examples.

...read moreread less

126 citations

Journal Article•DOI•

[...]

Andrew D. Carothers¹, Jim Piper¹•Institutions (1)

Western General Hospital¹

Applied State Space Modelling of Non-Gaussian Time Series using Integration-based Kalman-filtering

TL;DR: Although completely error-free classification has not been, nor is ever likely to be, achieved, error rates have been reduced to levels that are acceptable for many routine purposes and the subject remains of interest to those involved in statistical classification.

...read moreread less

Abstract: Computer-aided imaging systems are now widely used in cytogenetic laboratories to reduce the tedium and labour-intensiveness of traditional methods of chromosome analysis. Automatic chromosome classification is an essential component of such systems, and we review here the statistical techniques that have contributed towards it. Although completely error-free classification has not been, nor is ever likely to be, achieved, error rates have been reduced to levels that are acceptable for many routine purposes. Further reductions are likely to be achieved through advances in basic biology rather than in statistical methodology. Nevertheless, the subject remains of interest to those involved in statistical classification, because of its intrinsic challenges and because of the large body of existing results with which to compare new approaches. Also, the existence of very large databases of correctly-classified chromosomes provides a valuable resource for empirical investigations of the statistical properties of classifiers.

...read moreread less

90 citations

Journal Article•DOI•

[...]

Sylvia Frühwirth-Schnatter

Gray codes for randomization procedures

TL;DR: On-line filtering for non-Gaussian dynamic (state space) models by approximate computation of the first two posterior moments using efficient numerical integration is demonstrated and it is proved that the posterior moments of the state vector are related to the posterior Moments of the linear predictor in a simple way.

...read moreread less

Abstract: The main topic of the paper is on-line filtering for non-Gaussian dynamic (state space) models by approximate computation of the first two posterior moments using efficient numerical integration. Based on approximating the prior of the state vector by a normal density, we prove that the posterior moments of the state vector are related to the posterior moments of the linear predictor in a simple way. For the linear predictor Gauss-Hermite integration is carried out with automatic reparametrization based on an approximate posterior mode filter. We illustrate how further topics in applied state space modelling, such as estimating hyperparameters, computing model likelihoods and predictive residuals, are managed by integration-based Kalman-filtering. The methodology derived in the paper is applied to on-line monitoring of ecological time series and filtering for small count data.

...read moreread less

68 citations

Journal Article•DOI•

[...]

Persi Diaconis¹, Susan Holmes², Susan Holmes³•Institutions (3)

Harvard University¹, Stanford University², Institut national de la recherche agronomique³

Temperature schedules for simulated annealing

TL;DR: A simple combinatorial scheme for systematically running through a complete enumeration of sample reuse procedures such as the bootstrap, Hartigan's subsets, and various permutation tests is introduced.

...read moreread less

Abstract: We introduce a simple combinatorial scheme for systematically running through a complete enumeration of sample reuse procedures such as the bootstrap, Hartigan's subsets, and various permutation tests. The scheme is based on Gray codes which give ‘tours’ through various spaces, changing only one or two points at a time. We use updating algorithms to avoid recomputing statistics and achieve substantial speedups. Several practical examples and computer codes are given.

...read moreread less

57 citations

Journal Article•DOI•

[...]

Julian Stander, Bernard W. Silverman¹•Institutions (1)

University of Bristol¹

01 Mar 1994-Statistics and Computing

TL;DR: In this paper, a dynamic programming approach is used to find the temperature schedule which is optimal for a simple minimization problem, and the optimal schedule is compared with certain standard non-optimal choices.

...read moreread less

Abstract: It is well known that the behaviour of the simulated annealing approach to optimization is crucially dependent on the choice of temperature schedule. In this paper, a dynamic programming approach is used to find the temperature schedule which is optimal for a simple minimization problem. The optimal schedule is compared with certain standard non-optimal choices. These generally perform well provided the first and last temperatures are suitably selected. Indeed, these temperatures can be chosen in such a way as to make the performance of the logarithmic schedule almost optimal. This optimal performance is fairly robust to the choice of the first temperature. The dynamic programming approach cannot be applied directly to problems of more realistic size, such as those arising in statistical image reconstruction. Nevertheless, some simulation experiments suggest that the general conclusions from the simple minimization problem do carry over to larger problems. Various families of schedules can be made to perform well with suitable choice of the first and last temperatures, and the logarithmic schedule combines good performance with reasonable robustness to the choice of the first temperature.

...read moreread less

37 citations

Journal Article•DOI•

Basic aspects of evolution strategies

[...]

Thomas Bäck, Frank Hoffmeister

Evolutionary programming: an introduction and some current directions

TL;DR: The most important features of ESs, namely their self-adaptation, as well as their robustness and potential for parallelization which they share with other evolutionary algorithms, are presented.

...read moreread less

Abstract: Evolution strategies (ESs) are a special class of probabilistic, direct, global optimization methods. They are similar to genetic algorithms but work in continuous spaces and have the additional capability of self-adapting their major strategy parameters. This paper presents the most important features of ESs, namely their self-adaptation, as well as their robustness and potential for parallelization which they share with other evolutionary algorithms.

...read moreread less

35 citations

Journal Article•DOI•

[...]

David B. Fogel

A computational framework for variable selection in multivariate regression

TL;DR: This paper reviews some of the early development of the method and focuses on three current avenues of research: pattern discovery, system identification and automatic control.

...read moreread less

Abstract: Evolutionary programming was originally proposed in 1962 as an alternative method for generating machine intelligence. This paper reviews some of the early development of the method and focuses on three current avenues of research: pattern discovery, system identification and automatic control. Recent efforts along these lines are described. In addition, the application of evolutionary algorithms to autonomous system design on parallel processing computers is briefly discussed.

...read moreread less

35 citations

Journal Article•DOI•

[...]

Bruce E. Barrett¹, J. Brian Gray¹•Institutions (1)

University of Alabama¹

Automatic starting point selection for function optimization

TL;DR: In this paper, the authors propose variable selection criteria for multivariate regression which generalize the univariate SSE criterion, and develop a computational framework based on the use of the SWEEP operator.

...read moreread less

Abstract: Stepwise variable selection procedures are computationally inexpensive methods for constructing useful regression models for a single dependent variable. At each step a variable is entered into or deleted from the current model, based on the criterion of minimizing the error sum of squares (SSE). When there is more than one dependent variable, the situation is more complex. In this article we propose variable selection criteria for multivariate regression which generalize the univariate SSE criterion. Specifically, we suggest minimizing some function of the estimated error covariance matrix: the trace, the determinant, or the largest eigenvalue. The computations associated with these criteria may be burdensome. We develop a computational framework based on the use of the SWEEP operator which greatly reduces these calculations for stepwise variable selection in multivariate regression.

...read moreread less

Journal Article•DOI•

[...]

Stephen P. Brooks¹, Byron J. T. Morgan¹•Institutions (1)

University of Kent¹

Estimation in linear models using gradient descent with early stopping

TL;DR: This paper illustrates how the selection of starting points can be made automatically by using a method based upon simulated annealing, and presents a hybrid algorithm, possessing the accuracy of traditional routines, whilst incorporating the reliability of annealed methods.

...read moreread less

Abstract: Traditional (non-stochastic) iterative methods for optimizing functions with multiple optima require a good procedure for selecting starting points. This paper illustrates how the selection of starting points can be made automatically by using a method based upon simulated annealing. We present a hybrid algorithm, possessing the accuracy of traditional routines, whilst incorporating the reliability of annealing methods, and illustrate its performance for a particularly complex practical problem.

...read moreread less

Journal Article•DOI•

[...]

K. Skouras¹, C. Goutis¹, M. J. Bramson•Institutions (1)

University College London¹

Some criteria for projection pursuit

TL;DR: In this paper, a new shrinkage estimator of the coefficients of a linear model is derived, motivated by the gradient-descent algorithm used to minimize the sum of squared errors and results from early stopping of the algorithm.

...read moreread less

Abstract: A new shrinkage estimator of the coefficients of a linear model is derived. The estimator is motivated by the gradient-descent algorithm used to minimize the sum of squared errors and results from early stopping of the algorithm. The statistical properties of the estimator are examined and compared with other well-established methods such as least squares and ridge regression, both analytically and through a simulation study. An important result is that the new estimator is shown to be comparable to other shrinkage estimators in terms of mean squared error of parameters and of predictions, and superior under certain circumstances.

...read moreread less

Journal Article•DOI•

[...]

G. Eslava¹, F. H. C. Marriott²•Institutions (2)

National Autonomous University of Mexico¹, University of Oxford²

01 Mar 1994-Statistics and Computing

TL;DR: The idea of searching for orthogonal projections, from a multidimensional space into a linear subspace, as an aid to detecting non-linear structure has been named exploratory projection pursuit as discussed by the authors.

...read moreread less

Abstract: The idea of searching for orthogonal projections, from a multidimensional space into a linear subspace, as an aid to detecting non-linear structure has been named exploratory projection pursuit.

...read moreread less

Journal Article•DOI•

Symbolic computation: a unified approach to studying likelihood

[...]

James E. Stafford, David F. Andrews¹, Yong Wang²•Institutions (2)

University of Toronto¹, Health Canada²

Calculation of marginal densities for parameters of multinomial distributions

TL;DR: The purpose of the software is to provide a practical alternative to difficult manual algebraic computations and the result is a method that is quick and free of clerical error.

...read moreread less

Abstract: We describe a set of procedures that automate many algebraic calculations common in statistical asymptotic theory. The procedures are very general and serve to unify the study of likelihood and likelihood type functions. The procedures emulate techniques one would normally carry out by hand; this strategy is emphasised throughout the paper. The purpose of the software is to provide a practical alternative to difficult manual algebraic computations. The result is a method that is quick and free of clerical error.

...read moreread less

Journal Article•DOI•

[...]

Jonathan J. Forster¹, Allan M. Skene²•Institutions (2)

University of Southampton¹, University of Nottingham²

Simulating posterior Gibbs distributions: a comparison of the Swendsen-Wang and Gibbs sampler methods

TL;DR: In this article, Gibbs sampling is used for obtaining accurate approximations to marginal densities for a large and flexible family of posterior distributions, the A family, and two alternative Monte Carlo strategies are also discussed.

...read moreread less

Abstract: The full Bayesian analysis of multinomial data using informative and flexible prior distributions has, in the past, been restricted by the technical problems involved in performing the numerical integrations required to obtain marginal densities for parameters and other functions thereof. In this paper it is shown that Gibbs sampling is suitable for obtaining accurate approximations to marginal densities for a large and flexible family of posterior distributions—the A family. The method is illustrated with a three-way contingency table. Two alternative Monte Carlo strategies are also discussed.

...read moreread less

Journal Article•DOI•

[...]

Alison Gray¹•Institutions (1)

University of Strathclyde¹

Numerical evaluation of observed sojourn time distributions for a single ion channel incorporating time interval omission

TL;DR: The Swendsen-Wang algorithm, for simulating Potts models, may be used to simulate certain types of posterior Gibbs distribution, as a special case of Edwards and Sokal (1988), and the behaviour of the algorithm is empirically compared with that of the Gibbs sampler.

...read moreread less

Abstract: We show in detail how the Swendsen-Wang algorithm, for simulating Potts models, may be used to simulate certain types of posterior Gibbs distribution, as a special case of Edwards and Sokal (1988), and we empirically compare the behaviour of the algorithm with that of the Gibbs sampler. Some marginal posterior mode and simulated annealing image restorations are also examined. Our results demonstrate the importance of the starting configuration. If this is inappropriate, the Swendsen-Wang method can suffer from critical slowing in moderately noise-free situations where the Gibbs sampler convergence is very fast, whereas the reverse is true when noise level is high.

...read moreread less

Journal Article•DOI•

[...]

Frank Ball¹, Geoffrey F. Yeo²•Institutions (2)

University of Nottingham¹, Murdoch University²

01 Mar 1994-Statistics and Computing

TL;DR: In this paper, the probability density functions of observed open (closed) sojourn-times incorporating time interval omission are computed using a system of Volterra integral equations, whose solution governs the required density function.

...read moreread less

Abstract: The dynamical aspects of single ion channel gating can be modelled by a semi-Markov process. There is aggregation of states, corresponding to the receptor channel being open or closed, and there is time interval omission, brief sojourns in either the open or closed classes of states not being detected. This paper is concerned with the computation of the probability density functions of observed open (closed) sojourn-times incorporating time interval omission. A system of Volterra integral equations is derived, whose solution governs the required density function. Numerical procedures, using iterative and multistep methods, are described for solving these equations. Examples are given, and in the special case of Markov models results are compared with those obtained by alternative methods. Probabilistic interpretations are given for the iterative methods, which also give lower bounds for the solutions.

...read moreread less

Journal Article•DOI•

Non-standard methods in evolutionary computation

[...]

Zbigniew Michalewicz¹•Institutions (1)

University of North Carolina at Charlotte¹

Automatic selection of the proper family for simultaneous confidence intervals

TL;DR: These methods maintain populations of individuals with nonlinear chromosomal structure and use ‘genetic’ operators enhanced by the problem specific knowledge.

...read moreread less

Abstract: The paper presents non-standard methods in evolutionary computation and discusses their applicability to various optimization problems. These methods maintain populations of individuals with nonlinear chromosomal structure and use ‘genetic’ operators enhanced by the problem specific knowledge.

...read moreread less

Journal Article•DOI•

[...]

William DuMouchel¹, Thomas Lane•Institutions (1)

Columbia University¹

Presto: a software package for the simulation of diffusion processes

TL;DR: In this article, computer software can monitor the types of contrasts a user examines, and select the smallest family of contrasts that is likely to be of interest, and calculate simultaneous confidence intervals for these families using a hybrid of the Bonferroni and Scheffe methods.

...read moreread less

Abstract: Statisticians often employ simultaneous confidence intervals to reduce the likelihood of their drawing false conclusions when they must make a number of comparisons. To do this properly, it is necessary to consider the family of comparisons over which simultaneous confidence must be assured. Sometimes it is not clear what family of comparisons is appropriate. We describe how computer software can monitor the types of contrasts a user examines, and select the smallest family of contrasts that is likely to be of interest. We also describe how to calculate simultaneous confidence intervals for these families using a hybrid of the Bonferroni and Scheffe methods. Our method is especially suitable for problems with discrete and continuous predictors.

...read moreread less

Journal Article•DOI•

[...]

Denis Talay¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

Statistical analysis of a portable parallel non-linear programming algorithm

TL;DR: Presto is a software which automatically generates FORTRAN code corresponding to approximation procedures of the solutions of stochastic differential systems and is an INRIA product, which is free for academic institutions, universities, etc, which already have MAPLE and X-Windows licences.

...read moreread less

Abstract: Presto is a software which automatically generates FORTRAN code corresponding to approximation procedures of the solutions of stochastic differential systems.

...read moreread less

Journal Article•DOI•

[...]

JrJung Lyu¹, Angappa Gunasekaran²•Institutions (2)

National Cheng Kung University¹, University of Vaasa²

The efficient estimation of tail probabilities for extremes of moving average processes using conditional simulation

TL;DR: This paper studies the performance of a portable parallel unconstrained non-gradient optimization algorithm, when executed in various shared-memory multiprocessor systems, compared with its non-portable code.

...read moreread less

Abstract: It is well known that the availability of cost-effective and powerful parallel computers has enhanced the ability of the operations research community to solve laborious computational problems. But many researchers argue that the lack of portability of parallel algorithms is a major drawback to utilizing parallel computers. This paper studies the performance of a portable parallel unconstrained non-gradient optimization algorithm, when executed in various shared-memory multiprocessor systems, compared with its non-portable code. Analysis of covariance is used to analyse how the algorithm's performance is affected by several factors of interest. The results yield more insights into the parallel computing.

...read moreread less

Journal Article•DOI•

[...]

Paul G. Blackwell¹•Institutions (1)

University of Sheffield¹