scispace - formally typeset
Search or ask a question

Showing papers in "Statistics and Computing in 1994"


Journal ArticleDOI
TL;DR: This tutorial covers the canonical genetic algorithm as well as more experimental forms of genetic algorithms, including parallel island models and parallel cellular genetic algorithms.
Abstract: This tutorial covers the canonical genetic algorithm as well as more experimental forms of genetic algorithms, including parallel island models and parallel cellular genetic algorithms. The tutorial also illustrates genetic search by hyperplane sampling. The theoretical foundations of genetic algorithms are reviewed, include the schema theorem as well as recently developed exact models of the canonical genetic algorithm.

3,967 citations


Journal ArticleDOI
John R. Koza1
TL;DR: The recently developed genetic programming paradigm described herein provides a way to search the space of possible computer programs for a highly fit individual computer program to solve (or approximately solve) a surprising variety of different problems from different fields.
Abstract: Many seemingly different problems in machine learning, artificial intelligence, and symbolic processing can be viewed as requiring the discovery of a computer program that produces some desired output for particular inputs. When viewed in this way, the process of solving these problems becomes equivalent to searching a space of possible computer programs for a highly fit individual computer program. The recently developed genetic programming paradigm described herein provides a way to search the space of possible computer programs for a highly fit individual computer program to solve (or approximately solve) a surprising variety of different problems from different fields. In genetic programming, populations of computer programs are genetically bred using the Darwinian principle of survival of the fittest and using a genetic crossover (sexual recombination) operator appropriate for genetically mating computer programs. Genetic programming is illustrated via an example of machine learning of the Boolean 11-multiplexer function and symbolic regression of the econometric exchange equation from noisy empirical data.

950 citations


Journal ArticleDOI
TL;DR: It is demonstrated that the opportunity exists to develop more advanced procedures that make fuller use of scatter search strategies and their recent extensions.
Abstract: We provide a tutorial survey of connections between genetic algorithms and scatter search that have useful implications for developing new methods for optimization problems. The links between these approaches are rooted in principles underlying mathematical relaxations, which became inherited and extended by scatter search. Hybrid methods incorporating elements of genetic algorithms and scatter search are beginning to be explored in the literature, and we demonstrate that the opportunity exists to develop more advanced procedures that make fuller use of scatter search strategies and their recent extensions.

144 citations


Journal ArticleDOI
TL;DR: Three false steps are identified and discussed: they concern constraints on parameters, neglect of marginality constraints, and confusion between non-centrality parameters and corresponding hypotheses.
Abstract: Inference from the fitting of linear models is basic to statistical practice, but the development of strategies for analysis has been hindered by unnecessary complexities in the descriptions of such models. Three false steps are identified and discussed: they concern constraints on parameters, neglect of marginality constraints, and confusion between non-centrality parameters and corresponding hypotheses. Useful primitive statistical steps are discussed, and the need for strategies, rather than tactics, of analysis stressed. The implications for the development of good, fully interactive, computing software are set out, and illustrated with examples.

126 citations


Journal ArticleDOI
TL;DR: Although completely error-free classification has not been, nor is ever likely to be, achieved, error rates have been reduced to levels that are acceptable for many routine purposes and the subject remains of interest to those involved in statistical classification.
Abstract: Computer-aided imaging systems are now widely used in cytogenetic laboratories to reduce the tedium and labour-intensiveness of traditional methods of chromosome analysis. Automatic chromosome classification is an essential component of such systems, and we review here the statistical techniques that have contributed towards it. Although completely error-free classification has not been, nor is ever likely to be, achieved, error rates have been reduced to levels that are acceptable for many routine purposes. Further reductions are likely to be achieved through advances in basic biology rather than in statistical methodology. Nevertheless, the subject remains of interest to those involved in statistical classification, because of its intrinsic challenges and because of the large body of existing results with which to compare new approaches. Also, the existence of very large databases of correctly-classified chromosomes provides a valuable resource for empirical investigations of the statistical properties of classifiers.

90 citations


Journal ArticleDOI
TL;DR: On-line filtering for non-Gaussian dynamic (state space) models by approximate computation of the first two posterior moments using efficient numerical integration is demonstrated and it is proved that the posterior moments of the state vector are related to the posterior Moments of the linear predictor in a simple way.
Abstract: The main topic of the paper is on-line filtering for non-Gaussian dynamic (state space) models by approximate computation of the first two posterior moments using efficient numerical integration. Based on approximating the prior of the state vector by a normal density, we prove that the posterior moments of the state vector are related to the posterior moments of the linear predictor in a simple way. For the linear predictor Gauss-Hermite integration is carried out with automatic reparametrization based on an approximate posterior mode filter. We illustrate how further topics in applied state space modelling, such as estimating hyperparameters, computing model likelihoods and predictive residuals, are managed by integration-based Kalman-filtering. The methodology derived in the paper is applied to on-line monitoring of ecological time series and filtering for small count data.

68 citations


Journal ArticleDOI
TL;DR: A simple combinatorial scheme for systematically running through a complete enumeration of sample reuse procedures such as the bootstrap, Hartigan's subsets, and various permutation tests is introduced.
Abstract: We introduce a simple combinatorial scheme for systematically running through a complete enumeration of sample reuse procedures such as the bootstrap, Hartigan's subsets, and various permutation tests. The scheme is based on Gray codes which give ‘tours’ through various spaces, changing only one or two points at a time. We use updating algorithms to avoid recomputing statistics and achieve substantial speedups. Several practical examples and computer codes are given.

57 citations


Journal ArticleDOI
TL;DR: In this paper, a dynamic programming approach is used to find the temperature schedule which is optimal for a simple minimization problem, and the optimal schedule is compared with certain standard non-optimal choices.
Abstract: It is well known that the behaviour of the simulated annealing approach to optimization is crucially dependent on the choice of temperature schedule. In this paper, a dynamic programming approach is used to find the temperature schedule which is optimal for a simple minimization problem. The optimal schedule is compared with certain standard non-optimal choices. These generally perform well provided the first and last temperatures are suitably selected. Indeed, these temperatures can be chosen in such a way as to make the performance of the logarithmic schedule almost optimal. This optimal performance is fairly robust to the choice of the first temperature. The dynamic programming approach cannot be applied directly to problems of more realistic size, such as those arising in statistical image reconstruction. Nevertheless, some simulation experiments suggest that the general conclusions from the simple minimization problem do carry over to larger problems. Various families of schedules can be made to perform well with suitable choice of the first and last temperatures, and the logarithmic schedule combines good performance with reasonable robustness to the choice of the first temperature.

37 citations


Journal ArticleDOI
TL;DR: The most important features of ESs, namely their self-adaptation, as well as their robustness and potential for parallelization which they share with other evolutionary algorithms, are presented.
Abstract: Evolution strategies (ESs) are a special class of probabilistic, direct, global optimization methods. They are similar to genetic algorithms but work in continuous spaces and have the additional capability of self-adapting their major strategy parameters. This paper presents the most important features of ESs, namely their self-adaptation, as well as their robustness and potential for parallelization which they share with other evolutionary algorithms.

35 citations


Journal ArticleDOI
TL;DR: This paper reviews some of the early development of the method and focuses on three current avenues of research: pattern discovery, system identification and automatic control.
Abstract: Evolutionary programming was originally proposed in 1962 as an alternative method for generating machine intelligence. This paper reviews some of the early development of the method and focuses on three current avenues of research: pattern discovery, system identification and automatic control. Recent efforts along these lines are described. In addition, the application of evolutionary algorithms to autonomous system design on parallel processing computers is briefly discussed.

35 citations


Journal ArticleDOI
TL;DR: In this paper, the authors propose variable selection criteria for multivariate regression which generalize the univariate SSE criterion, and develop a computational framework based on the use of the SWEEP operator.
Abstract: Stepwise variable selection procedures are computationally inexpensive methods for constructing useful regression models for a single dependent variable. At each step a variable is entered into or deleted from the current model, based on the criterion of minimizing the error sum of squares (SSE). When there is more than one dependent variable, the situation is more complex. In this article we propose variable selection criteria for multivariate regression which generalize the univariate SSE criterion. Specifically, we suggest minimizing some function of the estimated error covariance matrix: the trace, the determinant, or the largest eigenvalue. The computations associated with these criteria may be burdensome. We develop a computational framework based on the use of the SWEEP operator which greatly reduces these calculations for stepwise variable selection in multivariate regression.

Journal ArticleDOI
TL;DR: This paper illustrates how the selection of starting points can be made automatically by using a method based upon simulated annealing, and presents a hybrid algorithm, possessing the accuracy of traditional routines, whilst incorporating the reliability of annealed methods.
Abstract: Traditional (non-stochastic) iterative methods for optimizing functions with multiple optima require a good procedure for selecting starting points. This paper illustrates how the selection of starting points can be made automatically by using a method based upon simulated annealing. We present a hybrid algorithm, possessing the accuracy of traditional routines, whilst incorporating the reliability of annealing methods, and illustrate its performance for a particularly complex practical problem.

Journal ArticleDOI
TL;DR: In this paper, a new shrinkage estimator of the coefficients of a linear model is derived, motivated by the gradient-descent algorithm used to minimize the sum of squared errors and results from early stopping of the algorithm.
Abstract: A new shrinkage estimator of the coefficients of a linear model is derived. The estimator is motivated by the gradient-descent algorithm used to minimize the sum of squared errors and results from early stopping of the algorithm. The statistical properties of the estimator are examined and compared with other well-established methods such as least squares and ridge regression, both analytically and through a simulation study. An important result is that the new estimator is shown to be comparable to other shrinkage estimators in terms of mean squared error of parameters and of predictions, and superior under certain circumstances.

Journal ArticleDOI
TL;DR: The idea of searching for orthogonal projections, from a multidimensional space into a linear subspace, as an aid to detecting non-linear structure has been named exploratory projection pursuit as discussed by the authors.
Abstract: The idea of searching for orthogonal projections, from a multidimensional space into a linear subspace, as an aid to detecting non-linear structure has been named exploratory projection pursuit.

Journal ArticleDOI
TL;DR: The purpose of the software is to provide a practical alternative to difficult manual algebraic computations and the result is a method that is quick and free of clerical error.
Abstract: We describe a set of procedures that automate many algebraic calculations common in statistical asymptotic theory. The procedures are very general and serve to unify the study of likelihood and likelihood type functions. The procedures emulate techniques one would normally carry out by hand; this strategy is emphasised throughout the paper. The purpose of the software is to provide a practical alternative to difficult manual algebraic computations. The result is a method that is quick and free of clerical error.

Journal ArticleDOI
TL;DR: In this article, Gibbs sampling is used for obtaining accurate approximations to marginal densities for a large and flexible family of posterior distributions, the A family, and two alternative Monte Carlo strategies are also discussed.
Abstract: The full Bayesian analysis of multinomial data using informative and flexible prior distributions has, in the past, been restricted by the technical problems involved in performing the numerical integrations required to obtain marginal densities for parameters and other functions thereof. In this paper it is shown that Gibbs sampling is suitable for obtaining accurate approximations to marginal densities for a large and flexible family of posterior distributions—the A family. The method is illustrated with a three-way contingency table. Two alternative Monte Carlo strategies are also discussed.

Journal ArticleDOI
TL;DR: The Swendsen-Wang algorithm, for simulating Potts models, may be used to simulate certain types of posterior Gibbs distribution, as a special case of Edwards and Sokal (1988), and the behaviour of the algorithm is empirically compared with that of the Gibbs sampler.
Abstract: We show in detail how the Swendsen-Wang algorithm, for simulating Potts models, may be used to simulate certain types of posterior Gibbs distribution, as a special case of Edwards and Sokal (1988), and we empirically compare the behaviour of the algorithm with that of the Gibbs sampler. Some marginal posterior mode and simulated annealing image restorations are also examined. Our results demonstrate the importance of the starting configuration. If this is inappropriate, the Swendsen-Wang method can suffer from critical slowing in moderately noise-free situations where the Gibbs sampler convergence is very fast, whereas the reverse is true when noise level is high.

Journal ArticleDOI
TL;DR: In this paper, the probability density functions of observed open (closed) sojourn-times incorporating time interval omission are computed using a system of Volterra integral equations, whose solution governs the required density function.
Abstract: The dynamical aspects of single ion channel gating can be modelled by a semi-Markov process. There is aggregation of states, corresponding to the receptor channel being open or closed, and there is time interval omission, brief sojourns in either the open or closed classes of states not being detected. This paper is concerned with the computation of the probability density functions of observed open (closed) sojourn-times incorporating time interval omission. A system of Volterra integral equations is derived, whose solution governs the required density function. Numerical procedures, using iterative and multistep methods, are described for solving these equations. Examples are given, and in the special case of Markov models results are compared with those obtained by alternative methods. Probabilistic interpretations are given for the iterative methods, which also give lower bounds for the solutions.

Journal ArticleDOI
TL;DR: These methods maintain populations of individuals with nonlinear chromosomal structure and use ‘genetic’ operators enhanced by the problem specific knowledge.
Abstract: The paper presents non-standard methods in evolutionary computation and discusses their applicability to various optimization problems. These methods maintain populations of individuals with nonlinear chromosomal structure and use ‘genetic’ operators enhanced by the problem specific knowledge.

Journal ArticleDOI
TL;DR: In this article, computer software can monitor the types of contrasts a user examines, and select the smallest family of contrasts that is likely to be of interest, and calculate simultaneous confidence intervals for these families using a hybrid of the Bonferroni and Scheffe methods.
Abstract: Statisticians often employ simultaneous confidence intervals to reduce the likelihood of their drawing false conclusions when they must make a number of comparisons. To do this properly, it is necessary to consider the family of comparisons over which simultaneous confidence must be assured. Sometimes it is not clear what family of comparisons is appropriate. We describe how computer software can monitor the types of contrasts a user examines, and select the smallest family of contrasts that is likely to be of interest. We also describe how to calculate simultaneous confidence intervals for these families using a hybrid of the Bonferroni and Scheffe methods. Our method is especially suitable for problems with discrete and continuous predictors.

Journal ArticleDOI
TL;DR: Presto is a software which automatically generates FORTRAN code corresponding to approximation procedures of the solutions of stochastic differential systems and is an INRIA product, which is free for academic institutions, universities, etc, which already have MAPLE and X-Windows licences.
Abstract: Presto is a software which automatically generates FORTRAN code corresponding to approximation procedures of the solutions of stochastic differential systems.

Journal ArticleDOI
TL;DR: This paper studies the performance of a portable parallel unconstrained non-gradient optimization algorithm, when executed in various shared-memory multiprocessor systems, compared with its non-portable code.
Abstract: It is well known that the availability of cost-effective and powerful parallel computers has enhanced the ability of the operations research community to solve laborious computational problems. But many researchers argue that the lack of portability of parallel algorithms is a major drawback to utilizing parallel computers. This paper studies the performance of a portable parallel unconstrained non-gradient optimization algorithm, when executed in various shared-memory multiprocessor systems, compared with its non-portable code. Analysis of covariance is used to analyse how the algorithm's performance is affected by several factors of interest. The results yield more insights into the parallel computing.

Journal ArticleDOI
TL;DR: In this paper, a conditional simulation technique is used to estimate probabilities associated with the distribution of the maximum of a real-valued process which can be written in the form of a moving average.
Abstract: This paper describes a conditional simulation technique which can be used to estimate probabilities associated with the distribution of the maximum of a real-valued process which can be written in the form of a moving average. The class of processes to which the technique applies includes non-stationary and spatial processes, and autoregressive processes. The technique is shown to achieve a considerable variance reduction compared with the obvious simulation-based estimator, particularly for estimating small upper-tail probabilities.