scispace - formally typeset
Search or ask a question

Showing papers on "Model selection published in 1995"


Proceedings Article
Ron Kohavi1
20 Aug 1995
TL;DR: The results indicate that for real-word datasets similar to the authors', the best method to use for model selection is ten fold stratified cross validation even if computation power allows using more folds.
Abstract: We review accuracy estimation methods and compare the two most common methods crossvalidation and bootstrap. Recent experimental results on artificial data and theoretical re cults in restricted settings have shown that for selecting a good classifier from a set of classifiers (model selection), ten-fold cross-validation may be better than the more expensive leaveone-out cross-validation. We report on a largescale experiment--over half a million runs of C4.5 and a Naive-Bayes algorithm--to estimate the effects of different parameters on these algrithms on real-world datasets. For crossvalidation we vary the number of folds and whether the folds are stratified or not, for bootstrap, we vary the number of bootstrap samples. Our results indicate that for real-word datasets similar to ours, The best method to use for model selection is ten fold stratified cross validation even if computation power allows using more folds.

11,185 citations


Journal ArticleDOI
TL;DR: In this article, a Bayesian approach to hypothesis testing, model selection, and accounting for model uncertainty is presented, which is straightforward through the use of the simple and accurate BIC approximation, and it can be done using the output from standard software.
Abstract: It is argued that P-values and the tests based upon them give unsatisfactory results, especially in large samples. It is shown that, in regression, when there are many candidate independent variables, standard variable selection procedures can give very misleading results. Also, by selecting a single model, they ignore model uncertainty and so underestimate the uncertainty about quantities of interest. The Bayesian approach to hypothesis testing, model selection, and accounting for model uncertainty is presented. Implementing this is straightforward through the use of the simple and accurate BIC approximation, and it can be done using the output from standard software. Specific results are presented for most of the types of model commonly used in sociology. It is shown that this approach overcomes the difficulties with P-values and standard model selection procedures based on them. It also allows easy comparison of nonnested models, and permits the quantification of the evidence for a null hypothesis of interest, such as a convergence theory or a hypothesis about societal norms.

6,100 citations


Journal ArticleDOI
TL;DR: The effects of model uncertainty, such as too narrow prediction intervals, and the non-trivial biases in parameter estimates which can follow data-based modelling are reviewed.
Abstract: This paper takes a broad, pragmatic view of statistical inference to include all aspects of model formulation. The estimation of model parameters traditionally assumes that a model has a prespecified known form and takes no account of possible uncertainty regarding the model structure. This implicitly assumes the existence of a 'true' model, which many would regard as a fiction. In practice model uncertainty is a fact of life and likely to be more serious than other sources of uncertainty which have received far more attention from statisticians. This is true whether the model is specified on subject-matter grounds or, as is increasingly the case, when a model is formulated, fitted and checked on the same data set in an iterative, interactive way. Modern computing power allows a large number of models to be considered and data-dependent specification searches have become the norm in many areas of statistics. The term data mining may be used in this context when the analyst goes to great lengths to obtain a good fit. This paper reviews the effects of model uncertainty, such as too narrow prediction intervals, and the non-trivial biases in parameter estimates which can follow data-based modelling. Ways of assessing and overcoming the effects of model uncertainty are discussed, including the use of simulation and resampling methods, a Bayesian model averaging approach and collecting additional data wherever possible. Perhaps the main aim of the paper is to ensure that statisticians are aware of the problems and start addressing the issues even if there is no simple, general theoretical fix.

995 citations


Journal ArticleDOI
TL;DR: In this paper, a simple kernel procedure based on marginal integration that estimates the relevant univariate quantity in both additive and multiplicative nonparametric regression is defined, which is used as a preliminary diagnostic tool.
Abstract: SUMMARY We define a simple kernel procedure based on marginal integration that estimates the relevant univariate quantity in both additive and multiplicative nonparametric regression Nonparametric regression is frequently used as a preliminary diagnostic tool It is a convenient method of summarising the relationship between a dependent and a univariate independent variable However, when the explanatory variables are multidimensional, these methods are less satisfactory In particular, the rate of convergence of standard estimators is poorer, while simple plots are not available to aid model selection There are a number of simplifying structures that have been used to avoid these problems These include the regression tree structure of Gordon & Olshen (1980), the projection pursuit model of Friedman & Stuetzle (1981), semiparametric models such as considered

553 citations


Ron Kohavi1
01 Sep 1995
TL;DR: This doctoral dissertation concludes that repeated runs of five-fold cross-validation give a good tradeoff between bias and variance for the problem of model selection used in later chapters.
Abstract: In this doctoral dissertation, we study three basic problems in machine learning and two new hypothesis spaces with corresponding learning algorithms. The problems we investigate are: accuracy estimation, feature subset selection, and parameter tuning. The latter two problems are related and are studied under the wrapper approach. The hypothesis spaces we investigate are: decision tables with a default majority rule (DTMs) and oblivious read-once decision graphs (OODGs). For accuracy estimation, we investigate cross-validation and the~.632 bootstrap. We show examples where they fail and conduct a large scale study comparing them. We conclude that repeated runs of five-fold cross-validation give a good tradeoff between bias and variance for the problem of model selection used in later chapters. We define the wrapper approach and use it for feature subset selection and parameter tuning. We relate definitions of feature relevancy to the set of optimal features, which is defined with respect to both a concept and an induction algorithm. The wrapper approach requires a search space, operators, a search engine, and an evaluation function. We investigate all of them in detail and introduce compound operators for feature subset selection. Finally, we abstract the search problem into search with probabilistic estimates. We introduce decision tables with a default majority rule (DTMs) to test the conjecture that feature subset selection is a very powerful bias. The accuracy of induced DTMs is surprisingly powerful, and we concluded that this bias is extremely important for many real-world datasets. We show that the resulting decision tables are very small and can be succinctly displayed. We study properties of oblivious read-once decision graphs (OODGs) and show that they do not suffer from some inherent limitations of decision trees. We describe a a general framework for constructing OODGs bottom-up and specialize it using the wrapper approach. We show that the graphs produced are use less features than C4.5, the state-of-the-art decision tree induction algorithm, and are usually easier for humans to comprehend.

338 citations


Journal ArticleDOI
TL;DR: In this article, a predictive Bayesian viewpoint is advocated to avoid the specification of prior probabilities for the candidate models and the detailed interpretation of the parameters in each model, and using criteria derived from a certain predictive density and a prior specification that emphasizes the observables, they implement the proposed methodology for three common problems arising in normal linear models: variable subset selection, selection of a transformation of predictor variables and estimation of a parametric variance function.
Abstract: We consider the problem of selecting one model from a large class of plausible models. A predictive Bayesian viewpoint is advocated to avoid the specification of prior probabilities for the candidate models and the detailed interpretation of the parameters in each model. Using criteria derived from a certain predictive density and a prior specification that emphasizes the observables, we implement the proposed methodology for three common problems arising in normal linear models: variable subset selection, selection of a transformation of predictor variables and estimation of a parametric variance function. Interpretation of the relative magnitudes of the criterion values for various models is facilitated by a calibration of the criteria. Relationships between the proposed criteria and other well-known criteria are examined

337 citations


Journal ArticleDOI
TL;DR: A small sample criterion (AICc) for the selection of extended quasi-likelihood models provides a more nearly unbiased estimator for the expected Kullback-Leibler information and often selects better models than AIC in small samples.
Abstract: We develop a small sample criterion (AICc) for the selection of extended quasi-likelihood models. In contrast to the Akaike information criterion (AIC). AICc provides a more nearly unbiased estimator for the expected Kullback-Leibler information. Consequently, it often selects better models than AIC in small samples. For the logistic regression model, Monte Carlo results show that AICc outperforms AIC, Pregibon's (1979, Data Analytic Methods for Generalized Linear Models. Ph.D. thesis. University of Toronto) Cp*, and the Cp selection criteria of Hosmer et al. (1989, Biometrics 45, 1265-1270). Two examples are presented.

310 citations



Journal ArticleDOI
TL;DR: A model-selection approach to the question of whether forward-interest rates are useful in predicting future spot rates indicates that the premium of the forward rate over the spot rate helps to predict the sign of future changes in the interest rate.
Abstract: We take a model-selection approach to the question of whether forward-interest rates are useful in predicting future spot rates, using a variety of out-of-sample forecast-based model-selection criteria—forecast mean squared error, forecast direction accuracy, and forecast-based trading-system profitability. We also examine the usefulness of a class of novel prediction models called artificial neural networks and investigate the issue of appropriate window sizes for rolling-window-based prediction methods. Results indicate that the premium of the forward rate over the spot rate helps to predict the sign of future changes in the interest rate. Furthermore, model selection based on an in-sample Schwarz information criterion (SIC) does not appear to be a reliable guide to out-of-sample performance in the case of short-term interest rates. Thus, the in-sample SIC apparently fails to offer a convenient shortcut to true out-of-sample performance measures.

274 citations


Journal ArticleDOI
TL;DR: The artificial neural network approach for the synthesis of reservoir inflow series differs from the traditional approaches in synthetic hydrology in the sense that it focuses on the role of reinforcement learning in the decision-making process.
Abstract: The artificial neural network (ANN) approach described in this paper for the synthesis of reservoir inflow series differs from the traditional approaches in synthetic hydrology in the sense that it belongs to a class of data-driven approaches as opposed to traditional model driven approaches. Most of the time series modelling procedures fall within the framework of multivariate autoregressive moving average (ARMA) models. Formal statistical modelling procedures suggest a fourstage iterative process, namely, model selection, model order identification, parameter estimation and diagnostic checks. Although a number of statistical tools are already available to follow such a modelling process, it is not an easy task, especially if higher order vector ARMA models are used. This paper investigates the use of artificial neural networks in the field of synthetic inflow generation. The various steps involved in the development of a neural network and a ultivariate autoregressive model for synthesis are pr...

271 citations


Journal ArticleDOI
TL;DR: It is argued that it is better to use model selection procedures rather than formal hypothesis testing when deciding on model specification, because testing favors the null hypothesis, typically uses an arbitrary choice of significance level, and researchers using the same data can end up with different final models.

Journal ArticleDOI
TL;DR: This paper evaluates information theoretic approaches to selection of a parsimonious model and compares them to the use of likelihood ratio tests using four a levels and finds two information theoretics criteria have a balance between underfitting and overfitting when compared to models where the average minimum RSS was known.
Abstract: Analysis of capture-recapture data is critically dependent upon selection of a proper model for inference. Model selection is particularly important in the analysis of multiple, interrelated data sets. This paper evaluates information theoretic approaches to selection of a parsimonious model and compares them to the use of likelihood ratio tests using four a levels. The purpose of the evaluation is to compare model selection strategies based on the quality of the inference, rather than on the degree to which differing selection strategies select the "true model." A measure of squared bias and variance (termed RSS) is used as a basis for comparing different data-based selection strategies, assuming that a minimum RSS value is a reasonable target. In general, the information theoretic approaches consistently selected models with a smaller RSS than did the likelihood ratio testing approach. Two information theoretic criteria have a balance between underfitting and overfitting when compared to models where the average minimum RSS was known. Other findings are presented along with a discussion of the concept of a "true model" and dimension consistency in model selection.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: A detailed comparison of three well-known model selection methods for finding a balance between the complexity of the hypothesis chosen and its observed error on a random training sample of limited size, when the goal is that of minimizing the resulting generalization error.
Abstract: We investigate the problem of {\it model\ selection} in the setting of supervised learning of boolean functions from independent random examples. More precisely, we compare methods for finding a balance between the complexity of the hypothesis chosen and its observed error on a random training sample of limited size, when the goal is that of minimizing the resulting generalization error. We undertake a detailed comparison of three well-known model selection methods — a variation of Vapnik‘s {\it Guaranteed\ Risk\ Minimization} (GRM), an instance of Rissanen‘s {\it Minimum\ Description\ Length\ Principle} (MDL), and (hold-out) cross validation (CV). We introduce a general class of model selection methods (called {\it penalty-based} methods) that includes both GRM and MDL, and provide general methods for analyzing such rules. We provide both controlled experimental evidence and formal theorems to support the following conclusions: \bulletEven on simple model selection problems, the behavior of the methods examined can be both complex and incomparable. Furthermore, no amount of “tuning” of the rules investigated (such as introducing constant multipliers on the complexity penalty terms, or a distribution-specific “effective dimension”) can eliminate this incomparability. \bulletIt is possible to give rather general bounds on the generalization error, as a function of sample size, for penalty-based methods. The quality of such bounds depends in a precise way on the extent to which the method considered automatically limits the complexity of the hypothesis selected. \bulletFor {\it any} model selection problem, the additional error of cross validation compared to {\it any} other method can be bounded above by the sum of two terms. The first term is large only if the learning curve of the underlying function classes experiences a phase transition” between (1-\gamma)m andm examples (where \gamma is the fraction saved for testing in CV). The second and competing term can be made arbitrarily small by increasing\gamma . \bulletThe class of penalty-based methods is fundamentally handicapped in the sense that there exist two types of model selection problems for which every penalty-based method must incur large generalization error on at least one, while CV enjoys small generalization error on both.

Journal ArticleDOI
TL;DR: A new paradigm for the segmentation of range images into piecewise continuous surfaces is presented through an effective combination of simple component algorithms, which stands in contrast to methods which attempt to solve the problem in a single processing step using sophisticated means.
Abstract: Segmentation of range images has long been considered in computer vision as an important but extremely difficult problem. In this paper we present a new paradigm for the segmentation of range images into piecewise continuous surfaces. Data aggregation is performed via model recovery in terms of variable-order bi-variate polynomials using iterative regression. Model recovery is initiated independently in regularly placed seed regions in the image. All the recovered models are potential candidates for the final description of the data. Selection of the models is defined as a quadratic Boolean problem, and the solution is sought by the WTA (winner-takes-all) technique, which turns out to be a good compromise between the speed of computation and the accuracy of the solution. The overall efficiency of the method is achieved by combining model recovery and model selection in an iterative way. Partial recovery of the models is followed by the selection (optimization) procedure and only the “best” models are allowed to develop further. The major novelty of the approach lies in an effective combination of simple component algorithms, which stands in contrast to methods which attempt to solve the problem in a single processing step using sophisticated means. We present the results on several real range images.

Journal ArticleDOI
TL;DR: The numerous methods most often used to determine the number of relevant components in principal component analysis are presented and it is shown why unfortunately most of them fail.

Journal ArticleDOI
Paul Kabaila1
TL;DR: In this paper, the authors consider the effect of model selection on prediction regions and show that the use of asymptotic results for the construction of prediction regions requires the same sort of care as use of such results for constructing confidence regions for the parameters of interest, and that a great deal of care must be exercised in any attempt at such an application.
Abstract: Potscher (1991, Econometric Theory7, 163–181) has recently considered the question of how the use of a model selection procedure affects the asymptotic distribution of parameter estimators and related statistics. An important potential application of such results is to the generation of confidence regions for the parameters of interest. It is demonstrated that a great deal of care must be exercised in any attempt at such an application. We also consider the effect of model selection on prediction regions. It is demonstrated that the use of asymptotic results for the construction of prediction regions requires the same sort of care as the use of such results for the construction of confidence regions.

Journal ArticleDOI
TL;DR: Raftery's paper as discussed by the authors addresses two important problems in the statistical analysis of social science data: (1) choosing an appropriate model when so much data are available that standard P-values reject all parsimonious models; and (2) making estimates and predictions when there are not enough data available to fit the desired model using standard techniques.
Abstract: Raftery's paper addresses two important problems in the statistical analysis of social science data: (1) choosing an appropriate model when so much data are available that standard P-values reject all parsimonious models; and (2) making estimates and predictions when there are not enough data available to fit the desired model using standard techniques. For both problems, we agree with Raftery that classical frequentist methods fail and that Raftery's suggested methods based on BIC can point in better directions. Nevertheless, we disagree with his solutions because, in principle, they are still directed off-target and only by serendipity manage to hit the target in special circumstances. Our primary criticisms of Raftery's proposals are that (1) he promises the impossible: the selection of a model that is adequate for specific purposes without consideration of those purposes; and (2) he uses the same limited tool for model averaging as for model selection, thereby depriving himself of the benefits of the broad range of available Bayesian procedures. Despite our criticisms, we applaud Raftery's desire to improve practice by providing methods and computer programs for all to use and applying these methods to real problems. We believe that his paper makes a positive contribution to social science, by focusing on

Journal ArticleDOI
TL;DR: For instance, the authors used the Bayesian cross-validated likelihood (BCVL) approach to compare non-nested models involving the same variables, and generalize it to the case where some variables may be unique to each model.

Journal ArticleDOI
TL;DR: It is shown that trees are useful not only in summarizing the prognostic information contained in a set of covariates (prognostic classification), but also in detecting and displaying treatment-covariates interactions (subgroup analysis).

Journal ArticleDOI
TL;DR: It is suggested that general model comparison, model selection, and model probability estimation be performed using the Schwarz criterion, which can be implemented given the model log likelihoods using only a hand calculator.
Abstract: We investigate the performance of empirical criteria for comparing and selecting quantitative models from among a candidate set. A simulation based on empirically observed parameter values is used to determine which criterion is the most accurate at identifying the correct model specification. The simulation is composed of both nested and nonnested linear regression models. We then derive posterior probability estimates of the superiority of the alternative models from each of the criteria and evaluate the relative accuracy, bias, and information content of these probabilities. To investigate whether additional accuracy can be derived from combining criteria, a method for obtaining a joint prediction from combinations of the criteria is proposed and the incremental improvement in selection accuracy considered. Based on the simulation, we conclude that most leading criteria perform well in selecting the best model, and several criteria also produce accurate probabilities of model superiority. Computationally intensive criteria failed to perform better than criteria which were computationally simpler. Also, the use of several criteria in combination failed to appreciably outperform the use of one model. The Schwarz criterion performed best overall in terms of selection accuracy, accuracy of posterior probabilities, and ease of use. Thus, we suggest that general model comparison, model selection, and model probability estimation be performed using the Schwarz criterion, which can be implemented given the model log likelihoods using only a hand calculator.

Journal ArticleDOI
TL;DR: Bayesian methods for the Jelinski and Moranda and the Littlewood and Verrall models in software reliability are studied and model selection based on the mean squared prediction error and the prequential likelihood of the conditional predictive ordinates is developed.
Abstract: Bayesian methods for the Jelinski and Moranda and the Littlewood and Verrall models in software reliability are studied. A Gibbs sampling approach is employed to compute the Bayes estimates. In addition, prediction of future failure times and future reliabilities is examined. Model selection based on the mean squared prediction error and the prequential likelihood of the conditional predictive ordinates is developed.

Posted Content
TL;DR: The purpose of this Guide is to provide an expositional review of the underlying methodology and to walk the user through an application, hoping that the Guide will be essentially self contained and that very little reference to the cited literature will be required to use the program and the SNP method.
Abstract: SNP is a method of nonparametric time series analysis The method employs a polynomial series expansion to approximate the conditional density of a multivariate process An appealing feature of the expansion is that it directly nests familiar models such as a pure VAR, a pure ARCH, a nonlinear process with homogeneous innovations, etc An SNP model is fitted using conventional maximuml likelihood together with a model selection strategy that determines the appropriate degree of the polynomial A Fortran program implementing the SNP method is available via anonymous ftp at ftpecondukeedu (15231064) in directory ~ftp/home/arg/snp or from Carnegie-Mellon University e-mail server by sending a one-line e-mail message "send snp from general" to statlib@libstatcmuedu The cose is provided at no charge for research purposes without warranty The program has switches that allow direct computation of functionals of the fitted density such as conditional means, conditional variances, and points for plotting the density Other switches generate simulated sample paths which can be used to compute nonlinear functionals of the density by Monte Carlo integration, notably the nonlinear analogs of the impulse-response mean and volatility profiles used in traditional VAR and ARCH analysis Simulated sample paths can also be used to set bootstrapped sup-norm confidence bands on these and other functionals The purpose of this Guide is to provide an expositional review of the underlying methodology and to walk the user through an application Our hope is that the Guide will be essentially self contained and that very little reference to the cited literature will be required to use the program and the SNP method

Journal ArticleDOI
TL;DR: The theoretical background for a program for establishing expert systems on the basis of observations and expert knowledge and various model selection methods for automatic model selection are presented.

Journal ArticleDOI
TL;DR: It turns out that there are a number of prior choices in the problem formulation, which are crucial for the estimators' behavior, and the role of the prior choices is clarified.

Journal ArticleDOI
TL;DR: A new model, called acceleration model, is proposed in the framework of the heterogenous case of the graded response model, based on processing functions defined for a finite or enumerable number of steps, expected to be useful in cognitive assessment and more traditional areas of application of latent trait models.
Abstract: A new model, called acceleration model, is proposed in the framework of the heterogenous case of the graded response model, based on processing functions defined for a finite or enumerable number of steps The model is expected to be useful in cognitive assessment, as well as in more traditional areas of application of latent trait models Criteria for evaluating models are proposed, and soundness and robustness of the acceleration model are discussed Graded response models based on individual choice behavior are also discussed, and criticisms on model selection in terms of fitnesses of models to the data are also given

Journal ArticleDOI
TL;DR: In this paper, the authors used nine closed populations of gray-tailed voles (Microtus canicaudus ) in 0.2-ha enclosures to empirically select the best fit among 11 probabilistic estimators of population size.
Abstract: We used nine closed populations of gray-tailed voles ( Microtus canicaudus ) in 0.2-ha enclosures to empirically select the best fit among 11 probabilistic estimators of population size. We also examined the influence of population size and number of trap occasions on performance of estimators. Population size was known in all instances, providing a basis for comparison of performance of estimators. Three replicates of three population sizes (30, 60, and 90 voles/enclosure) were used in this experiment. The most accurate and precise estimators, selected on the basis of four consecutive trapping occasions, were the Pollock and Otto's Mbh, Chao's Mh, and jackknife estimators. Examination of the hypothesis tests included in the Model Selection Procedure of the program CAPTURE identified individual heterogeneity as the prevailing source of variation in capture probabilities and suggested that the appropriate estimator would be the jackknife. Reliability of the heterogeneity estimators (the jackknife and Chao's estimators for Mh and Mth) was positively related to population size, whereas reliability of almost all of the other estimators varied inversely with population size. The jackknife estimator was unique in the stability and quality of its performance in the first few trap occasions. Using the jackknife estimator and three trap occasions offered the best tradeoff between reliability and trapping effort.

Posted Content
TL;DR: In this article, a model selection approach to real-time macroeconomic forecasting using linear and nonlinear models is presented, and true ex-ante forecasting is constructed by using unrevised as opposed to fully revised data.
Abstract: We take a model selection approach to real-time macroeconomic forecasting using linear and nonlinear models. True ex-ante forecasting are constructed by using unrevised as opposed to fully revised data. Model selection as well as model performance measures are considered.

Dissertation
01 Jan 1995
TL;DR: Wavelet neural networks are introduced as a new class of elliptic basis function neural networks and wavelet networks, and are applied to the numerical modeling and classification of EEGs, finding them to be ideally suited for problems of EEG analysis.
Abstract: Wavelet neural networks (WNNs) are introduced as a new class of elliptic basis function neural networks and wavelet networks, and are applied to the numerical modeling and classification of EEGs. The implementation of the networks is achieved in two possibly cyclical stages of structure and parameter identification. For structure identification, two methods are developed: one generic, based on data clusterings, and one specific, using wavelet analysis. For parameter identification, two methods are also implemented: the Levenberg-Marquardt algorithm and a genetic algorithm of ranking type. The problem of model generalization is considered from both, a crossvalidation and a regularization point of view. For the latter, a corrected average squared error (CASE) is derived as a new model selection criterion that does not rely on assumptions about error distributions or modeling paradigms. For EEG modeling, the nonlinear dynamics framework is employed in the reconstruction of state-spaces via the embedding scheme. Preprocessing for the resulting state-vector is introduced in terms of decorrelation and compression. The naive application of chaos theory to EEGs is shown to be useful in feature extraction, but not in corroborating theories about the nature of EEGs. For the latter, the concept of modeling resolution is introduced. It is shown that the chaos-in-the-brain question becomes meaningful only as a function of modeling resolution. For EEG classification, a general WNN classification system is implemented as a cascade of synergistic feature selection, WNN nonlinear discrimination, and decision logic. A feature library is described including raw and model-based features, ranging from traditional measures to chaotic indicators. Training for maximum-likelihood classification is shown to be inductively feasible via a decoder-type WNN classifier adjusted with nonanalytic methods. WNNs were found to be ideally suited for problems of EEG analysis due to the long-duration/low-frequency and short-duration/high-frequency structure of EEG signals.

Journal ArticleDOI
TL;DR: A comparative study of several recently proposed one-dimensional sedimentation models has been made by fitting these models to steady-state and dynamic concentration profiles obtained in a down-scaled secondary decanter and it could be concluded that the model of Takacs et al .

Journal ArticleDOI
TL;DR: The results indicate that the neural network approach can assist the practitioner in the selection of the appropriate forecast model.