scispace - formally typeset
Search or ask a question

Showing papers on "Model selection published in 1989"


Journal ArticleDOI
TL;DR: In this article, the authors propose simple and directional likelihood-ratio tests for discriminating and choosing between two competing models whether the models are nonnested, overlapping or nested and whether both, one, or neither is misspecified.
Abstract: In this paper, we propose a classical approach to model selection. Using the Kullback-Leibler Information measure, we propose simple and directional likelihood-ratio tests for discriminating and choosing between two competing models whether the models are nonnested, overlapping or nested and whether both, one, or neither is misspecified. As a prerequisite, we fully characterize the asymptotic distribution of the likelihood ratio statistic under the most general conditions.

5,661 citations


Journal ArticleDOI
TL;DR: An overview of problems in multivariate modeling of epidemiologic data is provided, and some proposed solutions are examined, including model and variable forms should be selected based on regression diagnostic procedures, in addition to goodness-of-fit tests.
Abstract: This paper provides an overview of problems in multivariate modeling of epidemiologic data, and examines some proposed solutions. Special attention is given to the task of model selection, which involves selection of the model form, selection of the variables to enter the model, and selection of the form of these variables in the model. Several conclusions are drawn, among them: a) model and variable forms should be selected based on regression diagnostic procedures, in addition to goodness-of-fit tests; b) variable-selection algorithms in current packaged programs, such as conventional stepwise regression, can easily lead to invalid estimates and tests of effect; and c) variable selection is better approached by direct estimation of the degree of confounding produced by each variable than by significance-testing algorithms. As a general rule, before using a model to estimate effects, one should evaluate the assumptions implied by the model against both the data and prior information.

2,117 citations


Journal ArticleDOI
TL;DR: The purpose of this note is to illustrate that for one of the more frequently used nonnormal regression models, logistic regression, one may perform the Lawless-Singhal analysis with any best subsets linear regression program that allows for case weights.
Abstract: Selection of covariates is an important step in any regression modeling situation. An effective strategy when the model is a linear regression with normal errors is to use a software package that implements a best subsets selection algorithm. A number of years ago Lawless and Singhal (1978) proposed a method for efficiently screening nonnormal regression models, thus providing the basis for best subsets nonlinear regression. Recently Lawless and Singhal (1987a, 1987b) have made available a software package that implements their method. The purpose of this note is to illustrate that for one of the more frequently used nonnormal regression models, logistic regression, we may perform the Lawless-Singhal analysis with any best subsets linear regression program that allows for case weights. The methods presented may also be obtained from the general method for model selection proposed by Gilks (1986). We also discuss the use and interpretation of Mallows's measure of predictive squared error as a statistic for comparing models with different subsets of variables.

577 citations


Journal ArticleDOI
TL;DR: A bootstrap investigation of the stability of a Cox proportional hazards regression model resulting from the analysis of a clinical trial of azathioprine versus placebo in patients with primary biliary cirrhosis shows graphically that these intervals are markedly wider than those obtained from the original model.
Abstract: We describe a bootstrap investigation of the stability of a Cox proportional hazards regression model resulting from the analysis of a clinical trial of azathioprine versus placebo in patients with primary biliary cirrhosis. We have considered stability to refer both to the choice of variables included in the model and, more importantly, to the predictive ability of the model. In stepwise Cox regression analyses of 100 bootstrap samples using 17 candidate variables, the most frequently selected variables were those selected in the original analysis, and no other important variable was identified. Thus there was no reason to doubt the model obtained in the original analysis. For each patient in the trial, bootstrap confidence intervals were constructed for the estimated probability of surviving two years. It is shown graphically that these intervals are markedly wider than those obtained from the original model.

307 citations


Journal ArticleDOI
TL;DR: In this paper, a decision rule for the choice of a model which is strongly consistent for the true model as n -> oo is presented. But the decision rule is not applicable to the case where the distribution of the components of En is unknown.
Abstract: SUMMARY We consider the multiple regression model Yn = Xn, + En, where Yn and En are n-vector random variables, Xn is an n x m matrix and 83 is an m-vector of unknown regression parameters. Each component of ,3 may be zero or nonzero, which gives rise to 2' possible models for multiple regression. We provide a decision rule for the choice of a model which is strongly consistent for the true model as n -> oo. The result is proved under certain mild conditions, for instance without assuming normality of the distribution of the components of En.

174 citations


Book ChapterDOI
03 Jan 1989
TL;DR: It is shown that the TIC is asymptotically equivalent to Cross Validation in a general context, although AIC is in fact an extension of TIC as well as of AIC, and a useful criterion RIC (Regularization Information Criterion) is derived to select both the model and the weight of penalty.
Abstract: Various aspects of statistical model selection are discussed from the view point of a statistician. Our concern here is about selection procedures based on the Kullback Leibler information number. Derivation of AIC (Akaike’s Information Criterion) is given. As a result a natural extension of AIC, called TIC (Takeuchi’s Information Criterion) follows. It is shown that the TIC is asymptotically equivalent to Cross Validation in a general context, although AIC is asymptotically equivalent only for the case of independent identically distributed observations. Next, the maximum penalized likelihood estimate is considered in place of the maximum likelihood estimate as an estimation of parameters after a model is selected. Then the weight of penalty is also the one to be selected. We will show that, starting from the same Kullback-Leibler information number, a useful criterion RIC (Regularization Information Criterion) is derived to select both the model and the weight of penalty. This criterion is in fact an extension of TIC as well as of AIC. Comparison of various criteria, including consistency and efficiency is summarized in Section 5. Applications of such criteria to time series models are given in the last section.

135 citations


Journal ArticleDOI
TL;DR: In this article, the authors give sufficient conditions for strong consistency of estimators for the order of general nonstationary autoregressive models based on the minimization of an information criterion a la Akaike's (1969) AIC.
Abstract: We give sufficient conditions for strong consistency of estimators for the order of general nonstationary autoregressive models based on the minimization of an information criterion a la Akaike's (1969) AIC. The case of a time-dependent error variance is also covered by the analysis. Furthermore, the more general case of regressor selection in stochastic regression models is treated.

108 citations


Journal ArticleDOI
TL;DR: A concise review of the theory of adaptive modeling, identification, and control of dynamic structural systems based on discrete‐time recordings is presented and guidelines for model selection and model validation and the computational aspects of the method are discussed.
Abstract: A concise review of the fheory of adaptive modeling, identification, and control of dynamic structural systems based on discrete‐time recordings is presented. Adaptive methods have four major advantages over the classical methods: (1) Removal of the noise from the signal is done over the whole frequency band; (2) time‐varying characteristics of systems can be tracked; (3) systems with unknown characteristics can be controlled; and (4) a small segment of the data is needed during the computations. Included in the paper are the discrete‐time representation of single‐input single‐output (SISO) systems, models for SISO systems with noise, the concept of stochastic approximation, recursive prediction error method (RPEM) for system identification, and the adaptive control. Guidelines for model selection and model validation and the computational aspects of the method are also discussed in the paper. The present paper is the first of two companion papers. The theory given in the paper is limited to that which is...

102 citations


Journal ArticleDOI
TL;DR: In this paper, a Monte Carlo comparison of five model selection criteria indicates that when the true relationship is weak and delayed, Theil's criterion or the selection of the most significant lag structure is most likely to result in the correct inference.

89 citations


Journal ArticleDOI
TL;DR: A model selection criterion based on the minimum description length principle due to Rissanen is discussed and strong consistency in the stationary situation is shown and modifications required to introduce forgetting when the procedures are implemented in the non-stationary case are presented.
Abstract: This paper is concerned with the recursive estimation of autoregressive models, in particular the realtime determination of the order of such a specification. A model selection criterion based on the minimum description length principle due to Rissanen is discussed and strong consistency in the stationary situation is shown. Alternative criteria are also considered and modifications required to introduce forgetting when the procedures are implemented in the non-stationary case are presented. Some simulation evidence on the performance of the criteria when applied to stationary and non-stationary processes is given.

61 citations


Journal ArticleDOI
TL;DR: It is argued that, because of relationships existing between the simulation residuals and estimation residuals, no new information is forthcoming from a simulation, and current proposals using simulation output for the purpose of model evaluation are analysed.

Journal ArticleDOI
Yuzo Hosoya1
TL;DR: In this article, a generalized likelihood ratio test with equal marginal error rate was proposed as a basic tool for statistical inference for statistical models which have a nested structure, and the power performance of this test was characterized and some applications were illustrated.
Abstract: This paper deals with statistical inference for statistical models which have a nested structure. The emphasis here is on the use of a classical test of significance and on a confidence set construction approach for model identification, and the paper proposes a generalized likelihood ratio test with equal marginal error rate as a basic tool. The power performance of this test is characterized and some applications are illustrated.

Journal ArticleDOI
TL;DR: In this article, the authors deal with the selection and estimation of the appropriate form of a growth curve for technological forecasting, including the determination of the shape of the growth curve (logistic or Gompertz) and the structure of the error underlying the model.

Journal ArticleDOI
TL;DR: A non-asymptotic approach to model selection and the estimation of performance and other parameters affected by the model selection, which treats data based model selection as an integrated part of the estimation procedure.


01 Jan 1989
TL;DR: In this paper, a simple likelihood-ratio based statistics for testing the null hypothesis that the competing models are equally close to the true data generating process against the alternative hypothesis that one model is closer.
Abstract: In this paper, we develop a classical approach to model selection. Using the KullbackLeibler Information Criterion to measure the closeness of a model to the truth, we propose simple likelihood-ratio based statistics for testing the null hypothesis that the competing models are equally close to the true data generating process against the alternative hypothesis that one model is closer. The tests are directional and are derived successively for the cases where the competing models are non-nested, overlapping, or nested and whether both, one, or neither is misspecified. As a prerequisite, we fully characterize the asymptotic distribution of the likelihood ratio statistic under the most general conditions. We show that it is a weighted sum of chi-square distribution or a normal distribution depending on whether the distributions in the competing models closest to the truth are observationally identical. We also propose a test of this latter condition.

Journal ArticleDOI
TL;DR: In this paper, the identifiability problem for factor analysis models representing two blocks of variables is discussed and a model selection procedure for choosing the "exact part" of the modelled variables is proposed.

Journal ArticleDOI
TL;DR: A novel approach to prediction of lake acidification by utilizing each model in those cases where it is most applicable, an approach which requires both a quantitative and qualitative judgement or rules to choose the proper model.

Journal ArticleDOI
TL;DR: In this paper, the authors developed and evaluated a confirmatory approach to assess test structure using multidimensional item response theory (MIRT), which involves adding to the exponent of the MIRT model an item structure matrix that allows the user to specify the ability dimensions measured by an item.
Abstract: The purpose of this research was to develop and evaluate a confirmatory approach to assessing test structure using multidimensional item response theory (MIRT). The approach investigated involves adding to the exponent of the MIRT model an item structure matrix that allows the user to specify the ability dimensions measured by an item. Various combinations of item structures were fit to two sets of simulation data with known true structures, and the results were evaluated using a likelihood ratio chi-square statistic and two information-based model selection criteria. The results of these analyses support the use of the confirmatory MIRT approach, since it was found that the procedures could recover the true item structures. It was also found that adding an additional ability dimension that forces together items that ought not to be together noticeably deteriorates the quality of the solution. On the other hand, imposing structures different from, but not inconsistent with, the true structures does not necessarily yield worse fit. Finally, in terms of model fit statistics, the consistent Akaike information criterion performed better than the simple Akaike information criterion, while the likelihood ratio chi-square was clearly inadequate.

Journal ArticleDOI
TL;DR: In this paper, a model based on the discrete autoregressive moving average (DARMA) family of stochastic processes, which includes the Markov chain as a particular case, was developed for simulation of sequences of dry and wet days.
Abstract: A new statistical model has been developed for the simulation of sequences of dry and wet days. The model is based on the discrete autoregressivemoving average (DARMA) family of stochastic processes, which includes the Markov chain as a particular case. The model building is based on a three‐step procedure consisting of identification, estimation, and model selection. The model identification and parameter estimation are based on the best fit of the autocorrelation function, while the selection of the optimum model is based on the best reproduction of the probability distribution function of the lengths of the runs of dry days and wet days. The model has thus the property of reproducing the persistence of dry spells and wet spells which are important in the evaluation and forecast of droughts and floods. Excellent results were obtained with rainfall data from Indiana, which indicate that the models are useful for scheduling irrigation of crops in the Central United States and possibly elsewhere.

Journal ArticleDOI
TL;DR: In this paper, the authors used several statistical criteria for model selection for Granger causality detection, and the results indicate some sensitivity to the chosen criterion, indicating that some sensitivity may arise when the model selection criterion is chosen.

Book ChapterDOI
Charles A. Micchelli1
01 Jan 1989
TL;DR: In this article, Milanese and Vicino this article presented two formulations of p-widths subject to measurement errors and identified optimal linear models in some concrete cases for a given class of signals constrained to lie in some known set.
Abstract: We follow the interesting developments in the recent work of Milanese and Belforte [7], Belforte, Bona, and Frediani [1,2], and Milanese and Vicino [8] and address some questions suggested by these papers on the subject of parameter estimation in linear models subject to unknown additive and bounded noise. In addition, we treat the question of model selection for a given class of signals constrained to lie in some known set. We approach this latter question by presenting two formulations of p-widths subject to measurement errors and identify optimal linear models in some concrete cases. Both stochastic and deterministic p-widths are studied.

Journal ArticleDOI
Luc Devroye1
TL;DR: In this paper, the authors consider the problem of choosing between two density estimates, a non-parametric estimate with the standard properties of nonparametric estimates (universal consistency, robustness, but not extremely good rate of convergence) and a special estimate designed to perform well on a given set T of densities.
Abstract: We consider the problem of choosing between two density estimates, a non-parametric estimate with the the standard properties of nonparametric estimates (universal consistency, robustness, but not extremely good rate of convergence) and a special estimate designed to perform well on a given set T of densities. The special estimate can often be thought of as a parametric estimate. The selection we propose is based upon the L1 distance oetween both estimates. Among otner things, we show how one should proceed to insure that the selected estimate matches the special estimate's rate on T, and that it matches the nonparametric estimate's rate off T

Book ChapterDOI
01 Jan 1989
TL;DR: The results of a Bayesian model selection calculation are presented, and it is shown that the Bayesian answer to this question is essentially a quantitative statement of Occams razor: When two models fit the evidence in the data equally well, choose the simpler model.
Abstract: The model selection problem is one of the most basic problems in data analysis. Given a data set one can always expand the model almost indefinitely. How does one pick a model which explains the data, but does not contain spurious features relating to the noise? Here we present the results of a Bayesian model selection calculation started in [1] and then extended in [2], and show that the Bayesian answer to this question is essentially a quantitative statement of Occams razor: When two models fit the evidence in the data equally well, choose the simpler model.

Journal ArticleDOI
TL;DR: In this paper, Monte Carlo analysis is employed to investigate the relative importance of input parameter uncertainty versus process aggregation error, and an expanded form of the Streeter-Phelps dissolved oxygen equation is used to demonstrate the application of this technique.
Abstract: Modeling error can be divided into two basic components: use of an incorrect model and input parameter uncertainty. Incorrect model usage can be further subdivided into inappropriate model selection and inherent modeling error due to process aggregation. Total modeling error is a culmination of these various modeling error components, with overall optimization requiring reductions in all. A technique, utilizing Monte Carlo analysis, is employed to investigate the relative importance of input parameter uncertainty versus process aggregation error. An expanded form of the Streeter-Phelps dissolved oxygen equation is used to demonstrate the application of this technique. A variety of scenarios are analyzed to illustrate the relative obfuscation of each modeling error component. Under certain circumstances an aggregated model performs better than a more complex model, which perfectly simulates the real system. Alternately, process aggregation error dominates total modeling error for other situations. The ability to differentiate modeling error impact is a function of the desired or imposed model performance level (accuracy tolerance).

Journal ArticleDOI
TL;DR: In this article, it is shown that model selection is influenced by the correct specification of the model and eight model selection criteria with respect to this influence are compared with the influence of the correct model.

Journal ArticleDOI
TL;DR: An algorithm for model selection in discrimination with categorical variables is presented, based on four models applied hierarchically and linked with a build-up procedure of feature-selection, which results in an application in medical diagnostics.
Abstract: An algorithm for model selection in discrimination with categorical variables is presented. It is based on four models applied hierarchically and linked with a build-up procedure of feature-selection. The choice of models and features is ensured by a consequent cross-validation. Results of an application in medical diagnostics are described.

Journal ArticleDOI
TL;DR: It is indeed possible to quantify errors in dependent variables in logit models as a consequence of errors in independent variables and error consideration can be used as a tool for identifying the optimal model from a set of candidate models.

Journal ArticleDOI
TL;DR: It is concluded that the range of methods now available makes old controversies less relevant and researchers may now look for strategies that combine the advantages of non-parametric and parametric approaches without automatically accepting the limitations of either.

Journal ArticleDOI
TL;DR: In this article, an internally balanced state space model is constructed by setting the singular values of an impulse response Handel matrix equal to the observability and controllability grammians.
Abstract: A numerically superior method of vector time series model reduction is proposed. The internally balanced state space model is constructed by setting the singular values of an impulse response Handel matrix equal to the observability and controllability grammians. This construction yields a stable model; stability retains for reduced models derived from this construction. The full model and the reduced model satisfy an optimization criterion with respect to conditioning. Two objective reduced order model selection criteria are proposed. When applied to the Quenouille U.S. hog. Corn, and farm wages data, the internally balanced model reductionmethod suggest that the data are efficiently represented by a state space model with only five or six states.