scispace - formally typeset
Search or ask a question

Showing papers on "Cross-validation published in 1990"


Journal ArticleDOI
Naomi Altman1
TL;DR: The mean squared error of kernel estimators is computed for processes with correlated errors, and the estimators are shown to be consistent when the sequence of error processes converges to a mixing sequence.
Abstract: Kernel smoothing is a common method of estimating the mean function in the nonparametric regression model y = f(x) + e, where f(x) is a smooth deterministic mean function and e is an error process with mean zero. In this article, the mean squared error of kernel estimators is computed for processes with correlated errors, and the estimators are shown to be consistent when the sequence of error processes converges to a mixing sequence. The standard techniques for bandwidth selection, such as cross-validation and generalized cross-validation, are shown to perform very badly when the errors are correlated. Standard selection techniques are shown to favor undersmoothing when the correlations are predominantly positive and oversmoothing when negative. The selection criteria can, however, be adjusted to correct for the effect of correlation. In simulations, the standard selection criteria are shown to behave as predicted. The corrected criteria are shown to be very effective when the correlation functi...

304 citations


Journal ArticleDOI
01 Dec 1990
TL;DR: In this article, a systematic report on mean squared error matrix comparisons of competing biased estimators is given, where the parameter vector to be estimated is assumed to belong to a subset of the p-dimensional Euclidean space.
Abstract: In the following we give a systematic report on mean squared error matrix comparisons of competing biased estimators. Our approach is quite general: The parameter vector to be estimated is assumed to belong to a subset of the p-dimensional Euclidean space. However, to illustrate our results, we shall pay attention to the linear regression model where biased estimation is very popular. Especially we are interested in generalized ridge and restricted least squares estimation.

154 citations


Journal ArticleDOI
TL;DR: The present study applies a k-means, optimal weighting procedure to two empirical data sets and contrasts its cross-validation performance with that of unit (i.e., equal) weighting of the variables.
Abstract: Recently, algorithms for optimally weighting variables in non-hierarchical and hierarchical clustering methods have been proposed. Preliminary Monte Carlo research has shown that at least one of these algorithms cross-validates extremely well. The present study applies a k-means, optimal weighting procedure to two empirical data sets and contrasts its cross-validation performance with that of unit (i.e., equal) weighting of the variables. We find that the optimal weighting procedure cross-validates better in one of the two data sets. In the second data set its comparative performance strongly depends on the approach used to find seed values for the initial k-means partitioning.

73 citations


Journal ArticleDOI
TL;DR: In this article, an adaptation of least squares cross-validation is proposed for bandwidth choice in the kernel estimation of the derivatives of a probability density, which is demonstrated by an example and a simulation study.
Abstract: An adaptation of least squares cross-validation is proposed for bandwidth choice in the kernel estimation of the derivatives of a probability density. The practicality of the method is demonstrated by an example and a simulation study. Theoretical justification is provided by an asymptotic optimality result

69 citations


Journal ArticleDOI
TL;DR: Wahba et al. as mentioned in this paper compared the performance of generalized cross validation (GCV) and modified maximum likelihood (MML) procedures for choosing the smoothing parameter of a smoothing spline.
Abstract: Wahba compared the performance of generalized cross validation (GCV) and modified maximum likelihood (MML) procedures for choosing the smoothing parameter of a smoothing spline. This work makes a more careful study of the two procedures when the stochastic model motivating the modified maximum likelihood estimate is correct. In particular, it is shown that in the case of the linear smoothing spline with equally spaced observations, both estimates are asymptotically normal with the GCV estimate having twice the asymptotic variance of the MML estimate. The impact of using these estimates on the subsequent predictions is also calculated. Conjectures on how these results should generalize to higher order smoothing splines are developed. These conjectures suggest that the penalty for using GCV instead of MML when the stochastic model is correct is greater for higher order smoothing splines, both in terms of the efficiency in estimating the smoothing parameter and the impact on subsequent predictions.

68 citations


Proceedings ArticleDOI
01 May 1990
TL;DR: The cross-validation principle is used to address the task of model selection and the derivation of a selection rule via Bayesian predictive densities is derived for the set of nested normal linear regression models.
Abstract: The cross-validation principle is used to address the task of model selection. Assuming that a set of probabilistic models is given or constructed, the derivation of a selection rule via Bayesian predictive densities is discussed. A selection rule is derived for the set of nested normal linear regression models. Conditioned on the assumption that the true model is in the set of examined models, this rule asymptotically yields consistent selection of the true model. Some simulation results to demonstrate the performance of the selection criterion are included. >

56 citations


Journal ArticleDOI
TL;DR: In this paper, a method for robust nonparametric regression is proposed, which is shown to possess a theoretical asymptotic optimality property, while some simulated examples confirm that the approach is practicable.
Abstract: A method for robust nonparametric regression is discussed. A method for robust nonparametric regression is discussed. We consider kernel M-estimates of the regression function using Huber's ψ-function and extend results of Hardle and Gasser to the case of random designs. A practical adaptive procedure is proposed consisting of simultaneously minimising a cross-validatory criterion with respect to both the smoothing parameter and a robustness parameter occurring in the ψ-function. This method is shown to possess a theoretical asymptotic optimality property, while some simulated examples confirm that the approach is practicable.

55 citations


Journal ArticleDOI
TL;DR: The estimate of the average posterior variance of a spline estimate proposed by Wahba is shown to converge in probability to a quantity proportional to the expected average squared error.
Abstract: A smoothing spline estimator can be interpreted in two ways: either as the solution to a variational problem or as the posterior mean when a particular Gaussian prior is placed on the unknown regression function. In order to explain the remarkable performance of her Bayesian "confidence intervals" in a simulation study, Wahba conjectured that the average posterior variance of a spline estimate evaluated at the observation points will be close to the expected average squared error. The estimate of the average posterior variance proposed by Wahba is shown to converge in probability to a quantity proportional to the expected average squared error. This result is established by relating this statistic to a consistent risk estimate based on generalized cross-validation.

34 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed two approaches to estimate the number of factors in a data compression model: the first uses the estimated standard error of the model and the second is an approximation to a leave-one-out method.
Abstract: Overcoming the collinearity problem in regression by data compression techniques [i.e., principal component regression (PCR) and partial least-squares (PLS)] requires estimation of the number of factors (principal component) to use for the model. The most common approach is to use cross-validation for this purpose. Unfortunately, cross-validation is time consuming to carry out. Accordingly, we have searched for time-saving methods to estimate the number of factors. Two approaches were considered. The first uses the estimated standard error of the model and the second is an approximation to a cross-validation leave-one-out method. Both alternatives have been tested on spectroscopic data. It has been found that, when the number of wavelengths is limited, both methods give results similar to those obtained by full cross-validation both for PCR and PLS. However, when the number of wavelengths is large, the tested methods are reliable only for PCR and not for PLS.

27 citations


Journal ArticleDOI
TL;DR: In this paper, the use of cross-validation is considered in conjunction with orthogonal series estimators for a probability density function, and a data-based procedure which will select both the optimal choice of series, and the best trade-off between bias-squared and variance, i.e. series length.
Abstract: The use of cross-validation is considered in conjunction with orthogonal series estimators for a probability density function. We attempt to establish a data-based procedure which will select both the optimal choice of series, and the best trade-off between bias-squared and variance, i.e. series length. Although the expected value of the estimator looks promising, the rate of convergence is very slow. Simulations illustrate the theoretical results.

4 citations


Journal ArticleDOI
TL;DR: This paper proposed a model selection strategy that combines strongly held economic priors with a search for models with low values of the final prediction error criterion, in which the regressors would be predetermined.
Abstract: selection algorithm like the state space models do. ACE does internally make use of the idea of cross validation, which is closely related to prediction error. We have carried that empirical philosophy over to the area of model selection and combined strongly held economic priors with a search for models with low values of the final prediction error criterion. Our model selection strategy was to consider only those models with homogeneity and per capita quantities. Because our interest is in prediction, we considered only those models in which the regressors would be predetermined.

Journal ArticleDOI
Jeffrey Pliskin1
TL;DR: In this paper, the authors derived the corresponding results for generalized ridge regression for two risk functions: mean squared error and mean square error of prediction, and showed that the ordinary ridge regression estimator performs best and performs worse when squared error is the measure of performance.
Abstract: Newhouse and Oman (1971) identified the orientations with respect to the eigenvectors of X'X of the true coefficient vector of the linear regression model for which the ordinary ridge regression estimator performs best and performs worse when mean squared error is the measure of performance. In this paper the corresponding result is derived for generalized ridge regression for two risk functions: mean squared error and mean squared error of prediction.

01 Jan 1990
TL;DR: In this article, a linear shrinkage of the observed value towards the normal value is used to estimate the expected number of accidents at a road junction and the uncertainty of the estimate is estimated by combining the cross validation with bootstrap technique.
Abstract: In traffic accident models for estimating the expected number of accidents at a road junction, the estimate is obtained as a weighted average of the observed number of accidents at the junction and some normal value. This normal value can be the mean value for a group of similar junctions or an otherwise predicted value. The estimate of the expected number of accidents can be regarded as a linear shrinkage of the observed value towards the normal value. This thesis consists of four separate research papers, (A), (B), (C) and (D). General formulas for the shrinkage parameter are given in (A). The formulas are obtained by applying the method of least squares to linear models. In (B) and (C), the shrinkage parameter is estimated by cross validation. The uncertainty of the estimate is estimated by combining the cross validation with bootstrap technique. In (D), maximum likelihood estimation is used to estimate the shrinkage parameter in a case of accident rates.