scispace - formally typeset
Search or ask a question

Showing papers on "Cross-validation published in 1985"


Journal ArticleDOI
TL;DR: In this paper, a bandwidth-selection rule is formulated in terms of cross validation, and under mild assumptions on the kernel and the unknown regression function, it is seen that this rule is asymptotically optimal.
Abstract: Kernel estimators of an unknown multivariate regression function are investigated. A bandwidth-selection rule is considered, which can be formulated in terms of cross validation. Under mild assumptions on the kernel and the unknown regression function, it is seen that this rule is asymptotically optimal.

447 citations


Journal ArticleDOI
TL;DR: In this paper, the authors considered a nonparametric regression model where the zero mean errors are uncorrelated with common variance a2 and the response function f is assumed only to have a bounded square integrable qth derivative.
Abstract: Linear estimation is considered in nonparametric regression models of the form Yi = f (xi) + ei, xi E= (a, b), where the zero mean errors are uncorrelated with common variance a2 and the response function f is assumed only to have a bounded square integrable qth derivative. The linear estimator which minimizes the maximum mean squared error summed over the observation points is derived, and the exact minimax rate of convergence is obtained. For practical problems where bounds on 11 f2q) 11 and a2 may be unknown, generalized cross-validation is shown to give an adaptive estimator which achieves the minimax optimal rate under the additional assumption of normality. 1. The model. Consider the nonparametric regression model

185 citations


Journal ArticleDOI
TL;DR: In this paper, a new approach to generalized cross validation based on Stein estimates and the associated unbiased risk estimates is developed, and the consistency results are obtained for the cross-validated (Steinized) estimates in the contexts of nearest neighbor nonparametric regression, model selection, ridge regression, and smoothing splines.
Abstract: This paper concerns the method of generalized cross validation (GCV), a promising way of choosing between linear estimates. Based on Stein estimates and the associated unbiased risk estimates (Stein, 1981), a new approach to GCV is developed. Many consistency results are obtained for the cross-validated (Steinized) estimates in the contexts of nearest-neighbor nonparametric regression, model selection, ridge regression, and smoothing splines. Moreover, the associated Stein's unbiased risk estimate is shown to be uniformly consistent in assessing the true loss (not the risk). Consistency properties are examined as well when the sampling error is unknown. Finally, we propose a variant of GCV to handle the case that the dimension of the raw data is known to be greater than that of their expected values.

173 citations


Journal ArticleDOI
TL;DR: In this paper, the mean squared error matrix of a natural estimator of a regression coefficient is derived for any individual k, in the general situation where all parameters of the model must be estimated, and is shown to be smaller than the Mean Squared Error matrix of the individual least squares estimator for every individual.
Abstract: Estimation of an individual's regression coefficients is considered in a multivariate general linear model, where it is assumed that the individual's coefficients β k are subject to both fixed effects and random effects over different individuals. The mean squared error matrix of a natural estimator of β k is derived for any individual k, in the general situation where all parameters of the model must be estimated, and is shown to be smaller than the mean squared error matrix of the individual least squares estimator for every individual. Extension of this result to more general multiparameter estimation situations is also considered.

67 citations


Journal ArticleDOI
TL;DR: This paper presents a simulation comparing various resampling procedures for estimating classification error rate, for the two-class and three-class problems.

60 citations


Journal ArticleDOI
TL;DR: In this paper, the shape of a two-dimensional obstacle from far-field scattering data is determined by parametrising the boundary of the obstacle, which is non-linear and ill-posed.
Abstract: The author considers the problem of determining the shape of a two-dimensional obstacle from far-field scattering data. A model for sparse, discrete, error-contaminated data is introduced. This model has applications for both linear acoustic and electromagnetic inverse scattering problems. By parametrising the boundary of the obstacle, a one-dimensional problem, which is non-linear and ill-posed, is obtained. A numerical method is presented to deal with the ill-posedness, non-linearity and error in the data. This method yields a parametrised sequence of smooth approximate solutions. A statistical technique known as generalised cross validation is then used to determine an appropriate value of the smoothing parameter. Two numerical examples are given, showing that the method is quite robust.

27 citations


Journal ArticleDOI
TL;DR: A simple classifier is investigated and a direct fidelity estimation is proposed, which shows good or bad fidelity estimations but, unfortunately, vice versa a big or small confidence interval for the estimation.
Abstract: To estimate the correct classification rate of a classifier, many different methods exist (test sample, bootstrap, cross validation). The test sample is a method with very small expense. Sometimes, only a small number of objects is available (seldom diseases, high costs for experiments). When we split the sample in training set and test set, we get good or bad fidelity estimations but, unfortunately, vice versa a big or small confidence interval for the estimation. Overcoming this dilemma is only possible for simple classifiers. Such a simple classifier is investigated and a direct fidelity estimation is proposed.

1 citations


01 Aug 1985
TL;DR: In this article, the use of Laplacian smoothing splines with generalized cross validation (GCV) to choose the smoothing parameter for the objective analysis problem is investigated Simulated 500 mb pressure height fields are approximated from first-quess data with spatially correlated errors and observed values having independent errors.
Abstract: : The use of Laplacian smoothing splines (LSS) with generalized cross validation (GCV) to choose the smoothing parameter for the objective analysis problem is investigated Simulated 500 mb pressure height fields are approximated from first-quess data with spatially correlated errors and observed values having independent errors It is found that GCV does not allow LSS to adapt to variations in individual realizations, and that specification of a single suitable parameter value for all realizations leads to smaller rms error overall While the tests were performed in the context of data from a meteorology problem, it is expected the results carry over to data from other sources A comparison shows that significantly better approximations can be obtained using LSS applied in a unified manner to both first-guess and observed values rather than in a correction to first-guess scheme (as in Optimum Interpolation) when the first-guess error has low spatial correlation

1 citations


Journal ArticleDOI
TL;DR: In this paper, a multi-stage selection procedure (MSSP) is proposed for macroeconomic models in the form of a system of simultaneous equations, which combines in itself the principle of grouping of variables, which lets to remove the consequences of multicollinearity and undersized samples of data, and is asymptotic equivalented with the Akaike's criterion for choice of model.
Abstract: This paper discusses the design of macroeconomical models in the form of a system of simultaneous equations. A new approach in this field is described. The main idea is that the synthesis of the model is based on a multi-stage selection procedure (MSSP). Characteristic of this procedure is that at each stage of selection are generated a variety of hypothesises about the wanted model. Each generated hypothesis is verified and estimated, after which by inconclusive limit choice of a given number, only few of them are selected as “best” in the sense of a predefined selection criteria. MSSP combines in itself the principle of grouping of variables [1], which lets to remove the consequences of multicollinearity and undersized samples of data, the principle of externally addition “Cross validation”, which is asymptotic equivalented with the Akaike's criterion for choice of model [4] etc.