scispace - formally typeset
Open AccessJournal ArticleDOI

Biased and Unbiased Cross-Validation in Density Estimation

Reads0
Chats0
TLDR
In this article, biased cross-validation criteria for selection of smoothing parameters for kernel and histogram density estimators, closely related to one investigated in Scott and Factor (1981), were introduced.
Abstract
Nonparametric density estimation requires the specification of smoothing parameters. The demands of statistical objectivity make it highly desirable to base the choice on properties of the data set. In this article we introduce some biased cross-validation criteria for selection of smoothing parameters for kernel and histogram density estimators, closely related to one investigated in Scott and Factor (1981). These criteria are obtained by estimating L 2 norms of derivatives of the unknown density and provide slightly biased estimates of the average squared L 2 error or mean integrated squared error. These criteria are roughly the analog of Wahba's (1981) generalized cross-validation procedure for orthogonal series density estimators. We present the relationship of the biased cross-validation procedure to the least squares cross-validation procedure, which provides unbiased estimates of the mean integrated squared error. Both methods are shown to be based on U statistics. We compare the two metho...

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Neural networks and the bias/variance dilemma

TL;DR: It is suggested that current-generation feedforward neural networks are largely inadequate for difficult problems in machine perception and machine learning, regardless of parallel-versus-serial hardware or other implementation issues.
Journal ArticleDOI

A Brief Survey of Bandwidth Selection for Density Estimation

TL;DR: In this article, the authors recommend a "solve-the-equation" plug-in bandwidth selector as being most reliable in terms of overall performance for kernel density estimation.
Book

Local Regression and Likelihood

Guohua Pan
TL;DR: The Origins of Local Regression, Fitting with LOCFIT, and Optimizing local Regression methods.
Book

Nonparametric and Semiparametric Models

TL;DR: In this paper, the authors proposed a nonparametric density estimator based on Histogram and Nonparametric Density Estimation (NDE), and generalized additive models and generalized partial linear models.
Book

Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning

TL;DR: Techniques covered range from traditional multivariate methods, such as multiple regression, principal components, canonical variates, linear discriminant analysis, factor analysis, clustering, multidimensional scaling, and correspondence analysis, to the newer methods of density estimation, projection pursuit, neural networks, and classification and regression trees.
References
More filters
Book

Approximation Theorems of Mathematical Statistics

TL;DR: In this paper, the basic sample statistics are used for Parametric Inference, and the Asymptotic Theory in Parametric Induction (ATIP) is used to estimate the relative efficiency of given statistics.
Journal ArticleDOI

On optimal and data based histograms

TL;DR: In this article, a data-based procedure for choosing the bin width parameter is proposed, which assumes a Gaussian reference standard and requires only the sample size and an estimate of the standard deviation.
Journal ArticleDOI

An alternative method of cross-validation for the smoothing of density estimates

TL;DR: An alternative method of cross-validation, based on integrated squared error, recently also proposed by Rudemo (1982), is derived, and Hall (1983) has established the consistency and asymptotic optimality of the new method.

Empirical Choice of Histograms and Kernel Density Estimators

Mats Rudemo
TL;DR: Methods of choosing histogram width and the smoothing parameter of kernel density estimators by use of data are studied and two closely related risk function estimators are given.
Journal ArticleDOI

Bandwidth Choice for Nonparametric Regression

John Rice
- 01 Dec 1984 - 
TL;DR: In this article, the problem of choosing a bandwidth parameter for nonparametric regression is studied and the relationship of this estimate to a kernel estimate is discussed, based on an unbiased estimate of mean square error, which is shown to be asymptotically optimal.