scispace - formally typeset
Open AccessJournal ArticleDOI

Bootstrapping for Penalized Spline Regression

Reads0
Chats0
TLDR
It is shown theoretically and through simulations and examples that bootstrapping provides valid inference in both cases, and it is concluded that the latter generally produces better results than the former.
Abstract
We describe and contrast several different bootstrap procedures for penalized spline smoothers. The bootstrap methods considered are variations on existing methods, developed under two different probabilistic frameworks. Under the first framework, penalized spline regression is considered as an estimation technique to find an unknown smooth function. The smooth function is represented in a high-dimensional spline basis, with spline coefficients estimated in a penalized form. Under the second framework, the unknown function is treated as a realization of a set of random spline coefficients, which are then predicted in a linear mixed model. We describe how bootstrap methods can be implemented under both frameworks, and we show theoretically and through simulations and examples that bootstrapping provides valid inference in both cases. We compare the inference obtained under both frameworks, and conclude that the latter generally produces better results than the former. The bootstrap ideas are extended to hy...

read more

Content maybe subject to copyright    Report

Bootstrapping for penalized spline regression
b
Göran Kauermann, Gerda Claeskens and Jean D. Opsomer
DEPARTMENT OF DECISION SCIENCES AND INFORMATION MANAGEMENT (KBI)
Faculty of Economics and Applied Economics
KBI 0609

Bootstrapping for Penalized Spline Regression
†‡
oran Kauermann
Universit¨at Bielefeld
Gerda Claeskens
Katholieke Universiteit Leuven
J. D. Opsomer
Iowa State University
14th Februar y 2006
Abstract
We describe and contrast several different bootstrapping procedures for penal-
ized spline smoothers. The bootstrapping procedures considered are variations on
existing methods, developed under two different probabilistic frameworks. Under
the first framework, penalized spline regression is consid ered an estimation tech-
nique to nd an unknown smooth function. The smooth function is represented
in a high dimensional spline basis, with spline coefficients estimated in a penalized
form. Under the second framework, the unknown fun ction is treated as a realization
of a set of random spline coefficients, which are then predicted in a linear mixed
model. We describe how bootstrapping methods can be implemented under both
frameworks, and we show in theory and through simulations and examples that
bootstrapping provides valid inference in both cases. We compare the inference
obtained under both frameworks, and conclude that the latter generally produces
better results than the former. The bootstrapping ideas are extended to hypothesis
testing, where parametric components in a m odel are tested against nonparametric
alternatives.
Abbreviated title: “Bootstrapping for Penalized Splines.”
AMS 199 1 subject classifications. Primary-62G08; secondary-62G09.
Key words: Mixed Model, Nonparametric Regressio n, Resampling, Nonparametric Hyp othesis Test-
ing.
1

1 Introduction
The objective of nonparametric regression is to model the mean function of a response
variable Y by some smooth but otherwise unspecified function µ(x), with x as continuous
covariate. Based o n a sample of data pairs (x
i
, y
i
), i = 1, . . . , n, two important classes of
methods for estimating µ(x) are local approaches (see for instance Fan and Gij bels, 1996)
and spline smoothing (see for instance Wahba, 1 992 or Eubank, 1999). Both methods
can be applied in more complex models like Additive Models (Hastie and Tibshirani,
1990), Varying Coefficient Models (Hastie and Tibshirani, 1993) or in generalized response
models (Green and Silverman, 1994 or Bowman and Azzalini, 1997). In recent years,
penalized spline regression (often referred to as P-splines) has received renewed attention
as a powerful alternative smoot hing method. Originally suggested by O’Sullivan (1986),
the method has been made popular by Eilers and Marx (1 996) and more r ecently thro ug h
the book by Ruppert, Wand, and Carroll (2003). The main idea of penalized spline
regression is to fit the function µ(x) parametrically with a sufficiently flexible spline basis.
Instead of simple parametric estimation, however, a penalty is imposed on the spline
coefficients to achieve a smooth fit. One technical benefit of this approach is that it
reveals a link to linear mixed models (see Wand, 20 03). The resulting affinity to linear
mixed models is advantageous and can be exploited in various ways. In particular, the
smoothing or penalty parameters are playing the role of a ratio of variances in the mixed
model which suggests the application of maximum likelihood theory for estimation (see
for instance Kauermann, 2004).
For notational simplicity, we restrict the presentation to the standard smoothing model
Y = µ(x) + ε with ε as zero mean residuals, even though the examples la t er in this
article mirror more complex models. Estimation of µ(x) is carried out by penalized spline
regression. Under this method, we first replace µ(x) by the parametric fo r m Xβ + Zu,
where X is some low dimensional basis, e.g. a line, while Z is high dimensional, e.g. a
basis built from truncated line segments. The main assumption is that Z is sufficiently
complex and high dimensional, so that the modelling bias µ(x)(Xβ +Zu) is of ignora ble
size compared to the stochastic estimation error. Theoretical results on how large the
dimension o f the spline basis should be in relation to the sample size are rudimentary,
even though Cardot (2002 ) provides a good starting point. However, it has been found in
practice that the actual specification of Z and its dimension has little influence o n the fit
2

as long as the dimension of Z is sufficiently large and a penalized fit is pursued. In fact,
Ruppert (2002) concludes that ”it may be surprising that a default that uses at most 35
or 40 knots [= the dimension of basis Z] could be recommended for effectively all sample
sizes and for all smo oth regression functions without too many oscillations”.
Once a basis is selected, a penalized fit is pursued by imposing a penalty on the spline
coefficients u and estimating by least squares regression, which results in a ridge regression
estimate. The resulting penalized fit is equivalently achieved by assuming the spline
coefficients u to be random, that is formulating an a priori distribution on u. This leads
to a linear mixed model and the best linear unbiased prediction (BLUP) of u is equivalent
to the penalized smooth fit, if the penalty is selected to be equal to the ratio of the
variances of ε and u.
Our objective is to develop a bootstrap that takes advantage of the mixed model structure,
and to compare it with a bootstrap that treats µ(x) as fixed and only ε a s random.
Bootstrapping for such “smoothing models” has a long history, with ardle and Bowman
(1988) and ardle and Marron (1991) as two important examples. See also Mammen
(1993), ardle, Huet, and Jolivet (1995) or Galindo, Liang, Kauermann, and Carroll
(2001) for some extensions. We refer to Shao a nd Tu (1995) for an overview. A major
concern when bootstrapping in smooth models is the bias occurring due to smoothing,
which is not a ccounted fo r if one applies a naive bootstrap. This requires the use o f a pilot
estimate with a relatively large smoo t hing para meter before the actual b ootstrapping is
pursued (see ardle and Marron, 1991). Following the discussion in Ruppert, Wand, a nd
Carroll (200 3, ch.6), we show here t hat the bias problem can be circumvented in penalized
spline smoothing if a mixed mo del for mulation is used for bootstrapping.
We describe a number of boo tstrap versions fo r both the mixed model and the smoothing
model formulations, including simple residual resampling, wild boot strapping and boo t-
strapping of correlated spline coefficients. We also show how residuals can be adjusted
to compensate for any small sample bias. The adjustment again depends on the model
used, that is a smoothing model or a mixed model, respectively. Bootstrapping is em-
ployed in our paper for two purposes. First, it serves to mirror estimation variability.
That is, we derive bootstrap based confidence bands for our smooth fit. Second, we take
advantage of the technique for model validation and model checking. In particular, we
use boo tstrapping for testing of particular comp onents of the model.
3

The article is organized as follows. In Section 2, we introduce penalized spline smoothing
in the two models considered, i.e. the smoothing model and the linear mixed model. We
then suggest two resulting boo t strap procedures. Before providing simulations, we prop ose
some small sample adjustment to improve the performance o f the bootstrap routine. The
bootstrap is then applied in Section 3 to two data examples making use of additive models.
In Section 4 we employ the bootstrap in testing for nonparametric and semiparametric
models, which shows the applicability of our suggestions in more complicated regression
settings.
2 Penalized Spline Smoothing
2.1 Estimation
We consider the smoot hing model
y
i
|x
i
= µ(x
i
) + ε
i
with ε
i
N(0, σ
2
ε
) as independent errors. Function µ (x) is assumed to be smooth but
otherwise unspecified. Following the idea of penalized spline smoothing sketched in Sec-
tion 1, we approximate µ(x) by µ(x
i
) = C(x
i
)θ + δ(x
i
) where C(x
i
) is a high dimensional
basis chosen in advance. In this form, δ(x) denotes the approximation bias of the spline
basis in C(x). If C(x) is chosen as a sufficiently flexible basis, δ(x) does not contain
relevant information and will therefore be dropped subsequently. This means we assume
the function µ(x) to be representable by a high dimensional parametric form C(x)θ. It
is convenient to decompose C(x) into a low dimensional part X and a high dimensional
component Z (see Ruppert, Wand, and Carroll, 20 03). For instance X = (1, x, . . . , x
p
)
can contain a low dimensional polynomial form while Z is a truncated polynomial basis
Z = ((x τ
1
)
p
+
, . . . , (x τ
K
)
p
+
), where (x)
p
+
= x
p
for x > 0 and zero otherwise. Following
Ruppert (2002), we choose K la rge but less than the sample size n (or n p 1). As a
practical choice, we suggest K = min(n/4, 40). Alternatively, one may use the selection
routine suggested in Ruppert (2002 ) , but to keep the approach simple we fix K with the
above rule of thumb. Once K is chosen, we select the knots τ
k
to cover the range of x
values using quantiles. This formulation brings us to the parametric model
Y |x, u N(Xβ + Zu, σ
2
ε
I) (1)
4

Citations
More filters
Journal ArticleDOI

Practical variable selection for generalized additive models

TL;DR: Two very simple but effective shrinkage methods and an extension of the nonnegative garrote estimator are introduced, which avoid having to use nonparametric testing methods for which there is no general reliable distributional theory.
Journal ArticleDOI

Resampling Methods for Dependent Data

TL;DR: This book presents recent developments in Bayesian nonlinear modeling and provides a complete treatment of regression and classiŽ cation problems by emphasizing a data-driven approach in determining appropriate models.
Journal ArticleDOI

Economic convergence: Policy implications from a heterogeneous agent model

TL;DR: In this paper, a two-region agent-based macroeconomic model is used to analyze short-, medium and long-term effects of policies improving human capital and fostering adoption of technologies in lagging regions.
Journal ArticleDOI

Simultaneous selection of variables and smoothing parameters in structured additive regression models

TL;DR: A new penalty for two-dimensional smoothing is proposed, that allows for ANOVA-type decompositions into main effects and an interaction effect without explicitly specifying the main effects.
Journal ArticleDOI

The Geometry of Nutrient Space–Based Life-History Trade-Offs: Sex-Specific Effects of Macronutrient Intake on the Trade-Off between Encapsulation Ability and Reproductive Effort in Decorated Crickets

TL;DR: It is highlighted the fact that greater consideration of specific nutrient intake is needed when examining nutrient space–based trade-offs, as demonstrated by significant negative correlations between the traits in males, nonoverlapping 95% confidence regions, and larger estimates of θ and d.
References
More filters
Book

An introduction to the bootstrap

TL;DR: This article presents bootstrap methods for estimation, using simple arguments, with Minitab macros for implementing these methods, as well as some examples of how these methods could be used for estimation purposes.
Book

A practical guide to splines

Carl de Boor
TL;DR: This book presents those parts of the theory which are especially useful in calculations and stresses the representation of splines as linear combinations of B-splines as well as specific approximation methods, interpolation, smoothing and least-squares approximation, the solution of an ordinary differential equation by collocation, curve fitting, and surface fitting.
Journal ArticleDOI

Generalized Additive Models.

Book

Spline models for observational data

Grace Wahba
TL;DR: In this paper, a theory and practice for the estimation of functions from noisy data on functionals is developed, where convergence properties, data based smoothing parameter selection, confidence intervals, and numerical methods are established which are appropriate to a number of problems within this framework.
OtherDOI

Generalized Additive Models

TL;DR: The generalized additive model (GA) as discussed by the authors is a generalization of the generalized linear model, which replaces the linear model with a sum of smooth functions in an iterative procedure called local scoring algorithm.
Frequently Asked Questions (12)
Q1. What are the contributions in "Bootstrapping for penalized spline regression b" ?

The authors describe and contrast several different bootstrapping procedures for penalized spline smoothers. The authors describe how bootstrapping methods can be implemented under both frameworks, and they show in theory and through simulations and examples that bootstrapping provides valid inference in both cases. 

residual bootstrapping can be used by setting u∗ = D̃−1/2 v∗ and drawing v∗ from the empirical distribution function of the fitted values v̂ = D̃1/2û. 

Once a basis is selected, a penalized fit is pursued by imposing a penalty on the spline coefficients u and estimating by least squares regression, which results in a ridge regression estimate. 

One advantage of the mixed model approach, as also noted in Ruppert, Wand, and Carroll (2003, ch.6), is that the bias due to smoothing in the smoothing model becomes a component of variance by treating u as random. 

The authors have demonstrated how the link between penalized spline smoothing and linear mixed models can not only be exploited for smoothing but also for bootstrapping. 

Under this model, the estimator µ̂λ can be interpreted as a posterior Bayes estimator or as best linear unbiased predictor (BLUP)5with λ = σ2ε/σ 2 u steering the amount of smoothness. 

The smoothing parameter λP is chosen using REML, which provides an easy and numerically appealing choice (see also Ruppert, Wand, and Carroll, 2003, p.113). 

The mixed model bootstrap of û is carried out in two ways, first by simply resampling u∗ from û and secondly by accounting for the correlation structure among û as proposed above. 

It can be shown that the Mean Squared Error based choice of λ has order O(1) (see Kauermann, 2004), so that consistency follows naturally if the smoothing parameter is chosen in a data driven manner, for both pilot and bootstrap versions of λ. 

The effect of bathrooms with special features is positive but less strong and shows some non-significant behavior based on the mixed model bootstrap. 

Following the discussion in Ruppert, Wand, and Carroll (2003, ch.6), the authors show here that the bias problem can be circumvented in penalized spline smoothing if a mixed model formulation is used for bootstrapping. 

A major concern when bootstrapping in smooth models is the bias occurring due to smoothing, which is not accounted for if one applies a naive bootstrap.