scispace - formally typeset
SciSpace - Your AI assistant to discover and understand research papers | Product Hunt

Journal ArticleDOI

A comparison of different nonparametric methods for inference on additive models

01 Jan 2005-Journal of Nonparametric Statistics (Taylor & Francis)-Vol. 17, Iss: 1, pp 57-81

AbstractIn this article, we highlight the main differences of available methods for the analysis of regression functions that are probably additive separable. We first discuss definition and interpretation of the most common estimators in practice explaining the different ideas of modeling behind each estimator as well as what the procedures are doing to the data. Computational aspects are mentioned explicitly. The discussion concludes with a simulation study on the mean squared error for different marginal integration approaches. Next, various test statistics for checking additive separability are introduced and accomplished with asymptotic theory. For the statistics, different smoothing and bootstrap methods, we perform a detailed simulation study. A main focus in the reported results is directed on the (non)reliability of the methods when the covariates are strongly correlated among themselves. We found that the most striking differences lie in the different pre-smoothers that are used, but less in the differe...

Topics: Additive model (56%), Nonparametric statistics (56%), Asymptotic theory (statistics) (55%), Estimator (55%), Mean squared error (54%)

Summary (3 min read)

1 Introduction

  • In the last ten years additive models have attracted an increasing amount of interest in nonparametric statistics.
  • A consequence could be to prefer marginal integration for the construction of additivity tests.
  • Further, Dette and Munk (1998) pointed out several drawbacks in the application of Fourier series estimation for checking model assumptions.
  • Therefore the present article is mainly concerned about the practical performance of the different procedures and for a better understanding of some of the above mentioned problems in estimating and testing.
  • The authors will investigate and explain that the use of the internalized Nadaraya–Watson estimator for the marginal integration can partly ameliorate this problem.

2 Marginal Integration and Additive Models

  • Further, m, σ are unknown functions and the regression function m(·) has to be estimated nonparametrically.
  • Notice first that in case of additivity, i.e. there exist functions mα, m−α such that m(X) = mα(Xα) + m−α(X−α) (2.2) with X−α being the vector X without the component Xα, the marginal impact of Xα corresponds exactly to the additive component mα.
  • The authors estimate the right hand side of equation (2.3) by replacing the expectation by an average and the unknown mutidimensional regression function m by a pre-smoother m̃.

2.1 Formal Definition

  • Theory has always been derived for kernel estimators [note that the same happened to the backfitting (Opsomer and Ruppert 1997, Mammen, Linton and Nielsen 1999)].
  • Therefore the authors will concentrate only on the kernel based definitions even though spline implementation is known to be computationally more advantageous.
  • The authors first give the definition of the classic marginal integration method (CMIE).
  • The modification giving us the internalized marginal integration estimate (IMIE) concerns the definition of m̂, equation (2.7), where f̂(xα, Xk,−α) is substituted by f̂(Xjα, Xj,−α), see Jones, Davies and Park (1994) or Kim, Linton and Hengartner (2000) for details.
  • Notice that the fraction before Yj in (2.11) is the inverse of the conditional density fα|−α(Xα|X−α).

2.2 On a Better Understanding of Marginal Integration

  • Linton and Härdle (1999) already emphasized the differences of backfitting and marginal integration, often they are still interpreted as competing estimators for the same aim.
  • For a better understanding of the difference between orthogonal projection into the additive space and measuring the marginal impact (marginal integration) the authors give two more examples.
  • Obvious advantages of the IMIE are the possibility of changing the sums and getting rid of the xα in the density estimates, see (2.11).
  • Camlong-Viot (2000) chose this for the simpler theoretical analysis of this estimator while Hengartner (1996) showed that the bandwidths conditions for the nuisance corrections depend only on the smoothness of the densities but not, as for the CMIE, on the smoothness of their component functions.
  • So far not investigated are differences in the finite sample performance.

2.3 Some Simulation Results

  • Since asymptotically all methods are consistent, differences and problems can better be observed for small samples.
  • The bandwidths were chosen from h = 0.25std(X) to 1.6std(X) where std(X) is the vector of the empirical standard deviations of the particular design.
  • For the calculation of the CV-values the authors used the same trimming at 1.96 (tr5), respectively 1.645 (tr10) in each direction.
  • Due to the sparseness of data the IMIE substantially outperforms the other methods, especially in the presence of high correlation.

3 Testing Additivity

  • In this section the authors investigate several tests but will only concentrate on statistics based on residuals from an internal marginal integration fit.
  • The authors prove asymptotic normality of the corresponding test statistics under the null hypothesis of additivity and fixed alternatives with different rates of convergence corresponding to both cases.
  • In the following section the authors investigate the asymptotic behavior of these statistics under the null hypothesis and fixed alternatives.

3.1 Theoretical Results

  • The authors assume that the following assumptions are satisfied.
  • The authors will come back to this point in the next section.
  • Note further that Gozalo and Linton (2000) and Dette and von Lieres (2000) considered weight functions in the definition of the corresponding test statistics based on residuals from the classical marginal integration fit.
  • Obviously it depends on many factors like the density of the covariates, kernel choice, error variance function, and the functional ∆ = m − m0 which test has more power.
  • T4n might give more reliable results in such cases.

4 Simulation Comparison of Additivity Tests

  • In this section the authors continue the considerations of the last part of Section 2 but extend them to the various tests for checking additivity.
  • The authors concentrate especially on the differences caused by the use of different pre-smoothers, i.e. they compare CMIE with IMIE, but certainly also consider differences between T1n to T4n.
  • Finally, the authors compare the difference in performance between tests using the bootstrap based on residuals taken from Y − m̂0 (B0), as e.g. Gozalo and Linton (2000) or Härdle and Mammen (1993), versus bootstrap based on residuals taken from Y − m̂ (B1) as e.g. Dette and von Lieres (2000).
  • The authors took always the bandwidths minimizing the average of the CV values for trimming tr5 and covariance Σ2.
  • Due to computational restrictions the authors did the simulations only for 500 bootstrap replications.

4.1 The case d = 2

  • Finally, since results for the test statistic T4 depend strongly on the choice of bandwidth g, the authors tried out various bandwidths and report the results for 0.1std(X) (g1), and 0.2std(X) (g2).
  • Note that for the ease of presentation all tables will have the same structure.
  • In the left part of each Table the results are given under the null hypothesis of additivity, i.e. for scalar a = 0.0; in the right part the authors present results under some alternative (a = 1.0).
  • Tables for independent and correlated designs are separated.
  • For these reasons all results presented here and in the following are based on bootstrap taking residuals under the null hypothesis.

4.2 The case d = 3

  • As for estimation, also for testing the results change significantly when the authors increase the dimension of the model.
  • Thus, a power statement or comparison would not make much sense.
  • The authors will restrict ourselves on some remarks.
  • For sample sizes bigger than n = 150 the simulations with the CMIE took about 10 times longer than with the IMIE (measured in days).the authors.
  • The authors turn to highly correlated designs, i.e. using Σ3.

Did you find this useful? Give us your feedback

...read more

Content maybe subject to copyright    Report

A comparison of different nonparametric methods for inference on
additive models
Holger Dette
Ruhr-Universit¨at Bochum
Fakult¨at f¨ur Mathematik
D - 44780 Bochum, Germany
Carsten von Lieres und Wilkau
Ruhr-Universit¨at Bochum
Fakult¨at f¨ur Mathematik
D - 44780 Bochum, Germany
Stefan Sperlich
Universidad Carlos III de Madrid
Departamento de Estad´ıstica y Econometr´ıa
E - 28903 Getafe, Spain
May 10, 2001
Abstract
In this article we highlight the main differences of available methods for the analysis of regression
functions that are probably additive separable. We first discuss definition and interpretation of the
most common estimators in practice. This is done by explaining the different ideas of modeling
behind each estimator as well as what the procedures are doing to the data. Computational
aspects are mentioned explicitly. The illustrated discussion concludes with a simulation study on
the mean squared error for different marginal integration approaches. Next, various test statistics
for checking additive separability are introduced and accomplished with asymptotic theory. Based
on the asymptotic results under hypothesis as well as under the alternative of non additivity we
compare the tests in a brief discussion. For the various statistics, different smoothing and bootstrap
methods we perform a detailed simulation study. A main focus in the reported results is directed on
the (non-) reliability of the methods when the covariates are strongly correlated among themselves.
Again, a further point are the computational aspects. We found that the most striking differences
lie in the different pre-smoothers that are used, but less in the different constructions of test
statistics. Moreover, although some of the observed differences are strong, they surprisingly can
not be revealed by asymptotic theory.
1
AMS Subject Classification: 62G07, 62G10
Keywords: marginal integration, additive models, test of additivity.
1
Acknowledgements: This research was financially supported by the Spanish “Direcci´on General de Ense˜nanza Supe-
rior” (DGES), reference number PB98-0025 and the Deutsche Forschungsgemeinschaft (SFB 475: Komplexit¨atsreduktion
in multivariaten Datenstrukturen, Teilprojekt A2; Sachbeihilfe: Validierung von Hypothesen, De 502/9-1). Parts of
this paper were written while the first author was visiting Perdue University and this author would like to thank the
Department of Statistics for its hospitality.
1

1 Introduction
In the last ten years additive models have attracted an increasing amount of interest in nonparametric
statistics. Also in the econometric literature these methods have a long history and are widely used
today in both, theoretical considerations and empirical research. Deaton and M¨ullbauer (1980) pro-
vided many examples in microeconomics where the additive structure follows from economic theory
of separable decision making like two step budgeting or optimization. Furthermore, additivity is the
natural structure when production processes have independent substitution rates for separable goods.
In statistics, additivity leads to the circumvention of the curse of dimensionality (see Stone 1985) that
usually affects multidimensional nonparametric regression.
The most common and best known nonparametric estimation approaches in these models can be
divided into three main groups: the backfitting (see Buja, Hastie and Tibshirani 1989, or Hastie and
Tibshirani 1990 for algorithms, and Opsomer and Ruppert 1997 or Mammen, Linton and Nielsen 1999),
series estimators (see Andrews and Whang 1990 or Li 2000), and the marginal integration estimator
(see Tjøstheim and Auestad 1994, Linton and Nielsen 1995, and also Kim, Linton, Hengartner 2000
for an important modification). Certainly, here we have mentioned only the main references respective
basic ideas and theory. Among them, to our knowledge, the series estimator is so far not explored in
practice, i.e. although straightforward implementation and good performance is declared, we could not
find a simulation study or an application of this method. Moreover, usually hardly feasible assumptions
are made on the series and its “smoothing parameters”, e.g. reducing bias and variance simultaneously,
but without giving a correct idea how to choose them in practice. The backfitting of Buja, Hastie and
Tibshirani (1989) is maybe the most studied additive model estimator in practice, and algorithms are
developed for various regression problems. However, the backfitting version of Mammen, Linton and
Nielsen (1999), for which closed theory is provided but no Monte-Carlo studies, differs a lot in definition
and implementation from that one. The marginal integration, finally, has experienced most extensions
in theory but actually a quite different interpretation than the aforementioned estimators. This was
first theoretically highlighted by Nielsen and Linton (1997) and empirically investigated by Sperlich,
Linton and ardle (1999) in a detailed simulation study. The main point is that backfitting, at least the
version of Mammen et al. (1999), and series estimators are orthogonal projections of the regression
into the additive space whereas the marginal integration estimator always estimates the marginal
impact of the explanatory variables taking into account possible correlation among them. This led
Pinske (2000) to the interpretation of the marginal integration estimator as a consistent estimator of
weak separable components, which, in the case of additivity, coincide with the additive components.
From this it can be expected that the distance between the real regression function and its estimate
increases especially fast when the data generating regression function is not additive but estimated by
the sum of component estimates obtained from marginal integration instead of backfitting or series
estimates. A consequence could be to prefer marginal integration for the construction of additivity
tests. Nevertheless, until now backfitting was not used for testing simply because of the lack of theory
for the estimator.
Due to the mentioned econometric results and statistical advantages there is an increasing interest in
testing the additive structure. Eubank, Hart, Simpson and Stefanski (1995) constructed such a test but
used special series estimates that apply only on data observed on a grid. Gozalo and Linton (2000) as
well as Sperlich, Tjøstheim and Yang (2000) introduced a bootstrap based additivity test applying the
marginal integration. Here, Sperlich et al. (2000) concentrated on the analysis of particular interaction
terms rather than on general separability. Finally, Dette and von Lieres (2000) have summarized the
2

test statistics considered by Gozalo and Linton (2000) and compared them theoretically and also in a
small simulation study. Their motivation for using the marginal integration was its direct definition
which allows an asymptotic treatment of the test statistics using central limit theorems for degenerate
U-statistics. They argued that such an approach based on backfitting seems to be intractable, because
their asymptotic analysis does not require the asymptotic properties of the estimators as e.g. derived
by Mammen, Linton and Nielson (1999) but an explicit representation of the residuals. Further, Dette
and Munk (1998) pointed out several drawbacks in the application of Fourier series estimation for
checking model assumptions. For these and the former mentioned reasons we do not consider series
estimators for the construction of tests for additivity in this paper.
For the empirical researcher it would be of essential interest how the different methods perform in
finite samples and which method should be preferred. Therefore the present article is mainly concerned
about the practical performance of the different procedures and for a better understanding of some of
the above mentioned problems in estimating and testing. Hereby, the main part studies performance,
feasibility and technical differences of estimation respectively testing procedures based on different
estimators. We concentrate especially on the differences caused by the use of different (pre-)smoothers
in marginal integration, in particular on the classic approach of Linton and Nielsen (1995) and on the
internalized Nadaraya–Watson estimator (Jones, Davies and Park 1994) as suggested by Kim, Linton
and Hengartner (2000). Notice that this study is not thought as an illustration of the general statement
of consistency and convergence. Our main interest is directed to the investigation and comparison of
finite sample behavior of these procedures.
The marginal integration estimator becomes inefficient with increasing correlation in the regressors,
see Linton (1997). He suggested to combine the marginal integration with a one step backfitting
afterwards to reach efficiency. Unfortunately, this combination destroys any interpretability of the
estimate when the additivity assumption is violated. The same loss of efficiency was also observed
in a simulation study by Sperlich, Linton and ardle (1999) for the backfitting estimator, although
these results do not reflect the asymptotic theory. In their article it is further demonstrated that with
increasing dimension the additive components are still estimated with a reasonable precision, whereas
the estimation of the regression function becomes problematic. This fact could cause problems for
prediction and for bootstrap tests. We will investigate and explain that the use of the internalized
Nadaraya–Watson estimator for the marginal integration can partly ameliorate this problem. This is
actually not based on theoretical results but more on numerical circumstances respective the handling
of “poor data areas”. Throughout this paper we will call the classical marginal integration estimator
CMIE, and IMIE the one using the internalized Nadaraya–Watson estimator as multidimensional
pre-smoother.
The rest of the paper is organized as follows. In Section 2 we give the definitions of the analyzed
estimators and some more discussion about their advantages and disadvantages. Finally we provide
some simulation results on the Cross-Validation mean squared errors for the different methods of
estimation. In Section 3 we introduce various test statistics based in the IMIE to check the additivity
assumption, present closed form asymptotic theory and a theoretical comparison. Notice that for
the IMIE, at least for testing, little theory has been done until now and hardly empirical studies.
Therefore we provide both in this work, an extensive simulation study but also a closed theory about
the asymptotic properties for any new estimator and test we are considering. Section 4 finally is
dedicated to an intensive simulation study for these test statistics, all using bootstrap methods. The
proofs of the asymptotic results are cumbersome and deferred to the Appendix in Section 5.
3

2 Marginal Integration and Additive Models
Let us consider the general regression model
Y = m(X)+σ(X)ε (2.1)
where X =(X
1
,...,X
d
)
T
is a d-dimensional random variable with density f, Y is the real valued
response, and ε the error, independent of X with mean 0 and variance 1. Further, m, σ are unknown
(smooth) functions and the regression function m(·) has to be estimated nonparametrically. As in-
dicated above the marginal integration estimator is constructed to catch the marginal impact of one
or some regressors X
α
IR
d
α
, d
α
<d. For the ease of notation we will restrict ourselves to the case
d
α
=1forallα. Notice first that in case of additivity, i.e. there exist functions m
α
, m
α
such that
m(X)=m
α
(X
α
)+m
α
(X
α
) (2.2)
with X
α
being the vector X without the component X
α
, the marginal impact of X
α
corresponds
exactly to the additive component m
α
. For identification we set E[m
α
(X
α
)] = 0 and consequently
E[Y ]=E[m
α
(X
α
)] = c. The marginal integration estimator is defined noting that
E
X
α
[m(x
α
,X
α
)] =
m(x
α
,x
α
)f
α
(x
α
)dx
α
(2.3)
= E
X
α
[m
α
(X
α
)+m
α
(x
α
)] = c + m
α
(x
α
), (2.4)
where f
α
denotes the marginal density of X
α
, and the second line follows from the first line in the
case of additivity, see equation (2.2). So marginal integration yields the function m
α
up to a constant
that can easily be estimated by the average over the observations Y
i
. We estimate the right hand
side of equation (2.3) by replacing the expectation by an average and the unknown mutidimensional
regression function m by a pre-smoother ˜m. Certainly, having a completely additive separable model
of the form
m(X)=c +
d
α=1
m
α
(X
α
), (2.5)
this method can be applied to estimate all components m
α
, and finally the regression function m is
estimated by summing up an estimator ˆc of c with the estimates ˆm
α
.
2.1 Formal Definition
Although the pre-smoother ˜m could be calculated applying any smoothing method, theory has al-
ways been derived for kernel estimators [note that the same happened to the backfitting (Opsomer
and Ruppert 1997, Mammen, Linton and Nielsen 1999)]. Therefore we will concentrate only on the
kernel based definitions even though spline implementation is known to be computationally more
advantageous. We first give the definition of the classic marginal integration method (CMIE). Let
K
i
(·)(i =1, 2) denote one - and (d 1) - dimensional Lipschitz - continuous kernels of order p and q,
respectively, with compact support, and define for a bandwidth h
i
> 0, i =1, 2, t
1
IR, t
2
IR
d1
K
1,h
1
(t
1
)=
1
h
1
K
1
(
t
1
h
1
),K
2,h
2
(t
2
)=
1
h
d1
2
K
2
(
t
2
h
2
). (2.6)
4

For the sample (X
i
, Y
i
)
n
i=1
, X
i
= (X
i1
,...,X
id
)
T
the CMIE is defined by
ˆm
α
(x
α
)=
1
n
n
j=1
˜m(x
α
,X
j,α
)=
1
n
2
n
k=1
n
j=1
K
1,h
1
(X
x
α
)K
2,h
2
(X
j,α
X
k,α
)Y
j
ˆ
f(x
α
,X
k,α
)
(2.7)
ˆ
f(x
α
,x
α
)=
1
n
n
i=1
K
1,h
1
(X
i,α
x
α
)K
2,h
2
(X
i,α
x
α
) (2.8)
ˆc =
1
n
n
j=1
Y
j
(2.9)
and X
i,α
denotes the vector X
i
without the component X
. Note that
ˆ
f is an estimator of the joint
density of X and ˜m denotes the Nadaraya Watson estimator with kernel K
1,h
1
· K
2,h
2
.
The modification giving us the internalized marginal integration estimate (IMIE) concerns the defini-
tion of ˆm, equation (2.7), where
ˆ
f(x
α
,X
k,α
) is substituted by
ˆ
f(X
,X
j,α
), see Jones, Davies and
Park (1994) or Kim, Linton and Hengartner (2000) for details. The resulting definition of the IMIE is
ˆm
I
α
(x
α
)=
1
n
2
n
k=1
n
j=1
K
1,h
1
(X
x
α
)K
2,h
2
(X
j,α
X
k,α
)Y
j
ˆ
f(X
,X
j,α
)
(2.10)
=
1
n
n
j=1
K
1,h
1
(X
x
α
)
ˆ
f
α
(X
j,α
)
ˆ
f(X
,X
j,α
)
Y
j
, (2.11)
where
ˆ
f
α
is an estimate of the marginal density f
α
. Notice that the fraction before Y
j
in (2.11)
is the inverse of the conditional density f
α|−α
(X
α
|X
α
). It is well known that under the hypothesis
of an additive model ˆm
α
and ˆm
I
α
are consistent estimates of m
α
(α =1,...,d) (see Tjøstheim and
Auestad, 1994, and Kim, Linton and Hengartner, 2000).
2.2 On a Better Understanding of Marginal Integration
Although the papers of Nielsen and Linton (1997) and Sperlich, Linton and ardle (1999) already
emphasized the differences of backfitting and marginal integration, often they are still interpreted as
competing estimators for the same aim. For a better understanding of the difference between orthog-
onal projection into the additive space (backfitting) and measuring the marginal impact (marginal
integration) we give two more examples.
As has been explained in Stone (1994) and Sperlich, Tjøstheim and Yang (2000), any model can be
written in the form
m(x)=c +
d
α=1
m
α
(x
α
)+
1α<βd
m
αβ
(x
α
,x
β
)+
1α<βd
m
αβγ
(x
α
,x
β
,x
γ
)+··· . (2.12)
The latter mentioned article, even when they worked it out in detail only for second order interactions,
showed that all these components can be identified and consistently estimated by marginal integration
obtaining the optimal convergence rate in smoothing. The main reason for this nice property is, that
definition, algorithm and thus the numerical results for the estimates do not differ whatever the chosen
extension or the true model is. This certainly is different for an orthogonal projection. At first we note
that so far model (2.12) can not be estimated by backfitting. Secondly, Stone (1994) gives (formal)
5

Citations
More filters

Journal ArticleDOI
25 Jul 2013-Test
Abstract: This survey intends to collect the developments on Goodness-of-Fit for regression models during the last 20 years, from the very first origins with the proposals based on the idea of the tests for density and distribution, until the most recent advances for complex data and models. Far from being exhaustive, the contents in this paper are focused on two main classes of tests statistics: smoothing-based tests (kernel-based) and tests based on empirical regression processes, although other tests based on Maximum Likelihood ideas will be also considered. Starting from the simplest case of testing a parametric family for the regression curves, the contributions in this field provide also testing procedures in semiparametric, nonparametric, and functional models, dealing also with more complex settings, as those ones involving dependent or incomplete data.

139 citations


Journal ArticleDOI
TL;DR: A feasible cross‐validation procedure is introduced and applied to the problem of data‐driven bandwidth choice for the smooth backfitting estimator (SBE), showing that the SBE is less affected by sparseness of data in high dimensional regression problems or strongly correlated designs.
Abstract: Summary. Compared with the classical backfitting of Buja, Hastie and Tibshirani, the smooth backfitting estimator (SBE) of Mammen, Linton and Nielsen not only provides complete asymptotic theory under weaker conditions but is also more efficient, robust and easier to calculate. However, the original paper describing the SBE method is complex and the practical as well as the theoretical advantages of the method have still neither been recognized nor accepted by the statistical community. We focus on a clear presentation of the idea, the main theoretical results and practical aspects like implementation and simplification of the algorithm. We introduce a feasible cross-validation procedure and apply it to the problem of data-driven bandwidth choice for the SBE. By simulations it is shown that the SBE and our cross-validation work very well indeed. In particular, the SBE is less affected by sparseness of data in high dimensional regression problems or strongly correlated designs. The SBE has reasonable performance even in 100-dimensional additive regression problems.

94 citations


Journal ArticleDOI
Abstract: Estimation and testing procedures for generalized additive (interaction) models are developed. We present extensions of several existing procedures for additive models when the link is the identity. This set of methods includes estimation of all component functions and their derivatives, testing functional forms and in particular variable selection. Theorems and simulation results are presented for the fundamentally new procedures. These comprise of, in particular, the introduction of local polynomial smoothing for this kind of models and the testing, including variable selection. Our method is straightforward to implement and the simulation studies show good performance in even small data sets.

41 citations


Journal ArticleDOI
Abstract: We consider semiparametric additive regression models with a linear parametric part and a nonparametric part, both involving multivariate covariates. For the nonparametric part we assume two models. In the first, the regression function is unspecified and smooth; in the second, the regression function is additive with smooth components. Depending on the model, the regression curve is estimated by suitable least squares methods. The resulting residual-based empirical distribution function is shown to differ from the error-based empirical distribution function by an additive expression, up to a uniformly negligible remainder term. This result implies a functional central limit theorem for the residual-based empirical distribution function. It is used to test for normal errors.

31 citations


Journal ArticleDOI
TL;DR: This paper considers the problem of testing whether a specific covariate has different impacts on the regression curve in these two samples, and proposes a subsampling procedure with automatic choice of subsample size that avoids the curse of dimensionality.
Abstract: . Imagine we have two different samples and are interested in doing semi- or non-parametric regression analysis in each of them, possibly on the same model. In this paper, we consider the problem of testing whether a specific covariate has different impacts on the regression curve in these two samples. We compare the regression curves of different samples but are interested in specific differences instead of testing for equality of the whole regression function. Our procedure does allow for random designs, different sample sizes, different variance functions, different sets of regressors with different impact functions, etc. As we use the marginal integration approach, this method can be applied to any strong, weak or latent separable model as well as to additive interaction models to compare the lower dimensional separable components between the different samples. Thus, in the case of having separable models, our procedure includes the possibility of comparing the whole regression curves, thereby avoiding the curse of dimensionality. It is shown that bootstrap fails in theory and practice. Therefore, we propose a subsampling procedure with automatic choice of subsample size. We present a complete asymptotic theory and an extensive simulation study.

22 citations


References
More filters

Journal ArticleDOI

9,934 citations


Journal ArticleDOI
Abstract: In general, there will be visible differences between a parametric and a nonparametric curve estimate. It is therefore quite natural to compare these in order to decide whether the parametric model could be justified. An asymptotic quantification is the distribution of the integrated squared difference between these curves. We show that the standard way of bootstrapping this statistic fails. We use and analyse a different form of bootstrapping for this task. We call this method the wild bootstrap and apply it to fitting Engel curves in expenditure data analysis.

1,173 citations


Journal ArticleDOI
Abstract: Let $(X, Y)$ be a pair of random variables such that $X = (X_1, \cdots, X_J)$ and let $f$ by a function that depends on the joint distribution of $(X, Y).$ A variety of parametric and nonparametric models for $f$ are discussed in relation to flexibility, dimensionality, and interpretability. It is then supposed that each $X_j \in \lbrack 0, 1\rbrack,$ that $Y$ is real valued with mean $\mu$ and finite variance, and that $f$ is the regression function of $Y$ on $X.$ Let $f^\ast,$ of the form $f^\ast(x_1, \cdots, x_J) = \mu + f^\ast_1(x_1) + \cdots + f^\ast_J(x_J),$ be chosen subject to the constraints $Ef^\ast_j = 0$ for $1 \leq j \leq J$ to minimize $E\lbrack(f(X) - f^\ast(X))^2\rbrack.$ Then $f^\ast$ is the closest additive approximation to $f,$ and $f^\ast = f$ if $f$ itself is additive. Spline estimates of $f^\ast_j$ and its derivatives are considered based on a random sample from the distribution of $(X, Y).$ Under a common smoothness assumption on $f^\ast_j, 1 \leq j \leq J,$ and some mild auxiliary assumptions, these estimates achieve the same (optimal) rate of convergence for general $J$ as they do for $J = 1.$

1,154 citations


Journal ArticleDOI
TL;DR: It is shown that backfitting is the Gauss-Seidel iterative method for solving a set of normal equations associated with the additive model and conditions for consistency and nondegeneracy are provided and convergence is proved for the backfitting and related algorithms for a class of smoothers that includes cubic spline smoothers.
Abstract: We study linear smoothers and their use in building nonparametric regression models. In the first part of this paper we examine certain aspects of linear smoothers for scatterplots; examples of these are the running-mean and running-line, kernel and cubic spline smoothers. The eigenvalue and singular value decompositions of the corresponding smoother matrix are used to describe qualitatively a smoother, and several other topics such as the number of degrees of freedom of a smoother are discussed. In the second part of the paper we describe how linear smoothers can be used to estimate the additive model, a powerful nonparametric regression model, using the "backfitting algorithm." We show that backfitting is the Gauss-Seidel iterative method for solving a set of normal equations associated with the additive model. We provide conditions for consistency and nondegeneracy and prove convergence for the backfitting and related algorithms for a class of smoothers that includes cubic spline smoothers.

951 citations


Journal ArticleDOI
Abstract: Testing of precise (point or small interval) hypotheses is reviewed, with special emphasis placed on exploring the dramatic conflict between conditional measures (Bayes factors and posterior probabilities) and the classical P-value (or observed significance level). This conflict is highlighted by finding lower bounds on the conditional measures over wide classes of priors, in normal and binomial situations, lower bounds, which are much larger than the P-value; this leads to the recommendation of several alter- natives to P-values. Results are also given concerning the validity of approximating an interval null by a point null. The overall discussion features critical examination of issues such as the probability of objective testing and the possibility of testing from confidence sets.

620 citations