Home
/
Authors
/
Grace Wahba

Author

Grace Wahba

Other affiliations: University of Missouri, Stanford University

Bio: Grace Wahba is an academic researcher from University of Wisconsin-Madison. The author has contributed to research in topics: Smoothing spline & Smoothing. The author has an hindex of 58, co-authored 184 publications receiving 28593 citations. Previous affiliations of Grace Wahba include University of Missouri & Stanford University.

Papers published on a yearly basis

2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1971
1970
1969
1968
1965

Papers

PDF

Open Access

More filters

Book•

Spline models for observational data

[...]

Grace Wahba

01 Mar 1990

TL;DR: In this paper, a theory and practice for the estimation of functions from noisy data on functionals is developed, where convergence properties, data based smoothing parameter selection, confidence intervals, and numerical methods are established which are appropriate to a number of problems within this framework.

...read moreread less

Abstract: This book serves well as an introduction into the more theoretical aspects of the use of spline models. It develops a theory and practice for the estimation of functions from noisy data on functionals. The simplest example is the estimation of a smooth curve, given noisy observations on a finite number of its values. Convergence properties, data based smoothing parameter selection, confidence intervals, and numerical methods are established which are appropriate to a number of problems within this framework. Methods for including side conditions and other prior information in solving ill posed inverse problems are provided. Data which involves samples of random variables with Gaussian, Poisson, binomial, and other distributions are treated in a unified optimization context. Experimental design questions, i.e., which functionals should be observed, are studied in a general context. Extensions to distributed parameter system identification problems are made by considering implicitly defined functionals.

...read moreread less

6,120 citations

Journal Article•DOI•

Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter

[...]

Gene H. Golub¹, Michael T. Heath², Grace Wahba³•Institutions (3)

Stanford University¹, Oak Ridge National Laboratory², University of Wisconsin-Madison³

01 May 1979-Technometrics

TL;DR: The generalized cross-validation (GCV) method as discussed by the authors is a generalized version of Allen's PRESS, which can be used in subset selection and singular value truncation, and even to choose from among mixtures of these methods.

...read moreread less

Abstract: Consider the ridge estimate (λ) for β in the model unknown, (λ) = (X T X + nλI)−1 X T y. We study the method of generalized cross-validation (GCV) for choosing a good value for λ from the data. The estimate is the minimizer of V(λ) given by where A(λ) = X(X T X + nλI)−1 X T . This estimate is a rotation-invariant version of Allen's PRESS, or ordinary cross-validation. This estimate behaves like a risk improvement estimator, but does not require an estimate of σ2, so can be used when n − p is small, or even if p ≥ 2 n in certain cases. The GCV method can also be used in subset selection and singular value truncation methods for regression, and even to choose from among mixtures of these methods.

...read moreread less

3,697 citations

Journal Article•DOI•

Smoothing Noisy Data with Spline Functions Estimating the Correct Degree of Smoothing by the Method of Generalized Cross-Validation*

[...]

Peter Craven, Grace Wahba

01 Dec 1978-Numerische Mathematik

TL;DR: In this paper, a method for estimating the optimum amount of smoothing from the data is presented, based on smoothing splines, which is well known to provide nice curves which smooth discrete, noisy data.

...read moreread less

2,799 citations

Journal Article•DOI•

Smoothing noisy data with spline functions

[...]

Grace Wahba¹•Institutions (1)

Stanford University¹

01 Oct 1975-Numerische Mathematik

TL;DR: In this article, a generalized cross-validation estimate for smoothing polynomial splines is proposed, where the tradeoff between the "roughness" of the solution, as measured by the average square error of the smoothing spline, is defined.

...read moreread less

Abstract: Smoothing splines are well known to provide nice curves which smooth discrete, noisy data. We obtain a practical, effective method for estimating the optimum amount of smoothing from the data. Derivatives can be estimated from the data by differentiating the resulting (nearly) optimally smoothed spline. We consider the modely i (t i )+? i ,i=1, 2, ...,n,t i?[0, 1], whereg?W 2 (m) ={f:f,f?, ...,f (m?1) abs. cont.,f (m)??2[0,1]}, and the {? i } are random errors withE? i =0,E? i ? j =?2? ij . The error variance ?2 may be unknown. As an estimate ofg we take the solutiong n, ? to the problem: Findf?W 2 (m) to minimize $$\frac{1}{n}\sum\limits_{j = 1}^n {(f(t_j ) - y_j )^2 + \lambda \int\limits_0^1 {(f^{(m)} (u))^2 du} }$$ . The functiong n, ? is a smoothing polynomial spline of degree 2m?1. The parameter ? controls the tradeoff between the "roughness" of the solution, as measured by $$\int\limits_0^1 {[f^{(m)} (u)]^2 du}$$ , and the infidelity to the data as measured by $$\frac{1}{n}\sum\limits_{j = 1}^n {(f(t_j ) - y_j )^2 }$$ , and so governs the average square errorR(?; g)=R(?) defined by $$R(\lambda ) = \frac{1}{n}\sum\limits_{j = 1}^n {(g_{n,\lambda } (t_j ) - g(t_j ))^2 }$$ . We provide an estimate $$\hat \lambda$$ , called the generalized cross-validation estimate, for the minimizer ofR(?). The estimate $$\hat \lambda$$ is the minimizer ofV(?) defined by $$V(\lambda ) = \frac{1}{n}\parallel (I - A(\lambda ))y\parallel ^2 /\left[ {\frac{1}{n}{\text{Trace(}}I - A(\lambda ))} \right]^2$$ , wherey=(y 1, ...,y n)t andA(?) is then×n matrix satisfying(g n, ? (t 1), ...,g n, ? (t n))t=A (?) y. We prove that there exist a sequence of minimizers $$\tilde \lambda = \tilde \lambda (n)$$ ofEV(?), such that as the (regular) mesh{t i} i=1 n becomes finer, $$\mathop {\lim }\limits_{n \to \infty } ER(\tilde \lambda )/\mathop {\min }\limits_\lambda ER(\lambda ) \downarrow 1$$ . A Monte Carlo experiment with several smoothg's was tried withm=2,n=50 and several values of ?2, and typical values of $$R(\hat \lambda )/\mathop {\min }\limits_\lambda R(\lambda )$$ were found to be in the range 1.01---1.4. The derivativeg? ofg can be estimated by $$g'_{n,\hat \lambda } (t)$$ . In the Monte Carlo examples tried, the minimizer of $$R_D (\lambda ) = \frac{1}{n}\sum\limits_{j = 1}^n {(g'_{n,\lambda } (t_j ) - } g'(t_j ))$$ tended to be close to the minimizer ofR(?), so that $$\hat \lambda$$ was also a good value of the smoothing parameter for estimating the derivative.

...read moreread less

1,735 citations

Journal Article•DOI•

Some results on Tchebycheffian spline functions

[...]

George Kimeldorf¹, Grace Wahba²•Institutions (2)

Florida State University¹, University of Wisconsin-Madison²

01 Jan 1971-Journal of Mathematical Analysis and Applications

TL;DR: This article derived explicit solutions to problems involving Tchebycheffian spline functions using a reproducing kernel Hilbert space which depends on the smoothness criterion, but not on the form of the data, to solve explicitly Hermite-Birkhoff interpolation and smoothing problems.

...read moreread less

1,365 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

Collapse

Cited by

PDF

Open Access

More filters

Book•

Neural networks for pattern recognition

[...]

Christopher M. Bishop¹•Institutions (1)

Aston University¹

01 Jan 1995

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.

...read moreread less

Abstract: From the Publisher: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition. After introducing the basic concepts, the book examines techniques for modelling probability density functions and the properties and merits of the multi-layer perceptron and radial basis function network models. Also covered are various forms of error functions, principal algorithms for error function minimalization, learning and generalization in neural networks, and Bayesian techniques and their applications. Designed as a text, with over 100 exercises, this fully up-to-date work will benefit anyone involved in the fields of neural computation and pattern recognition.

...read moreread less

19,056 citations

Report•DOI•

A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix

[...]

Whitney K. Newey, Kenneth D. West

01 May 1987-Econometrica

TL;DR: In this article, a simple method of calculating a heteroskedasticity and autocorrelation consistent covariance matrix that is positive semi-definite by construction is described.

...read moreread less

Abstract: This paper describes a simple method of calculating a heteroskedasticity and autocorrelation consistent covariance matrix that is positive semi-definite by construction. It also establishes consistency of the estimated covariance matrix under fairly general conditions.

...read moreread less

18,117 citations

Journal Article•DOI•

Deep learning in neural networks

[...]

Jürgen Schmidhuber¹•Institutions (1)

University of Lugano¹

01 Jan 2015-Neural Networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

...read moreread less

14,635 citations

Journal Article•DOI•

Bayesian measures of model complexity and fit

[...]

David Spiegelhalter¹, Nicola G. Best², Bradley P. Carlin³, Angelika van der Linde⁴•Institutions (4)

Medical Research Council¹, Imperial College London², University of Minnesota³, University of Bremen⁴

01 Oct 2002-Journal of The Royal Statistical Society Series B-statistical Methodology

TL;DR: In this paper, the authors consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined and derive a measure pD for the effective number in a model as the difference between the posterior mean of the deviances and the deviance at the posterior means of the parameters of interest, which is related to other information criteria and has an approximate decision theoretic justification.

...read moreread less

Abstract: Summary. We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. Using an information theoretic argument we derive a measure pD for the effective number of parameters in a model as the difference between the posterior mean of the deviance and the deviance at the posterior means of the parameters of interest. In general pD approximately corresponds to the trace of the product of Fisher's information and the posterior covariance, which in normal models is the trace of the ‘hat’ matrix projecting observations onto fitted values. Its properties in exponential families are explored. The posterior mean deviance is suggested as a Bayesian measure of fit or adequacy, and the contributions of individual observations to the fit and complexity can give rise to a diagnostic plot of deviance residuals against leverages. Adding pD to the posterior mean deviance gives a deviance information criterion for comparing models, which is related to other information criteria and has an approximate decision theoretic justification. The procedure is illustrated in some examples, and comparisons are drawn with alternative Bayesian and classical proposals. Throughout it is emphasized that the quantities required are trivial to compute in a Markov chain Monte Carlo analysis.

...read moreread less

11,691 citations

Book•

Gaussian Processes for Machine Learning

[...]

Carl Edward Rasmussen¹, Christopher Williams•Institutions (1)

Max Planck Society¹

23 Nov 2005

TL;DR: The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and deals with the supervised learning problem for both regression and classification.

...read moreread less

Abstract: A comprehensive and self-contained introduction to Gaussian processes, which provide a principled, practical, probabilistic approach to learning in kernel machines. Gaussian processes (GPs) provide a principled, practical, probabilistic approach to learning in kernel machines. GPs have received increased attention in the machine-learning community over the past decade, and this book provides a long-needed systematic and unified treatment of theoretical and practical aspects of GPs in machine learning. The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics. The book deals with the supervised-learning problem for both regression and classification, and includes detailed algorithms. A wide variety of covariance (kernel) functions are presented and their properties discussed. Model selection is discussed both from a Bayesian and a classical perspective. Many connections to other well-known techniques from machine learning and statistics are discussed, including support-vector machines, neural networks, splines, regularization networks, relevance vector machines and others. Theoretical issues including learning curves and the PAC-Bayesian framework are treated, and several approximation methods for learning with large datasets are discussed. The book contains illustrative examples and exercises, and code and datasets are available on the Web. Appendixes provide mathematical background and a discussion of Gaussian Markov processes.

...read moreread less

11,357 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse