Home
/
Topics
/
Statistical learning theory

Topic

Statistical learning theory

About: Statistical learning theory is a research topic. Over the lifetime, 1618 publications have been published within this topic receiving 158033 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1995
1993
1992
1973
1972
1969
1965
1962
1961
1959
1958
1957
1956
1954

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Complexity measures in Genetic Programming learning: A brief review

[...]

Nam Le¹, Hoai Nguyen Xuan¹, Anthony Brabazon², Thuong Pham Thi¹•Institutions (2)

Hanoi University¹, University College Dublin²

01 Jul 2016

TL;DR: An up-to-date overview of the research concerning complexity measure techniques in GP learning, including methods based on information theory techniques, Bayesian Information Criterion, plus those based on statistical machine learning theory on generalization error bound; and some based on structural complexity.

...read moreread less

Abstract: Model complexity of Genetic Programming (GP) as a learning machine is currently attracting considerable interest from the research community. Here we provide an up-to-date overview of the research concerning complexity measure techniques in GP learning. The scope of this review includes methods based on information theory techniques, such as the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC); plus those based on statistical machine learning theory on generalization error bound, namely, Vapnik-Chervonenkis (VC) theory; and some based on structural complexity. The research contributions from each of these are systematically summarized and compared, allowing us to clearly define existing research challenges, and to highlight promising new research directions. The findings of this review provides valuable insights into the current GP literature and is a good source for anyone who is interested in the research on model complexity and applying statistical learning theory to GP.

...read moreread less

25 citations

Book Chapter•DOI•

Generalization Error of Limear Neural Networks in Unidentifiable Cases

[...]

Kenji Fukumizu¹•Institutions (1)

RIKEN Brain Science Institute¹

06 Dec 1999

TL;DR: It is shown that the expectation of the generalization error in the unidentifiable cases is larger than what is given by the usual asymptotic theory, and dependent on the rank of the target function.

...read moreread less

Abstract: The statistical asymptotic theory is often used in theoretical results in computational and statistical learning theory It describes the limiting distribution of the maximum likelihood estimator (MLE) as an normal distribution However, in layered models such as neural networks, the regularity condition of the asymptotic theory is not necessarily satisfied The true parameter is not identifiable, if the target function can be realized by a network of smaller size than the size of the model There has been little known on the behavior of the MLE in these cases of neural networks In this paper, we analyze the expectation of the generalization error of three-layer linear neural networks, and elucidate a strange behavior in unidentifiable cases We show that the expectation of the generalization error in the unidentifiable cases is larger than what is given by the usual asymptotic theory, and dependent on the rank of the target function

...read moreread less

24 citations

Journal Article•DOI•

Generalization in fully-connected neural networks for time series forecasting

[...]

Anastasia Borovykh, Cornelis W. Oosterlee, Sander M. Bohte

01 Sep 2019-Journal of Computational Science

TL;DR: The input and weight Hessians are used to quantify a network's ability to generalize to unseen data and how one can control the generalization capability of the network by means of the training process using the learning rate, batch size and the number of training iterations as controls.

...read moreread less

24 citations

Journal Article•

Kernel Ho-Kashyap classifier with generalization control

[...]

Jacek Łęski

01 Jan 2004-International Journal of Applied Mathematics and Computer Science

TL;DR: This paper introduces a new classifier design method based on a kernel extension of the classical Ho-Kashyap procedure that leads to robustness against outliers and a better approximation of the misclassification error.

...read moreread less

Abstract: This paper introduces a new classifier design method based on a kernel extension of the classical Ho-Kashyap procedure. The proposed method uses an approximation of the absolute error rather than the squared error to design a classifier, which leads to robustness against outliers and a better approximation of the misclassification error. Additionally, easy control of the generalization ability is obtained using the structural risk minimization induction principle from statistical learning theory. Finally, examples are given to demonstrate the validity of the introduced method.

...read moreread less

24 citations

Book Chapter•DOI•

Distribution-Dependent Vapnik-Chervonenkis Bounds

[...]

Nicolas Vayatis¹, Nicolas Vayatis², Robert Azencott¹•Institutions (2)

École normale supérieure de Cachan¹, École Polytechnique²

29 Mar 1999

TL;DR: It is shown indeed that it is possible to replace the 2∈2 under the exponential of the deviation term by the corresponding CramEr transform as shown by large deviations theorems and why these theoretical results on such bounds can lead to practical estimates of the effective VC dimension of learning structures.

...read moreread less

Abstract: Vapnik-Chervonenkis (VC) bounds play an important role in statistical learning theory as they are the fundamental result which explains the generalization ability of learning machines. There have been consequent mathematical works on the improvement of VC rates of convergence of empirical means to their expectations over the years. The result obtained by Talagrand in 1994 seems to provide more or less the final word to this issue as far as universal bounds are concerned. Though for fixed distributions, this bound can be practically outperformed. We show indeed that it is possible to replace the 2∈2 under the exponential of the deviation term by the corresponding CramEr transform as shown by large deviations theorems. Then, we formulate rigorous distributionsensitive VC bounds and we also explain why these theoretical results on such bounds can lead to practical estimates of the effective VC dimension of learning structures.

...read moreread less

24 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
…
71
72
73
74
75
76
77
…
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

1,647

Papers

173,903

Citations

No. of papers in the topic in previous years
Year	Papers
2023	9
2022	19
2021	59
2020	69
2019	72
2018	47

Statistical learning theory

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics