Home
/
Authors
/
Sylvain Arlot

Author

Sylvain Arlot

Other affiliations: Université Paris-Saclay, École Normale Supérieure, French Institute for Research in Computer Science and Automation

Bio: Sylvain Arlot is an academic researcher from Département de Mathématiques. The author has contributed to research in topics: Model selection & Estimator. The author has an hindex of 25, co-authored 58 publications receiving 6493 citations. Previous affiliations of Sylvain Arlot include Université Paris-Saclay & École Normale Supérieure.

Topics: Model selection, Estimator, Resampling, Kernel (statistics), Heuristics ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2004

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A survey of cross-validation procedures for model selection

[...]

Sylvain Arlot¹, Alain Celisse•Institutions (1)

École Normale Supérieure¹

01 Jan 2010-Statistics Surveys

TL;DR: This survey intends to relate the model selection performances of cross-validation procedures to the most recent advances of model selection theory, with a particular emphasis on distinguishing empirical statements from rigorous theoretical results.

...read moreread less

Abstract: Used to estimate the risk of an estimator or to perform model selection, cross-validation is a widespread strategy because of its simplicity and its apparent universality. Many results exist on the model selection performances of cross-validation procedures. This survey intends to relate these results to the most recent advances of model selection theory, with a particular emphasis on distinguishing empirical statements from rigorous theoretical results. As a conclusion, guidelines are provided for choosing the best cross-validation procedure according to the particular features of the problem in hand.

...read moreread less

2,980 citations

Journal Article•DOI•

A survey of cross-validation procedures for model selection

[...]

Sylvain Arlot¹, Alain Celisse•Institutions (1)

École Normale Supérieure¹

27 Jul 2009-arXiv: Statistics Theory

TL;DR: In this paper, a survey on the model selection performances of cross-validation procedures is presented, with a particular emphasis on distinguishing empirical statements from rigorous theoretical results, and guidelines are provided for choosing the best crossvalidation procedure according to the particular features of the problem in hand.

...read moreread less

2,720 citations

Journal Article•

Data-driven Calibration of Penalties for Least-Squares Regression

[...]

Sylvain Arlot¹, Pascal Massart¹•Institutions (1)

Département de Mathématiques¹

01 Dec 2009-Journal of Machine Learning Research

TL;DR: A completely data-driven calibration algorithm for these parameters in the least-squares regression framework, without assuming a particular shape for the penalty, based on the concept of minimal penalty, recently introduced by Birge and Massart (2007).

...read moreread less

Abstract: Penalization procedures often suffer from their dependence on multiplying factors, whose optimal values are either unknown or hard to estimate from data. We propose a completely data-driven calibration algorithm for these parameters in the least-squares regression framework, without assuming a particular shape for the penalty. Our algorithm relies on the concept of minimal penalty, recently introduced by Birge and Massart (2007) in the context of penalized least squares for Gaussian homoscedastic regression. On the positive side, the minimal penalty can be evaluated from the data themselves, leading to a data-driven estimation of an optimal penalty which can be used in practice; on the negative side, their approach heavily relies on the homoscedastic Gaussian nature of their stochastic framework. The purpose of this paper is twofold: stating a more general heuristics for designing a data-driven penalty (the slope heuristics) and proving that it works for penalized least-squares regression with a random design, even for heteroscedastic non-Gaussian data. For technical reasons, some exact mathematical results will be proved only for regressogram bin-width selection. This is at least a first step towards further results, since the approach and the method that we use are indeed general.

...read moreread less

187 citations

Posted Content•

Cross-validation

[...]

Sylvain Arlot

09 Mar 2017-arXiv: Statistics Theory

TL;DR: In this paper, a survey on cross-validation procedures for estimating the risk of a given estimator, and selecting the best estimator among a given family, is presented.

...read moreread less

Abstract: This text is a survey on cross-validation We define all classical cross-validation procedures, and we study their properties for two different goals: estimating the risk of a given estimator, and selecting the best estimator among a given family For the risk estimation problem, we compute the bias (which can also be corrected) and the variance of cross-validation methods For estimator selection, we first provide a first-order analysis (based on expectations) Then, we explain how to take into account second-order terms (from variance computations, and by taking into account the usefulness of overpenalization) This allows, in the end, to provide some guidelines for choosing the best cross-validation method for a given learning problem

...read moreread less

149 citations

Journal Article•

A Kernel Multiple Change-point Algorithm via Model Selection

[...]

Sylvain Arlot, Alain Celisse, Zaid Harchaoui¹•Institutions (1)

Courant Institute of Mathematical Sciences¹

01 Dec 2019-Journal of Machine Learning Research

TL;DR: A penalty for choosing the number of change-points in the kernel-based method of Harchaoui and Capp{\'e} (2007) is built and a non-asymptotic oracle inequality is proved for the proposed method, thanks to a new concentration result for some function of Hilbert-space valued random variables.

...read moreread less

Abstract: We tackle the change-point problem with data belonging to a general set. We build a penalty for choosing the number of change-points in the kernel-based method of Harchaoui and Cappe (2007). This penalty generalizes the one proposed by Lebarbier (2005) for one-dimensional signals. We prove a non-asymptotic oracle inequality for the proposed method, thanks to a new concentration result for some function of Hilbert-space valued random variables. Experiments on synthetic data illustrate the accuracy of our method, showing that it can detect changes in the whole distribution of data, even when the mean and variance are constant.

...read moreread less

84 citations

1
2
3
4
…
5
6
7
8
9
10
11
12

Collapse

Cited by

PDF

Open Access

More filters

Computer vision : a modern approach = 计算机视觉 : 一种现代的方法

[...]

David Forsyth, Jean Ponce

01 Jan 2004

TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.

...read moreread less

Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.

...read moreread less

3,627 citations

Journal Article•DOI•

A survey of cross-validation procedures for model selection

[...]

Sylvain Arlot¹, Alain Celisse•Institutions (1)

École Normale Supérieure¹

01 Jan 2010-Statistics Surveys

...read moreread less

2,980 citations

Journal Article•DOI•

A survey of cross-validation procedures for model selection

[...]

Sylvain Arlot¹, Alain Celisse•Institutions (1)

École Normale Supérieure¹

27 Jul 2009-arXiv: Statistics Theory

...read moreread less

2,720 citations

Journal Article•DOI•

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

[...]

Aki Vehtari¹, Andrew Gelman², Jonah Gabry²•Institutions (2)

Helsinki Institute for Information Technology¹, Columbia University²

16 Jul 2015-arXiv: Computation

TL;DR: In this article, leave-one-out cross-validation (LOO) and the widely applicable information criterion (WAIC) are used to estimate pointwise out-of-sample prediction accuracy from a fitted Bayesian model using the log-likelihood evaluated at the posterior simulations of the parameter values.

...read moreread less

Abstract: Leave-one-out cross-validation (LOO) and the widely applicable information criterion (WAIC) are methods for estimating pointwise out-of-sample prediction accuracy from a fitted Bayesian model using the log-likelihood evaluated at the posterior simulations of the parameter values. LOO and WAIC have various advantages over simpler estimates of predictive error such as AIC and DIC but are less used in practice because they involve additional computational steps. Here we lay out fast and stable computations for LOO and WAIC that can be performed using existing simulation draws. We introduce an efficient computation of LOO using Pareto-smoothed importance sampling (PSIS), a new procedure for regularizing importance weights. Although WAIC is asymptotically equal to LOO, we demonstrate that PSIS-LOO is more robust in the finite case with weak priors or influential observations. As a byproduct of our calculations, we also obtain approximate standard errors for estimated predictive errors and for comparing of predictive errors between two models. We implement the computations in an R package called 'loo' and demonstrate using models fit with the Bayesian inference package Stan.

...read moreread less

2,455 citations

Journal Article•DOI•

Introduction to the Modern Theory of Dynamical Systems

[...]

J.A. Zukas

01 Jan 1998-Shock and Vibration

1,620 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse