Home
/
Authors
/
Bertrand Thirion

Author

Bertrand Thirion

Other affiliations: French Institute for Research in Computer Science and Automation, French Institute of Health and Medical Research, École Polytechnique ...read more

Bio: Bertrand Thirion is an academic researcher from Université Paris-Saclay. The author has contributed to research in topics: Cluster analysis & Cognition. The author has an hindex of 51, co-authored 311 publications receiving 73839 citations. Previous affiliations of Bertrand Thirion include French Institute for Research in Computer Science and Automation & French Institute of Health and Medical Research.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000

Papers

PDF

Open Access

More filters

ACCEPTED MANUSCRIPT Very large fMRI study using the IMAGEN database: sensitivity - specificity and population effect modelling in relation to the underlying anatomy

[...]

Benjamin Thyreau, Yannick Schwartz, Bertrand Thirion, Vincent Frouin, Eva Loth, Sabine Vollstädt-Klein, Tomas Paus, Eric Artiges, Patricia J. Conrod, Gunter Schumann, Robert Whelan, Jean-Baptiste Poline - Show less +8 more

01 Jan 2012

TL;DR: It is concluded that adding matter information consistently improves the quantitative analysis of BOLD responses in some areas of the brain, particularly those where accurate inter-subject registration remains challenging.

...read moreread less

Abstract: In this paper we investigate the use of classical fMRI Random Effect (RFX) group statistics when analysing a very large cohort and the possible improvement brought from anatomical information. Using 1326 subjects from the IMAGEN study, we first give a global picture of the evolution of the group effect t-value from a simple face-watching contrast with increasing cohort size. We obtain a wide "activated" pattern, far from being limited to the reasonably expected brain areas, illustrating the difference between statistical significance and practical significance. This motivates us to inject tissue-probability information into the group estimation, we model the BOLD contrast using a matter-weighted mixture of Gaussians and compare it to the common, single-Gaussian model. In both cases, the models parameters are estimated per-voxel for one subgroup, and the likelihood of both models is computed on a second, separate subgroup to reflect models generalization capacity. Various group sizes are tested, and significance is asserted using a 10-fold cross-validation scheme. We conclude that adding matter information consistently improves the quantitative analysis of BOLD responses in some areas of the brain, particularly those where accurate inter-subject registration remains challenging.

...read moreread less

Causal mediation analysis with one or multiple mediators: a comparative study

[...]

Judith Abecassis, Julie Josse, Bertrand Thirion

TL;DR: In this paper , the authors provide a thorough evaluation of estimators for direct and indirect outcomes in the context of causal mediation analysis for binary, continuous and multi-dimensional mediators, and propose and assess the relevance of several extensions inspired from double or debiased machine learning.

...read moreread less

Abstract: Summary Mediation analysis breaks down the causal eﬀect of a treatment on an outcome into an indirect eﬀect, acting through a third group of variables called mediators and a direct eﬀect, operating through other mechanisms. We provide a thorough evaluation of estimators for direct and indirect eﬀects in the context of causal mediation analysis for binary, continuous and multi-dimensional mediators. We consider standard parametric implementations of classical estimators, and propose and assess the relevance of several extensions inspired from double or debiased machine learning, in particular non-parametric models, regularization, probability calibration and cross-ﬁtting. Our results show that most methods obtain reasonable estimates under model misspeciﬁcation, but some methods, including multiply-robust methods, are very sensitive to (near-)violations of the overlap assumption. This trend is even more pro-nounced in multi-dimensional settings. We also describe settings where the use of more complex non-parametric models for estimation is relevant. To illustrate the considered methods on real data, we examine the causal path from higher education graduation to middle-age general intelligence in the UK Biobank, which includes several potential binary, continuous and multi-dimensional mediators. This analysis shows that this eﬀect is partially mediated by having a physical occupation, and brain characteristics measured through MRI, but not by the brain age, a popular MRI-derived phenotype.

...read moreread less

Posted Content•

Functional Magnetic Resonance Imaging data augmentation through conditional ICA

[...]

Badr Tajini, Hugo Richard, Bertrand Thirion

11 Jul 2021-arXiv: Image and Video Processing

TL;DR: Conditional Independent Components Analysis (Conditional ICA) as discussed by the authors is a fast functional Magnetic Resonance Imaging (fMRI) data augmentation technique, that leverages abundant resting-state data to create images by sampling from an ICA decomposition.

...read moreread less

Abstract: Advances in computational cognitive neuroimaging research are related to the availability of large amounts of labeled brain imaging data, but such data are scarce and expensive to generate. While powerful data generation mechanisms, such as Generative Adversarial Networks (GANs), have been designed in the last decade for computer vision, such improvements have not yet carried over to brain imaging. A likely reason is that GANs training is ill-suited to the noisy, high-dimensional and small-sample data available in functional neuroimaging. In this paper, we introduce Conditional Independent Components Analysis (Conditional ICA): a fast functional Magnetic Resonance Imaging (fMRI) data augmentation technique, that leverages abundant resting-state data to create images by sampling from an ICA decomposition. We then propose a mechanism to condition the generator on classes observed with few samples. We first show that the generative mechanism is successful at synthesizing data indistinguishable from observations, and that it yields gains in classification accuracy in brain decoding problems. In particular it outperforms GANs while being much easier to optimize and interpret. Lastly, Conditional ICA enhances classification accuracy in eight datasets without further parameters tuning.

...read moreread less

Proceedings Article•DOI•

A Conditional Randomization Test for Sparse Logistic Regression in High-Dimension

[...]

Binh Nguyen, Bertrand Thirion, Sylvain Arlot

29 May 2022

TL;DR: This work proposes CRT-logit, an algorithm that combines a variable-distillation step and a decorrelation step that takes into account the geometry of ‘ 1 -penalized logistic regression problem’ to improve the Conditional Randomization Test.

...read moreread less

Abstract: Identifying the relevant variables for a classification model with correct confidence levels is a central but difficult task in high-dimension. Despite the core role of sparse logistic regression in statistics and machine learning, it still lacks a good solution for accurate inference in the regime where the number of features $p$ is as large as or larger than the number of samples $n$. Here, we tackle this problem by improving the Conditional Randomization Test (CRT). The original CRT algorithm shows promise as a way to output p-values while making few assumptions on the distribution of the test statistics. As it comes with a prohibitive computational cost even in mildly high-dimensional problems, faster solutions based on distillation have been proposed. Yet, they rely on unrealistic hypotheses and result in low-power solutions. To improve this, we propose \emph{CRT-logit}, an algorithm that combines a variable-distillation step and a decorrelation step that takes into account the geometry of $\ell_1$-penalized logistic regression problem. We provide a theoretical analysis of this procedure, and demonstrate its effectiveness on simulations, along with experiments on large-scale brain-imaging and genomics datasets.

...read moreread less

Posted Content•DOI•

Continuous Evaluation of Denoising Strategies in Resting-State fMRI Connectivity Using fMRIPrep and Nilearn

[...]

Hao-Ting Wang, Steven L. Meisler, Hanad Sharmarke, Christopher J. Markiewicz, Francois Paugam, Bertrand Thirion, P. Bellec - Show less +3 more

05 Jul 2023-bioRxiv

TL;DR: In this article , the authors present a denoising benchmark for functional magnetic resonance imaging (fMRI) connectivity analyses based on the fMRIprep software, which is implemented in a fully reproducible framework, where the provided research objects enable readers to reproduce or modify core computations.

...read moreread less

Abstract: Reducing contributions from non-neuronal sources is a crucial step in functional magnetic resonance imaging (fMRI) connectivity analyses. Many viable strategies for denoising fMRI are used in the literature, and practitioners rely on denoising benchmarks for guidance in the selection of an appropriate choice for their study. However, fMRI denoising software is an ever-evolving field, and the benchmarks can quickly become obsolete as the techniques or implementations change. In this work, we present a denoising benchmark featuring a range of denoising strategies, datasets and evaluation metrics for connectivity analyses, based on the popular fMRIprep software. The benchmark is implemented in a fully reproducible framework, where the provided research objects enable readers to reproduce or modify core computations, as well as the figures of the article using the Jupyter Book project and the Neurolibre reproducible preprint server (https://neurolibre.org/). We demonstrate how such a reproducible benchmark can be used for continuous evaluation of research software, by comparing two versions of the fMRIprep software package. The majority of benchmark results were consistent with prior literature. Scrubbing, a technique which excludes time points with excessive motion, combined with global signal regression, is generally effective at noise removal. Scrubbing however disrupts the continuous sampling of brain images and is incompatible with some statistical analyses, e.g. auto-regressive modeling. In this case, a simple strategy using motion parameters, average activity in select brain compartments, and global signal regression should be preferred. Importantly, we found that certain denoising strategies behave inconsistently across datasets and/or versions of fMRIPrep, or had a different behavior than in previously published benchmarks. This work will hopefully provide useful guidelines for the fMRIprep users community, and highlight the importance of continuous evaluation of research methods. Our reproducible benchmark infrastructure will facilitate such continuous evaluation in the future, and may also be applied broadly to different tools or even research fields.

...read moreread less

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
…
60
61
62
63
64
65
66
…
67

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

Scikit-learn: Machine Learning in Python

[...]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel¹, Peter Prettenhofer², Ron Weiss³, Vincent Dubourg, Jake Vanderplas⁴, Alexandre Passos⁵, David Cournapeau, Matthieu Brucher⁶, Matthieu Perrot, Edouard Duchesnay - Show less +12 more•Institutions (6)

Kobe University¹, Bauhaus University, Weimar², Google³, University of Washington⁴, University of Massachusetts Amherst⁵, Total S.A.⁶

01 Feb 2011-Journal of Machine Learning Research

TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.

...read moreread less

Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

...read moreread less

47,974 citations

Posted Content•

Scikit-learn: Machine Learning in Python

[...]

Fabian Pedregosa¹, Gaël Varoquaux¹, Alexandre Gramfort¹, Vincent Michel¹, Bertrand Thirion¹, Olivier Grisel, Mathieu Blondel, Andreas Müller², Joel Nothman, Gilles Louppe², Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, Edouard Duchesnay - Show less +15 more•Institutions (2)

French Institute for Research in Computer Science and Automation¹, University of Liège²

02 Jan 2012-arXiv: Learning

TL;DR: Scikit-learn as mentioned in this paper is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.

...read moreread less

28,898 citations

疟原虫var基因转换速率变化导致抗原变异[英]／Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A

[...]

宁北芳, 朱淮民

28 Jul 2005

TL;DR: PfPMP1）与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作�ly.

...read moreread less

Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1（PfPMP1）与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员，通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

...read moreread less

18,940 citations

Proceedings Article•DOI•

XGBoost: A Scalable Tree Boosting System

[...]

Tianqi Chen¹, Carlos Guestrin¹•Institutions (1)

University of Washington¹

13 Aug 2016

TL;DR: XGBoost as discussed by the authors proposes a sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning to achieve state-of-the-art results on many machine learning challenges.

...read moreread less

Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

...read moreread less

14,872 citations

Proceedings Article•DOI•

XGBoost: A Scalable Tree Boosting System

[...]

Tianqi Chen¹, Carlos Guestrin¹•Institutions (1)

University of Washington¹

09 Mar 2016-arXiv: Learning

TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.

...read moreread less

13,333 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse