scispace - formally typeset
Search or ask a question
Author

Bertrand Thirion

Bio: Bertrand Thirion is an academic researcher from Université Paris-Saclay. The author has contributed to research in topics: Cluster analysis & Cognition. The author has an hindex of 51, co-authored 311 publications receiving 73839 citations. Previous affiliations of Bertrand Thirion include French Institute for Research in Computer Science and Automation & French Institute of Health and Medical Research.


Papers
More filters
Book ChapterDOI
16 Dec 2011
TL;DR: The proposed approach relies on randomization techniques which have been proved to be consistent for support recovery and makes use of a spatially constrained hierarchical clustering algorithm to account for the spatial correlations between voxels.
Abstract: The prediction of behavioral covariates from functional MRI (fMRI) is known as brain reading. From a statistical standpoint, this challenge is a supervised learning task. The ability to predict cognitive states from new data gives a model selection criterion: prediction accuracy. While a good prediction score implies that some of the voxels used by the classifier are relevant, one cannot state that these voxels form the brain regions involved in the cognitive task. The best predictive model may have selected by chance noninformative regions, and neglected relevant regions that provide duplicate information. In this contribution, we address the support identification problem. The proposed approach relies on randomization techniques which have been proved to be consistent for support recovery. To account for the spatial correlations between voxels, our approach makes use of a spatially constrained hierarchical clustering algorithm. Results are provided on simulations and a visual experiment.

11 citations

Proceedings Article
01 Jan 2013
TL;DR: This work proposes a method that predicts the experimental paradigms across different studies using a large corpus of imaging studies and a predictive engine, and is the first demonstration of predicting the cognitive content of completely new brain images.
Abstract: Imaging neuroscience links brain activation maps to behavior and cognition via correlational studies. Due to the nature of the individual experiments, based on eliciting neural response from a small number of stimuli, this link is incomplete, and unidirectional from the causal point of view. To come to conclusions on the function implied by the activation of brain regions, it is necessary to combine a wide exploration of the various brain functions and some inversion of the statistical inference. Here we introduce a methodology for accumulating knowledge towards a bidirectional link between observed brain activity and the corresponding function. We rely on a large corpus of imaging studies and a predictive engine. Technically, the challenges are to find commonality between the studies without denaturing the richness of the corpus. The key elements that we contribute are labeling the tasks performed with a cognitive ontology, and modeling the long tail of rare paradigms in the corpus. To our knowledge, our approach is the first demonstration of predicting the cognitive content of completely new brain images. To that end, we propose a method that predicts the experimental paradigms across different studies.

11 citations

Book ChapterDOI
23 Aug 2010
TL;DR: Because the number of variables is very high, the statistical aspects of these data and the challenges, such as the multiple comparison problem, created by such a large imaging genetics study are described and possible strategies are suggested.
Abstract: The IMAGEN study—a very large European Research Project—seeks to identify and characterize biological and environmental factors that influence teenagers mental health. To this aim, the consortium plans to collect data for more than 2000 subjects at 8 neuroimaging centres. These data comprise neuroimaging data, behavioral tests (for up to 5 hours of testing), and also white blood samples which are collected and processed to obtain 650 k single nucleotide polymorphisms (SNP) per subject. Data for more than 1000 subjects have already been collected. We describe the statistical aspects of these data and the challenges, such as the multiple comparison problem, created by such a large imaging genetics study (i.e., 650 k for the SNP, 50 k data per neuroimage).We also suggest possible strategies, and present some first investigations using uni or multi-variate methods in association with re-sampling techniques. Specifically, because the number of variables is very high, we first reduce the data size and then use multivariate (CCA, PLS) techniques in association with re-sampling techniques.

11 citations

Book ChapterDOI
05 Oct 2015
TL;DR: A convex region-selecting penalty is introduced that leads to segmentation of medical images in a target-informed manner and an efficient optimization scheme that brings significant computational gains.
Abstract: Prediction from medical images is a valuable aid to diagnosis For instance, anatomical MR images can reveal certain disease conditions, while their functional counterparts can predict neuropsychiatric phenotypes However, a physician will not rely on predictions by black-box models: understanding the anatomical or functional features that underpin decision is critical Generally, the weight vectors of classifiers are not easily amenable to such an examination: Often there is no apparent structure Indeed, this is not only a prediction task, but also an inverse problem that calls for adequate regularization We address this challenge by introducing a convex region-selecting penalty Our penalty combines total-variation regularization, enforcing spatial contiguity, and l1 regularization, enforcing sparsity, into one group: Voxels are either active with non-zero spatial derivative or zero with inactive spatial derivative This leads to segmenting contiguous spatial regions inside which the signal can vary freely against a background of zeros Such segmentation of medical images in a target-informed manner is an important analysis tool On several prediction problems from brain MRI, the penalty shows good segmentation Given the size of medical images, computational efficiency is key Keeping this in mind, we contribute an efficient optimization scheme that brings significant computational gains

11 citations

Posted ContentDOI
08 Dec 2020-bioRxiv
TL;DR: It is found that functional alignment generally improves inter-subject decoding accuracy though the best performing method depends on the research context, and two new extensions of functional alignment methods are introduced: piecewise Shared Response Modelling, and intra-subject alignment.
Abstract: Inter-individual variability in the functional organization of the brain presents a major obstacle to identifying generalizable neural coding principles. Functional alignment—a class of methods that matches subjects’ neural signals based on their functional similarity—is a promising strategy for addressing this variability. At present, however, a range of functional alignment methods have been proposed and their relative performance is still unclear. In this work, we benchmark five functional alignment methods for inter-subject decoding on four publicly available datasets. Specifically, we consider piecewise Procrustes, searchlight Procrustes, piecewise Optimal Transport, Shared Response Modelling (SRM), and intra-subject alignment; as well as associated methodological choices such as ROI definition. We find that functional alignment generally improves inter-subject decoding accuracy though the best performing method depends on the research context. Specifically, SRM performs best within a region-of-interest while piecewise Optimal Transport performs best at a whole-brain scale. We also benchmark the computational efficiency of each of the surveyed methods, providing insight into their usability and scalability. Taking inter-subject decoding accuracy as a quantification of inter-subject similarity, our results support the use of functional alignment to improve inter-subject comparisons in the face of variable structure-function organization. We provide open implementations of the methods used.

10 citations


Cited by
More filters
Journal Article
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

47,974 citations

Posted Content
TL;DR: Scikit-learn as mentioned in this paper is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from this http URL.

28,898 citations

28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Proceedings ArticleDOI
13 Aug 2016
TL;DR: XGBoost as discussed by the authors proposes a sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning to achieve state-of-the-art results on many machine learning challenges.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

14,872 citations

Proceedings ArticleDOI
TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

13,333 citations