scispace - formally typeset
Search or ask a question
Author

Bertrand Thirion

Bio: Bertrand Thirion is an academic researcher from Université Paris-Saclay. The author has contributed to research in topics: Cluster analysis & Cognition. The author has an hindex of 51, co-authored 311 publications receiving 73839 citations. Previous affiliations of Bertrand Thirion include French Institute for Research in Computer Science and Automation & French Institute of Health and Medical Research.


Papers
More filters
Journal ArticleDOI
TL;DR: A hierarchical model for patterns in multi-subject fMRI datasets is proposed, akin to mixed-effect group models used in linear-model-based analysis, based on an estimation procedure, CanICA (Canonical ICA), based on i) probabilistic dimension reduction of the individual data, ii) canonical correlation analysis to identify a data subspace common to the group iii) ICA-based pattern extraction.

176 citations

Posted Content
TL;DR: Scikit-learn as mentioned in this paper contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain.
Abstract: Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g. multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g. resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain.

173 citations

Journal ArticleDOI
TL;DR: In this paper, the l1 norm of the image gradient is used as a regularization method for brain decoding, which can be applied to fMRI data for brain mapping and brain decoding.
Abstract: While medical imaging typically provides massive amounts of data, the extraction of relevant information for predictive diagnosis remains a difficult challenge. Functional magnetic resonance imaging (fMRI) data, that provide an indirect measure of task-related or spontaneous neuronal activity, are classically analyzed in a mass-univariate procedure yielding statistical parametric maps. This analysis framework disregards some important principles of brain organization: population coding, distributed and overlapping representations. Multivariate pattern analysis, i.e., the prediction of behavioral variables from brain activation patterns better captures this structure. To cope with the high dimensionality of the data, the learning method has to be regularized. However, the spatial structure of the image is not taken into account in standard regularization methods, so that the extracted features are often hard to interpret. More informative and interpretable results can be obtained with the l1 norm of the image gradient, also known as its total variation (TV), as regularization. We apply for the first time this method to fMRI data, and show that TV regularization is well suited to the purpose of brain mapping while being a powerful tool for brain decoding. Moreover, this article presents the first use of TV regularization for classification.

169 citations

Journal ArticleDOI
TL;DR: This collection of individual fMRI data will help to describe the cerebral inter-subject variability of the correlates of some language, calculation and sensorimotor tasks and will serve as the cornerstone to establish a hybrid database of hundreds of subjects suitable to study the range and causes of variation in the cerebral bases of numerous mental processes.
Abstract: Background Although cognitive processes such as reading and calculation are associated with reproducible cerebral networks, inter-individual variability is considerable. Understanding the origins of this variability will require the elaboration of large multimodal databases compiling behavioral, anatomical, genetic and functional neuroimaging data over hundreds of subjects. With this goal in mind, we designed a simple and fast acquisition procedure based on a 5-minute functional magnetic resonance imaging (fMRI) sequence that can be run as easily and as systematically as an anatomical scan, and is therefore used in every subject undergoing fMRI in our laboratory. This protocol captures the cerebral bases of auditory and visual perception, motor actions, reading, language comprehension and mental calculation at an individual level.

164 citations

01 Jan 2009
TL;DR: In this article, the authors used multivariate pattern recognition on high-resolution functional imaging data to decode the information content of fine-scale signals evoked by different individual numbers, and demonstrated partial format invariance of individual number codes that is compatible with more numerous but more broadly tuned populations for nonsymbolic than for symbolic numbers, as postulated by recent computational models.
Abstract: Summary Background: Neuropsychology and human functional neuroimaging have implicated human parietal cortex in numerical processing, and macaque electrophysiology has shown that intraparietal areas house neurons tuned to numerosity. Yet although the areas responding overall during numerical tasks have been well defined by neuroimaging, a direct demonstration of individual number coding by spatial patterns has thus far been elusive. Results: We used multivariate pattern recognition on highresolution functional imaging data to decode the information content of fine-scale signals evoked by different individual numbers. Parietal activation patterns for individual numerosities could be accurately discriminated and generalized across changes in low-level stimulus parameters. Distinct patterns were evoked by symbolic and nonsymbolic number formats, and individual digits were less accurately decoded (albeit still with significant accuracy) than numbers of dots. Interestingly, the numerosity of dot sets could be predicted above chance from the brain activation patterns evoked by digits, but not vice versa. Finally, number-evoked patterns changed in a gradual fashion as a function of numerical distance for the nonsymbolic notation, compatible with some degree of orderly layout of individual number representations. Conclusions: Our findings demonstrate partial format invariance of individual number codes that is compatible with more numerous but more broadly tuned populations for nonsymbolic than for symbolic numbers, as postulated by recent computational models. In more general terms, our results illustrate the potential of functional magnetic resonance imaging pattern recognition to understand the detailed format of representations within a single semantic category, and beyond sensory cortical areas for which columnar architectures are well established.

154 citations


Cited by
More filters
Journal Article
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

47,974 citations

Posted Content
TL;DR: Scikit-learn as mentioned in this paper is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from this http URL.

28,898 citations

28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Proceedings ArticleDOI
13 Aug 2016
TL;DR: XGBoost as discussed by the authors proposes a sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning to achieve state-of-the-art results on many machine learning challenges.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

14,872 citations

Proceedings ArticleDOI
TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

13,333 citations