scispace - formally typeset
Search or ask a question
Author

Bertrand Thirion

Bio: Bertrand Thirion is an academic researcher from Université Paris-Saclay. The author has contributed to research in topics: Cluster analysis & Cognition. The author has an hindex of 51, co-authored 311 publications receiving 73839 citations. Previous affiliations of Bertrand Thirion include French Institute for Research in Computer Science and Automation & French Institute of Health and Medical Research.


Papers
More filters
Proceedings ArticleDOI
TL;DR: A simple method of rapid diagnostic classification for the clinic using Support Vector Machines (SVM) and easy to obtain geometrical measurements that, together with a cortical and sub-cortical brain parcellation, create a robust framework capable of automatic diagnosis with high accuracy.
Abstract: Magnetic Resonance Imaging (MRI) has been gaining popularity in the clinic in recent years as a safe in-vivoimaging technique. As a result, large troves of data are being gathered and stored daily that may be used asclinical training sets in hospitals. While numerous machine learning (ML) algorithms have been implemented forAlzheimer's disease classi cation, their outputs are usually dicult to interpret in the clinical setting. Here, wepropose a simple method of rapid diagnostic classi cation for the clinic using Support Vector Machines (SVM) 1 and easy to obtain geometrical measurements that, together with a cortical and sub-cortical brain parcellation,create a robust framework capable of automatic diagnosis with high accuracy. On a signi cantly large imagingdataset consisting of over 800 subjects taken from the Alzheimer's Disease Neuroimaging Initiative (ADNI)database, classi cation-success indexes of up to 99.2% are reached with a single measurement.Keywords: Alzheimer's disease, machine learning, mild cognitive impairment, support vector machines, fastclinical diagnosis.

3 citations

Proceedings Article
10 Dec 2016
TL;DR: In this article, the authors present a matrix factorization algorithm that scales to input matrices that are large in both dimensions (i.e., that contain more than 1TB of data).
Abstract: We present a matrix factorization algorithm that scales to input matrices that are large in both dimensions (i.e., that contains more than 1TB of data). The algorithm streams the matrix columns while subsampling them, resulting in low complexity per iteration and reasonable memory footprint. In contrast to previous online matrix factorization methods, our approach relies on low-dimensional statistics from past iterates to control the extra variance introduced by subsampling. We present a convergence analysis that guarantees us to reach a stationary point of the problem. Large speed-ups can be obtained compared to previous online algorithms that do not perform subsampling, thanks to the feature redundancy that often exists in high-dimensional settings.

3 citations

Dissertation
01 Jan 2003
TL;DR: Nous nous interessons tout d'abord a the modelisation des series temporelles obtenues pour chaque voxel separement, en faisant appel aux techniques de prediction lineaire and au calcul de l'information des processus modelises.
Abstract: Dans cette these, nous discutons et proposons un certains nombre de methodes pour l'analyse de donnees d'IRM -imagerie par resonance magnetique- fonctionnelle. L'IRM fonctionnelle est une modalite recente de l'exploration du cerveau: elle produit des sequences d'images refletant l'activite metabolique locale, celle-ci refletant l'activite neuronale. Nous nous interessons tout d'abord a la modelisation des series temporelles obtenues pour chaque voxel separement, en faisant appel aux techniques de prediction lineaire et au calcul de l'information des processus modelises. Nous etudions ensuite differentes generalisations multivariees de ce modele. Apres avoir rappele et discute certaines techniques classiques (analyse en composantes independantes, regroupement), nous proposons successivement une approche lineaire fondee sur la theorie des systemes a etat et une approche non-lineaire fondee sur les decompositions a noyau. Le but commun de ces methodes -qui peuvent se completer- est de proposer des decompositions qui preservent au mieux la dynamique des donnees. Nous introduisons ensuite une approche nouvelle par reduction de la dimension des donnees; cette approche offre une representation plus structuree et relativement agreable a visualiser. Nous montrons ses avantages par rapport aux techniques lineaires classiques. Enfin, nous decrivons une methodologie d'analyse qui synthetise une grande partie de ce travail, et repose sur des hypotheses tres souples. Nos resultats offrent ainsi une description globale des processus dynamiques qui sont mis en image lors des experiences d'IRM fonctionnelle

3 citations

Book ChapterDOI
28 Jun 2015
TL;DR: In this article, a bootstrapped permutation test (BPT) was proposed to identify statistically significant features from sparse multiresponse regression (SMR) models with unknown parameter distribution.
Abstract: Despite that diagnosis of neurological disorders commonly involves a collection of behavioral assessments, most neuroimaging studies investigating the associations between brain and behavior largely analyze each behavioral measure in isolation. To jointly model multiple behavioral scores, sparse multiresponse regression (SMR) is often used. However, directly applying SMR without statistically controlling for false positives could result in many spurious findings. For models, such as SMR, where the distribution of the model parameters is unknown, permutation test and stability selection are typically used to control for false positives. In this paper, we present another technique for inferring statistically significant features from models with unknown parameter distribution. We refer to this technique as bootstrapped permutation test (BPT), which uses Studentized statistics to exploit the intuition that the variability in parameter estimates associated with relevant features would likely be higher with responses permuted. On synthetic data, we show that BPT provides higher sensitivity in identifying relevant features from the SMR model than permutation test and stability selection, while retaining strong control on the false positive rate. We further apply BPT to study the associations between brain connectivity estimated from pseudo-rest fMRI data of 1139 fourteen year olds and behavioral measures related to ADHD. Significant connections are found between brain networks known to be implicated in the behavioral tasks involved. Moreover, we validate the identified connections by fitting a regression model on pseudo-rest data with only those connections and applying this model on resting state fMRI data of 337 left out subjects to predict their behavioral scores. The predicted scores significantly correlate with the actual scores, hence verifying the behavioral relevance of the found connections.

3 citations

01 Jan 2013
TL;DR: It is shown that conventional functional connectivity Estimates based on Pearson's correlation and anatomical connectivity estimates based on fiber counts are actually not that highly correlated for typical RS-fMRI and dMRI data, and it is illustrated that these inconsistencies can be useful in fMRI-dMRI integration for improving brain connectivity estimation.
Abstract: There is a recent trend towards integrating resting state functional magnetic resonance imaging (RS-fMRI) and diffusion MRI (dMRI) for brain connectivity estimation, as motivated by how estimates from these modalities are presumably two views reflecting the same underlying brain circuitry. In this paper, we show on a cohort of 60 subjects that conventional functional connectivity (FC) estimates based on Pearson's correlation and anatomical connectivity (AC) estimates based on fiber counts are actually not that highly correlated for typical RS-fMRI (approximately 7 min) and dMRI (approximately 32 gradient directions) data. The FC-AC correlation can be significantly increased by considering sparse partial correlation and modeling fiber endpoint uncertainty, but the resulting FC-AC correlation is still rather low in absolute terms. We further exemplify the inconsistencies between FC and AC estimates by integrating them as priors into activation detection and demonstrating significant differences in their detection sensitivity. Importantly, we illustrate that these inconsistencies can be useful in fMRI-dMRI integration for improving brain connectivity estimation.

3 citations


Cited by
More filters
Journal Article
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

47,974 citations

Posted Content
TL;DR: Scikit-learn as mentioned in this paper is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from this http URL.

28,898 citations

28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Proceedings ArticleDOI
13 Aug 2016
TL;DR: XGBoost as discussed by the authors proposes a sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning to achieve state-of-the-art results on many machine learning challenges.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

14,872 citations

Proceedings ArticleDOI
TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

13,333 citations