scispace - formally typeset
Search or ask a question
Author

Bertrand Thirion

Bio: Bertrand Thirion is an academic researcher from Université Paris-Saclay. The author has contributed to research in topics: Cluster analysis & Cognition. The author has an hindex of 51, co-authored 311 publications receiving 73839 citations. Previous affiliations of Bertrand Thirion include French Institute for Research in Computer Science and Automation & French Institute of Health and Medical Research.


Papers
More filters
Journal ArticleDOI
08 Oct 2020
TL;DR: This comparison applies linear models for identifying significant contributing variables and for finding the most predictive variable sets to systematic data simulations and common medical datasets to explore how variables identified as significantly relevant and variables identifiedAs predictively relevant can agree or diverge.
Abstract: Summary In the 20th century, many advances in biological knowledge and evidence-based medicine were supported by p values and accompanying methods. In the early 21st century, ambitions toward precision medicine place a premium on detailed predictions for single individuals. The shift causes tension between traditional regression methods used to infer statistically significant group differences and burgeoning predictive analysis tools suited to forecast an individual's future. Our comparison applies linear models for identifying significant contributing variables and for finding the most predictive variable sets. In systematic data simulations and common medical datasets, we explored how variables identified as significantly relevant and variables identified as predictively relevant can agree or diverge. Across analysis scenarios, even small predictive performances typically coincided with finding underlying significant statistical relationships, but not vice versa. More complete understanding of different ways to define “important” associations is a prerequisite for reproducible research and advances toward personalizing medical care.

38 citations

Book ChapterDOI
28 Jun 2013
TL;DR: A new sparse group Gaussian graphical model (SGGGM) is proposed that facilitates joint estimation of intra-subject and group-level connectivity and significantly improves brain activation detection over connectivity priors derived from other graphical modeling approaches.
Abstract: The estimation of intra-subject functional connectivity is greatly complicated by the small sample size and complex noise structure in functional magnetic resonance imaging (fMRI) data. Pooling samples across subjects improves the conditioning of the estimation, but loses subject-specific connectivity information. In this paper, we propose a new sparse group Gaussian graphical model (SGGGM) that facilitates joint estimation of intra-subject and group-level connectivity. This is achieved by casting functional connectivity estimation as a regularized consensus optimization problem, in which information across subjects is aggregated in learning group-level connectivity and group information is propagated back in estimating intra-subject connectivity. On synthetic data, we show that incorporating group information using SGGGM significantly enhances intra-subject connectivity estimation over existing techniques. More accurate group-level connectivity is also obtained. On real data from a cohort of 60 subjects, we show that integrating intra-subject connectivity estimated with SGGGM significantly improves brain activation detection over connectivity priors derived from other graphical modeling approaches.

37 citations

Journal ArticleDOI
TL;DR: A collection of test statistics accounting for estimation uncertainties at the within-subject level, that can be used as alternatives to the standard t statistic in one-sample random-effect analyses, i.e. when testing the mean effect of a population.

37 citations

Book ChapterDOI
03 Jul 2011
TL;DR: The results show that functional connectivity can be explained by anatomical connectivity on a rigorous statistical basis, and that a proper model of functional connectivity is essential to assess this link.
Abstract: We present a novel probabilistic framework to learn across several subjects a mapping from brain anatomical connectivity to functional connectivity, ie the covariance structure of brain activity This prediction problem must be formulated as a structured-output learning task, as the predicted parameters are strongly correlated We introduce a model selection framework based on cross-validation with a parametrization-independent loss function suitable to the manifold of covariance matrices Our model is based on constraining the conditional independence structure of functional activity by the anatomical connectivity Subsequently, we learn a linear predictor of a stationary multivariate autoregressive model This natural parameterization of functional connectivity also enforces the positive-definiteness of the predicted covariance and thus matches the structure of the output space Our results show that functional connectivity can be explained by anatomical connectivity on a rigorous statistical basis, and that a proper model of functional connectivity is essential to assess this link

36 citations

Proceedings ArticleDOI
14 May 2006
TL;DR: This work addresses the open question of an optimal parameterization (number of parcels) of brain parcellations using information theoretic criteria and cross-validation and suggests a finer analysis of variance components enables us to better characterize intra- and inter-subject variability sources in parcellation models.
Abstract: The acquisition of brain images in fMRI yields rich topographic information about the functional structure of the brain. However, these descriptions are limited by strong inter-subject variability. A recent approach to represent the gross functional architecture across the population as seen in fMRI consists in automatically defining accross-subjects brain parcels. This technique yields large-scale inter-subject correspondences while allowing some spatial relaxation in the alignment of the brains. We address here the open question of an optimal parameterization (number of parcels) of brain parcellations using information theoretic criteria and cross-validation. Moreover, a finer analysis of variance components enables us to better characterize intra- and inter-subject variability sources in parcellation models.

36 citations


Cited by
More filters
Journal Article
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

47,974 citations

Posted Content
TL;DR: Scikit-learn as mentioned in this paper is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from this http URL.

28,898 citations

28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Proceedings ArticleDOI
13 Aug 2016
TL;DR: XGBoost as discussed by the authors proposes a sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning to achieve state-of-the-art results on many machine learning challenges.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

14,872 citations

Proceedings ArticleDOI
TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

13,333 citations