scispace - formally typeset
Search or ask a question

Showing papers by "Bertrand Thirion published in 2021"


Journal ArticleDOI
TL;DR: In this article, a multidomain brain decoder was proposed to automatically learn the spatiotemporal dynamics of brain response within a short time window using a deep learning approach.

34 citations


Journal ArticleDOI
TL;DR: In this paper, the authors developed a parcel-wise linear regression model based on a dictionary of reference topographies and aggregated estimates across these parcellations to predict functional features in left-out subjects.

19 citations


Posted ContentDOI
24 Sep 2021-bioRxiv
TL;DR: In this article, the authors applied machine learning on more than 10.000 individuals from the general population to define empirical approximations of health-related psychological measures that do not require human judgment.
Abstract: Background Biological aging is revealed by physical measures, e.g., DNA probes or brain scans. Instead, individual differences in mental function are explained by psychological constructs, e.g., intelligence or neuroticism. These constructs are typically assessed by tailored neuropsychological tests that build on expert judgement and require careful interpretation. Could machine learning on large samples from the general population be used to build proxy measures of these constructs that do not require human intervention? Results Here, we built proxy measures by applying machine learning on multimodal MR images and rich sociodemographic information from the largest biomedical cohort to date: the UK Biobank. Objective model comparisons revealed that all proxies captured the target constructs and were as useful, and sometimes more useful than the original measures for characterizing real-world health behavior (sleep, exercise, tobacco, alcohol consumption). We observed this complementarity of proxy measures and original measures when modeling from brain signals or sociodemographic data, capturing multiple health-related constructs. Conclusions Population modeling with machine learning can derive measures of mental health from brain signals and questionnaire data, which may complement or even substitute for psychometric assessments in clinical populations. Key Points We applied machine learning on more than 10.000 individuals from the general population to define empirical approximations of health-related psychological measures that do not require human judgment. We found that machine-learning enriched the given psychological measures via approximation from brain and sociodemographic data: Resulting proxy measures related as well or better to real-world health behavior than the original measures. Model comparisons showed that sociodemographic information contributed most to characterizing psychological traits beyond aging.

16 citations


Journal ArticleDOI
TL;DR: In this article, the authors applied machine learning on multimodal MR images and rich sociodemographic information from the largest biomedical cohort to date: the UK Biobank.
Abstract: Background Biological aging is revealed by physical measures, e.g., DNA probes or brain scans. In contrast, individual differences in mental function are explained by psychological constructs, e.g., intelligence or neuroticism. These constructs are typically assessed by tailored neuropsychological tests that build on expert judgement and require careful interpretation. Could machine learning on large samples from the general population be used to build proxy measures of these constructs that do not require human intervention? Results Here, we built proxy measures by applying machine learning on multimodal MR images and rich sociodemographic information from the largest biomedical cohort to date: the UK Biobank. Objective model comparisons revealed that all proxies captured the target constructs and were as useful, and sometimes more useful, than the original measures for characterizing real-world health behavior (sleep, exercise, tobacco, alcohol consumption). We observed this complementarity of proxy measures and original measures at capturing multiple health-related constructs when modeling from, both, brain signals and sociodemographic data. Conclusion Population modeling with machine learning can derive measures of mental health from heterogeneous inputs including brain signals and questionnaire data. This may complement or even substitute for psychometric assessments in clinical populations.

15 citations


Journal ArticleDOI
TL;DR: In this paper, the authors introduce a new methodology to analyze brain responses across tasks without a joint model of the psychological processes, which boosts statistical power in small studies with specific cognitive focus by analyzing them jointly with large studies that probe less focal mental processes.
Abstract: Cognitive brain imaging is accumulating datasets about the neural substrate of many different mental processes. Yet, most studies are based on few subjects and have low statistical power. Analyzing data across studies could bring more statistical power; yet the current brain-imaging analytic framework cannot be used at scale as it requires casting all cognitive tasks in a unified theoretical framework. We introduce a new methodology to analyze brain responses across tasks without a joint model of the psychological processes. The method boosts statistical power in small studies with specific cognitive focus by analyzing them jointly with large studies that probe less focal mental processes. Our approach improves decoding performance for 80% of 35 widely-different functional-imaging studies. It finds commonalities across tasks in a data-driven way, via common brain representations that predict mental processes. These are brain networks tuned to psychological manipulations. They outline interpretable and plausible brain structures. The extracted networks have been made available; they can be readily reused in new neuro-imaging studies. We provide a multi-study decoding tool to adapt to new data.

11 citations


Journal ArticleDOI
TL;DR: In this article, the authors benchmark five functional alignment methods for inter-subject decoding on four publicly available datasets and find that functional alignment generally improves intersubject decoding accuracy though the best performing method depends on the research context.

8 citations


Journal ArticleDOI
TL;DR: In this article, the authors leverage the Individual Brain Charting (IBC) dataset, a high-resolution task-fMRI dataset acquired in a fixed environment, to study the feasibility of individual mapping.
Abstract: Functional Magnetic Resonance Imaging (fMRI) has opened the possibility to investigate how brain activity is modulated by behavior. Most studies so far are bound to one single task, in which functional responses to a handful of contrasts are analyzed and reported as a group average brain map. Contrariwise, recent data-collection efforts have started to target a systematic spatial representation of multiple mental functions. In this paper, we leverage the Individual Brain Charting (IBC) dataset—a high-resolution task-fMRI dataset acquired in a fixed environment—in order to study the feasibility of individual mapping. First, we verify that the IBC brain maps reproduce those obtained from previous, large-scale datasets using the same tasks. Second, we confirm that the elementary spatial components, inferred across all tasks, are consistently mapped within and, to a lesser extent, across participants. Third, we demonstrate the relevance of the topographic information of the individual contrast maps, showing that contrasts from one task can be predicted by contrasts from other tasks. At last, we showcase the benefit of contrast accumulation for the fine functional characterization of brain regions within a pre-specified network. To this end, we analyze the cognitive profile of functional territories pertaining to the language network and prove that these profiles generalize across participants.

7 citations


Journal ArticleDOI
TL;DR: In this article, machine learning has been proposed for tissue fate prediction after acute ischemic stroke (AIS), with the aim to help treatment decision and patient management. But the authors compared three different ML models.
Abstract: Machine Learning (ML) has been proposed for tissue fate prediction after acute ischemic stroke (AIS), with the aim to help treatment decision and patient management. We compared three different ML ...

5 citations


Journal ArticleDOI
TL;DR: The concept of functional fingerprint is introduced in this article, which subsumes the accumulation of functional information at a given brain location, which is discussed in detail through concrete examples taken from the Individual Brain Charting dataset.
Abstract: How can neuroimaging inform us about the function of brain structures? This simple question immediately brings out two pertinent issues: (i) an inference problem, namely the fact that the function of a region can only be asserted after observing a large array of experimental conditions or contrasts; and (ii) the fact that the identity of a region can only be defined with accuracy at the individual level, because of intrinsic differences between subjects. To overcome this double challenge, we consider an approach based on the deep phenotyping of behavioral responses from task data acquired using functional Magnetic Resonance Imaging. The concept of functional fingerprint-which subsumes the accumulation of functional information at a given brain location-is herein discussed in detail through concrete examples taken from the Individual Brain Charting dataset.

4 citations


Journal ArticleDOI
TL;DR: In this paper, the Ensemble of Clustered Desparsified Lasso (EnCluDL) procedure for multivariate statistical inference on high-dimensional structured data was proposed.

3 citations


Book ChapterDOI
27 Sep 2021
TL;DR: Conditional Independent Components Analysis (Conditional ICA) as discussed by the authors is a fast functional Magnetic Resonance Imaging (fMRI) data augmentation technique, that leverages abundant resting-state data to create images by sampling from an ICA decomposition.
Abstract: Advances in computational cognitive neuroimaging research are related to the availability of large amounts of labeled brain imaging data, but such data are scarce and expensive to generate. While powerful data generation mechanisms, such as Generative Adversarial Networks (GANs), have been designed in the last decade for computer vision, such improvements have not yet carried over to brain imaging. A likely reason is that GANs training is ill-suited to the noisy, high-dimensional and small-sample data available in functional neuroimaging. In this paper, we introduce Conditional Independent Components Analysis (Conditional ICA): a fast functional Magnetic Resonance Imaging (fMRI) data augmentation technique, that leverages abundant resting-state data to create images by sampling from an ICA decomposition. We then propose a mechanism to condition the generator on classes observed with few samples. We first show that the generative mechanism is successful at synthesizing data indistinguishable from observations, and that it yields gains in classification accuracy in brain decoding problems. In particular it outperforms GANs while being much easier to optimize and interpret. Lastly, Conditional ICA enhances classification accuracy in eight datasets without further parameters tuning.

Posted Content
TL;DR: Conditional Independent Components Analysis (Conditional ICA) as discussed by the authors is a fast functional Magnetic Resonance Imaging (fMRI) data augmentation technique, that leverages abundant resting-state data to create images by sampling from an ICA decomposition.
Abstract: Advances in computational cognitive neuroimaging research are related to the availability of large amounts of labeled brain imaging data, but such data are scarce and expensive to generate. While powerful data generation mechanisms, such as Generative Adversarial Networks (GANs), have been designed in the last decade for computer vision, such improvements have not yet carried over to brain imaging. A likely reason is that GANs training is ill-suited to the noisy, high-dimensional and small-sample data available in functional neuroimaging. In this paper, we introduce Conditional Independent Components Analysis (Conditional ICA): a fast functional Magnetic Resonance Imaging (fMRI) data augmentation technique, that leverages abundant resting-state data to create images by sampling from an ICA decomposition. We then propose a mechanism to condition the generator on classes observed with few samples. We first show that the generative mechanism is successful at synthesizing data indistinguishable from observations, and that it yields gains in classification accuracy in brain decoding problems. In particular it outperforms GANs while being much easier to optimize and interpret. Lastly, Conditional ICA enhances classification accuracy in eight datasets without further parameters tuning.

Posted Content
TL;DR: Adaptive multi-view ICA (AVICA) as mentioned in this paper is a noisy ICA model where each view is a linear mixture of shared independent sources with additive noise on the sources.
Abstract: We consider a multi-view learning problem known as group independent component analysis (group ICA), where the goal is to recover shared independent sources from many views. The statistical modeling of this problem requires to take noise into account. When the model includes additive noise on the observations, the likelihood is intractable. By contrast, we propose Adaptive multiView ICA (AVICA), a noisy ICA model where each view is a linear mixture of shared independent sources with additive noise on the sources. In this setting, the likelihood has a tractable expression, which enables either direct optimization of the log-likelihood using a quasi-Newton method, or generalized EM. Importantly, we consider that the noise levels are also parameters that are learned from the data. This enables sources estimation with a closed-form Minimum Mean Squared Error (MMSE) estimator which weights each view according to its relative noise level. On synthetic data, AVICA yields better sources estimates than other group ICA methods thanks to its explicit MMSE estimator. On real magnetoencephalograpy (MEG) data, we provide evidence that the decomposition is less sensitive to sampling noise and that the noise variance estimates are biologically plausible. Lastly, on functional magnetic resonance imaging (fMRI) data, AVICA exhibits best performance in transferring information across views.

Posted Content
TL;DR: ShICA-J as mentioned in this paper models each view as a linear transform of shared independent components contaminated by additive Gaussian noise and uses joint diagonalization after multiset CCA to solve this problem.
Abstract: We consider shared response modeling, a multi-view learning problem where one wants to identify common components from multiple datasets or views. We introduce Shared Independent Component Analysis (ShICA) that models each view as a linear transform of shared independent components contaminated by additive Gaussian noise. We show that this model is identifiable if the components are either non-Gaussian or have enough diversity in noise variances. We then show that in some cases multi-set canonical correlation analysis can recover the correct unmixing matrices, but that even a small amount of sampling noise makes Multiset CCA fail. To solve this problem, we propose to use joint diagonalization after Multiset CCA, leading to a new approach called ShICA-J. We show via simulations that ShICA-J leads to improved results while being very fast to fit. While ShICA-J is based on second-order statistics, we further propose to leverage non-Gaussianity of the components using a maximum-likelihood method, ShICA-ML, that is both more accurate and more costly. Further, ShICA comes with a principled method for shared components estimation. Finally, we provide empirical evidence on fMRI and MEG datasets that ShICA yields more accurate estimation of the components than alternatives.

Posted Content
TL;DR: In this article, the authors study the properties of ensembled clustered inference algorithms which combine three techniques: spatially constrained clustering, statistical inference, and ensembling to aggregate several clustered inference solutions.
Abstract: We consider the inference problem for high-dimensional linear models, when covariates have an underlying spatial organization reflected in their correlation. A typical example of such a setting is high-resolution imaging, in which neighboring pixels are usually very similar. Accurate point and confidence intervals estimation is not possible in this context with many more covariates than samples, furthermore with high correlation between covariates. This calls for a reformulation of the statistical inference problem, that takes into account the underlying spatial structure: if covariates are locally correlated, it is acceptable to detect them up to a given spatial uncertainty. We thus propose to rely on the $\delta$-FWER, that is the probability of making a false discovery at a distance greater than $\delta$ from any true positive. With this target measure in mind, we study the properties of ensembled clustered inference algorithms which combine three techniques: spatially constrained clustering, statistical inference, and ensembling to aggregate several clustered inference solutions. We show that ensembled clustered inference algorithms control the $\delta$-FWER under standard assumptions for $\delta$ equal to the largest cluster diameter. We complement the theoretical analysis with empirical results, demonstrating accurate $\delta$-FWER control and decent power achieved by such inference algorithms.

Posted Content
TL;DR: In this paper, low-dimensional embedding spaces were derived from the UK Biobank population dataset and used to enhance data-scarce prediction of health indicators, lifestyle and demographic characteristics.
Abstract: High-quality data accumulation is now becoming ubiquitous in the health domain. There is increasing opportunity to exploit rich data from normal subjects to improve supervised estimators in specific diseases with notorious data scarcity. We demonstrate that low-dimensional embedding spaces can be derived from the UK Biobank population dataset and used to enhance data-scarce prediction of health indicators, lifestyle and demographic characteristics. Phenotype predictions facilitated by Variational Autoencoder manifolds typically scaled better with increasing unlabeled data than dimensionality reduction by PCA or Isomap. Performances gains from semisupervison approaches will probably become an important ingredient for various medical data science applications.