scispace - formally typeset
Search or ask a question

Showing papers by "Bertrand Thirion published in 2012"


Posted Content
TL;DR: Scikit-learn as mentioned in this paper is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from this http URL.

28,898 citations


Journal ArticleDOI
TL;DR: A method that combines signals from many brain regions observed in functional Magnetic Resonance Imaging to predict the subject's behavior during a scanning session yields higher prediction accuracy than standard voxel-based approaches and infers an explicit weighting of the regions involved in the regression or classification task.

137 citations



Journal ArticleDOI
TL;DR: A sparse hierarchical structured regularization that encodes the spatial structure of the data at different scales into the regularization, which makes the overall prediction procedure more robust to inter-subject variability.
Abstract: Inverse inference, or "brain reading", is a recent paradigm for analyzing functional magnetic resonance imaging (fMRI) data, based on pattern recognition and statistical learning. By predicting some cognitive variables related to brain activation maps, this approach aims at decoding brain activity. Inverse inference takes into account the multivariate information between voxels and is currently the only way to assess how precisely some cognitive information is encoded by the activity of neural populations within the whole brain. However, it relies on a prediction function that is plagued by the curse of dimensionality, since there are far more features than samples, i.e., more voxels than fMRI volumes. To address this problem, different methods have been proposed, such as, among others, univariate feature selection, feature agglomeration and regularization techniques. In this paper, we consider a sparse hierarchical structured regularization. Specifically, the penalization we use is constructed from a tree that is obtained by spatially-constrained agglomerative clustering. This approach encodes the spatial structure of the data at different scales into the regularization, which makes the overall prediction procedure more robust to inter-subject variability. The regularization used induces the selection of spatially coherent predictive brain regions simultaneously at different scales. We test our algorithm on real data acquired to study the mental representation of objects, and we show that the proposed algorithm not only delineates meaningful brain regions but yields as well better prediction accuracy than reference methods.

95 citations


Journal ArticleDOI
TL;DR: A systematic comparison of 2D versus 3D group-level inference procedures, by using cluster-level and voxel-level statistics assessed by permutation, in random effects (RFX) and mixed-effects analyses (MFX).

76 citations


Proceedings Article
26 Jun 2012
TL;DR: In this paper, sparse regression models over new variables obtained by clustering of the original variables are proposed to overcome the dificulties of sparse data on functional MRI (fMRI) data, where the number of samples is small and the variables are strongly correlated.
Abstract: Functional neuroimaging can measure the brain's response to an external stimulus. It is used to perform brain mapping: identifying from these observations the brain regions involved. This problem can be cast into a linear supervised learning task where the neuroimaging data are used as predictors for the stimulus. Brain mapping is then seen as a support recovery problem. On functional MRI (fMRI) data, this problem is particularly challenging as i) the number of samples is small due to limited acquisition time and ii) the variables are strongly correlated. We propose to overcome these dificulties using sparse regression models over new variables obtained by clustering of the original variables. The use of randomization techniques, e.g. bootstrap samples, and clustering of the variables improves the recovery properties of sparse methods. We demonstrate the benefit of our approach on an extensive simulation study as well as two fMRI datasets.

72 citations


Journal ArticleDOI
TL;DR: It is demonstrated on functional neuroimaging datasets that outlier detection can be performed with small sample sizes and improves group studies and introduces regularization in the MCD framework and investigates different regularization schemes.

54 citations


Book ChapterDOI
01 Oct 2012
TL;DR: A novel multimodal integration approach based on sparse Gaussian graphical model for estimating brain connectivity is proposed, Casting functional connectivity estimation as a sparse inverse covariance learning problem, which adjusts the level of sparse penalization on each connection based on its anatomical capacity for functional interactions.
Abstract: Despite the clear potential benefits of combining fMRI and diffusion MRI in learning the neural pathways that underlie brain functions, little methodological progress has been made in this direction. In this paper, we propose a novel multimodal integration approach based on sparse Gaussian graphical model for estimating brain connectivity. Casting functional connectivity estimation as a sparse inverse covariance learning problem, we adapt the level of sparse penalization on each connection based on its anatomical capacity for functional interactions. Functional connections with little anatomical support are thus more heavily penalized. For validation, we showed on real data collected from a cohort of 60 subjects that additionally modeling anatomical capacity significantly increases subject consistency in the detected connection patterns. Moreover, we demonstrated that incorporating a connectivity prior learned with our multimodal connectivity estimation approach improves activation detection.

51 citations



Posted Content
TL;DR: The use of randomization techniques, e.g. bootstrap samples, and clustering of the variables improves the recovery properties of sparse methods and overcome dificulties on functional MRI data using sparse regression models over new variables obtained by clusters of the original variables.
Abstract: Functional neuroimaging can measure the brain?s response to an external stimulus. It is used to perform brain mapping: identifying from these observations the brain regions involved. This problem can be cast into a linear supervised learning task where the neuroimaging data are used as predictors for the stimulus. Brain mapping is then seen as a support recovery problem. On functional MRI (fMRI) data, this problem is particularly challenging as i) the number of samples is small due to limited acquisition time and ii) the variables are strongly correlated. We propose to overcome these difficulties using sparse regression models over new variables obtained by clustering of the original variables. The use of randomization techniques, e.g. bootstrap samples, and clustering of the variables improves the recovery properties of sparse methods. We demonstrate the benefit of our approach on an extensive simulation study as well as two fMRI datasets.

40 citations


Journal ArticleDOI
TL;DR: It is found that summarizing the structure as strongly-connected networks can give a good description only for very large and overlapping networks, highlighting that Markov models are good tools to identify the structure of brain connectivity from fMRI signals, but for this purpose they must reflect the small-world properties of the underlying neural systems.
Abstract: Correlations in the signal observed via functional Magnetic Resonance Imaging (fMRI), are expected to reveal the interactions in the underlying neural populations through hemodynamic response. In particular, they highlight distributed set of mutually correlated regions that correspond to brain networks related to different cognitive functions. Yet graph-theoretical studies of neural connections give a different picture: that of a highly integrated system with small-world properties: local clustering but with short pathways across the complete structure. We examine the conditional independence properties of the fMRI signal, i.e. its Markov structure, to find realistic assumptions on the connectivity structure that are required to explain the observed functional connectivity. In particular we seek a decomposition of the Markov structure into segregated functional networks using decomposable graphs: a set of strongly-connected and partially overlapping cliques. We introduce a new method to efficiently extract such cliques on a large, strongly-connected graph. We compare methods learning different graph structures from functional connectivity by testing the goodness of fit of the model they learn on new data. We find that summarizing the structure as strongly-connected networks can give a good description only for very large and overlapping networks. These results highlight that Markov models are good tools to identify the structure of brain connectivity from fMRI signals, but for this purpose they must reflect the small-world properties of the underlying neural systems.

Journal ArticleDOI
TL;DR: The uncovered interplay between the two regions is proposed to reflect a generic binding process that dynamically weights the perceptual evidence supporting the different shape and motion interpretations according to the reliability of the neural activity in these regions.
Abstract: Visual shape and motion information, processed in distinct brain regions, should be combined to elicit a unitary coherent percept of an object in motion. In an fMRI study, we identified brain regions underlying the perceptual binding of motion and shape independently of the features-contrast, motion, and shape-used to design the moving displays. These displays alternately elicited a bound (moving diamond) or an unbound (disconnected moving segments) percept, and were either physically unchanging yet perceptually bistable or physically changing over time. The joint analysis of the blood-oxygen-level-dependent (BOLD) signals recorded during bound or unbound perception with these different stimuli revealed a network comprising the occipital lobe and ventral and dorsal visual regions. Bound percepts correlated with in-phase BOLD increases within the occipital lobe and a ventral area and decreased activity in a dorsal area, while unbound percepts elicited moderate BOLD modulations in these regions. This network was similarly activated by bistable unchanging displays and by displays periodically changing over time. The uncovered interplay between the two regions is proposed to reflect a generic binding process that dynamically weights the perceptual evidence supporting the different shape and motion interpretations according to the reliability of the neural activity in these regions.

Book ChapterDOI
01 Oct 2012
TL;DR: This paper uses a linear model for its robustness in high dimension and its possible interpretation, and shows that this approach is able to predict the correct ordering on pairs of images, yielding higher prediction accuracy than standard regression and multiclass classification techniques.
Abstract: Medical images can be used to predict a clinical score coding for the severity of a disease, a pain level or the complexity of a cognitive task. In all these cases, the predicted variable has a natural order. While a standard classifier discards this information, we would like to take it into account in order to improve prediction performance. A standard linear regression does model such information, however the linearity assumption is likely not be satisfied when predicting from pixel intensities in an image. In this paper we address these modeling challenges with a supervised learning procedure where the model aims to order or rank images. We use a linear model for its robustness in high dimension and its possible interpretation. We show on simulations and two fMRI datasets that this approach is able to predict the correct ordering on pairs of images, yielding higher prediction accuracy than standard regression and multiclass classification techniques.

Proceedings ArticleDOI
02 Jul 2012
TL;DR: This paper proposes incorporating connectivity into sparse classifier learning so that both local and long-range connections can be jointly modeled and demonstrates that integrating connectivity information inferred from diffusion tensor imaging (DTI) data provides higher classification accuracy and more interpretable classifier weight patterns than standard classifiers.
Abstract: In recent years, sparse regularization has become a dominant means for handling the curse of dimensionality in functional magnetic resonance imaging (fMRI) based brain decoding problems. Enforcing sparsity alone, however, neglects the interactions between connected brain areas. Methods that additionally impose spatial smoothness would account for local but not long-range interactions. In this paper, we propose incorporating connectivity into sparse classifier learning so that both local and long-range connections can be jointly modeled. On real data, we demonstrate that integrating connectivity information inferred from diffusion tensor imaging (DTI) data provides higher classification accuracy and more interpretable classifier weight patterns than standard classifiers. Our results thus illustrate the benefits of adding neurologically-relevant priors in fMRI brain decoding.

Book ChapterDOI
01 Oct 2012
TL;DR: A new graph-based framework designed to deal with inter-subject functional variability present in fMRI data is introduced and it is shown that it is the only approach to perform above chance level, among a wide range of tested methods.
Abstract: Classification of medical images in multi-subjects settings is a difficult challenge due to the variability that exists between individuals. Here we introduce a new graph-based framework designed to deal with inter-subject functional variability present in fMRI data. A graphical model is constructed to encode the functional, geometric and structural properties of local activation patterns. We then design a specific graph kernel, allowing to conduct SVM classification in graph space. Experiments conducted in an inter-subject classification task of patterns recorded in the auditory cortex show that it is the only approach to perform above chance level, among a wide range of tested methods.

Book ChapterDOI
01 Oct 2012
TL;DR: An extension of the diffeomorphic Geometric Demons algorithm which combines the iconic registration with geometric constraints works in the log-domain space, so that one can efficiently compute the deformation field of the geometry.
Abstract: We present an extension of the diffeomorphic Geometric Demons algorithm which combines the iconic registration with geometric constraints. Our algorithm works in the log-domain space, so that one can efficiently compute the deformation field of the geometry. We represent the shape of objects of interest in the space of currents which is sensitive to both location and geometric structure of objects. Currents provides a distance between geometric structures that can be defined without specifying explicit point-to-point correspondences. We demonstrate this framework by registering simultaneously T1 images and 65 fiber bundles consistently extracted in 12 subjects and compare it against non-linear T1, tensor, and multi-modal T1+ Fractional Anisotropy (FA) registration algorithms. Results show the superiority of the Log-domain Geometric Demons over their purely iconic counterparts.

Journal Article
TL;DR: The A-Brain project addresses the computational problem using cloud computing techniques on Microsoft Azure, relying on complementary expertise in the area of scalable cloud data management and in the field of neuroimaging and genetics data analysis.
Abstract: Joint genetic and neuroimaging data analysis on large cohorts of subjects is a new approach used to assess and understand the variability that exists between individuals. This approach has remained poorly understood so far and brings forward very significant challenges, as progress in this field can open pioneering directions in biology and medicine. As both neuroimaging- and genetic-domain observations represent a huge amount of variables (of the order of 106 ), performing statistically rigorous analyses on such Big Data represents a computational challenge that cannot be addressed with conventional computational techniques. In the A-Brain project, we address this computational problem using cloud computing techniques on Microsoft Azure, relying on our complementary expertise in the area of scalable cloud data management and in the field of neuroimaging and genetics data analysis.

Proceedings ArticleDOI
02 Jul 2012
TL;DR: This work builds a decoder that predicts the visual percept formed by four letter words, allowing us to identify words that were not present in the training data, and addresses a challenging estimation problem.
Abstract: Word reading involves multiple cognitive processes. To infer which word is being visualized, the brain first processes the visual percept, deciphers the letters, bigrams, and activates different words based on context or prior expectation like word frequency. In this contribution, we use supervised machine learning techniques to decode the first step of this processing stream using functional Magnetic Resonance Images (fMRI). We build a decoder that predicts the visual percept formed by four letter words, allowing us to identify words that were not present in the training data. To do so, we cast the learning problem as multiple classification problems after describing words with multiple binary attributes. This work goes beyond the identification or reconstruction of single letters or simple geometrical shapes and addresses a challenging estimation problem, that is the prediction of multiple variables from a single observation, hence facing the problem of learning multiple predictors from correlated inputs.

10 Jun 2012
TL;DR: This work proposes an extension of the well-established diffeomorphic Demons registration algorithm to take into account geometric constraints, and defines a mathematically sound framework to jointly register images and geometric descriptors such as fibers or sulcal lines.
Abstract: Image registration is undoubtedly one of the most active areas of research in medical imaging. Within inter-individual comparison, registration should align images as well as cortical and external structures such as sulcal lines and fibers. While using image-based registration[1], neural fibers appear uniformly white giving no information to the registration. Tensor-based registration was recently proposed to improve white-matter alignment[2,3], however misregistration may also persist in regions where the tensor field appears uniform[4]. We propose an hybrid approach by extending the Diffeomorphic Demons(D)[5] registration to incorporate geometric constrains. Combining the deformation field induce by the image and the geometry, we define a mathematically sound framework to jointly register images and geometric descriptors such as fibers or sulcal lines.

Book ChapterDOI
01 Oct 2012
TL;DR: In this paper, the problem of voxel selection based on transfer learning has been studied in the context of brain imaging and it has been shown that selecting predictive voxels on the reference task leads to higher detection power on small cohorts.
Abstract: Typical cohorts in brain imaging studies are not large enough for systematic testing of all the information contained in the images. To build testable working hypotheses, investigators thus rely on analysis of previous work, sometimes formalized in a so-called meta-analysis. In brain imaging, this approach underlies the specification of regions of interest (ROIs) that are usually selected on the basis of the coordinates of previously detected effects. In this paper, we propose to use a database of images, rather than coordinates, and frame the problem as transfer learning: learning a discriminant model on a reference task to apply it to a different but related new task. To facilitate statistical analysis of small cohorts, we use a sparse discriminant model that selects predictive voxels on the reference task and thus provides a principled procedure to define ROIs. The benefits of our approach are twofold. First it uses the reference database for prediction, i.e. to provide potential biomarkers in a clinical setting. Second it increases statistical power on the new task. We demonstrate on a set of 18 pairs of functional MRI experimental conditions that our approach gives good prediction. In addition, on a specific transfer situation involving different scanners at different locations, we show that voxel selection based on transfer learning leads to higher detection power on small cohorts.

27 Aug 2012
TL;DR: The MapReduce framework coupled with ecient algorithms permits to deliver a scalable analysis tool that deals with high-dimensional data and thousands of per- mutations in a few hours, and shows promising results with a genetic variant that survives the very strict correction for multiple testing.
Abstract: In the last few years, it has become possible to acquire high-dimensional neu- roimaging and genetic data on relatively large cohorts of subjects, which provides novel means to understand the large between-subject variability observed in brain organization. Genetic association studies aim at unveiling correlations between the genetic variants and the numer- ous phenotypes extracted from brain images and thus face a dire multiple comparisons issue. While these statistics can be accumulated across the brain volume for the sake of sensitivity, the signicance of the resulting summary statistics can only be assessed through permutations. Fortunately, the increase of computational power can be exploited, but this requires designing new parallel algorithms. The MapReduce framework coupled with ecient algorithms permits to deliver a scalable analysis tool that deals with high-dimensional data and thousands of per- mutations in a few hours. On a real functional MRI dataset, this tool shows promising results with a genetic variant that survives the very strict correction for multiple testing.

Posted Content
TL;DR: In this paper, a supervised learning procedure is proposed to order or rank images in order to improve the performance of medical image classification. But the model is not able to predict the correct ordering on pairs of images, yielding high prediction accuracy.
Abstract: Medical images can be used to predict a clinical score coding for the severity of a disease, a pain level or the complexity of a cognitive task. In all these cases, the predicted variable has a natural order. While a standard classifier discards this information, we would like to take it into account in order to improve prediction performance. A standard linear regression does model such information, however the linearity assumption is likely not be satisfied when predicting from pixel intensities in an image. In this paper we address these modeling challenges with a supervised learning procedure where the model aims to order or rank images. We use a linear model for its robustness in high dimension and its possible interpretation. We show on simulations and two fMRI datasets that this approach is able to predict the correct ordering on pairs of images, yielding higher prediction accuracy than standard regression and multiclass classification techniques.

Proceedings ArticleDOI
02 Jul 2012
TL;DR: In this article, transfer learning and selection transfer are compared based on their ability to identify the common patterns between brain activation maps related to two functional tasks and provide some preliminary quantification of these similarities, and show that selection transfer makes it possible to set a spatial scale yielding ROIs that are more specific to the context of interest.
Abstract: Researchers in functional neuroimaging mostly use activation coordinates to formulate their hypotheses. Instead, we propose to use the full statistical images to define regions of interest (ROIs). This paper presents two machine learning approaches, transfer learning and selection transfer, that are compared upon their ability to identify the common patterns between brain activation maps related to two functional tasks. We provide some preliminary quantification of these similarities, and show that selection transfer makes it possible to set a spatial scale yielding ROIs that are more specific to the context of interest than with transfer learning. In particular, selection transfer outlines well known regions such as the Visual Word Form Area when discriminating between different visual tasks.

Posted Content
TL;DR: This paper demonstrates that further improvement can be made by accounting for non-linearities using a ranking approach rather than the commonly used least-square regression and demonstrates the superiority of ranking with a real fMRI dataset.
Abstract: Inferring the functional specificity of brain regions from functional Magnetic Resonance Images (fMRI) data is a challenging statistical problem. While the General Linear Model (GLM) remains the standard approach for brain mapping, supervised learning techniques (a.k.a.} decoding) have proven to be useful to capture multivariate statistical effects distributed across voxels and brain regions. Up to now, much effort has been made to improve decoding by incorporating prior knowledge in the form of a particular regularization term. In this paper we demonstrate that further improvement can be made by accounting for non-linearities using a ranking approach rather than the commonly used least-square regression. Through simulation, we compare the recovery properties of our approach to linear models commonly used in fMRI based decoding. We demonstrate the superiority of ranking with a real fMRI dataset.

Proceedings ArticleDOI
02 Jul 2012
TL;DR: The output coefficients are used to fit blood oxygen level dependent (BOLD) signal in visual areas using functional magnetic resonance imaging and significant improvement in the prediction accuracy is shown when using the second layer in addition to the first, suggesting biological relevance of the features extracted in layer two or linear combinations thereof.
Abstract: The scattering transform is a hierarchical signal transformation that has been designed to be robust to signal deformations. It can be used to compute representations with invariance or tolerance to any transformation group, such as translations, rotations or scaling. In image analysis, going beyond edge detection, its second layer captures higher order features, providing a fine-grain dissection of the signal. Here we use the output coefficients to fit blood oxygen level dependent (BOLD) signal in visual areas using functional magnetic resonance imaging. Significant improvement in the prediction accuracy is shown when using the second layer in addition to the first, suggesting biological relevance of the features extracted in layer two or linear combinations thereof.

Journal ArticleDOI
TL;DR: In this article, a decomposition of the Markov structure into segregated functional networks using decomposable graphs is proposed to identify the structure of brain connectivity from fMRI signals, but for this purpose they must reflect the smallworld properties of the underlying neural systems.
Abstract: Correlations in the signal observed via functional Magnetic Resonance Imaging (fMRI), are expected to reveal the interactions in the underlying neural populations through hemodynamic response. In particular, they highlight distributed set of mutually correlated regions that correspond to brain networks related to different cognitive functions. Yet graph-theoretical studies of neural connections give a different picture: that of a highly integrated system with small-world properties: local clustering but with short pathways across the complete structure. We examine the conditional independence properties of the fMRI signal, i.e. its Markov structure, to find realistic assumptions on the connectivity structure that are required to explain the observed functional connectivity. In particular we seek a decomposition of the Markov structure into segregated functional networks using decomposable graphs: a set of strongly-connected and partially overlapping cliques. We introduce a new method to efficiently extract such cliques on a large, strongly-connected graph. We compare methods learning different graph structures from functional connectivity by testing the goodness of fit of the model they learn on new data. We find that summarizing the structure as strongly-connected networks can give a good description only for very large and overlapping networks. These results highlight that Markov models are good tools to identify the structure of brain connectivity from fMRI signals, but for this purpose they must reflect the small-world properties of the underlying neural systems.

Book ChapterDOI
01 Oct 2012
TL;DR: This work uses Local Component Analysis as a non-parametric density estimator for this purpose and shows that it outperforms state-of-the-art approaches, in particular those involving a Gaussian assumption.
Abstract: The statistical analysis of medical images is challenging because of the high dimensionality and low signal-to-noise ratio of the data. Simple parametric statistical models, such as Gaussian distributions, are well-suited to high-dimensional settings. In practice, on medical data reflecting heterogeneous subjects, the Gaussian hypothesis seldom holds. In addition, alternative parametric models of the data tend to break down due to the presence of outliers that are usually removed manually from studies. Here we focus on interactive detection of these outlying observations, to guide the practitioner through the data inclusion process. Our contribution is to use Local Component Analysis as a non-parametric density estimator for this purpose. Experiments on real and simulated data show that our procedure separates well deviant observations from the relevant and representative ones. We show that it outperforms state-of-the-art approaches, in particular those involving a Gaussian assumption.

Proceedings ArticleDOI
02 Jul 2012
TL;DR: In this article, a ranking approach was proposed to account for non-linearities using a ranking method rather than the commonly used least-square regression, and the recovery properties of their approach were compared to linear models commonly used in fMRI based decoding.
Abstract: Inferring the functional specificity of brain regions from functional Magnetic Resonance Images (fMRI) data is a challenging statistical problem. While the General Linear Model (GLM) remains the standard approach for brain mapping, supervised learning techniques (\emph{a.k.a.} decoding) have proven to be useful to capture multivariate statistical effects distributed across voxels and brain regions. Up to now, much effort has been made to improve decoding by incorporating prior knowledge in the form of a particular regularization term. In this paper we demonstrate that further improvement can be made by accounting for non-linearities using a ranking approach rather than the commonly used least-square regression. Through simulation, we compare the recovery properties of our approach to linear models commonly used in fMRI based decoding. We demonstrate the superiority of ranking with a real fMRI dataset.

Posted Content
TL;DR: To facilitate statistical analysis of small cohorts, a sparse discriminant model is used that selects predictive voxels on the reference task and thus provides a principled procedure to define ROIs and it is shown that voxel selection based on transfer learning leads to higher detection power on small cohorts.
Abstract: Typical cohorts in brain imaging studies are not large enough for systematic testing of all the information contained in the images. To build testable working hypotheses, investigators thus rely on analysis of previous work, sometimes formalized in a so-called meta-analysis. In brain imaging, this approach underlies the specification of regions of interest (ROIs) that are usually selected on the basis of the coordinates of previously detected effects. In this paper, we propose to use a database of images, rather than coordinates, and frame the problem as transfer learning: learning a discriminant model on a reference task to apply it to a different but related new task. To facilitate statistical analysis of small cohorts, we use a sparse discriminant model that selects predictive voxels on the reference task and thus provides a principled procedure to define ROIs. The benefits of our approach are twofold. First it uses the reference database for prediction, i.e. to provide potential biomarkers in a clinical setting. Second it increases statistical power on the new task. We demonstrate on a set of 18 pairs of functional MRI experimental conditions that our approach gives good prediction. In addition, on a specific transfer situation involving different scanners at different locations, we show that voxel selection based on transfer learning leads to higher detection power on small cohorts.

01 Jan 2012
TL;DR: It is concluded that adding matter information consistently improves the quantitative analysis of BOLD responses in some areas of the brain, particularly those where accurate inter-subject registration remains challenging.
Abstract: In this paper we investigate the use of classical fMRI Random Effect (RFX) group statistics when analysing a very large cohort and the possible improvement brought from anatomical information. Using 1326 subjects from the IMAGEN study, we first give a global picture of the evolution of the group effect t-value from a simple face-watching contrast with increasing cohort size. We obtain a wide "activated" pattern, far from being limited to the reasonably expected brain areas, illustrating the difference between statistical significance and practical significance. This motivates us to inject tissue-probability information into the group estimation, we model the BOLD contrast using a matter-weighted mixture of Gaussians and compare it to the common, single-Gaussian model. In both cases, the models parameters are estimated per-voxel for one subgroup, and the likelihood of both models is computed on a second, separate subgroup to reflect models generalization capacity. Various group sizes are tested, and significance is asserted using a 10-fold cross-validation scheme. We conclude that adding matter information consistently improves the quantitative analysis of BOLD responses in some areas of the brain, particularly those where accurate inter-subject registration remains challenging.