scispace - formally typeset
Search or ask a question

Showing papers by "Bertrand Thirion published in 2011"


Journal Article
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

47,974 citations


Book ChapterDOI
03 Jul 2011
TL;DR: A new hierarchical probabilistic model for brain activity patterns that does not require an experimental design to be specified is given and this model is estimated in the dictionary learning framework, learning simultaneously latent spatial maps and the corresponding brain activity time-series.
Abstract: Fluctuations in brain on-going activity can be used to reveal its intrinsic functional organization. To mine this information, we give a new hierarchical probabilistic model for brain activity patterns that does not require an experimental design to be specified. We estimate this model in the dictionary learning framework, learning simultaneously latent spatial maps and the corresponding brain activity time-series. Unlike previous dictionary learning frameworks, we introduce an explicit difference between subject-level spatial maps and their corresponding population-level maps, forming an atlas. We give a novel algorithm using convex optimization techniques to solve efficiently this problem with non-smooth penalties well-suited to image denoising. We show on simulated data that it can recover population-level maps as well as subject specificities. On resting-state fMRI data, we extract the first atlas of spontaneous brain activity and show how it defines a subject-specific functional parcellation of the brain in localized regions.

205 citations


Journal ArticleDOI
TL;DR: In this paper, the l1 norm of the image gradient is used as a regularization method for brain decoding, which can be applied to fMRI data for brain mapping and brain decoding.
Abstract: While medical imaging typically provides massive amounts of data, the extraction of relevant information for predictive diagnosis remains a difficult challenge. Functional magnetic resonance imaging (fMRI) data, that provide an indirect measure of task-related or spontaneous neuronal activity, are classically analyzed in a mass-univariate procedure yielding statistical parametric maps. This analysis framework disregards some important principles of brain organization: population coding, distributed and overlapping representations. Multivariate pattern analysis, i.e., the prediction of behavioral variables from brain activation patterns better captures this structure. To cope with the high dimensionality of the data, the learning method has to be regularized. However, the spatial structure of the image is not taken into account in standard regularization methods, so that the extracted features are often hard to interpret. More informative and interpretable results can be obtained with the l1 norm of the image gradient, also known as its total variation (TV), as regularization. We apply for the first time this method to fMRI data, and show that TV regularization is well suited to the purpose of brain mapping while being a powerful tool for brain decoding. Moreover, this article presents the first use of TV regularization for classification.

169 citations


Journal ArticleDOI
TL;DR: A clustering method that detects the fiber bundles embedded in any MR-diffusion based tractography dataset is presented, seeing this approach as a crucial preprocessing step before further analysis of huge fiber datasets.

145 citations


Journal ArticleDOI
TL;DR: This paper quantified maturation within the linguistic network in fourteen 1-to 4-month-old infants using an index based on the normalized T2-weighted magnetic resonance signal, and found that the ventral superior temporal sulcus (STS) is the less mature perisylvian region.
Abstract: Human infants, unlike even closely related primates, exhibit a remarkable capacity for language learning. Yet how the underlying anatomical network matures remains largely unknown. The classical view is that of a largely immature brain comprising only a few islands of maturity in primary cortices. This view has favored a description of learning based on bottom-up algorithms and has tended to discard the role of frontal regions, which were assumed to be barely functional early on. Here, using an index based on the normalized T2-weighted magnetic resonance signal, we have quantified maturation within the linguistic network in fourteen 1- to 4-month-old infants. Our results show first that the ventral superior temporal sulcus (STS), and not the inferior frontal area, is the less mature perisylvian region. A significant difference of maturation in the STS favoring the right side is an early testimony of the distinctive left-right development of this structure observed during the whole life. Second, asymmetries of maturation in Broca's area were correlated with asymmetries in the posterior STS and in the parietal segment of the arcuate fasciculus, suggesting that an efficient frontotemporal dorsal pathway might provide infants with a phonological loop circuitry much earlier than expected.

134 citations


Journal ArticleDOI
TL;DR: This article applies for the first time this method to fMRI data, and shows that TV regularization is well suited to the purpose of brain mapping while being a powerful tool for brain decoding.
Abstract: While medical imaging typically provides massive amounts of data, the extraction of relevant information for predictive diagnosis remains a difficult challenge. Functional MRI (fMRI) data, that provide an indirect measure of task-related or spontaneous neuronal activity, are classically analyzed in a mass-univariate procedure yielding statistical parametric maps. This analysis framework disregards some important principles of brain organization: population coding, distributed and overlapping representations. Multivariate pattern analysis, i.e., the prediction of behavioural variables from brain activation patterns better captures this structure. To cope with the high dimensionality of the data, the learning method has to be regularized. However, the spatial structure of the image is not taken into account in standard regularization methods, so that the extracted features are often hard to interpret. More informative and interpretable results can be obtained with the l_1 norm of the image gradient, a.k.a. its Total Variation (TV), as regularization. We apply for the first time this method to fMRI data, and show that TV regularization is well suited to the purpose of brain mapping while being a powerful tool for brain decoding. Moreover, this article presents the first use of TV regularization for classification.

90 citations


Proceedings ArticleDOI
16 May 2011
TL;DR: A hierarchical structured regularization that encodes the spatial prior information in the regularization process, which makes the overall prediction procedure more robust to inter-subject variability.
Abstract: Inverse inference, or "brain reading", is a recent paradigm for analyzing functional magnetic resonance imaging (fMRI) data, based on pattern recognition tools. By predicting some cognitive variables related to brain activation maps, this approach aims at decoding brain activity. Inverse inference takes into account the multivariate information between voxels and is currently the only way to assess how precisely some cognitive information is encoded by the activity of neural populations within the whole brain. However, it relies on a prediction function that is plagued by the curse of dimensionality, as we have far more features than samples, i.e., more voxels than fMRI volumes. To address this problem, different methods have been proposed. Among them are univariate feature selection, feature agglomeration and regularization techniques. In this paper, we consider a hierarchical structured regularization. Specifically, the penalization we use is constructed from a tree that is obtained by spatially constrained agglomerative clustering. This approach encodes the spatial prior information in the regularization process, which makes the overall prediction procedure more robust to inter-subject variability. We test our algorithm on a real data acquired for studying the mental representation of objects, and we show that the proposed algorithm yields better prediction accuracy than reference methods.

54 citations


Book ChapterDOI
03 Jul 2011
TL;DR: The results show that functional connectivity can be explained by anatomical connectivity on a rigorous statistical basis, and that a proper model of functional connectivity is essential to assess this link.
Abstract: We present a novel probabilistic framework to learn across several subjects a mapping from brain anatomical connectivity to functional connectivity, ie the covariance structure of brain activity This prediction problem must be formulated as a structured-output learning task, as the predicted parameters are strongly correlated We introduce a model selection framework based on cross-validation with a parametrization-independent loss function suitable to the manifold of covariance matrices Our model is based on constraining the conditional independence structure of functional activity by the anatomical connectivity Subsequently, we learn a linear predictor of a stationary multivariate autoregressive model This natural parameterization of functional connectivity also enforces the positive-definiteness of the predicted covariance and thus matches the structure of the output space Our results show that functional connectivity can be explained by anatomical connectivity on a rigorous statistical basis, and that a proper model of functional connectivity is essential to assess this link

36 citations


Journal ArticleDOI
TL;DR: The methodology developed in this study points at further developments in time-resolved analyses of distributed visual processes in the millisecond range, and to new ways of exploring the dynamics of functional processes within the human visual cortex non-invasively.

31 citations


Book ChapterDOI
18 Sep 2011
TL;DR: The classical Minimum Covariance Determinant approach is modified by adding a regularization term, that ensures that the estimation is well-posed in high-dimensional settings and in the presence of many outliers, to show that outliers can be detected satisfactorily.
Abstract: Medical imaging datasets used in clinical studies or basic research often comprise highly variable multi-subject data. Statisticallycontrolled inclusion of a subject in a group study, i.e. deciding whether its images should be considered as samples from a given population or whether they should be rejected as outlier data, is a challenging issue. While the informal approaches often used do not provide any statistical assessment that a given dataset is indeed an outlier, traditional statistical procedures are not well-suited to the noisy, high-dimensional, settings encountered in medical imaging, e.g. with functional brain images. In this work, we modify the classical Minimum Covariance Determinant approach by adding a regularization term, that ensures that the estimation is well-posed in high-dimensional settings and in the presence of many outliers. We show on simulated and real data that outliers can be detected satisfactorily, even in situations where the number of dimensions of the data exceeds the number of observations.

22 citations


Book ChapterDOI
18 Sep 2011
TL;DR: A novel generative model is proposed that integrates RS-connectivity and stimulus-evoked responses under a unified analytical framework that permits exact closed-form solutions for both the posterior activation effect estimates and the model evidence.
Abstract: A growing interest has emerged in studying the correlation structure of spontaneous and task-induced brain activity to elucidate the functional architecture of the brain. In particular, functional networks estimated from resting state (RS) data were shown to exhibit high resemblance to those evoked by stimuli. Motivated by these findings, we propose a novel generative model that integrates RS-connectivity and stimulus-evoked responses under a unified analytical framework. Our model permits exact closed-form solutions for both the posterior activation effect estimates and the model evidence. To learn RS networks, graphical LASSO and the oracle approximating shrinkage technique are deployed. On a cohort of 65 subjects, we demonstrate increased sensitivity in fMRI activation detection using our connectivity-informed model over the standard univariate approach. Our results thus provide further evidence for the presence of an intrinsic relationship between brain activity during rest and task, the exploitation of which enables higher detection power in task-driven studies.

Journal ArticleDOI
TL;DR: A new model, called Multiclass Sparse Bayesian Regression (MCBR), that automatically adapts the amount of regularization to the available data, which consists in grouping features into several classes and then regularizing each class differently in order to apply an adaptive and efficient regularization.
Abstract: Inverse inference has recently become a popular approach for analyzing neuroimaging data, by quantifying the amount of information contained in brain images on perceptual, cognitive, and behavioral parameters. As it outlines brain regions that convey information for an accurate prediction of the parameter of interest, it allows to understand how the corresponding information is encoded in the brain. However, it relies on a prediction function that is plagued by the curse of dimensionality, as there are far more features (voxels) than samples (images), and dimension reduction is thus a mandatory step. We introduce in this paper a new model, called Multiclass Sparse Bayesian Regression (MCBR), that, unlike classical alternatives, automatically adapts the amount of regularization to the available data. MCBR consists in grouping features into several classes and then regularizing each class differently in order to apply an adaptive and efficient regularization. We detail these framework and validate our algorithm on simulated and real neuroimaging data sets, showing that it performs better than reference methods while yielding interpretable clusters of features.

Book ChapterDOI
16 Dec 2011
TL;DR: The proposed approach relies on randomization techniques which have been proved to be consistent for support recovery and makes use of a spatially constrained hierarchical clustering algorithm to account for the spatial correlations between voxels.
Abstract: The prediction of behavioral covariates from functional MRI (fMRI) is known as brain reading. From a statistical standpoint, this challenge is a supervised learning task. The ability to predict cognitive states from new data gives a model selection criterion: prediction accuracy. While a good prediction score implies that some of the voxels used by the classifier are relevant, one cannot state that these voxels form the brain regions involved in the cognitive task. The best predictive model may have selected by chance noninformative regions, and neglected relevant regions that provide duplicate information. In this contribution, we address the support identification problem. The proposed approach relies on randomization techniques which have been proved to be consistent for support recovery. To account for the spatial correlations between voxels, our approach makes use of a spatially constrained hierarchical clustering algorithm. Results are provided on simulations and a visual experiment.

Posted Content
TL;DR: In this paper, a sparse hierarchical structured regularization is used to identify brain regions simultaneously at different scales, and the penalization is constructed from a tree that is obtained by spatially-constrained agglomerative clustering.
Abstract: Inverse inference, or "brain reading", is a recent paradigm for analyzing functional magnetic resonance imaging (fMRI) data, based on pattern recognition and statistical learning. By predicting some cognitive variables related to brain activation maps, this approach aims at decoding brain activity. Inverse inference takes into account the multivariate information between voxels and is currently the only way to assess how precisely some cognitive information is encoded by the activity of neural populations within the whole brain. However, it relies on a prediction function that is plagued by the curse of dimensionality, since there are far more features than samples, i.e., more voxels than fMRI volumes. To address this problem, different methods have been proposed, such as, among others, univariate feature selection, feature agglomeration and regularization techniques. In this paper, we consider a sparse hierarchical structured regularization. Specifically, the penalization we use is constructed from a tree that is obtained by spatially-constrained agglomerative clustering. This approach encodes the spatial structure of the data at different scales into the regularization, which makes the overall prediction procedure more robust to inter-subject variability. The regularization used induces the selection of spatially coherent predictive brain regions simultaneously at different scales. We test our algorithm on real data acquired to study the mental representation of objects, and we show that the proposed algorithm not only delineates meaningful brain regions but yields as well better prediction accuracy than reference methods.

Book ChapterDOI
16 Dec 2011
TL;DR: The optimality of decoding methods in two different settings, namely intra- and inter-subject kind of decoding, is discussed, and it is shown that using spatial regularization improves reverse inference in the challenging context of inter- subject prediction.
Abstract: Functional Magnetic Resonance Imaging (fMRI) provides a unique opportunity to study brain functional architecture, while being minimally invasive. Reverse inference, a.k.a. decoding, is a recent statistical analysis approach that has been used with success for deciphering activity patterns that are thought to fit the neuroscientific concept of population coding. Decoding relies on the selection of brain regions in which the observed activity is predictive of certain cognitive tasks. The accuracy of such a procedure is quantified by the prediction of the behavioral variable of interest - the target. In this paper, we discuss the optimality of decoding methods in two different settings, namely intra- and inter-subject kind of decoding. While inter-subject prediction aims at finding predictive regions that are stable across subjects, it is plagued by the additional inter-subject variability (lack of voxel-to-voxel correspondence), so that the best suited prediction algorithms used in reverse inference may not be the same in both cases. We benchmark different prediction algorithms in both intra- and inter-subjects analysis, and we show that using spatial regularization improves reverse inference in the challenging context of inter-subject prediction. Moreover, we also study the different maps of weights, and show that methods with similar accuracy may yield maps with very different spatial layout of the predictive regions.

Book ChapterDOI
16 Dec 2011
TL;DR: This work proposes a model selection framework based on cross-validation that selects the appropriate sparsity of the connectivity matrices and demonstrates that choosing an ordering for the MAR that lends to sparser models is more appropriate than a random.
Abstract: We aim to learn across several subjects a mapping from brain anatomical connectivity to functional connectivity. Following [1], we formulate this problem as estimating a multivariate autoregressive (MAR) model with sparse linear regression. We introduce a model selection framework based on cross-validation. We select the appropriate sparsity of the connectivity matrices and demonstrate that choosing an ordering for the MAR that lends to sparser models is more appropriate than a random. Finally, we suggest randomized Least Absolute Shrinkage and Selective Operator (LASSO) in order to identify relevant anatomo-functional links with better recovery of ground truth.

Book ChapterDOI
16 Dec 2011
TL;DR: Exploring two specific descriptions of resting-state fMRI, namely spatial analysis and connectivity graphs, the progress brought by statistical learning techniques is discussed, but also the neuroscientific picture that they paint, and possible modeling pitfalls are discussed.
Abstract: In the absence of external stimuli, fluctuations in cerebral activity can be used to reveal intrinsic structures. Well-conditioned probabilistic models of this so-called resting-state activity are needed to support neuroscientific hypotheses. Exploring two specific descriptions of resting-state fMRI, namely spatial analysis and connectivity graphs, we discuss the progress brought by statistical learning techniques, but also the neuroscientific picture that they paint, and possible modeling pitfalls.

01 Jan 2011
TL;DR: An open-source library called CLONES (Closed Loop Neural Simulation), a communication interface between the BRIAN neural simulator, and SOFA, a physics engine for biomedical applications, which is both intuitive and high performance simulation environments.
Abstract: The activity of neurons does not only reflect cognitive states, it is also highly constrained by mechanical properties of the body: sensors, muscles, tendons, and their interaction with the environment. Therefore, in order to model the activity of neurons, it seems necessary to take a holistic approach, where the behavior of an animal is modeled through the interaction between neurons, muscles and the environment. This involves knowledge in both neurophysiology and biomechanics. It also requires the development of software that can simulate the interaction between neurons and muscles efficiently. In order to achieve this, we have developed an open-source library called CLONES (Closed Loop Neural Simulation). CLONES is a communication interface between the BRIAN neural simulator, and SOFA, a physics engine for biomedical applications. BRIAN and SOFA are both intuitive and high performance simulation environments. BRIAN is based on the Python programming language. SOFA uses an interpreted XML description of a physical scene. CLONES contains a SOFA plugin and a PYTHON module. The Python module provides a function to be called after each step of the simulation. The Sofa plugin provides components for the simulation of muscles, sensors, drag forces. Communication between BRIAN and SOFA is achieved through shared memory and semaphores. A single step of simulation typically takes place on different CPU cores. Once a simulation step has been completed in both simulators, sensory inputs to the neurons and motor outputs to the muscles are updated. Several demonstration examples are provided with the library. In particular, a neuro-mechanical model of undulatory locomotion in the worm caenorhabditis elegans is available. 6/39 sciencesconf.org:pythonneuro:1021