scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Analysis of a large fMRI cohort: Statistical and methodological issues for group analyses.

TL;DR: The study shows that inter-subject variability plays a prominent role in the relatively low sensitivity and reliability of group studies and focuses on the notion of reproducibility by bootstrapping.
About: This article is published in NeuroImage.The article was published on 2007-03-01 and is currently open access. It has received 541 citations till now. The article focuses on the topics: Sample size determination & Random effects model.

Summary (10 min read)

Jump to: [1.1 Motivation][1.2 SysML][1.3 Research Objectives][1.4 Organization of the Dissertation][2.1 Software Project Management][2.2 Effort Estimation][2.2.1 Empirical parametric models][2.2.2 Empirical nonparametric models][2.2.3 Analogical models][2.2.4 Theoretical models][2.2.5 Heuristic models][2.3 Function Point Analysis][2.4 Use Case Points (UCP) Model][2.5 Object-Oriented Function Point (OOFP) Model][2.6 Class Point (CP) Model][2.7 SysML Point Overview][3.1 Design Patterns][3.1.1 History][3.1.2 Uses][3.1.3 Classification][3.2 The Pattern Point Model][3.2.1 The pattern point method][3.3 Identification and Classification of User Objects][3.4 Evaluation of a Pattern Complexity Level][3.5 Estimating the Total Unadjusted Pattern Point][3.6 Technical Complexity and Environmental Factor Estimation][4. THEORETICAL VALIDATION][4.1 Representation of Systems and Modules][4.3 Proof][5. EMPIRICAL VALIDATION][5.1 IBM Lotus Quickr 8.0][5.2 Applying the Pattern Point Method to Lotus Quickr][5.3 Reverse Engineering Using MaintainJ][5.4 The Cross Validation Process][5.5 Partitioning the Data Set][5.6 OLS Regression Analysis to Derive Effort Prediction Models][5.7 Accuracy Evaluation of the Prediction Models][6.1 Single Measures and Their Sums][6.2 Multivariate OLS Regression] and [7.1 Conclusions]

1.1 Motivation

  • Traditional software effort estimation techniques rely on analytic equations, statistical data fitting, expert judgment or some combination of the three.
  • They are still notoriously inaccurate.
  • There are two bases that make the approach taken in this dissertation feasible and practical.
  • Secondly, a key characteristic of the object-oriented paradigm is the continual realization and refinement of the same system artifacts/objects at each phase of development or within each development iteration (depending on the chosen project life cycle).
  • Unobtrusively from the CASE tools in the stages of development preceding implementation, to predict the effort required to further realize, refine and develop these and other system artifacts regardless of the level of realization or refinement of the existing artifacts.

1.2 SysML

  • This dissertation proposes an effort prediction model – the SysML Point Model - for object-oriented development systems that is based on a common, structured and comprehensive modeling language (OMG SysML), which can be built using the CASE tools from which data can be unobtrusively gathered and applied to prediction equations.
  • OMG SysML [98] is a specification that defines a general-purpose modeling language for systems engineering applications.
  • The Block Definition Diagram in SysML defines features of a block and relationships between blocks such as associations, generalizations, and dependencies.
  • The state machine represents behavior as the state history of an object in terms of its transitions and states.
  • Use case diagrams include the use case and actors and the associated communications between them.

1.3 Research Objectives

  • The focus of this dissertation is to define and validate the Pattern Points (PP) method of the SysML Point approach.
  • Object-oriented analysis (OOA) is concerned with the transformation of software engineering requirements and specifications into a system's object model, which is composed of a population of interacting objects (rather than the functional views or traditional data of systems) [108].
  • The Pattern Points (PP) model is an empirical parametric estimation method that uses object interactions and the class structure of object-oriented design patterns to predict development effort in the late analysis phase of an object-oriented project.
  • In software engineering, a design pattern is a common reusable solution to a frequently occurring problem in software design.
  • The Use Case Point model, which is based on use case counts called use case points, is defined in Carol et al [9].

1.4 Organization of the Dissertation

  • Following this introductory section, this dissertation is presented in six additional sections.
  • This includes literature on the subject of software project management and effort estimation.
  • Design patterns and the Pattern Point model are described in Section 3.
  • Section 5 explains the project experiment results used to empirically test the research model.

2.1 Software Project Management

  • Software project management is a major endeavor that helps to realize a successful software project.
  • Planning is central to software project management; it involves the identification of the activities, milestones, and deliverables produced by a project [39].
  • Estimates for the software project’s effort and cost are derived according to a documented procedure.
  • If a project is behind schedule, the manager can increase resources or decrease features.
  • Process is significant because it lets people efficiently build products by imposing a structure on the progression of the project.

2.2 Effort Estimation

  • Even though the difficulties of software cost estimation were discussed 30 years ago in “The Mythical Man Month” [42], it is as much a relevant area of research now as it was then.
  • Effort estimation is critical because of the following [107]: Exploring the practicality of developing or purchasing a new system Determining a price or schedule for a new system.
  • Planning how to staff a software development project.
  • Understanding the impact of changing the functions of an existing system.
  • In spite of the importance, software cost estimates are more often than not imprecise, and there is no indication that the software engineering community is making significant gains in making better predictions.

2.2.1 Empirical parametric models

  • The most prevalent of estimation models are empirical parametric models.
  • An alternative empirical parametric methodology is to calibrate a model by estimating values for the parameters (a and b in the case of (2.1)).
  • COCOMO was first published in 1981 by Barry J. Boehm [43] as a model for estimating effort, cost, and schedule for software projects.
  • The amount of effort required to produce a software product, the defects remaining in a software product, and time required to create a software product are all estimated using Volume attributes.
  • Even though function points are a popular measure, they too have drawbacks:.

2.2.2 Empirical nonparametric models

  • Nonparametric models typically involve the use of artificial intelligence techniques in producing an effort estimate.
  • In the comparison, the OSR methodology produced a lower mean absolute relative error than both the two parametric models, with the COCOMO model performing least favorably.
  • Even though they may provide better effort estimates, empirical nonparametric methods such as a neural network are hard to set up and they typically require more work than preparing a statistical regression model [91].
  • There have been several attempts to use regression and decision trees to estimate aspects of software engineering.
  • A single organization can provide a large enough data set but it is hard to believe that all the projects would come from the same environment.

2.2.3 Analogical models

  • Effort estimation by analogy (EBA) is an established method for software effort prediction.
  • In EBA, the estimated effort of the project under consideration (target project) is a function of the known effort values from analogous historical projects.
  • The data set used to develop ESTOR is a subset of 10 projects from the Kemerer [74] data set.
  • It avoids the problems associated both with knowledge elicitation and extracting and codifying the knowledge.
  • Analogy-based systems only need deal with those problems that actually occur in practice, while generative (i.e., algorithmic) systems must handle all possible problems.

2.2.4 Theoretical models

  • In comparison to the algorithmic (parametric and non-parametric) and analogical models, there is less research on the development of theoretical models for software effort estimation.
  • Wang and Yuan [113] have developed a ‘coherent’ theory on the nature of collaborative work and their mathematical models in software engineering.
  • The FEMSEC model provides a theoretical foundation for software engineering decision optimizations on the optimal labor allocation, the shortest duration determination, and the lowest workload/effort and costs estimation.
  • The model works from an initial estimate for overall effort and then explores how the actual effort is influenced by the model’s assumptions about the interactions and feedback between project and decisions.
  • Simulations of project management scenarios can be run to investigate the effects of management policies and decisions.

2.2.5 Heuristic models

  • Heuristics are rules of thumb, developed through experience that capture knowledge about relationships between attributes of the empirical model.
  • Since initial software cost estimates are made based on preliminary data, re- estimating is desirable when additional information is available.
  • The process of re-estimation is made more complicated by such issues, but in order to successfully estimate the total effort or time to complete successfully, effort estimation models need to incorporate these measures.
  • Models that are more difficult to develop and apply are typically based on a large number of variables such as Abdel-Hamid and Madnick [89].
  • In the following sections the authors explore Function Point analysis and the variations to the Function Point method that were designed to suite the object-oriented development model.

2.3 Function Point Analysis

  • The Function Point method was introduced in 1979 by Albrecht [30] to measure the size of a data-processing system from the end-user’s point of view.
  • The first step is the identification of all functions - each function is classified as belonging to one of the following function types: external input (EI), external output (EO), external inquiry (EQ), internal logical file (ILF), and external interface file (EIF).
  • Each function is then weighted based on its type and on the level of its complexity, in agreement with standard values as specified in the Counting Practices Manual.
  • As an example, for transactions (EI, EO, and EQ), the rating is based on the number of Data Element Types (DETs) and Referenced File Types (FTRs).
  • The FP measure has been used by application developers to estimate productivity, in terms of Function Points per person-month, and quality, in terms of the number of defects per Function Point with respect to requirements, design, coding, and user documentation phases.

2.4 Use Case Points (UCP) Model

  • The Use Case Points (UCP) model [9] is a software sizing estimation method based on use case counts called use case points.
  • Use cases describe the interaction between a primary actor—the initiator of the interaction—and the system itself, represented as a sequence of simple steps.
  • Actors are something or someone which exist outside the system under study, and that take part in a sequence of activities in a dialogue with the system, to achieve some goal: they may be end users, other systems, or hardware devices.
  • Use case modeling is part of the UML 2.0 and is therefore applicable in the early estimation of an object oriented software development project.
  • Weighing Environment Factor is an exercise to calculate a Use Case Point modifier which will modify the UUCP by the weight of the Environment factors.

2.5 Object-Oriented Function Point (OOFP) Model

  • Another related work in the sizing of OOP is the Object-Oriented Function Point (OOFP) model.
  • The function point size metric uses functional, logical entities such as inputs, outputs, and inquiries that tend to relate more closely to the functions performed by the software as compared to other measures, such as lines of code.
  • Inputs, Outputs and Inquiries are all treated in the same way: they are generically called “service requests” and correspond to class methods.
  • Classes within the application boundary correspond to ILFS, while classes outside the application boundary (including libraries) correspond to EIFS.
  • The OOFP is an adaptation of the original FP and although it attempts to use Object Oriented metrics, the framework itself is not very well suited to the objectoriented paradigm.

2.6 Class Point (CP) Model

  • The Class Point model as defined by Costagliola et al, 2005 [29], is similar to the OOFP approach in that it attempts to give an estimate of the size metric based on design/structural artifacts.
  • There are two forms of the Class point metric, named CP1 and CP2 respectively.
  • The former is used later in the design stage as more information is available where as CP1 is meant to be used a bit earlier at the beginning of the design process to carry out a preliminary size estimate.
  • These are the problem domain type (PDT)/entity classes, the human interaction type (HIT)/boundary classes, the data management type (DMT)/data classes, and the task management type (TMT)/ control classes.
  • C) Estimating the Total Unadjusted Class Point: this consists of computing a weighted total of the classes with their complexity levels determined.

2.7 SysML Point Overview

  • In object-oriented development projects, it is desirable to have an estimation model that imitates the continuous realization and refinement of the same system artifacts through the pre-implementation activities of the project development.
  • Use cases models are realized into object interaction diagrams and analysis classes, and these are further refined into the class structures that will be coded.
  • The Pattern Point model is a constituent of the proposed SysML point approach (Figure 3).
  • The remainder of this dissertation defines and validates the Pattern Point estimation model.

3.1 Design Patterns

  • In software engineering, a design pattern is a common reusable solution to a frequentlyoccurring problem in software design.
  • A design pattern is not a finished design that can be transformed directly into code.
  • It is a description or template for how to solve a problem that can be used in many different situations.
  • Typically, object-oriented design patterns display relationships and interactions between classes or objects without specifying the final application classes or objects that are involved.
  • Algorithms are not considered design patterns because they solve computational problems and not design problems.

3.1.1 History

  • The concept of a design pattern was not formalized for several years.
  • Patterns, in general, emerged as an architectural concept by Christopher Alexander in 1977.
  • In 1987, Kent Beck and Ward Cunningham began experimenting with the concept of applying patterns to computer programming and presented their results at the OOPSLA conference that year [20], [21].
  • In the following years, Beck, Cunningham and others followed up on this work.
  • That same year, the maiden Pattern Languages of Programming Conference was held and the following year, the Portland Pattern Repository was created for documentation of design patterns.

3.1.2 Uses

  • Design patterns provide tested, proven development paradigms and can thus speed up the development process.
  • Effective software design demands the consideration of issues that may not come to light until later in the implementation stage.
  • These techniques are difficult to apply to a broader range of problems.
  • Design patterns provide general solutions, documented in a format that doesn't require specifics tied to a particular problem.
  • Moreover, patterns enable developers to communicate using established names for software interactions.

3.1.3 Classification

  • Object-oriented design patterns are classified into the categories: Creational Patterns, Structural Patterns, and Behavioral Patterns, and described using the concepts of aggregation, delegation, and consultation [21].
  • Creational Patterns are design patterns that are concerned with object creation mechanisms; trying to create objects in a manner suitable to the situation.
  • Lastly, Behavioral Patterns are design patterns that identify common communication patterns between objects and realize these patterns.
  • By doing so, these patterns increase flexibility in carrying out this communication.
  • Table 1 lists design patterns classified into the three categories.

3.2 The Pattern Point Model

  • The Pattern Points (PP) model is an empirical parametric estimation method that utilizes UML sequence diagrams (object interactions) to predict development effort in the analysis phase of an object-oriented development project.
  • Each pattern is sized based on a pattern ranking and an implementation ranking.
  • As the interaction model is refined and designers have identified which patterns to use in the construction of each object interaction, a single unadjusted component size estimate can be attained.
  • Size estimates are then adjusted to accommodate for technical and environmental factors such as the lead programmer experience and requirements volatility.
  • At the late analysis stage where the object interactions have been further refined to reflect some initial design elements, the PP metric is computed a little differently.

3.2.1 The pattern point method

  • The Pattern Point size estimation process is composed of three main phases, corresponding to analogous phases in the FP approach [30].
  • The former is applicable at the beginning of the analysis phase where a majority of the design constructs have not been formalized, where as the latter takes into account the structural constructs that have been identified in the late analysis phase.
  • Following are the three main steps in estimating the Pattern Point size.

3.3 Identification and Classification of User Objects

  • The user objects that form the design patterns are classified into 4 groups.
  • Table 1 shows a default grouping as defined for the objects that comprise the 23 design patterns as defined by Gamma et al [3].
  • With regard to the previous example, the objects EmergencyReportForm and ReportEmergencyButton belong to HIT.
  • In the example [3], a DMT component is the IncidentManagement subsystem containing classes responsible for issuing SQL queries in order to store and retrieve records representing Incidents in the database.
  • D. Task management type (TMT) - TMT objects are responsible for the definition and control of tasks.

3.4 Evaluation of a Pattern Complexity Level

  • The second step is to evaluate the complexity level of the design patterns that are found in the object interaction analysis of the system.
  • The structural complexity is a function of the # of classes and # of associations that are identified in the structure of the design pattern.
  • These are the Interface pattern and the Filter pattern as defined in [4].
  • The PP1 metric is a function of the Degree of Difficulty (DD) and Structural Complexity (SC) of the design pattern, and PP2 takes the number of implemented concrete classes in the pattern also into consideration.

3.5 Estimating the Total Unadjusted Pattern Point

  • After estimating the complexity of each of the design patterns found in the object interaction analysis of the system according to Table 2, the authors can now compute the Total Unadjusted Pattern Point (TUPP).
  • To achieve this, Table 3 below, as defined in the Class Point estimation [29] is completed for Pattern Point estimation.
  • Typology and complexity level are given by the corresponding row and column, respectively.

3.6 Technical Complexity and Environmental Factor Estimation

  • The Technical Complexity Factor (TCF) [9] is determined by assigning the degree of influence (ranging from 0 to 5) that 13 general system characteristics have on the application, from the designer’s point of view.
  • The estimates given for the degrees of influence are recorded in the Technical factors table illustrated in Table 4.
  • The final value of the Adjusted Pattern Point (PP) is obtained by multiplying the Total Unadjusted Pattern Point value by the TCF and EAF PP = TUPP * TCF *EAF.
  • It is worth mentioning that the Technical Complexity Factor and Environmental Adjustment Factor are determined by taking into account the characteristics that are considered in the FP.

4. THEORETICAL VALIDATION

  • The PP metric as well as its composite metrics: DD, SC and PC have been defined so far, but a software measure can be acceptable and effectively usable only if its usefulness has been proven by means of a validation process.
  • The goal of such a process is to convey that a measure really measures the attribute that it is supposed to and it is practically useful [29].
  • Theoretical validation is a fundamental step in the validation process and should allow one to demonstrate that a measure satisfies properties characterizing the concept (e.g., size, complexity, coupling, etc.) it is intended to [5].
  • The framework contributes to the definition of a stronger theoretical ground of software measurement by providing convenient and intuitive properties for several measurement concepts, such as complexity, cohesion, length, coupling and size.
  • Within the framework, a system is characterized as a set of elements and a set of relationships between those elements, as formalized in the following definition.

4.1 Representation of Systems and Modules

  • A system S will be represented as a pair < E, R >, where E represents the set of elements of S and R is a binary relation on E (R ⊆ E Χ E) representing the relationships between S’s elements.
  • The basic properties of size measures are very intuitive; they ensure that the size cannot be negative, it is null when the system has no element, and it can be obtained as the sum of the size of its modules when they are disjoint.

4.3 Proof

  • Since the PP value is obtained as a weighted sum of nonnegative numbers, the Nonnegativity property holds.
  • If no design pattern (i.e. classes/objects, associations/calls) is present in the system analysis the PP value is trivially null and the Null Value property is also verified.
  • This means that for each pattern, the values for DD and SC will be unchanged after the partitioning.

5. EMPIRICAL VALIDATION

  • In the literature, it is largely accepted that system size is strongly correlated with development effort [17]-[20].
  • The theoretical validation conducted in the previous section illustrates that the Pattern Point measures satisfy properties that are considered requisite for size measures.
  • A theoretical validation alone does not guarantee the usefulness of the measures as predictors of effort and cost.
  • Thus, the author has performed an empirical study purposed to determine whether the Pattern Point measures can be used to predict the development effort of OO systems in terms of person-days (8 hours per day).
  • The subject of the study was the initial release of the IBM Lotus Quickr software product.

5.1 IBM Lotus Quickr 8.0

  • Lotus Quickr is IBM’s team collaboration and content sharing software that helps users access and interact with the people, information and project materials that they need to get their work done.
  • The software was released June 2007 and the Pattern Point method was applied retroactively on the recorded data.

5.2 Applying the Pattern Point Method to Lotus Quickr

  • Like many software development projects, there was incomplete documentation particularly with respect to the artifacts from the analysis phase of the software product i.e. there was little or no documentation of object interaction analyses including sequence diagrams.
  • There was ample data on implemented use case scenarios, and also the package structure of the code was designed for easy identification of the design patterns in play, which helped in the reverse engineering of the object interaction diagrams in the following section.
  • The reverse engineering tool MaintainJ was employed to reverse engineer the object interaction diagrams involved in a particular use case.

5.3 Reverse Engineering Using MaintainJ

  • MaintainJ is an Eclipse plug-in that generates runtime UML sequence and class diagrams for a use case.
  • In Step 2, the user can now log in to the Quickr application and perform use case scenarios with the MaintainJ application running.
  • The author has written a separate tool that takes as input the trace file and it outputs class and method names involved in the object interactions to a text file.
  • The same was also verified in the Class Point approach, i.e. whether or not the 14 Function Point factors are useful in their context, and also if the four additional Class Point factors enhance the prediction accuracy [29].
  • In the remainder of section 6, the cross validation process applied to PP1 and PP2 is described.

5.4 The Cross Validation Process

  • To carry out the cross validation process on the 78 selected use cases from the Lotus Quickr, the following steps were performed: 1. The whole data set was partitioned into eight randomly selected test sets; seven of equal size (10) and the last test set had two less data elements (8).
  • For each data set, the remaining use cases were analyzed to identify the corresponding training set obtained by removing influential outliers.
  • An Ordinary Least-Squares (OLS) regression analysis was performed on each training set to derive the effort prediction model.
  • Accuracy was separately calculated for each test set and the resulting values have been aggregated across all 8 test sets.
  • In what follows, the authors describe each of the above steps.

5.5 Partitioning the Data Set

  • Table 6 reports the data of the 78 use cases, following the order resulting from the random partition performed.
  • Thus, the first ten use cases form the first test set, the subsequent ten use cases form the second one, and so on.

5.6 OLS Regression Analysis to Derive Effort Prediction Models

  • An Ordinary Least-Squares regression analysis was applied in order to perform an empirical validation of the PP1 and PP2 measures.
  • When applying the OLS regression, a number of determinative indicators have been taken into account to establish the quality of the prediction.
  • Furthermore, to evaluate the statistical significance a t-test was performed and the p-value, t-value of the coefficient and intercept for each model was determined.
  • When it is less than 0.05, the authors can reject the hypothesis that the coefficient is zero; the reliability of the predictor is then given by the t-value of the coefficient.

5.7 Accuracy Evaluation of the Prediction Models

  • In order to assess the acceptability of the effort prediction models, the criteria suggested by Conte et al. [31] were adopted.
  • For each test set, the prediction accuracy has been evaluated by taking into account a summary measure, given by the Mean of MRE (MMRE), to measure the aggregation of MRE over the 10 observations.
  • The values of such measures are reported in Tables 18 to 25.
  • This represents an acceptable threshold for an effort prediction model, as suggested by Conte et al [31], which is confirmed by the aggregate (mean) and median MMRE values for PP1 and PP2 in Table 17, which are both ≤ 0.25.
  • This suggests the use of the PP1 measure at the beginning of the development process, in order to obtain a preliminary effort estimation, which can be refined by employing PP2 when the number of Pattern Concrete classes is known.

6.1 Single Measures and Their Sums

  • Courtney et al. [71] report that researchers who set out to learn empirical relationships by experimenting with different combinations of measures and functional forms before choosing the one with the highest correlation tend to make a good model with small data sets.
  • Then, the performance of the derived models for all considered measures was evaluated using the data coming from the corresponding testing sets.
  • Table 27 shows a summary descriptive statistics of the measures considered.
  • In fact, all the measures with SC fair slightly better than the PP1 metric.
  • First, the PP2 metric is better correlated to effort than any single measure composing it.

6.2 Multivariate OLS Regression

  • In order to complete the analysis, a multivariate OLS regression using as independent variables the basic measures of the Pattern Point approach, was carried out.
  • Again, the 8-fold cross validation technique was applied by carrying out a multivariate OLS regression on the eight training sets, and then evaluating the performance of the derived models, using the data coming from the corresponding testing sets.
  • Table 29 reports the aggregate MMRE and PRED (0.25) resulting from this analysis.
  • Compared with the values reported in Table 26, it can be deduced that the PP2 measure exhibits a more accurate predictive capability.
  • In any case, this study has confirmed once again that the use of the PP2 measure may yield a better predictive accuracy in models, which are based on a multivariate regression as well.

7.1 Conclusions

  • There are several models in existence that are used to estimate the size of software systems.
  • The Pattern Point model provides a system-level size measure using the design patterns from object interaction analyses in the late OOA phase of development.
  • The empirical study presented in the dissertation has suggested that the PP1 measure may have an equal or lesser predictive capability than its constituent SC metric.
  • A multi-project study is desired to assess the possible effects of the Technical Complexity Factors and Environmental Factors in the Pattern Point method.

Did you find this useful? Give us your feedback

Figures (9)
Citations
More filters
Journal ArticleDOI
TL;DR: In this paper, the organization of networks in the human cerebrum was explored using resting-state functional connectivity MRI data from 1,000 subjects and a clustering approach was employed to identify and replicate networks of functionally coupled regions across the cerebral cortex.
Abstract: Information processing in the cerebral cortex involves interactions among distributed areas. Anatomical connectivity suggests that certain areas form local hierarchical relations such as within the visual system. Other connectivity patterns, particularly among association areas, suggest the presence of large-scale circuits without clear hierarchical relations. In this study the organization of networks in the human cerebrum was explored using resting-state functional connectivity MRI. Data from 1,000 subjects were registered using surface-based alignment. A clustering approach was employed to identify and replicate networks of functionally coupled regions across the cerebral cortex. The results revealed local networks confined to sensory and motor cortices as well as distributed networks of association regions. Within the sensory and motor cortices, functional connectivity followed topographic representations across adjacent areas. In association cortex, the connectivity patterns often showed abrupt transitions between network boundaries. Focused analyses were performed to better understand properties of network connectivity. A canonical sensory-motor pathway involving primary visual area, putative middle temporal area complex (MT+), lateral intraparietal area, and frontal eye field was analyzed to explore how interactions might arise within and between networks. Results showed that adjacent regions of the MT+ complex demonstrate differential connectivity consistent with a hierarchical pathway that spans networks. The functional connectivity of parietal and prefrontal association cortices was next explored. Distinct connectivity profiles of neighboring regions suggest they participate in distributed networks that, while showing evidence for interactions, are embedded within largely parallel, interdigitated circuits. We conclude by discussing the organization of these large-scale cerebral networks in relation to monkey anatomy and their potential evolutionary expansion in humans to support cognition.

6,284 citations

Journal ArticleDOI
TL;DR: Three atlases at the 100-, 200- and 300-parcellation levels derived from 79 healthy normal volunteers are made freely available online along with tools to interface this atlas with SPM, BioImage Suite and other analysis packages.

822 citations


Cites background from "Analysis of a large fMRI cohort: St..."

  • ...Analysis of a wide range of anatomic registration methods has also shown significant spatial mismatch (of the order of 1 cm) between subjects in many brain regions (Hellier et al., 2003; Thirion et al., 2007)....

    [...]

Journal ArticleDOI
TL;DR: Although neuroimaging is unlikely to be cheaper than other tools in the near future, there is growing evidence that it may provide hidden information about the consumer experience.
Abstract: The application of neuroimaging methods to product marketing - neuromarketing - has recently gained considerable popularity. We propose that there are two main reasons for this trend. First, the possibility that neuroimaging will become cheaper and faster than other marketing methods; and second, the hope that neuroimaging will provide marketers with information that is not obtainable through conventional marketing methods. Although neuroimaging is unlikely to be cheaper than other tools in the near future, there is growing evidence that it may provide hidden information about the consumer experience. The most promising application of neuroimaging methods to marketing may come before a product is even released - when it is just an idea being developed.

744 citations

Journal ArticleDOI
TL;DR: A model is proposed that extends the original idea of the MNS to include forward and inverse internal models and motor and sensory simulation, distinguishing the M NS from a more general concept of sVx.
Abstract: Many neuroimaging studies of the mirror neuron system (MNS) examine if certain voxels in the brain are shared between action observation and execution (shared voxels, sVx). Unfortunately, finding sVx in standard group analyses is not a guarantee that sVx exist in individual subjects. Using unsmoothed, single-subject analyses we show sVx can be reliably found in all 16 investigated participants. Beside the ventral premotor (BA6/44) and inferior parietal cortex (area PF) where mirror neurons (MNs) have been found in monkeys, sVx were reliably observed in dorsal premotor, supplementary motor, middle cingulate, somatosensory (BA3, BA2, and OP1), superior parietal, middle temporal cortex and cerebellum. For the premotor, somatosensory and parietal areas, sVx were more numerous in the left hemisphere. The hand representation of the primary motor cortex showed a reduced BOLD during hand action observation, possibly preventing undesired overt imitation. This study provides a more detailed description of the location and reliability of sVx and proposes a model that extends the original idea of the MNS to include forward and inverse internal models and motor and sensory simulation, distinguishing the MNS from a more general concept of sVx.

647 citations


Cites background from "Analysis of a large fMRI cohort: St..."

  • ...This helped preserve statistical power, a critical issue in neuroimaging (Thirion et al. 2007)....

    [...]

Journal ArticleDOI
21 Apr 2009-PLOS ONE
TL;DR: The results demonstrate the highly organized modular architecture and associated topological properties in the temporal and spatial brain functional networks of the human brain that underlie spontaneous neuronal dynamics, which provides important implications for understanding of how intrinsically coherent spontaneous brain activity has evolved into an optimal neuronal architecture to support global computation and information integration in the absence of specific stimuli or behaviors.
Abstract: The characterization of topological architecture of complex brain networks is one of the most challenging issues in neuroscience. Slow (<0.1 Hz), spontaneous fluctuations of the blood oxygen level dependent (BOLD) signal in functional magnetic resonance imaging are thought to be potentially important for the reflection of spontaneous neuronal activity. Many studies have shown that these fluctuations are highly coherent within anatomically or functionally linked areas of the brain. However, the underlying topological mechanisms responsible for these coherent intrinsic or spontaneous fluctuations are still poorly understood. Here, we apply modern network analysis techniques to investigate how spontaneous neuronal activities in the human brain derived from the resting-state BOLD signals are topologically organized at both the temporal and spatial scales. We first show that the spontaneous brain functional networks have an intrinsically cohesive modular structure in which the connections between regions are much denser within modules than between them. These identified modules are found to be closely associated with several well known functionally interconnected subsystems such as the somatosensory/motor, auditory, attention, visual, subcortical, and the “default” system. Specifically, we demonstrate that the module-specific topological features can not be captured by means of computing the corresponding global network parameters, suggesting a unique organization within each module. Finally, we identify several pivotal network connectors and paths (predominantly associated with the association and limbic/paralimbic cortex regions) that are vital for the global coordination of information flow over the whole network, and we find that their lesions (deletions) critically affect the stability and robustness of the brain functional system. Together, our results demonstrate the highly organized modular architecture and associated topological properties in the temporal and spatial brain functional networks of the human brain that underlie spontaneous neuronal dynamics, which provides important implications for our understanding of how intrinsically coherent spontaneous brain activity has evolved into an optimal neuronal architecture to support global computation and information integration in the absence of specific stimuli or behaviors.

597 citations


Cites background from "Analysis of a large fMRI cohort: St..."

  • ...One of the key characteristics of fMRI data, is their large intersubject variability, which may dramatically influence on the robustness of group analysis [60]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: The standard nonparametric randomization and permutation testing ideas are developed at an accessible level, using practical examples from functional neuroimaging, and the extensions for multiple comparisons described.
Abstract: Requiring only minimal assumptions for validity, nonparametric permutation testing provides a flexible and intuitive methodology for the statistical analysis of data from functional neuroimaging experiments, at some computational expense. Introduced into the functional neuroimaging literature by Holmes et al. ([1996]: J Cereb Blood Flow Metab 16:7-22), the permutation approach readily accounts for the multiple comparisons problem implicit in the standard voxel-by-voxel hypothesis testing framework. When the appropriate assumptions hold, the nonparametric permutation approach gives results similar to those obtained from a comparable Statistical Parametric Mapping approach using a general linear model with multiple comparisons corrections derived from random field theory. For analyses with low degrees of freedom, such as single subject PET/SPECT experiments or multi-subject PET/SPECT or fMRI designs assessed for population effects, the nonparametric approach employing a locally pooled (smoothed) variance estimate can outperform the comparable Statistical Parametric Mapping approach. Thus, these nonparametric techniques can be used to verify the validity of less computationally expensive parametric approaches. Although the theory and relative advantages of permutation approaches have been discussed by various authors, there has been no accessible explication of the method, and no freely distributed software implementing it. Consequently, there have been few practical applications of the technique. This article, and the accompanying MATLAB software, attempts to address these issues. The standard nonparametric randomization and permutation testing ideas are developed at an accessible level, using practical examples from functional neuroimaging, and the extensions for multiple comparisons described. Three worked examples from PET and fMRI are presented, with discussion, and comparisons with standard parametric approaches made where appropriate. Practical considerations are given throughout, and relevant statistical concepts are expounded in appendices.

5,777 citations


"Analysis of a large fMRI cohort: St..." refers background in this paper

  • ...Non-parametric tests may avoid these issues (Holmes et al., 1996; Brammer et al., 1997; Bullmore et al., 1999; Nichols and Holmes, 2002; Hayasaka and Nichols, 2003; Mériaux et al., 2006a), but at a higher computational cost....

    [...]

  • ...Non-parametric tests may avoid these issues [Holmes et al., 1996, Brammer et al., 1997, Bullmore et al., 1999, Nichols and Holmes, 2002, Hayasaka and Nichols, 2003, Mériaux et al., 2006a], but at a higher computational cost....

    [...]

Journal ArticleDOI
TL;DR: This paper introduces to the neuroscience literature statistical procedures for controlling the false discovery rate (FDR) and demonstrates this approach using both simulations and functional magnetic resonance imaging data from two simple experiments.

4,838 citations

Journal ArticleDOI
TL;DR: This work has developed a means for generating an average folding pattern across a large number of individual subjects as a function on the unit sphere and of nonrigidly aligning each individual with the average, establishing a spherical surface‐based coordinate system that is adapted to the folding pattern of each individual subject, allowing for much higher localization accuracy of structural and functional features of the human brain.
Abstract: The neurons of the human cerebral cortex are arranged in a highly folded sheet, with the majority of the cortical surface area buried in folds. Cortical maps are typically arranged with a topography oriented parallel to the cortical surface. Despite this unambiguous sheetlike geometry, the most commonly used coordinate systems for localizing cortical features are based on 3-D stereotaxic coordinates rather than on position relative to the 2-D cortical sheet. In order to address the need for a more natural surface-based coordinate system for the cortex, we have developed a means for generating an average folding pattern across a large number of individual subjects as a function on the unit sphere and of nonrigidly aligning each individual with the average. This establishes a spherical surface-based coordinate system that is adapted to the folding pattern of each individual subject, allowing for much higher localization accuracy of structural and functional features of the human brain.

3,024 citations

Book
01 Dec 1971
TL;DR: Theoretical Bases for Calculating the ARE Examples of the Calculations of Efficacy and ARE Analysis of Count Data.
Abstract: Introduction and Fundamentals Introduction Fundamental Statistical Concepts Order Statistics, Quantiles, and Coverages Introduction Quantile Function Empirical Distribution Function Statistical Properties of Order Statistics Probability-Integral Transformation Joint Distribution of Order Statistics Distributions of the Median and Range Exact Moments of Order Statistics Large-Sample Approximations to the Moments of Order Statistics Asymptotic Distribution of Order Statistics Tolerance Limits for Distributions and Coverages Tests of Randomness Introduction Tests Based on the Total Number of Runs Tests Based on the Length of the Longest Run Runs Up and Down A Test Based on Ranks Tests of Goodness of Fit Introduction The Chi-Square Goodness-of-Fit Test The Kolmogorov-Smirnov One-Sample Statistic Applications of the Kolmogorov-Smirnov One-Sample Statistics Lilliefors's Test for Normality Lilliefors's Test for the Exponential Distribution Anderson-Darling Test Visual Analysis of Goodness of Fit One-Sample and Paired-Sample Procedures Introduction Confidence Interval for a Population Quantile Hypothesis Testing for a Population Quantile The Sign Test and Confidence Interval for the Median Rank-Order Statistics Treatment of Ties in Rank Tests The Wilcoxon Signed-Rank Test and Confidence Interval The General Two-Sample Problem Introduction The Wald-Wolfowitz Runs Test The Kolmogorov-Smirnov Two-Sample Test The Median Test The Control Median Test The Mann-Whitney U Test and Confidence Interval Linear Rank Statistics and the General Two-Sample Problem Introduction Definition of Linear Rank Statistics Distribution Properties of Linear Rank Statistics Usefulness in Inference Linear Rank Tests for the Location Problem Introduction The Wilcoxon Rank-Sum Test and Confidence Interval Other Location Tests Linear Rank Tests for the Scale Problem Introduction The Mood Test The Freund-Ansari-Bradley-David-Barton Tests The Siegel-Tukey Test The Klotz Normal-Scores Test The Percentile Modified Rank Tests for Scale The Sukhatme Test Confidence-Interval Procedures Other Tests for the Scale Problem Applications Tests of the Equality of k Independent Samples Introduction Extension of the Median Test Extension of the Control Median Test The Kruskal-Wallis One-Way ANOVA Test and Multiple Comparisons Other Rank-Test Statistics Tests against Ordered Alternatives Comparisons with a Control Measures of Association for Bivariate Samples Introduction: Definition of Measures of Association in a Bivariate Population Kendall's Tau Coefficient Spearman's Coefficient of Rank Correlation The Relations between R and T E(R), tau, and rho Another Measure of Association Applications Measures of Association in Multiple Classifications Introduction Friedman's Two-Way Analysis of Variance by Ranks in a k x n Table and Multiple Comparisons Page's Test for Ordered Alternatives The Coefficient of Concordance for k Sets of Rankings of n Objects The Coefficient of Concordance for k Sets of Incomplete Rankings Kendall's Tau Coefficient for Partial Correlation Asymptotic Relative Efficiency Introduction Theoretical Bases for Calculating the ARE Examples of the Calculations of Efficacy and ARE Analysis of Count Data Introduction Contingency Tables Some Special Results for k x 2 Contingency Tables Fisher's Exact Test McNemar's Test Analysis of Multinomial Data Summary Appendix of Tables Answers to Problems References Index A Summary and Problems appear at the end of each chapter.

2,988 citations

Book
23 Sep 1997
TL;DR: Principles and methods: Linking brain and behaviour, C. Frith Analyzing brain images - principles and overview, K. Friston Characterizing brain images with the general linear model, A. Holmes et al Making statistical inferences, J. Poline et al Characterizing distributed functional systems, K Friston characterizing functional integration.
Abstract: Principles and methods: Linking brain and behaviour, C. Frith Analyzing brain images - principles and overview, K. Friston Registering brain images to anatomy, J. Ashbumer, K. Friston Characterizing brain images with the general linear model, A. Holmes et al Making statistical inferences, J. Poline et al Characterizing distributed functional systems, K. Friston Characterizing functional integration, C. Buechel, K. Friston A taxonomy of study design, K. Friston et al. Functional anatomy: Dynamism of a PET image - studies of visual function, S. Zeki Mapping somatosensory systems, E. Paulesu, R. Frackowiak Functional organization of the motor system, R. Passingham The cerebral basis of functional recovery, R. Frackowiak Functional anatomy of reading, C. Price Higher cognitive processes, C. Frith, R. Dolan Human memory systems, R. Dolan et al Measuring neuromodulation with functional imaging, R. Dolan et al Brain maps - linking the present with the future, J. Mazziotta et al Functional imaging with magnetic resonance, R. Turner et al.

1,816 citations

Frequently Asked Questions (13)
Q1. What have the authors contributed in "Analysis of a large fmri cohort: statistical and methodological issues for group analyses" ?

While many efforts have been made to control the rate of false detections, statistical characteristics of the data have rarely been studied, and the reliability of the results ( supra-thresholds areas that are considered as activated regions ) has rarely been assessed. In this work, the authors take advantage of the large cohort of subjects who underwent the Localizer experiment to study the statistical nature of group data, propose some measures of the reliability of group studies, and address simple methodological questions as: is there, from the point of view of reliability, an optimal statistical threshold for activity maps ? Their results suggest that i ) optimal thresholds can indeed be found, and are rather lower than usual corrected for multiple comparison thresholds ii ) 20 subjects or more should be included in functional neuroimaging studies in order to have sufficient reliability, iii ) non-parametric significance assessment should be preferred to parametric methods iv ) cluster-level thresholding is more reliable than voxel-based thresholding v ) mixed effects tests are much more reliable than random effects tests. 

Several directions may be addressed in the future: • First trying to relate inter-subject variability to behavioral differences and individual or psychological characteristics of the subjects. Once again, such investigation may be undertaken only on large databases of subjects, and the authors the data basis used in this experiment might and probably will be used in such a framework. • Second, efforts will further be made to relate spatial functional variability to anatomical variability. While some cortex-based analysis reports have indicated a greater sensitivity than standard volume-based mappings [ Fischl et al., 1999 ], statistical evidence is still lacking, and it is not clear at all how much can be gained when taking into account macroanatomical features, e. g. sulco-gyral anatomy. 

Appropriate penalty terms are used to handle the case I(r) = 0. The authors have performed some experiments using η = 10 voxels or η = 30 voxels, and use δ = 6mm. 

Many scientists spend lots of efforts in order to obtain statistically significant results in neuroimaging studies, in order to validate a prior hypothesis on brain function, and it is certainly true that one of the greatest difficulties that they have to face is the high variability that is present in their datasets across subjects. 

Voxel-based random effects analysis is the standard way to analyse data from group studies (although the extraction of discrete local maxima [Worsley, 2005] presents an attractive alternative). 

While parametric tests are particularly efficient and computationally cheap, they are based on possibly unrealistic hypotheses that may reduce their sensitivity. 

In order to estimate the reliability of a statistical model, the authors need a method to compare statistical maps issued from the same technique, but sampled from different groups of subjects. 

It is worthwhile to note that the implementation of the tests in C reduces computation time to very reasonable time (cluster-level P-values can e.g. be computed in less than one minute on a dataset of ten subjects). 

The reliability measure is computed for 100 different splits of the population of subjects into R = 5 groups of S = 16 subjects, in the case of the left click-right click contrast. 

Although the statistic function does not take into account the group variance - as argued earlier, this is probably the reason of its higher performance - its distribution under the null hypothesis is tabulated by random swaps of the effects signs, so that it is indeed a valid group inference technique. 

using a spatial independence assumption, the log-likelihood of the datawriteslog(P (G)|λ, π0A, π 0 The author) = cst +V ∑v=1log ( λ(π0A) R−G(v)(π1A) G(v) + (1 − λ)(π0I ) R−G(v)(π1I ) G(v) )(7)Assuming R ≥ 3 the three free parameters, π0A, π 0 The author, λ can be estimated using EM or Newton’s methods. 

The magnitude order of such local shifts is probably as large as 1cm in many instances (this can be observed for functional regions like the the motor cortex or the visual areas [Thirion et al., ress, Stiers et al., 2006] or the position of anatomical landmarks[Collins et al., 1998, Hellier et al., 2003]). 

Since the parcel centres are defined at the group level in Talairach space, the voxels in the group result map are assigned to the parcel with the closest center in Talairach space.