scispace - formally typeset

Journal ArticleDOI

A multitrait-multimethod validation of the Implicit Association Test: implicit and explicit attitudes are related but distinct constructs.

01 Jan 2007-Experimental Psychology (Hogrefe & Huber Publishers)-Vol. 54, Iss: 1, pp 14-29

AbstractRecent theoretical and methodological innovations suggest a distinction between implicit and explicit evaluations. We applied Campbell and Fiske's (1959) classic multitrait-multimethod design precepts to test the construct validity of implicit attitudes as measured by the Implicit Association Test (IAT). Participants (N = 287) were measured on both self-report and IAT for up to seven attitude domains. Through a sequence of latent-variable structural models, systematic method variance was distinguished from attitude variance, and a correlated two-factors-per-attitude model (implicit and explicit factors) was superior to a single-factor-per-attitude specification. That is, despite sometimes strong relations between implicit and explicit attitude factors, collapsing their indicators into a single attitude factor resulted in relatively inferior model fit. We conclude that these implicit and explicit measures assess related but distinct attitude constructs. This provides a basis for, but does not distinguish between, dual-process and dual-representation theories that account for the distinctions between constructs.

Summary (5 min read)

Introduction

  • Also, this multitrait-multimethod (MTMM) approach does not identify the cognitive processes that distinguish the constructs.
  • The context of this analysis follows from Cronbach and Meehl’s (1955) classic discussion of construct validation where a construct is an indeterminant function of representation and process.
  • The authors research provides another avenue of evidence for this growing nomological net by examining the relationship between implicit and explicit attitude measures to determine whether they can be fairly interpreted as measuring a single construct, or whether they assess related, but distinct constructs.

Preliminary Evidence

  • Greenwald and Farnham (2000) observed that a model describing implicit and explicit self-esteem as distinct latent factors provided a better fit than a single self-esteem conceptualization.
  • Likewise, Cunningham, Preacher, and Banaji (2001) found implicit and explicit measures of racial attitudes to reveal related, but distinct factors, as did Cunningham, Netlek, and Banaji (2004) for implicit and explicit ethnocentrism.
  • Following this approach of comparing single versus dual factors in structural equation modeling, the authors reanalyzed a large dataset reported by Nosek (2005).
  • To transcend this inferential limitation, here, guided by principles articulated by Campbell and Fiske (1959), the authors use a MTMM design and comparative structural modeling analyses.
  • The authors findings demonstrate (1) convergent and discriminant validity of the IAT, (2) that a model of distinct, but related implicit and explicit attitudes best fits the data, and (3) that this characterization is not attributable to attitude-irrelevant method variance of the IAT or of self-report.

MTMM and Confirmatory Factor Analysis

  • In their classic article on construct validation, Campbell and Fiske (1959) articulated a strategy for using MTMM matrices to evaluate convergent and discriminant validity.
  • Campbell and Fiske argued that “the clear-cut demonstration of the presence of method variance requires both several traits and several methods” (p. 85).
  • Confirmatory factor analysis (CFA) has emerged as a tool well-suited to the partitioning of MTMM data envisioned by Campbell and Fiske (Jöreskog, 1974; Widaman, 1985).
  • 2 Cunningham, Nezlek et al. (2004) footnoted this limitation, but argued that, since a control IAT, birds vs. trees, did not load with five ingroup-outgroup IATs on an implicit ethnocentrism factor, IAT method variance was not a strong driver of the two-factor solution.
  • By comparing the fits of nested structural models, the relative merits of alternative hypotheses concerning the structure of trait and method variance can be systematically tested (Jöreskog & Sörbom, 1979; Loehlin, 2004; McDonald, 1985).

Study Overview

  • Data are from four laboratory studies in which attitudes toward seven different attitude-object pairs were measured: flowers–insects, Democrats–Republicans, humanities–science, straight–gay, Whites–Blacks, creationism–evolution, and thin people–fat people.
  • These domains were selected because on their face they cover a broad range of attitudes.
  • They reported support for this idea and also found that implicit and explicit ethnocentrism factors were related, but distinct.
  • The authors sought to demonstrate discriminant validity between attitude domains – hypothesizing that the attitude domains would form distinct factors; and convergent validity across measurement types (IAT and selfreport) – hypothesizing that the implicit and explicit attitude constructs would be related, but retain distinctiveness not accounted for by method factors.
  • This simultaneous examination of discriminant and convergent validity is the core value of the MTMM approach.

Implicit Association Test (IAT)

  • One of the four samples received IATs for all seven object pairs, while the others received subsets of at least four pairs, including the flowers–insects and Democrats–Republicans pairs .
  • For the remaining IATs, response blocks for all tasks were randomized and single-discrimination practice blocks were eliminated.
  • The authors allow substantial opportunity for method factors to influence IAT performance and challenge the hypothesis that distinguishable attitude factors can be identified despite intermixed performance blocks.
  • Using four indicators was preferable, however, in terms of attaining stable estimates in the more complex models.

Self-Report Measures

  • Participants reported attitudes toward each of the target objects per pair independently using two 9-point semantic differentials.
  • Anchors for these differentials varied across data collections, including, for a given study, two of the following four pairs: cold–warm, unpleasant–pleasant, bad–good, or unfavorable–favorable.
  • Positive values indicate greater liking for the object that was implicitly preferred on average (–8 to +8).

Procedure

  • Similar procedures were used across all four data collections.
  • After informed consent, participants completed a selection of IATs and self-report measures.
  • The order of implicit and explicit measures was counterbalanced between subjects.
  • The correlation matrices are substantively similar when each data collection is considered independently.

Analyses

  • Following guidelines suggested by Campbell and Fiske (1959), the authors first describe and offer interpretations of the reliability and validity relations in the MTMM matrix.
  • Thus, if the 95% CI for the εaΔ includes .05, the models being compared are considered close to one another in fit, and one would be preferred only to the extent that it involves fewer parameter estimates (i.e., is more parsimonious).
  • In Model 1 the authors specify two method factors to account for the covariances among the 42 observed indicators (14 explicit and 28 implicit) across all seven general attitude domains.
  • This is accomplished by specifying models that are identical to Model 3 except that, in turn, a common implicit method factor is not specified (Model 4) and a common explicit method factor is not specified (Model 5).
  • By comparing the fits of each with that of Model 3, the authors may discern to what extent accounting for common method variance is important to understanding the structure of relations among these measures.

Description of the MTMM Correlations

  • Table 2 is a MTMM matrix for the two methods and seven attitude object pairs.
  • The first three rows of the table provide descriptive statistics for each of the fourteen measurements (i.e., full IAT D scores and averages of the self-reports).
  • Reliability estimates (Cronbach’s α) are shown in parentheses along the main diagonal in the top-left (IAT) and bottom-right (self-report) panels, and intramethod (Campbell and Fiske’s monomethod-heterotrait) correlations are listed in the other cells of these respective panels.
  • For the IAT, reliabilities are based on D scores for split-halves formed from alternating couplets of trials, since a couplet consisting of an object stimulus (e.g., a science word) and an evaluative stimulus (e.g., a “good” word) occurred every two trials.
  • That is, little evidence of common method variance is apparent across attitude domains.

IAT

  • Correlations in the gray diagonal of the bottom-left panel can be used to assess convergent validity (monotrait-heteromethod), and discriminant validity (heterotrait-heteromethod) can be assessed by those off the diagonal in this panel.
  • These data are consistent with their hypotheses for the convergent and discriminant validity of the IAT and self-report across attitude domains.
  • The power of the MTMM design is not fully harnessed by scrutinizing correlation matrices.
  • The relative merits of competing hypotheses about the structure of the data can be tested formally by comparing the fits of nested, but differentially specified, structural equation models.

MTMM Structural Equation Models

  • Summary statistics from the confirmatory structural models are listed in Table 3 (details of each model’s specifications and parameter estimates – all fit with Mplus statistical software (Muthén & Muthén, 1998–2004) – are available in the supplement to this paper at http://briannosek.com/).
  • In Model 1, the first of substantive interest , two method factors are specified, one loading on the explicit indicators and one on the implicit.
  • Comparing this model’s fit with that of Model 2 tests whether specifying distinct implicit and explicit attitude factors is superior to a single-attitude factor per domain model for these data.
  • To summarize, there was relatively little common method variance to account for in these data; statistically significant, but small amounts were found for both the explicit and implicit measurements.

Follow-Up Analyses

  • The authors conducted additional analyses to evaluate the possibility that method variance is underestimated in these models because both the IAT and the explicit measures have rational zero points; both indicate a preference for one category compared to another .
  • Some factors may primarily influence the extremity of the score away from neutrality (0).
  • People who are more skilled at task-switching will achieve less extreme scores regardless of whether, for example, they are proDemocrat or pro-Republican.
  • This provides a liberal test for the method factor influences because it reduces the constructvalid variance by treating positive and negative score values as the same, and enhances the opportunity to see influences of extremity (distance from 0) as indicating common influence on the implicit or explicit measures.
  • Refitting the sequence of models summarized in Table 3 with the absolute values for each indicator yielded the same pattern of results (a table in which these results are listed is part of the online supplement, http://briannosek.com/).

Highly Correlated Domains

  • To more rigorously test the generality of their two-attitude specification, the authors fit the sequence of models only to the data for the three domains with relatively strong implicit–explicit correlations: creationism–evolution, Democrats–Republicans, and science–humanities.
  • Thus, by using the most highly correlated domains, and again partitioning common method variance, the authors increased the likelihood that a single attitude specification would suffice to account for variable interrelations.

Discussion

  • The results of this study add to construct validation evidence for the IAT as a measure of attitudes (Greenwald & Nosek, 2001).
  • The convergent validity of the IAT was evidenced by significant factor correlations between the implicit and explicit attitude constructs in five of the seven attitude domains, while its discriminant validity was simultaneously evidenced by the statistical superiority of the two-attitude model to the single-attitude model.
  • There was a modest amount of common method variance to account for, with statistically significant but small portions isolated from both the explicit and implicit measures, and this was also observed when absolute values of the indicators were used.
  • As Campbell and Fiske (1959, p. 84) observed, “In any given psychological measuring device, there are certain features or stimuli introduced specifically to represent the trait that it is intended to measure.
  • The MTMM design, coupled with comparative structural modeling, allowed common implicit method variance to be distinguished from implicit attitude variance.

Other Components of the Nomological Net for the Implicit Attitude Construct

  • This research provides a basis for some key components of validation of the IAT and implicit attitudes.
  • If the IAT reflected evaluations of the stimulus items and not the categories, then the implicit–explicit distinction might be explained by this difference.
  • The effort required for the chemist to specify the snowboarder’s constructs in terms of a single H20 construct would produce ungainly process theories that would likely “miss the point” of the snow and ice constructs.
  • In short, the present evidence for distinct implicit and explicit attitude constructs does not rule out the possibility that the two constructs derive from common evaluative content.

Conclusion

  • Convergent evidence across a variety of research programs suggests that the IAT is a valid measure of attitudes (see Nosek et al., 2006, for a review).
  • Like other methods such as semantic differentials, Likert scales, sequential priming, and the Stroop task, the IAT can be adapted to measure evaluations of many types of social categories.
  • The cumulative evidence identifies design factors that will influence the method’s validity, and provides a nomological net of knowledge to accelerate validation of novel applications of the IAT.
  • The emergence of the implicit attitude construct has spurred investigations to test the strength and limitations of this concept and its measurement tools, like the IAT.
  • The authors found simultaneous evidence of convergent and discriminant validity of the IAT and self-report as measures of related but distinct attitude constructs, and as distinct from methodological variation.

Did you find this useful? Give us your feedback

...read more

Content maybe subject to copyright    Report

B.A. Nosek & F.L. Smyth: A Multitrait-MultimethodValidation of the Implicit Association TestExperimentalP sychology 2007; Vol. 54 (1):14–29© 2007 Hogrefe & Huber Publishers
A Multitrait-Multimethod Validation
of the Implicit Association Test
Implicit and Explicit Attitudes Are Related
butDistinctConstructs
Brian A. Nosek and Frederick L. Smyth
University of Virginia, Charlottesville, VA, USA
Abstract. Recent theoretical and methodological innovations suggest a distinction between implicit and explicit evaluations. We applied
Campbell and Fiske’s (1959) classic multitrait-multimethod design precepts to test the construct validity of implicit attitudes as measured
by the Implicit Association Test (IAT). Participants (N = 287) were measured on both self-report and IAT for up to seven attitude domains.
Through a sequence of latent-variable structural models, systematic method variance was distinguished from attitude variance, and a
correlated two-factors-per-attitude model (implicit and explicit factors) was superior to a single-factor-per-attitude specification. That is,
despite sometimes strong relations between implicit and explicit attitude factors, collapsing their indicators into a single attitude factor
resulted in relatively inferior model fit. We conclude that these implicit and explicit measures assess related but distinct attitude constructs.
This provides a basis for, but does not distinguish between, dual-process and dual-representation theories that account for the distinctions
between constructs.
Keywords: implicit social cognition, attitudes, individual differences, construct validity, structural equation modeling
Realizing that the human mind is more than the sum of its
conscious processes, a number of theorists have proposed
a conceptual distinction between evaluations that are the
products of introspection, called explicit attitudes, and
those that occur automatically and may exist outside of
conscious awareness, called implicit attitudes (Greenwald
& Banaji, 1995; Wilson, Lindsey, & Schooler, 2000).
Greenwald and Banaji (1995, p. 8), for example, defined
implicit attitudes as “introspectively unidentified (or inac-
curately identified) traces of past experience that mediate
favorable or unfavorable feelings toward an attitude ob-
ject.” This theory has developed in conjunction with the
invention of measurement tools that assess automatic eval-
uative associations without introspection (e.g., Fazio, San-
bonmatsu, Powell, & Kardes, 1986; Greenwald, McGhee,
& Schwartz, 1998; Nosek & Banaji, 2001; Wittenbrink,
Judd, & Park, 1997).
Some experiences with these new measurement tools
have spawned doubts about whether they measure attitudes
at all (Karpinski & Hilton, 2001), and whether a conceptual
distinction between implicit and explicit attitudes is worth-
while (Fazio & Olson, 2003). Fazio and Olson contend “it
is more appropriate to view the measures as implicit or
explicit, not the attitude (or whatever other construct)”
(2003, p. 303). The purpose of the research we report was
to test whether the structure of attitude variance derived
from an implicit measure (the Implicit Association Test
[IAT]; Greenwald et al., 1998) and from an explicit one
(semantic differentials) is best represented by one latent
factor or by two correlated latent factors, when stripped of
confounding method variance. If our hypothesis that the
latter specification will fit the data better is sustained, it will
support a view that substantively different attitude con-
structs, distinguishable from the techniques used to mea-
sure them, underlie data collected by explicit and implicit
methods.
The finding would not, however, lend weight to one side
or the other in the debate about origins of the distinction
between implicit and explicit attitudes i.e., are they de-
rived from a single representation at different stages of pro-
cessing (Fazio & Olson, 2003) or do they reflect distinct
evaluative sources (Strack & Deutsch, 2004; Wilson et al.,
2000). Also, this multitrait-multimethod (MTMM) ap-
proach does not identify the cognitive processes that dis-
tinguish the constructs.
The context of this analysis follows from Cronbach and
Meehl’s (1955) classic discussion of construct validation
where a construct is an indeterminant function of represen-
tation and process. So, our reference to distinct implicit and
explicit attitude constructs should be interpreted as refer-
ring to distinguishable attitudinal components, without im-
plying a specific commitment to distinguishable formative
processes, single versus dual mental representations, or sin-
gle versus dual operative processes.
Any of these theoretical perspectives can explain dual
constructs by postulating combinations of representations
DOI 10.1027/1618-3169.54.1.14
Experimental Psychology 2007; Vol. 54(1):14–29 © 2007 Hogrefe & Huber Publishers

and processes to account for the observed differences be-
tween constructs. Dual-construct validation justifies the
need to have theoretical models account for the distinction
without providing evidence for or against any particular
explanation. Further illustration of the difference between
construct validation versus commitments to dual-process
or dual-representation theories appears in the discussion
(see also Greenwald, Nosek, & Banaji, in press).
Construct Validation
The conceptual and empirical justification for a psycholog-
ical construct requires development of a nomological net
of facts, relationships, and validity evidence that clarifies
the identity and utility of the construct (Cronbach & Meehl,
1955; McArdle & Prescott, 1992). The nomological net
supporting the validity of implicit attitudes has been gain-
ing strength (Greenwald & Banaji, 1995; Nosek, Green-
wald, & Banaji, in press; Wilson et al., 2000). For example,
Poehlman, Uhlmann, Greenwald, and Banaji (2004) con-
ducted a meta-analysis of studies examining the predictive
validity of the IAT, a measure thought to be influenced by
automatic associations, and found that the IAT had robust
predictive validity across domains, and outperformed self-
report measures in some domains (stereotyping and preju-
dice), while self-report outperformed the IAT in other do-
mains (e.g., political preferences). Also, recent social-neu-
roscience research finds evidence for a neurological
distinction between automatic and controlled evaluative
processes (Cunningham, Johnson, Gatenby, Gore, & Bana-
ji, 2003; Cunningham, Johnson et al., 2004). Our research
provides another avenue of evidence for this growing no-
mological net by examining the relationship between im-
plicit and explicit attitude measures to determine whether
they can be fairly interpreted as measuring a single con-
struct, or whether they assess related, but distinct con-
structs.
Preliminary Evidence
Greenwald and Farnham (2000) observed that a model de-
scribing implicit and explicit self-esteem as distinct latent
factors provided a better fit than a single self-esteem con-
ceptualization. Likewise, Cunningham, Preacher, and Ba-
naji (2001) found implicit and explicit measures of racial
attitudes to reveal related, but distinct factors, as did Cun-
ningham, Netlek, and Banaji (2004) for implicit and explic-
it ethnocentrism. Following this approach of comparing
single versus dual factors in structural equation modeling,
we reanalyzed a large dataset reported by Nosek (2005).
We found support for the generalizability of a model of
distinct-but-related latent implicit and explicit attitude con-
structs across 56 of 57
1
widely varying attitude domains
showing that this observation is quite general (see Table 1
and the supplement to this paper available at http://brian-
nosek.com/ for a full report). Even so, in this and the other
previous structural modeling studies, specification of im-
plicit and explicit attitude constructs is confounded with
measurement method. As a consequence, a two-factor so-
lution is an indeterminant function of both attitude and
method variance.
2
To transcend this inferential limitation, here, guided by
principles articulated by Campbell and Fiske (1959), we use
a MTMM design and comparative structural modeling anal-
yses. This approach allowed us to distinguish attitude and
method variance from IAT and thermometer ratings, the op-
erationalizations of implicit and explicit attitudes, respective-
ly. Our findings demonstrate (1) convergent and discriminant
validity of the IAT, (2) that a model of distinct, but related
implicit and explicit attitudes best fits the data, and (3) that
this characterization is not attributable to attitude-irrelevant
method variance of the IAT or of self-report.
MTMM and Confirmatory Factor
Analysis
In their classic article on construct validation, Campbell
and Fiske (1959) articulated a strategy for using MTMM
matrices to evaluate convergent and discriminant validity.
This strategy requires measurement of two or more osten-
sibly distinct trait constructs by two or more measurement
methods. Convergent validity is demonstrated when indi-
cators of a given trait (or, in our study, attitude) correlate
highly across measurement method, while discriminant va-
lidity obtains when correlations between ostensibly differ-
ent traits are low. Campbell and Fiske argued that “the
clear-cut demonstration of the presence of method variance
requires both several traits and several methods” (p. 85).
They described ways to statistically evaluate the respective
contributions of trait and method factors, but looked for-
ward to continued progress in developing more rigorous
validation methods.
Confirmatory factor analysis (CFA) has emerged as a
tool well-suited to the partitioning of MTMM data envi-
sioned by Campbell and Fiske (Jöreskog, 1974; Widaman,
1985). According to Marsh and Grayson (1995, p. 181),
B.A. Nosek & F.L. Smyth: A Multitrait-Multimethod Validation of the Implicit Association Test 15
© 2007 Hogrefe & Huber Publishers Experimental Psychology 2007; Vol. 54(1):14–29
1 The two-factor model for the males-females attitude domain failed to converge, leaving the hypothesis untested for this domain.
2 Cunningham, Nezlek et al. (2004) footnoted this limitation, but argued that, since a control IAT, birds vs. trees, did not load with five
ingroup-outgroup IATs on an implicit ethnocentrism factor, IAT method variance was not a strong driver of the two-factor solution. They
further suggested that if “systematic measurement error alone” was responsible for the implicit ethnocentrism factor, then the substantial
correlation between implicit and explicit ethnocentrism (r = .47) would be unlikely. We do not disagree, and we test this supposition
systematically.

Table 1. Results of one- and two-factor (oblique) attitude models fit to measures of 57 topics, a structural equation model
reanalysis of Nosek (2005)
Attitude topic comparisons One-factor model Two-factor model
AB N χ² ε
a
90%CIε
a
χ² ε
a
90%CIε
a
Factors r (t)
Whites Asians 279 44 .28 .21–.35 1.4 .04 .00–.17 .01 (0.1)
Cold Hot 211 262 .79 .71–.87 0.5 .00 .00–.16 .13 (2.1)
Skirts Pants 255 73 .37 .30–.45 0.1 .00 .00–.10 .15 (3.1)
Future Past 235 76 .40 .32–.48 3 .10 .00–.23 .26 (1.9)
Thin people Fat people 275 50 .30 .23–.37 0.4 .00 .00–.14 .27 (2.0)
Approaching Avoiding 180 28 .27 .19–.36 0.3 .00 .00–.16 .29 (1.3)
Simple Difficult 210 56 .36 .28–.44 0.0 .00 .00–..01 .29 (3.4)
Public Private 196 40 .31 .23–.40 0.0 .00 .00–..05 .30 (3.7)
Freedom Security 220 67 .38 .31–.47 1 .00 .00–.16 .31 (2.5)
Short people Tall people 226 53 .34 .26–.42 3 .09 .00–.22 .31 (2.9)
Family Career 238 42 .29 .22–.37 0.1 .00 .00–.12 .33 (2.7)
Married Single 261 169 .57 .50–.64 0.5 .00 .00–.14 .33 (3.4)
Rich people Poor people 222 89 .44 .37–.52 1 .00 .00–.17 .34 (3.2)
Education Defense 214 50 .33 .26–.42 2 .07 .00–.21 .36 (3.7)
Letters Numbers 228 75 .40 .33–.48 0.1 .00 .00–.12 .37 (3.4)
Nerds Jocks 239 80 .40 .33–.48 0.0 .00 .00–..00 .38 (4.0)
Young people Old people 250 42 .28 .21–.36 0.0 .00 .00–.00 .40 (3.1)
Imprisonment Capital punishment 260 66 .35 .28–.43 0.2 .00 .00–.13 .41 (3.7)
Yankees Diamondbacks 200 42 .32 .24–.40 1 .04 .00–.20 .41 (4.5)
Flexible Stable 201 35 .29 .21–.37 0.0 .00 .00–.10 .42 (4.4)
Meg Ryan Julia Roberts 255 41 .28 .21–.35 1 .03 .00–.17 .43 (4.7)
Emotion Reason 175 84 .49 .40–.58 0.0 .00 .00–.09 .44 (3.9)
Conforming Rebellious 208 132 .56 .48–.64 0.4 .00 .00–.16 .44 (4.2)
Summer Winter 260 133 .50 .43–.58 0.4 .00 .00–.14 .44 (5.5)
Leaders Helpers 265 60 .33 .26–.41 2 .07 .00–.19 .45 (5.4)
Tom Cruise Denzel Washington 242 52 .32 .25–.40 0.3 .00 .00–.14 .46 (4.8)
Management Labor 194 47 .34 .26–.43 0.1 .00 .00–.14 .47 (3.9)
Exercising Relaxing 258 185 .60 .53–.67 0.3 .00 .00–.14 .48 (5.8)
Jay Leno David Letterman 196 49 .35 .27–.43 0.0 .00 .00–.04 .48 (4.5)
American places Foreign places 205 44 .32 .24–.40 0.0 .00 .00–.11 .49 (5.3)
Microsoft Apple 205 77 .43 .35–.51 3 .09 .00–.23 .49 (4.1)
California New York 253 66 .36 .29–.43 2 .05 .00–.18 .49 (4.3)
Tea Coffee 250 90 .42 .35–.50 0.0 .00 .00–.00 .50 (5.6)
Abstaining Drinking 249 100 .44 .37–.52 0.2 .00 .00–.13 .50 (5.9)
Christian Jewish 253 77 .39 .31–.46 0.0 .00 .00–.00 .50
(5.2)
Classical Hip Hop 243 106 .47 .39–.54 1 .00 .00–.16 .51 (6.0)
Northerners Southerners 207 62 .39 .30–.47 1 .00 .00–.17 .51 (5.8)
Jews Muslims 243 52 .32 .25–.40 2 .07 .00–.20 .52 (5.5)
Books Television 233 77 .40 .33–.48 0.1 .00 .00–.11 .54 (5.2)
Cats Dogs 258 77 .38 .31–.46 1 .00 .00–.15 .54 (5.6)
16 B.A. Nosek & F.L. Smyth: A Multitrait-Multimethod Validation of the Implicit Association Test
Experimental Psychology 2007; Vol. 54(1):14–29 © 2007 Hogrefe & Huber Publishers

“the operationalizations of convergent validity, discrimi-
nant validity, and method effects in the CFA approach ap-
parently better reflect Campbell and Fiske’s (1959) original
intentions than do their own guidelines.” By comparing the
fits of nested structural models, the relative merits of alter-
native hypotheses concerning the structure of trait (atti-
tude) and method variance can be systematically tested
(Jöreskog & Sörbom, 1979; Loehlin, 2004; McDonald,
1985). We used this approach to distinguish method vari-
ance from trait variance for seven attitude object pairs.
And, within this framework, we tested whether a single
attitude construct or distinct implicit and explicit attitude
constructs provides a better fit for the data.
This MTMM design, because it enables direct modeling
of method factors, increases confidence that a finding of
distinct implicit and explicit attitude factors indicates a sub-
stantive distinction and not one that is driven by confound-
ing influences in the measurement requirements. Our im-
plicit measurement instrument, the IAT, measures associa-
tions between concepts (e.g., thin–fat) and attributes (e.g.,
good–bad) by comparing the average response times for
sorting exemplars of those concepts and attributes in two
distinct response conditions one in which sorting thin and
good exemplars requires a single response and sorting fat
and bad exemplars requires an alternate response, and a
second in which sorting fat and good exemplars requires a
single response and sorting thin and bad exemplars requires
an alternate response.
This method is distinct from attitude self-report in which
a participant self-assesses attitudes by reporting the mag-
nitude of good or bad feelings on a response scale. Because
of their radically different measurement properties, it is
possible that the unique components of the better two-fac-
tor models observed in earlier research are the result of
sources of method variance such as cognitive fluency or
task-switching ability, two known influences on IAT effects
(Mierke & Klauer, 2003; see Nosek et al., 2006, for a re-
view).
Study Overview
Data are from four laboratory studies in which attitudes
toward seven different attitude-object pairs were measured:
flowers–insects, Democrats–Republicans, humanities–sci-
ence, straight–gay, Whites–Blacks, creationism–evolution,
and thin people–fat people. These domains were selected
because on their face they cover a broad range of attitudes.
This was important so that our goal of accounting for
common method variance would not be confounded with
common substantive variance. For example, Cunningham,
Nezlek, and Banaji’s (2004) examination of implicit and
explicit ethnocentrism specifically hypothesized that atti-
tudes for a variety of ingroup-outgroup domains (e.g.,
poor–rich, Blacks–Whites, Jews–Christians) would share a
Attitude topic comparisons One-factor model Two-factor model
AB N χ² ε
a
90%CIε
a
χ² ε
a
90%CIε
a
Factors r (t)
European Americans African Americans 254 39 .27 .20–.35 0.0 .00 .00–.09 .55 (4.7)
American Canadian 290 44 .27 .20–.34 0.5 .00 .00–.14 .55 (4.7)
Teen pop Jazz 239 81 .41 .33–.48 1 .00 .00–.15 .56 (6.1)
Vegetables Meat 234 76 .40 .32–.48 0.3 .00 .00–.14 .56 (5.4)
Social programs Tax reductions 188 77 .45 .36–.53 5 .15 .04–.28 .57 (5.6)
USA Japan 246 49 .31 .24–.39 0.0 .00 .00–.00 .57 (5.2)
Gun rights Gun control 216 85 .44 .36–.52 1 .04 .00–.19 .59 (6.8)
Straight people Gay people 175 35 .31 .22–.40 0.1 .00 .00–.14 .60 (5.3)
Religion Atheism 211 160 .61 .53–.69 2 .05 .00–.20 .61 (7.1)
Coke Pepsi 250 71 .37 .30–.45 1 .00 .00–.16 .66 (7.1)
Liberals Conservatives 215 124 .53 .46–.61 1 .00 .00–.17 .67 (6.9)
Creationism Evolution 231 99 .46 .39–.54 3 .08 .00–.21 .68 (7.2)
Feminism Traditional values 226 75 .40 .33–.48 0.4 .00 .00–.15 .71 (7.4)
Gore Bush 211 61 .37 .30–.46 0.0 .00 .00–.09 .74 (7.2)
Democrats Republicans 195 42 .32 .24–.41 2 .08 .00–.23 .75 (6.7)
Prochoice Prolife 242 52 .32 .25–.40 0.4 .00 .00–.10 .79 (8.6)
Females Males 289 112 .44 .37–.51 * * * *
Note. Attitude object A was implicitly preferred on average. All one-factor models have df = 2, and two-factor models have df =1.ε
a
=
root-mean-square error of approximation. CI = confidence interval. t = r/se. Factor correlations in boldface have t 2.0. * model did not
converge.
Table 1 continued.
B.A. Nosek & F.L. Smyth: A Multitrait-Multimethod Validation of the Implicit Association Test 17
© 2007 Hogrefe & Huber Publishers Experimental Psychology 2007; Vol. 54(1):14–29

common ethnocentrism factor. They reported support for
this idea and also found that implicit and explicit ethnocen-
trism factors were related, but distinct.
Our goal, in one sense, was the opposite of Cunningham
et al.’s: they sought to demonstrate convergent validity be-
tween conceptually related attitude domains revealing an
ethnocentrism factor. We sought to demonstrate discrimi-
nant validity between attitude domains hypothesizing that
the attitude domains would form distinct factors; and con-
vergent validity across measurement types (IAT and self-
report) hypothesizing that the implicit and explicit atti-
tude constructs would be related, but retain distinctiveness
not accounted for by method factors. This simultaneous ex-
amination of discriminant and convergent validity is the
core value of the MTMM approach.
Campbell and Fiske (1959) stressed that “One cannot
define [a construct] without implying distinctions, and the
verification of these distinctions is an important part of the
validational processes” (p. 84). Self-report and IAT mea-
sures were obtained for each attitude object pair (e.g.,
Whites–Blacks) and participants were measured on multi-
ple pairs. We fitted a sequence of nested covariance struc-
ture models, beginning with one in which method variance
was partitioned into latent factors, and we predicted that a
model specifying distinct implicit and explicit attitude fac-
tors would provide the best data fit, whether or not the par-
titioning of method variance proved important.
Method
Participants
A total of 287 Yale University undergraduates from four
data collections in 2000 and 2001 (n = 81, 86, 60, 60) com-
prise the study sample.
Materials
Implicit Association Test (IAT)
One of the four samples received IATs for all seven object
pairs, while the others received subsets of at least four
pairs, including the flowers–insects and Democrats–Re-
publicans pairs (patterns of measured variables are listed
in the Appendix). All IAT category headings and exem-
plars are listed in the Appendix. IAT D scores were com-
puted based on the scoring algorithm suggested by Green-
wald, Nosek, and Banaji (2003), that is, by taking the dif-
ference in mean response latency between the two critical
block conditions and scaling it by the participant’s aver-
age latency standard deviation for both blocks. Most of
the IATs administered across the four samples consisted
of 56 trials in each of the critical double-discrimination
blocks.
3
IAT scores were removed from analysis if more
than 10% of trials were unreasonably fast (< 300 ms) or
if the error rate for any block of trials was greater than
39%. This cleaning resulted in elimination of less than 1%
of IAT scores (14 of 1475).
In all data collections, participants first completed a
flowers–insects IAT with the order of blocks conforming
to that suggested by Greenwald et al. (2003), i.e., (1) a
single-discrimination block of trials for bad–good exem-
plars, followed by (2) a single-discrimination block for
flower–insect exemplars, then (3) a double-discrimination
block of (counterbalanced) either flowers+good/in-
sects+bad or flowers+bad/insects+good, followed by (4)
another single-discrimination block to practice the
switched flower–insect key assignments for the (5) final,
reversed, double-discrimination block. For the remaining
IATs, response blocks for all tasks were randomized and
single-discrimination practice blocks were eliminated.
For example, after the flowers–insects IAT, a participant
could receive the race attitude compatible block of trials
(i.e., compatible with the dominant prejudice) in which
Black faces and bad words are to be categorized with one
response key and White faces and good words are catego-
rized with another key, without any opportunity to prac-
tice the simple, single-discrimination task of sorting Black
from White faces; then the participant could receive the
incompatible block for the gay–straight IAT (gay+good
with one key, straight+bad with the other); then the Dem-
ocrat+bad/Republican+good block; then the incompatible
block for race attitude, etc. This atypical approach pro-
vides a tough test for identifying attitude factors because
the component performance tasks are intermixed with per-
formance tasks for the other attitude domains. In this way,
we allow substantial opportunity for method factors to in-
fluence IAT performance and challenge the hypothesis
that distinguishable attitude factors can be identified de-
spite intermixed performance blocks.
To facilitate latent variable analyses, four IAT D scores
were calculated for each attitude domain based on the dif-
ference between the means of each fourth of the trials, in
turn, across the critical blocks. That is, for IATs with 56
critical trials in each double-discrimination block, the
mean latency for trials 1–14 in one double-discrimination
condition was compared with that of trials 1–14 in the
other, and so on for sets of trials 15–28, 29–42, and
43–56.
4
18 B.A. Nosek & F.L. Smyth: A Multitrait-Multimethod Validation of the Implicit Association Test
Experimental Psychology 2007; Vol. 54(1):14–29 © 2007 Hogrefe & Huber Publishers
3 Exceptions to the 56-trial critical block format were the flowers–insects IAT for two of the samples (40 trials in critical blocks) and the
creation–evolution and thin–fat IATs for 13 participants in one of the samples, also 40 critical trials.
4 We also conducted the analysis with just two IAT indicators, first half vs. second half of trials from each block, and found comparable results
in our comparative structural modeling. Using four indicators was preferable, however, in terms of attaining stable estimates in the more
complex models.

Citations
More filters

Journal ArticleDOI
TL;DR: In the new version, procedures to analyze the power of tests based on single-sample tetrachoric correlations, comparisons of dependent correlations, bivariate linear regression, multiple linear regression based on the random predictor model, logistic regression, and Poisson regression are added.
Abstract: G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.

14,933 citations


01 Jan 2007
Abstract: A mong earthly organisms, humans have a unique propensity to introspect or look inward into the contents of their own minds, and to share those observations with others. With the ability to introspect comes the palpable feeling of " knowing, " of being objective or certain, of being mentally in control of one's thoughts, aware of the causes of one's thoughts, feelings, and actions, and of making decisions deliberately and rationally. Among the noteworthy discoveries of 20th century psychology was a challenge posed to this assumption of rationality. From the groundbreaking theorizing of Herbert Simon (1955) and the mind-boggling problems posed by Kahneman, Slovik, and Tversky (1982) to striking demonstrations of illusions of control (Wegner, 2002), the paucity of introspection (Nisbett and Wilson, 1977), and the automaticity of everyday thought (Bargh, 1997), psychologists have shown the frailties of the minds of their species. As psychologists have come to grips with the limits of the mind, there has been an increased interest in measuring aspects of thinking and feeling that may not be easily accessed or available to consciousness. Innovations in measurement have been undertaken with the purpose of bringing under scrutiny new forms of cogni-tion and emotion that were previously undiscovered and especially by asking if traditional concepts such as attitude and preference, belief and stereotype, self-concept and self-esteem can be rethought based on what the new measures reveal. These newer measures do not require introspection on the part of the subject. For many constructs this is considered a valuable, if not essential, feature of measurement; for others, avoiding introspection is greeted with suspicion and skepticism. For example, one approach to measuring math ability would be to ask " how good are you at math? " whereas an alternative approach is to infer math ability via a performance on a math skills test. The former requires introspection to assess the relevant construct, the latter does not. And yet, the latter is accepted

1,053 citations


Journal ArticleDOI
Abstract: Correspondence should be addressed to Brian A. Nosek, Department of Psychology, University of Virginia, 102 Gilmer Hall, Box 400400, Charlottesville, VA 22904, USA. E-mail: nosek@virginia.edu This research was supported by the National Institute of Mental Health (MH-41328, MH-01533, MH-57672, and MH-68447) and the National Science Foundation (SBR-9422241, SBR-9709924, and REC-0634041). The authors are grateful for technical support from N. Sriram, Ethan Sutin, and Lili Wu. Related information is available at http://briannosek.com/ and http://projectimplicit.net/ EUROPEAN REVIEW OF SOCIAL PSYCHOLOGY 2007, 1 – 53, iFirst article

802 citations


Journal ArticleDOI
TL;DR: This normative analysis provides a heuristic framework for organizing past and future research on implicit measures and reviews past research on the 2 implicit measures that are currently most popular: effects in implicit association tests and affective priming tasks.
Abstract: Implicit measures can be defined as outcomes of measurement procedures that are caused in an automatic manner by psychological attributes. To establish that a measurement outcome is an implicit measure, one should examine (a) whether the outcome is causally produced by the psychological attribute it was designed to measure, (b) the nature of the processes by which the attribute causes the outcome, and (c) whether these processes operate automatically. This normative analysis provides a heuristic framework for organizing past and future research on implicit measures. The authors illustrate the heuristic function of their framework by using it to review past research on the 2 implicit measures that are currently most popular: effects in implicit association tests and affective priming tasks.

711 citations


Journal ArticleDOI
TL;DR: Clinician implicit race bias and race and compliance stereotyping are associated with markers of poor visit communication and poor ratings of care, particularly among Black patients.
Abstract: Objectives. We examined the associations of clinicians’ implicit attitudes about race with visit communication and patient ratings of care.Methods. In a cross-sectional study of 40 primary care clinicians and 269 patients in urban community-based practices, we measured clinicians’ implicit general race bias and race and compliance stereotyping with 2 implicit association tests and related them to audiotape measures of visit communication and patient ratings.Results. Among Black patients, general race bias was associated with more clinician verbal dominance, lower patient positive affect, and poorer ratings of interpersonal care; race and compliance stereotyping was associated with longer visits, slower speech, less patient centeredness, and poorer ratings of interpersonal care. Among White patients, bias was associated with more verbal dominance and better ratings of interpersonal care; race and compliance stereotyping was associated with less verbal dominance, shorter visits, faster speech, more patient ...

490 citations


References
More filters

Journal ArticleDOI
Abstract: This article is concerned with measures of fit of a model. Two types of error involved in fitting a model are considered. The first is error of approximation which involves the fit of the model, wi...

23,630 citations


Journal ArticleDOI
TL;DR: This transmutability of the validation matrix argues for the comparisons within the heteromethod block as the most generally relevant validation data, and illustrates the potential interchangeability of trait and method components.
Abstract: Content Memory (Learning Ability) As Comprehension 82 Vocabulary Cs .30 ( ) .23 .31 ( ) .31 .31 .35 ( ) .29 .48 .35 .38 ( ) .30 .40 .47 .58 .48 ( ) As judged against these latter values, comprehension (.48) and vocabulary (.47), but not memory (.31), show some specific validity. This transmutability of the validation matrix argues for the comparisons within the heteromethod block as the most generally relevant validation data, and illustrates the potential interchangeability of trait and method components. Some of the correlations in Chi's (1937) prodigious study of halo effect in ratings are appropriate to a multitrait-multimethod matrix in which each rater might be regarded as representing a different method. While the published report does not make these available in detail because it employs averaged values, it is apparent from a comparison of his Tables IV and VIII that the ratings generally failed to meet the requirement that ratings of the same trait by different raters should correlate higher than ratings of different traits by the same rater. Validity is shown to the extent that of the correlations in the heteromethod block, those in the validity diagonal are higher than the average heteromethod-heterotrait values. A conspicuously unsuccessful multitrait-multimethod matrix is provided by Campbell (1953, 1956) for rating of the leadership behavior of officers by themselves and by their subordinates. Only one of 11 variables (Recognition Behavior) met the requirement of providing a validity diagonal value higher than any of the heterotrait-heteromethod values, that validity being .29. For none of the variables were the validities higher than heterotrait-monomethod values. A study of attitudes toward authority and nonauthority figures by Burwen and Campbell (1957) contains a complex multitrait-multimethod matrix, one symmetrical excerpt from which is shown in Table 6. Method variance was strong for most of the procedures in this study. Where validity was found, it was primarily at the level of validity diagonal values higher than heterotrait-heteromethod values. As illustrated in Table 6, attitude toward father showed this kind of validity, as did attitude toward peers to a lesser degree. Attitude toward boss showed no validity. There was no evidence of a generalized attitude toward authority which would include father and boss, although such values as the VALIDATION BY THE MULTITRAIT-MULTIMETHOD MATRIX

14,992 citations


Journal ArticleDOI
TL;DR: The present interpretation of construct validity is not "official" and deals with some areas where the Committee would probably not be unanimous, but the present writers are solely responsible for this attempt to explain the concept and elaborate its implications.
Abstract: Validation of psychological tests has not yet been adequately conceptualized, as the APA Committee on Psychological Tests learned when it undertook (1950-54) to specify what qualities should be investigated before a test is published. In order to make coherent recommendations the Committee found it necessary to distinguish four types of validity, established by different types of research and requiring different interpretation. The chief innovation in the Committee's report was the term construct validity.[2] This idea was first formulated by a subcommittee (Meehl and R. C. Challman) studying how proposed recommendations would apply to projective techniques, and later modified and clarified by the entire Committee (Bordin, Challman, Conrad, Humphreys, Super, and the present writers). The statements agreed upon by the Committee (and by committees of two other associations) were published in the Technical Recommendations (59). The present interpretation of construct validity is not "official" and deals with some areas where the Committee would probably not be unanimous. The present writers are solely responsible for this attempt to explain the concept and elaborate its implications.

9,262 citations


Journal ArticleDOI
TL;DR: An implicit association test (IAT) measures differential association of 2 target concepts with an attribute when instructions oblige highly associated categories to share a response key, and performance is faster than when less associated categories share a key.
Abstract: An implicit association test (IAT) measures differential association of 2 target concepts with an attribute. The 2 concepts appear in a 2-choice task (e.g., flower vs. insect names), and the attribute in a 2nd task (e.g., pleasant vs. unpleasant words for an evaluation attribute). When instructions oblige highly associated categories (e.g., flower + pleasant) to share a response key, performance is faster than when less associated categories (e.g., insect + pleasant) share a key. This performance difference implicitly measures differential association of the 2 concepts with the attribute. In 3 experiments, the IAT was sensitive to (a) near-universal evaluative differences (e.g., flower vs. insect), (b) expected individual differences in evaluative associations (Japanese + pleasant vs. Korean + pleasant for Japanese vs. Korean subjects), and (c) consciously disavowed evaluative differences (Black + pleasant vs. White + pleasant for self-described unprejudiced White subjects).

9,091 citations


Journal ArticleDOI
Abstract: A framework for hypothesis testing and power analysis in the assessment of fit of covariance structure models is presented. We emphasize the value of confidence intervals for fit indices, and we stress the relationship of confidence intervals to a framework for hypothesis testing. The approach allows for testing null hypotheses of not-good fit, reversing the role of the null hypothesis in conventional tests of model fit, so that a significant result provides strong support for good fit. The approach also allows for direct estimation of power, where effect size is defined in terms of a null and alternative value of the root-mean-square error of approximation fit index proposed by J. H. Steiger and J. M. Lind (1980). It is also feasible to determine minimum sample size required to achieve a given level of power for any test of fit in this framework. Computer programs and examples are provided for power analyses and calculation of minimum sample sizes.

7,456 citations


Frequently Asked Questions (1)
Q1. What are the contributions in "A multitrait-multimethod validation of the implicit association test implicit and explicit attitudes are related but distinct constructs" ?

This provides a basis for, but does not distinguish between, dual-process and dual-representation theories that account for the distinctions between constructs.