scispace - formally typeset
Open AccessJournal ArticleDOI

Convergent and incremental predictive validity of clinician, self-report, and structured interview diagnoses for personality disorders over 5 years.

Reads0
Chats0
TLDR
The results from a large clinical sample echo previous research documenting limited agreement between clinicians' naturalistic PD diagnoses and those from self-report and semistructured interview methods, and provide the first evidence about the relative predictive validity of these different methods.
Abstract
The complexity of diagnosing personality disorders (PDs) has been a longstanding issue in psychiatry (Westen, 1997; Zimmerman, 1994). Several methods exist for diagnosing PDs, including semi-structured diagnostic interviews, self-report questionnaires, clinician-rated Q-sort instruments, as well as unstructured diagnoses made by treating clinicians (McDermut & Zimmerman, 2005). Although research has relied primarily on semi-structured diagnostic interviews and self-report questionnaires, therapists typically base PD diagnoses on their unstructured interviews and clinical contacts with patients (Perry, 1992; Westen, 1997; Zimmerman, 2011). Despite debate regarding the relative merits of different diagnostic methods (Westen & Muderrisoglu, 2003; Zimmerman & Mattia, 1999), no study has yet compared the predictive validity of clinicians’ naturalistic PD diagnoses to those from self-report questionnaires or semi-structured interviews (Zimmerman, 2011). Existing research has repeatedly indicated that clinician-generated PD diagnoses do not agree well with those from self-report measures (Davidson, Obonsawin, Seils, & Patience, 2003; Hyler, Rieder, Williams, & Spitzer, 1989; Morey, Blashfield, Webb, & Jewell, 1988; Rossi, Van den Brande, Tobac, Sloore, & Hauben, 2003) or semi-structured interviews (Dreessen & Arntz, 1999; Fridell & Hesse, 2006; Samuel & Widiger, 2010). This poor agreement is not unique to PDs, and has been noted for various psychiatric diagnoses (Rettew, Lynch, Achenbach, Dumenci, & Ivanova, 2009). More importantly, fundamental questions regarding the incremental predictive validity of diagnoses assigned by clinicians relative to different methods have not been answered. Research has compared the validity of self and informant reports of PD (Klein, 2003; Oltmanns & Turkheimer, 2009), but there is a critical need for analogous work comparing clinical diagnoses to other methods. Such work is crucial for determining whether and how different sources of information might be usefully combined. Currently, the optimal approach for how researchers and clinicians should most validly identify PDs remains unclear. Although research on clinical judgment offers reasons for skepticism about the validity of clinician ratings in general (Grove, Zald, Lebow, Snitz, & Nelson, 2000; Meehl, 1954), there are compelling reasons to believe that their PD diagnoses might be useful and valid. Therapists’ diagnostic impressions rely on extensive training and take into consideration information about the client's life gleaned across extended periods of clinical interactions. Pilkonis, Heape, Ruddy, and Serrao (1991) noted “clinical judgment, of course, has its own limitations, but it would seem unwise to develop assessment tools that are unrelated to thoughtful clinical experience” (p. 46). In addition, Westen (1997) suggested that clinicians take a holistic approach to diagnosis, situating them well to describe complex personality pathology. Others contend that clinicians’ PD ratings are superior to self-report because patients’ ability to accurately assess their own personality might be limited by mood states, lack of insight, or presentation biases (Ganellen, 2007; Huprich, Bornstein, & Schmitt, 2011). Finally, Morey, Blashfield, Webb, and Jewell (1988) suggested that semi-structured diagnostic interviews also might have limitations because “a relatively brief interview situation does not seem particularly well suited to the task of assessing long-term personological characteristics.” (p. 47). Despite these concerns, there are reasons to believe that patient-reported information from semi-structured interviews and/or self-report questionnaires can usefully contribute to PD diagnoses (Zimmerman & Mattia, 1999). Thus, although clinicians might not routinely ask direct questions about PD symptoms or employ semi-structured interviews and self-report questionnaires, they incorporate such information to inform their diagnoses if it is available. Importantly, because semi-structured interviews explicitly assess the longitudinal presence of PD symptoms, they might have greater ability to disentangle episodic state artifacts from more durable trait-based PD syndromes (Loranger et al., 1991; Morey et al., 2010). As treating therapists almost always play the primary role in diagnosing PDs in clinical settings, understanding the relative validity of their impressions carries particular importance. Comparing clinicians’ diagnoses with those from self-report questionnaires or semi-structured diagnostic interviews would be useful for prospectively predicting clinically-relevant outcomes that extend beyond specific diagnostic features, such as psychosocial functioning. We conducted such a comparison using data from the Collaborative Longitudinal Personality Disorders Study (CLPS) (Gunderson et al., 2000). The CLPS is well-suited for this investigation as the baseline assessment included diagnoses from treating clinicians collected using a modified version of the Personality Assessment Form (PAF) (Shea, Glass, Pilkonis, Watkins, & Docherty, 1987; Shea et al., 1990). This allowed them to record the degree to which patients evinced the prototypical characteristics of each of four study PDs (viz., schizotypal, borderline, avoidant, obsessive-compulsive). The PAF provides a relevant, externally valid method for conducting such an analysis as it closely approximates the way clinicians make PD diagnoses in clinical practice. The PAF's format is also timely, as it utilizes a prototype-matching approach that mirrors the original proposal for diagnosing PDs in DSM-5. In fact, the PAF and research that had employed it were cited as primary support for the Work Group's proposal (Skodol, Bender, et al., 2011; Skodol, Clark, et al., 2011). This proposal subsequently was criticized by a number of PD scholars (Pilkonis, Hallquist, Morse, & Stepp, 2011; Widiger, 2011; Zimmerman, 2011) and abandoned. Nonetheless, other prominent researchers and clinicians have strongly argued that the prototype-matching approach should become the standard method of PD diagnosis (Shedler et al., 2010). The benefit and goal of employing the PAF for collecting clinicians’ impressions is to maximize external validity (i.e., most closely match the type of PD diagnoses typically made in clinical practice), not to provide equivalence with other methods (Westen & Weinberger, 2004). Westen and colleagues have demonstrated that when clinicians administer a systematic clinical interview (i.e., the Clinical Diagnostic Interview, CDI; Westen, 2004) and record their impressions using the Shedler-Westen Assessment Procedure (SWAP; Westen & Shedler, 1999), their PD diagnoses become more reliable across independent raters (Westen & Muderrisoglu, 2006; Westen, Shedler, Bradley, & DeFife, 2012). Although informative, such a diagnostic strategy (i.e., a two-hour administration of the Clinical Diagnostic Interview followed by the sorting of 200 SWAP items) is not standard practice in naturalistic settings. Perhaps recognizing this, Westen and his colleagues have also been the primary proponents of the prototype-matching approach (Shedler & Westen, 2004; Westen, DeFife, Bradley, & Hilsenroth, 2010; Westen, Shedler, & Bradley, 2006) that helped inform the original DSM-5 proposal (Skodol, Bender, et al., 2011; Skodol, Clark, et al., 2011). The PAF's prototype-matching format makes it a reasonable choice for collecting treating clinicians’ PD diagnoses in this study. We compared the incremental validity of clinicians’ diagnoses of these four PDs assigned via the PAF to those generated by a semi-structured interview and self-report questionnaire for predicting psychosocial functioning assessed prospectively over five years. Given the published support for the validity of the prototype-matching approach (Westen et al., 2012), we hypothesized that clinicians’ PAF ratings would account for variance in functioning beyond that captured by self-report questionnaires or semi-structured interviews. Nonetheless, we also recognized that all previous findings concerning the relative validity of alternative diagnostic methods have suggested that the methods are mutually informative (Hopwood et al., 2008; Klein, 2003). Thus, we also hypothesized that the self-report and semi-structured interview methods would have unique strengths and demonstrate incremental predictive validity beyond the clinician-assigned diagnoses. Finally, to account for inadequate familiarity with patients that might disadvantage the clinicians’ PAF ratings, we conducted additional analyses using only the subset of cases whom clinicians had treated for at least one year prior to providing the diagnoses. This choice of a one year interval of treatment ensured adequate familiarity with a patient's personality pathology.

read more

Content maybe subject to copyright    Report

Wesleyan University
From the SelectedWorks of Charles A. Sanislow, Ph.D.
",/")"-
+*1"-$"*/*!* -")"*/(-"!& /&1"(&!&/3+#
(&*& &*"(#",+-/*!/-0 /0-"!*/"-1&"2
&$*+.".#+-"-.+*(&/3&.+-!"-.1"-"-.
+0$(.)0"(
%-(".*&.(+2 Wesleyan University
%-&./+,%"-+,2++! Michigan State University
- &"%"
*!-"2'+!+( University of Arizona"/(
1&(("/ %4,.2+-'.",-"..+) %-("..*&.(+2

Convergent and Incremental Predictive Validity of Clinician, Self-Report,
and Structured Interview Diagnoses for Personality Disorders Over 5 Years
Douglas B. Samuel
Yale School of Medicine
Charles A. Sanislow
Wesleyan University
Christopher J. Hopwood
Michigan State University
M. Tracie Shea
Veterans Affairs Medical Center, Providence, Rhode Island, and
Alpert Medical School of Brown University
Andrew E. Skodol
The Sunbelt Collaborative, Tucson, Arizona, and University of
Arizona College of Medicine
Leslie C. Morey
Texas A&M University
Emily B. Ansell
Yale School of Medicine
John C. Markowitz
New York State Psychiatric Institute, New York, New York,
and Columbia University College of Physicians and Surgeons
Mary C. Zanarini
Harvard Medical School
Carlos M. Grilo
Yale School of Medicine
Objective: Research has demonstrated poor agreement between clinician-assigned personality disorder (PD)
diagnoses and those generated by self-report questionnaires and semistructured diagnostic interviews. No
research has compared prospectively the predictive validity of these methods. We investigated the conver-
gence of these 3 diagnostic methods and tested their relative and incremental validity in predicting indepen-
dent, multimethod assessments of psychosocial functioning performed prospectively over 5 years. Method:
Participants were 320 patients in the Collaborative Longitudinal Personality Disorders Study diagnosed with
PDs by therapist, self-report, and semistructured interview at baseline. We examined the relative incremental
validity of therapists’ naturalistic ratings relative to these other diagnostic methods for predicting psychosocial
functioning at 5-year follow-up. Results: Hierarchical linear regression analyses revealed that both the
self-report questionnaire and semistructured interview PD diagnoses had significant incremental predictive
validity over the PD diagnoses assigned by a treating clinician. Although, in some cases, the clinicians’ ratings
for individual PDs did have validity for predicting subsequent functioning, they did not generally provide
incremental prediction beyond the other methods. These findings remained robust in a series of analyses
restricted to a subsample of therapist ratings based on clinical contact of 1 year or greater. Conclusions: These
results from a large clinical sample echo previous research documenting limited agreement between clinicians’
naturalistic PD diagnoses and those from self-report and semistructured interview methods. They extend prior work
by providing the first evidence about the relative predictive validity of these different methods. Our findings
challenge the validity of naturalistic PD diagnoses and suggest the use of structured diagnostic instruments.
Keywords: personality disorder, semistructured interview, self-report, diagnostic agreement, clinician
This article was published Online First May 6, 2013.
Douglas B. Samuel, Department of Psychiatry, Yale School of Medi-
cine; Charles A. Sanislow, Department of Psychology, Wesleyan Univer-
sity; Christopher J. Hopwood, Department of Psychology, Michigan State
University; M. Tracie Shea, Veterans Affairs Medical Center, Providence,
Rhode Island, and Department of Psychiatry, Alpert Medical School of
Brown University; Andrew E. Skodol, The Sunbelt Collaborative, Tucson,
Arizona, and Department of Psychiatry, University of Arizona College of
Medicine; Leslie C. Morey, Department of Psychology, Texas A&M
University; Emily B. Ansell, Department of Psychiatry, Yale School of
Medicine; John C. Markowitz, New York State Psychiatric Institute, New
York, New York, and Department of Psychiatry, Columbia University
College of Physicians and Surgeons; Mary C. Zanarini, Department of
Psychiatry, Harvard Medical School; Carlos M. Grilo, Department of
Psychiatry, Yale School of Medicine.
Writing of this article was supported by the Office of Academic Affiliations,
Advanced Fellowship Program in Mental Illness Research and Treatment,
Department of Veterans Affairs. Research was supported by National Institute
of Mental Health Grants MH 50837, 50838, 50839, 50840, 50850, and
MH073708, awared to Charles A. Sanislow. This publication has been re-
viewed and approved by the Publications Committee of the Collaborative
Longitudinal Personality Disorders Study.
Correspondence concerning this article should be addressed to Doug-
las B. Samuel, who is now at Department of Psychological Sciences,
Purdue University, West Lafayette, IN 47907. E-mail: dbsamuel@
purdue.edu
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Journal of Consulting and Clinical Psychology © 2013 American Psychological Association
2013, Vol. 81, No. 4, 650 659 0022-006X/13/$12.00 DOI: 10.1037/a0032813
650

The complexity of diagnosing personality disorders (PDs) has
been a long-standing issue in psychiatry (Westen, 1997; Zimmer-
man, 1994). Several methods exist for diagnosing PDs, including
semistructured diagnostic interviews, self-report questionnaires,
clinician-rated Q-sort instruments, as well as unstructured diagno-
ses made by treating clinicians (McDermut & Zimmerman, 2005).
Although research has relied primarily on semistructured diagnos-
tic interviews and self-report questionnaires, therapists typically
base PD diagnoses on their unstructured interviews and clinical
contacts with patients (Perry, 1992; Westen, 1997; Zimmerman,
2011). Despite debate regarding the relative merits of different
diagnostic methods (Westen & Muderrisoglu, 2003; Zimmerman
& Mattia, 1999), no study has yet compared the predictive validity
of clinicians’ naturalistic PD diagnoses with those from self-report
questionnaires or semistructured interviews (Zimmerman, 2011).
Existing research has repeatedly indicated that clinician-
generated PD diagnoses do not agree well with those from self-
report measures (Davidson, Obonsawin, Seils, & Patience, 2003;
Hyler, Rieder, Williams, & Spitzer, 1989; Morey, Blashfield,
Webb, & Jewell, 1988; Rossi, Van den Brande, Tobac, Sloore, &
Hauben, 2003) or semistructured interviews (Dreessen & Arntz,
1999; Fridell & Hesse, 2006; Samuel & Widiger, 2010). This poor
agreement is not unique to PDs, and has been noted for various
psychiatric diagnoses (Rettew, Lynch, Achenbach, Dumenci, &
Ivanova, 2009). More importantly, fundamental questions regard-
ing the incremental predictive validity of diagnoses assigned by
clinicians relative to different methods have not been answered.
Research has compared the validity of self- and informant reports
of PD (Klein, 2003; Oltmanns & Turkheimer, 2009), but there is
a critical need for analogous work comparing clinical diagnoses
with other methods. Such work is crucial for determining whether
and how different sources of information might be usefully com-
bined. Currently, the optimal approach for how researchers and
clinicians should most validly identify PDs remains unclear.
Although research on clinical judgment offers reasons for skep-
ticism about the validity of clinician ratings in general (Grove,
Zald, Lebow, Snitz, & Nelson, 2000; Meehl, 1954), there are
compelling reasons to believe that their PD diagnoses might be
useful and valid. Therapists’ diagnostic impressions rely on exten-
sive training and take into consideration information about the
client’s life gleaned across extended periods of clinical interac-
tions. Pilkonis, Heape, Ruddy, and Serrao (1991) noted “clinical
judgment, of course, has its own limitations, but it would seem
unwise to develop assessment tools that are unrelated to thoughtful
clinical experience” (p. 46). In addition, Westen (1997) suggested
that clinicians take a holistic approach to diagnosis, situating them
well to describe complex personality pathology. Others contend
that clinicians’ PD ratings are superior to self-report because
patients’ ability to accurately assess their own personality might be
limited by mood states, lack of insight, or presentation biases
(Ganellen, 2007; Huprich, Bornstein, & Schmitt, 2011). Finally,
Morey et al. (1988) suggested that semistructured diagnostic in-
terviews also might have limitations because “a relatively brief
interview situation does not seem particularly well suited to the
task of assessing long-term personological characteristics” (p. 47).
Despite these concerns, there are reasons to believe that patient-
reported information from semistructured interviews and/or self-
report questionnaires can usefully contribute to PD diagnoses
(Zimmerman & Mattia, 1999). Thus, although clinicians might not
routinely ask direct questions about PD symptoms or use semi-
structured interviews and self-report questionnaires, they incorpo-
rate such information to inform their diagnoses if it is available.
Importantly, because semistructured interviews explicitly assess
the longitudinal presence of PD symptoms, they might have
greater ability to disentangle episodic state artifacts from more
durable trait-based PD syndromes (Loranger et al., 1991; Morey et
al., 2010).
As treating therapists almost always play the primary role in
diagnosing PDs in clinical settings, understanding the relative
validity of their impressions carries particular importance. Com-
paring clinicians’ diagnoses with those from self-report question-
naires or semistructured diagnostic interviews would be useful for
prospectively predicting clinically relevant outcomes that extend
beyond specific diagnostic features, such as psychosocial function-
ing. We conducted such a comparison using data from the Col-
laborative Longitudinal Personality Disorders Study (CLPS;
Gunderson et al., 2000). The CLPS is well suited for this investi-
gation as the baseline assessment included diagnoses from treating
clinicians collected using a modified version of the Personality
Assessment Form (PAF; Shea, Glass, Pilkonis, Watkins, & Do-
cherty, 1987; Shea et al., 1990). This allowed them to record the
degree to which patients evinced the prototypical characteristics of
each of four study PDs (viz., schizotypal, borderline, avoidant,
obsessive-compulsive).
The PAF provides a relevant, externally valid method for con-
ducting such an analysis as it closely approximates the way clini-
cians make PD diagnoses in clinical practice. The PAF’s format is
also timely, as it uses a prototype-matching approach that mirrors
the original proposal for diagnosing PDs in the Diagnostic and
Statistical Manual of Mental Disorders, fifth edition (DSM–5). In
fact, the PAF and research that had used it were cited as primary
support for the Work Group’s proposal (Skodol, Bender, et al.,
2011; Skodol, Clark, et al., 2011). This proposal subsequently was
criticized by a number of PD scholars (Pilkonis, Hallquist, Morse,
& Stepp, 2011; Widiger, 2011; Zimmerman, 2011) and aban-
doned. Nonetheless, other prominent researchers and clinicians
have strongly argued that the prototype-matching approach should
become the standard method of PD diagnosis (Shedler et al.,
2010).
The benefit and goal of using the PAF for collecting clinicians’
impressions is to maximize external validity (i.e., most closely
match the type of PD diagnoses typically made in clinical prac-
tice), not to provide equivalence with other methods (Westen &
Weinberger, 2004). Westen and colleagues have demonstrated that
when clinicians administer a systematic clinical interview (i.e., the
Clinical Diagnostic Interview, CDI; Westen, 2004) and record
their impressions using the Shedler-Westen Assessment Procedure
(SWAP; Westen & Shedler, 1999), their PD diagnoses become
more reliable across independent raters (Westen & Muderrisoglu,
2006; Westen, Shedler, Bradley, & DeFife, 2012). Although in-
formative, such a diagnostic strategy (i.e., a 2-hr administration of
the CDI followed by the sorting of 200 SWAP items) is not
standard practice in naturalistic settings. Perhaps recognizing this,
Westen and his colleagues have also been the primary proponents
of the prototype-matching approach (Shedler & Westen, 2004;
Westen, DeFife, Bradley, & Hilsenroth, 2010; Westen, Shedler, &
Bradley, 2006) that helped inform the original DSM–5 proposal
(Skodol, Bender, et al., 2011; Skodol, Clark, et al., 2011). The
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
651
VALIDITY OF CLINICIANS’ PERSONALITY DISORDER DIAGNOSES

PAF’s prototype-matching format makes it a reasonable choice for
collecting treating clinicians’ PD diagnoses in this study.
We compared the incremental validity of clinicians’ diagnoses
of these four PDs assigned via the PAF with those generated by a
semistructured interview and self-report questionnaire for predict-
ing psychosocial functioning assessed prospectively over 5 years.
Given the published support for the validity of the prototype-
matching approach (Westen et al., 2012), we hypothesized that
clinicians’ PAF ratings would account for variance in functioning
beyond that captured by self-report questionnaires or semistruc-
tured interviews. Nonetheless, we also recognized that all previous
findings concerning the relative validity of alternative diagnostic
methods have suggested that the methods are mutually informative
(Hopwood et al., 2008; Klein, 2003). Thus, we also hypothesized
that the self-report and semistructured interview methods would
have unique strengths and demonstrate incremental predictive va-
lidity beyond the clinician-assigned diagnoses. Finally, to account
for inadequate familiarity with patients that might disadvantage the
clinicians’ PAF ratings, we conducted additional analyses using
only the subset of cases, whom clinicians had treated for at least 1
year prior to providing the diagnoses. This choice of a 1-year
interval of treatment ensured adequate familiarity with a patient’s
personality pathology.
Method
Study participants were drawn from the 668 participants re-
cruited from the multiple CLPS clinical sites. Appropriate Institu-
tional Review Boards approved the study. Participants who pro-
vided written, informed consent underwent diagnostic interviews
and completed self-report questionnaires as part of a standardized
battery. Detailed recruitment and diagnostic procedures have been
published elsewhere (Gunderson et al., 2000). Briefly, participants
were assigned to one of four PD groups (borderline, avoidant,
schizotypal, and obsessive-compulsive [OC]), or to major depres-
sive disorder (MDD) without any PD. These PD diagnostic as-
signments were based on the Diagnostic Interview for DSM–IV
Personality Disorders (DIPD-IV; Zanarini, Frankenburg, Sickel, &
Yong, 1996), reliably administered by trained research personnel.
For inclusion, these diagnoses required confirmation by a self-
report questionnaire (e.g., Schedule for Nonadaptive and Adaptive
Personality–2; SNAP-2; Clark, Simms, Wu, & Casillas, in press)
and/or the treating clinician’s PAF ratings. Furthermore, because
inclusion demanded either a self-report or clinician-assigned diag-
nosis, in a subset of participants the semistructured interview-
assigned diagnosis disagreed with the clinicians’ ratings and was
instead confirmed by the self-report questionnaire.
Participants used for the current analyses were 320 individuals
from the CLPS with available PAF ratings completed by a treating
clinician at baseline. Independent sample t tests and chi-square
tests demonstrated no significant differences between participants
with PAF scores and the larger CLPS sample in gender, age, or
ethnicity. Independent samples t tests revealed that this subsample
differed in diagnosis and functioning, perhaps reflecting that par-
ticipants with PAF ratings were in ongoing psychiatric or psycho-
logical treatment. Participants with PAF ratings met more criteria
for borderline PD according to the DIPD-IV at baseline (M ! 4.4,
SD ! 2.7) than did those without available PAF ratings (M ! 2.6,
SD ! 2.5), t(729) ! 9.3, p " .01. Differences for the other three
studied PDs on the DIPD-IV were nonsignificant. Baseline
SNAP-2 PD scores were significantly greater for the studied group
for all four PDs. Participants with available PAF ratings did not
differ from those without in terms of psychosocial functioning
measured by the Social Adjustment Scale, Self-Report (SAS-SR),
t(700) ! 1.1, p ! .28, but did differ significantly according to the
Longitudinal Interval Follow-Up Evaluation (LIFE), t(727) ! 5.3,
p " .01.
Average age of the participants at baseline was 32.9 years
(SD ! 7.9, range ! 18 45); 199 (62%) were women; the ethnic
breakdown was 237 (74%) Caucasian, 35 (11%) African Ameri-
can, 39 (12%) Hispanic, six (2%) Asian American, and three (1%)
“other.” Of the participants, 73 (23%) were assigned to the
avoidant, 128 (40%) to the borderline, 54 (17%) to the obsessive-
compulsive, 37 (12%) to the schizotypal, and 28 (9%) to the MDD
without PD groups. Clinicians reported clinical contact with the
patients ranging from 0 to 884 weeks, with a mean of 53.7 (SD !
89.7) at the time of providing the PAF ratings. Their confidence in
their diagnostic ratings evinced a mean of 2.26 (on a 1– 5 metric,
where 1 ! high and 5 ! low; SD ! 1.12).
Personality Disorder Measures
DIPD-IV (Zanarini et al., 1996). The DIPD-IV is a semi-
structured diagnostic interview for assessing the Diagnostic and
Statistical Manual of Mental Disorders, fourth edition (DSM–IV;
American Psychiatric Association, 1994) PDs. Each criterion is
assessed with one or more questions rated on a 3-point scale (0 !
not present;1! present but of uncertain clinical significance;2!
present and clinically significant). The DIPD-IV requires that
criteria be pervasive, present for at least 2 years, and characteristic
of the person for most of his or her adult life. In the CLPS sample,
interrater reliability (based on 84 pairs of raters) kappa coefficients
ranged from .58 to 1.00 (Zanarini et al., 2000). The current report
considered only the DIPD-IV scores for the four PDs studied in
CLPS.
SNAP-2 (Clark et al., in press). Comprising 390 true/false
statements, the SNAP-2 provides a self-report assessment of 12
pathological personality traits derived from an iterative factor
analytic process. The SNAP-2 includes scales assessing the
DSM–IV PDs, ranging in length from 19 (avoidant) to 34 (antiso-
cial) items. Although most DSM–IV PD scale items are also scored
for one of the trait scales, a number of items were added to
explicitly tap additional content. The PD scales can be scored
dimensionally or by individual diagnostic criteria to yield categor-
ical diagnoses. In the full CLPS sample, the SNAP-2 PD scale
internal consistencies ranged from .69 (OCPD) to .88 (avoidant),
with an overall median of .83. The SNAP-2 PD scores correlate
consistently with those from other self-report PD inventories (Wi-
diger & Boyd, 2009) and structured PD diagnostic interviews
(Samuel et al., 2011). The current report only included the SNAP-2
scores for the four CLPS PDs.
PAF (Shea et al., 1987, 1990). The PAF was adapted for the
DSM–IV PDs from a measure developed for the National Institute
of Mental Health Treatment of Depression Collaborative Research
Program (Elkin, Parloff, Hadley, & Autry, 1985). Its purpose was
to provide a standardized method to quantify clinicians’ routine
clinical diagnoses. Thus, it was designed to maximize external
validity and mirror the type of PD ratings and diagnoses made in
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
652
SAMUEL ET AL.

clinical practice. The PAF used in CLPS contained three to four
sentence prototypical descriptions for each of the four PDs studied
(schizotypal, borderline, avoidant, and obsessive-compulsive) as
well as several “cues” to aid clinicians in rating a patient’s match
to the prototypes. The instrument is available by request to the first
author. Clinicians rated all four of the studied PDs on a 1–6 scale,
where 1 indicated not at all and 6 indicated that the patient
matched the prototype to an extreme degree. Consistent with
previous research (Shea et al., 1990), a score ! 4 indicated a
categorical diagnosis. Clinicians could also indicate no informa-
tion or insufficient data for a particular PD, although they used this
only rarely (24 times across the four PDs in the sample of 320).
Those values were recoded as missing for the current analyses. The
mean PAF ratings were 1.95 (SD ! 1.20) for schizotypal PD, 2.94
(SD ! 1.55) for borderline PD, 2.49 (SD ! 1.34) for avoidant PD,
and 2.08 (SD ! 1.33) for OCPD.
Psychosocial Functioning Measures Serving as
Independent External Criteria
Multiple measures of psychosocial functioning served as exter-
nal outcome criteria. These were independent of specific PD
symptoms and used two independent assessment methods. Both
aspects are crucial for the current purposes, as independent, exter-
nal criteria provide the only opportunity to discriminate validity
among different methods of PD diagnosis. To assess psychosocial
functioning, CLPS research team interviewers administered the
LIFE (Keller et al., 1987), a structured interview assessing func-
tioning in interpersonal relationships and occupational and recre-
ational domains. Most areas of functioning are rated on 5-point
severity scales (1 ! no impairment, high level of functioning or
very good functioning and 5 ! severe impairment or very poor
functioning). Participants also completed the SAS-SR (Weissman
& Bothwell, 1976), a self-report instrument yielding estimates of
interpersonal, occupational, and recreational functioning. The
LIFE and SAS-SR were administered at baseline and repeated at
predetermined intervals, including the 5-year follow-up. The same
interviewers administered both interviews (i.e., the LIFE and
DIPD-IV) at a given assessment interval; however, it was unlikely
that the interviewer who administered the DIPD-IV at baseline
also administered the LIFE at 5-year follow-up.
Data Analytic Procedures
We first examined the convergent validity of clinicians’ PAF
diagnoses with those from a semistructured diagnostic interview
(DIPD-IV) and self-report questionnaire (SNAP-2). PAF dimen-
sional ratings were compared with those from the DIPD-IV and
SNAP-2 (all at baseline) for their ability to predict functional
outcomes at the 60-month follow-up (via the LIFE and SAS-SR)
using hierarchical regression analyses. For example, the clinicians’
baseline PAF ratings for the four PDs were entered simultaneously
in one step, followed by the baseline PDs ratings from the
SNAP-2. This was then repeated with the order of entry reversed.
To account for possible contamination due to shared method
variance, we conducted these analyses separately using the self-
report criterion and again with the interview-based criterion vari-
able.
PAF diagnoses had been used to confirm the DIPD-IV diagnosis
for a subset of participants, creating a potential confound. Al-
though our use of functional outcomes rather than diagnostic
information as criteria attenuates this possibility, we nonetheless
examined it by performing a parallel set of analyses restricted to a
subsample of 110 participants for whom the PAF disagreed with
the DIPD-IV at baseline and thus was not required for study
inclusion. In this subsample, PAF ratings would potentially have
greater ability to increment the DIPD-IV scores.
Results
Categorical and Dimensional Agreement
Table 1 provides the agreement between PAF ratings and those
from the DIPD-IV and SNAP-2. Categorical agreement (kappas)
between treating clinicians’ diagnoses and the semistructured di-
agnostic interview ranged from of .21 (avoidant) to .42 (schizo-
typal), whereas dimensional agreement (Pearson correlations)
ranged from .30 (avoidant) to .44 (borderline). Agreement between
clinicians’ ratings and self-report questionnaire was lower than
between clinicians’ ratings and semistructured diagnostic inter-
views, with kappas ranging from .00 (OCPD) to .20 (borderline)
and Pearson correlations ranging from .18 (schizotypal) to .28
(borderline). For context, we note that agreement between
DIPD-IV and SNAP-2 in the current sample ranged from .25
(OCPD) to .51 (avoidant) for categorical diagnoses and from .57
(schizotypal) to .72 (avoidant) for dimensional ratings.
Incremental Predictive Validity
Tables 26 summarize the hierarchical regression analyses. Ta-
ble 2 shows that the DIPD-IV provided significant increment
beyond the PAF for predicting functioning assessed by both the
SAS-SR and LIFE. In contrast, clinicians’ ratings did not signifi-
cantly increment the DIPD-IV interview results for either criterion.
The nonsignificant #R
2
when the PAF block was added does not
indicate that all PAF diagnoses lacked validity, as the individual
schizotypal rating from the PAF was a significant predictor ($!
.15; p " .05). Table 3 summarizes the parallel series of analyses on
Table 1
Dimensional and Categorical Agreement of Clinician PD
Diagnostic Ratings With Interview Generated and Self-Report
PD Scores
PAF ratings
DIPD-IV criteria
counts SNAP-2 PD scores
% r % r
Schizotypal .42 .40 .01 .18
Borderline .38 .44 .20 .28
Avoidant .21 .30 .14 .23
OCPD .24 .30 .00 .20
Note. n ! 320. Kappa between diagnoses provided by PAF (!4) and
from DIPD-IV and SNAP-2 (meeting diagnostic criteria threshold). Di-
mensional agreements represent Pearson correlations of PAF ratings (1–6)
with scores from DIPD-IV and SNAP-2. PD ! personality disorder;
PAF ! Personality Assessment Form; DIPD-IV ! Diagnostic Interview
for DSM–IV Personality Disorders; SNAP-2 ! Schedule for Nonadaptive
and Adaptive Personality–2; OCPD ! obsessive-compulsive personality
disorder.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
653
VALIDITY OF CLINICIANS’ PERSONALITY DISORDER DIAGNOSES

Citations
More filters
Journal ArticleDOI

Determinants and Predictive Value of Clinician Assessment of Short-Term Suicide Risk.

TL;DR: Clinical judgment appears to be informed both by concrete risk factors and clinicians' emotional responses to suicidal patients, highlighting emotional awareness as a promising area for research and training.
Journal ArticleDOI

Utilizing interview and self-report assessment of the Five-Factor Model to examine convergence with the alternative model for personality disorders

TL;DR: This study expands on recent research to examine the relationship of the PID-5 with an interview measure of the Five-Factor Model and provides evidence for the convergence of the 2 models using self-report and interview measures of the FFM.
Journal ArticleDOI

When and How to use Multiple Informants to Improve Clinical Assessments

TL;DR: De Los Reyes et al. as discussed by the authors simplified the OTM model by identifying context and insight as the critical factors necessary for determining if multiple informants improve diagnostic accuracy and providing decision-making heuristics for determining when and how to use multiple informants in clinical research and practice.
Journal ArticleDOI

Clinicians and clients disagree: Five implications for clinical science.

TL;DR: Evidence that, despite criticisms, self-reported psychopathology may be at least as valid as clinicians' unstructured diagnoses is provided, and the need for research that provides clinicians with the most valid tools, including those that focus on dimensional constructs, rather than diagnostic categories is highlighted.
References
More filters
Journal ArticleDOI

Diagnostic and Statistical Manual of Mental Disorders

TL;DR: An issue concerning the criteria for tic disorders is highlighted, and how this might affect classification of dyskinesias in psychotic spectrum disorders.
Journal ArticleDOI

Guidelines, Criteria, and Rules of Thumb for Evaluating Normed and Standardized Assessment Instruments in Psychology.

TL;DR: In this paper, the authors provide guidelines, guidelines, and simple rules of thumb to assist the clinician faced with the challenge of choosing an appropriate test instrument for a given psychological assessment.
Journal ArticleDOI

Assessment of Social Adjustment by Patient Self-Report

TL;DR: The derivation and testing of a simple and inexpensive method, the Social Adjustment Scale Self-Report, is described, which covers the patient's role performance, interpersonal relationships, friction, feelings and satisfaction in work, and social and leisure activities with the extended family.
Journal ArticleDOI

The Longitudinal Interval Follow-up Evaluation. A comprehensive method for assessing outcome in prospective longitudinal studies

TL;DR: The Longitudinal Interval Follow-up Evaluation (LIFE) is an integrated system for assessing the longitudinal course of psychiatric disorders that consists of a semistructured interview, an Instruction booklet, a coding sheet, and a set of training materials.
Related Papers (5)
Frequently Asked Questions (2)
Q1. What contributions have the authors mentioned in the paper "Convergent and incremental predictive validity of clinician, self-report, and structured interview diagnoses for personality disorders over 5 years" ?

In this paper, the authors showed that PD diagnoses made by treating clinicians agree poorly with semistructured interviews and self-report questionnaires, and that the latter two methods have greater utility than clinicians ' PD diagnoses for predicting psychosocial functioning. 

Future research that examines the incremental predictive validity of clinicians ’ diagnoses derived from more structured assessments, such as therapists completing the SWAP ( Westen & Shedler, 1999 ), the Personality Disorder Schedule ( Nestadt et al., 2012 ), or even an informant version of an existing PD questionnaire, would help to address this possibility. Future research exploring the validity of other psychiatric diagnoses provided by clinicians in routine practice warrants attention. Rather, their results suggest that clinicians use standardized assessment instruments to inform PD diagnoses.