scispace - formally typeset
Search or ask a question

Showing papers in "Psychological Assessment in 2019"


Journal ArticleDOI
TL;DR: This update of Clark and Watson (1995) provides a synopsis of major points of an earlier article and discusses issues in scale construction that have become more salient as clinical and personality assessment has progressed over the past quarter-century.
Abstract: In this update of Clark and Watson (1995), we provide a synopsis of major points of our earlier article and discuss issues in scale construction that have become more salient as clinical and personality assessment has progressed over the past quarter-century. It remains true that the primary goal of scale development is to create valid measures of underlying constructs and that Loevinger's theoretical scheme provides a powerful model for scale development. We still discuss practical issues to help developers maximize their measures' construct validity, reiterating the importance of (a) clear conceptualization of target constructs, (b) an overinclusive initial item pool, (c) paying careful attention to item wording, (d) testing the item pool against closely related constructs, (e) choosing validation samples thoughtfully, and (f) emphasizing unidimensionality over internal consistency. We have added (g) consideration of the hierarchical structures of personality and psychopathology in scale development, discussion of (h) codeveloping scales in the context of these structures, (i) "orphan," and "interstitial" constructs, which do not fit neatly within these structures, (j) problems with "conglomerate" constructs, and (k) developing alternative versions of measures, including short forms, translations, informant versions, and age-based adaptations. Finally, we have expanded our discussions of (l) item-response theory and of external validity, emphasizing (m) convergent and discriminant validity, (n) incremental validity, and (o) cross-method analyses, such as questionnaires and interviews. We conclude by reaffirming that all mature sciences are built on the bedrock of sound measurement and that psychology must redouble its efforts to develop reliable and valid measures. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

310 citations


Journal ArticleDOI
TL;DR: Results support use of the SCS to examine 6 subscale scores or a total score, but not separate scores representing compassionate and uncompassionate self-responding, while fit was excellent using ESEM for the 6-factor correlated, single-bifactor and correlated 2- bifactor models.
Abstract: This study examined the factor structure of the Self-Compassion Scale (SCS) using secondary data drawn from 20 samples (N = 11,685)-7 English and 13 non-English-including 10 community, 6 student, 1 mixed community/student, 1 meditator, and 2 clinical samples. Self-compassion is theorized to represent a system with 6 constituent components: self-kindness, common humanity, mindfulness and reduced self-judgment, isolation and overidentification. There has been controversy as to whether a total score on the SCS or if separate scores representing compassionate versus uncompassionate self-responding should be used. The current study examined the factor structure of the SCS using confirmatory factor analyses (CFA) and Exploratory Structural Equation Modeling (ESEM) to examine 5 distinct models: 1-factor, 2-factor correlated, 6-factor correlated, single-bifactor (1 general self-compassion factor and 6 group factors), and 2-bifactor models (2 correlated general factors each with 3 group factors representing compassionate or uncompassionate self-responding). Results indicated that a 1- and 2-factor solution to the SCS had inadequate fit in every sample examined using both CFA and ESEM, whereas fit was excellent using ESEM for the 6-factor correlated, single-bifactor and correlated 2-bifactor models. However, factor loadings for the correlated 2-bifactor models indicated that 2 separate factors were not well specified. A general factor explained 95% of the reliable item variance in the single-bifactor model. Results support use of the SCS to examine 6 subscale scores (representing the constituent components of self-compassion) or a total score (representing overall self-compassion), but not separate scores representing compassionate and uncompassionate self-responding. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

204 citations


Journal ArticleDOI
TL;DR: This work considers multiple measures of reliability, ranging from the worst to average, and why model-based estimates should be reported, for a single test administration, and addresses the utility of test-retest and alternate form reliabilities.
Abstract: Reliability is a fundamental problem for measurement in all of science. Although defined in multiple ways, and estimated in even more ways, the basic concepts seem straightforward and need to be understood by practitioners as well as methodologists. Reliability theory is not just for the psychometrician estimating latent variables, it is for everyone who wants to make inferences from measures of individuals or of groups. For the case of a single test administration, we consider multiple measures of reliability, ranging from the worst (β) to average (α, λ3) to best (λ4) split half reliabilities, and consider why model-based estimates (ωh, ωt) should be reported. We also address the utility of test-retest and alternate form reliabilities. The advantages of immediate versus delayed retests to decompose observed score variance into specific, state, and trait scores are discussed. But reliability is not just for test scores, it is also important when evaluating the use of ratings. Estimates that may be applied to continuous data include a set of intraclass correlations while discrete categorical data needs to take advantage of the family of κ statistics. Examples of these various reliability estimates are given using state and trait measures of anxiety given with different delays and under different conditions. An online supplemental materials is provided with more detail and elaboration. The online supplemental materials is also used to demonstrate applications of open source software to examples of real data, and comparisons are made between the many types of reliability. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

171 citations


Journal ArticleDOI
TL;DR: There is a need to be more careful about item distribution properties in light of their potential impact on model estimation as well as providing a very strong caution against item parceling in the evaluation of psychological test instruments.
Abstract: This article provides a summary and discussion of major challenges and pitfalls in factor analysis as observed in psychological assessment research, as well as our recommendations within each of these areas. More specifically, we discuss a need to be more careful about item distribution properties in light of their potential impact on model estimation as well as providing a very strong caution against item parceling in the evaluation of psychological test instruments. Moreover, we consider the important issue of estimation, with a particular emphasis on selecting the most appropriate estimator to match the scaling properties of test item indicators. Next, we turn our attention to the issues of model fit and comparison of alternative models with the strong recommendation to allow for theoretical guidance rather than being overly influenced by model fit indices. In addition, since most models in psychological assessment research involve multidimensional items that often do not map neatly onto a priori confirmatory models, we provide recommendations about model respecification. Finally, we end our article with a discussion of alternative forms of model specification that have become particularly popular recently: exploratory structural equation modeling (ESEM) and bifactor modeling. We discuss various important areas of consideration for the applied use of these model specifications, with a conclusion that, whereas ESEM models can offer a useful avenue for the evaluation of internal structure of test items, researchers should be very careful about using bifactor models for this purpose. Instead, we highlight other, more appropriate applications of such models. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

167 citations


Journal ArticleDOI
TL;DR: Results revealed attenuated psychometric precision for response scales with 2 to 5 response options; interestingly, however, the criterion validity results did not follow this pattern, and no psychometric advantages were revealed for any response scales beyond 6 options, including visual analogs.
Abstract: Psychological tests typically include a response scale whose purpose it is to organize and constrain the options available to respondents and facilitate scoring. One such response scale is the Likert scale, which initially was introduced to have a specific 5-point form. In practice, such scales have varied considerably in the nature and number of response options. However, relatively little consensus exists regarding several questions that have emerged regarding the use of Likert-type items. First, is there a "psychometrically optimal" number of response options? Second, is it better to include an even or odd number of response options? Finally, do visual analog items offer any advantages over Likert-type items? We studied these questions in a sample of 1,358 undergraduates who were randomly assigned to groups to complete a common personality measure using response scales ranging from 2 to 11 options, and a visual analog condition. Results revealed attenuated psychometric precision for response scales with 2 to 5 response options; interestingly, however, the criterion validity results did not follow this pattern. Also, no psychometric advantages were revealed for any response scales beyond 6 options, including visual analogs. These results have important implications for psychological scale development. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

163 citations


Journal ArticleDOI
TL;DR: The value of applied AA will be most fully realized by balancing the ability to develop personalized models with ensuring comparability among individuals, according to the inherent tension between idiographic and nomothetic principles of measurement.
Abstract: Ambulatory assessment (AA; also known as ecological momentary assessment) has enjoyed enthusiastic implementation in psychological research. The ability to assess thoughts, feelings, behavior, physiology, and context intensively and repeatedly in the moment in an individual's natural ecology affords access to data that can answer exciting questions about sequences of events and dynamic processes in daily life. AA also holds unique promise for developing personalized models of individuals (i.e., precision or person-specific assessment) that might be transformative for applied settings such as clinical practice. However, successfully translating AA from bench to bedside is challenging because of the inherent tension between idiographic and nomothetic principles of measurement. We argue that the value of applied AA will be most fully realized by balancing the ability to develop personalized models with ensuring comparability among individuals. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

104 citations


Journal ArticleDOI
TL;DR: Meta-analytic results indicate that self-report CE scores account for only approximately 1% of the variance in behavioral cognitive empathy assessments and that, perhaps equally importantly, this relation is not significantly different from that demonstrated by affective empathy scores.
Abstract: Empathy is widely regarded as relevant to a diverse range of psychopathological constructs, such as autism spectrum disorder, psychopathy, and borderline personality disorder. Cognitive empathy (CE) is the ability to accurately recognize or infer the thoughts and feelings of others. Although behavioral task paradigms are frequently used to assess such abilities, a large proportion of published studies reporting on CE use self-report questionnaires. For decades, however, a number of theorists have cautioned that individuals may not possess the metacognitive insight needed to validly gauge their own mindreading abilities. To investigate this possibility, we examined the aggregate relations between behavioral CE task performance and self-report CE scale scores, as well as with self-report affective empathy scale scores for comparison. Meta-analytic results, based on random effects models, from 85 studies (total N = 14,327) indicate that self-report CE scores account for only approximately 1% of the variance in behavioral cognitive empathy assessments and that, perhaps equally importantly, this relation is not significantly different from that demonstrated by affective empathy scores. Effect sizes were not moderated by self-report empathy domain, gender composition, unisensory versus multisensory behavioral stimuli presentation, child versus adult samples, or by normative versus clinical/forensic samples. Effect size estimates were not markedly affected by publication bias. These results raise serious concerns regarding the widespread use of self-report CE scores as proxies for CE ability, as well as the extensive theoretical conclusions that have been based on their use in past studies. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

98 citations


Journal ArticleDOI
TL;DR: This review discusses various conceptualizations of social-emotional skills, demonstrates their overlap with related constructs such as emotional intelligence and the Big Five personality dimensions, and proposes an integrative set of social/emotional skill domains that has been developed recently.
Abstract: The development and promotion of social-emotional skills in childhood and adolescence contributes to subsequent well-being and positive life outcomes. However, the assessment of these skills is associated with conceptual and methodological challenges. This review discusses how social-emotional skill measurement in youth could be improved in terms of skills' conceptualization and classification, and in terms of assessment techniques and methodologies. The first part of the review discusses various conceptualizations of social-emotional skills, demonstrates their overlap with related constructs such as emotional intelligence and the Big Five personality dimensions, and proposes an integrative set of social-emotional skill domains that has been developed recently. Next, methodological approaches that are innovative and may improve social-emotional assessments are presented, illustrated by concrete examples. We discuss how these innovations could advance social-emotional assessments, and demonstrate links to similar issues in related fields. We conclude the review by providing several concrete assessment recommendations that follow from this discussion. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

88 citations


Journal ArticleDOI
TL;DR: A new sample of 186 children with ADHD was evaluated using community detection analysis to determine if meaningful subprofiles existed and if they replicated those previously identified, and prospects for a fresh approach to assessing ADHD heterogeneity focused on the distinction between ADHD with and without anger/irritability.
Abstract: Attention deficit hyperactivity disorder (ADHD) is emblematic of unresolved heterogeneity in psychiatric disorders-the variation in biological, clinical, and psychological correlates that impedes progress on etiology. One approach to this problem is to characterize subgroups using measures rooted in biological or psychological theory, consistent with the National Institute of Mental Health's research domain criteria initiative. Within ADHD, a promising application involves using emotion trait profiles that can address the role of irritability as a complicating feature for ADHD. Here, a new sample of 186 children with ADHD was evaluated using community detection analysis to determine if meaningful subprofiles existed and if they replicated those previously identified. The new sample and a prior sample were pooled for evaluation of (a) method dependence, (b) longitudinal assessment of the stability of classifications, and (c) clinical prediction 2 years later. Three temperament profiles were confirmed within the ADHD group: one with normative emotional functioning ("mild"), one with high surgency ("surgent"), and one with high negative affect ("irritable"). Profiles were similar across statistical clustering approaches. The irritable group had the highest external validity: It was moderately stable over time and it enhanced prospective prediction of clinical outcomes beyond standard baseline indicators. The irritable group was not reducible to ADHD + oppositional defiant disorder, ADHD + disruptive mood dysregulation disorder, or other patterns of comorbidity. Among the negative affect domains studied, trait proneness to anger uniquely contributed to clinical prediction. Results extend our understanding of chronic irritability in psychiatric disorders and provide prospects for a fresh approach to assessing ADHD heterogeneity focused on the distinction between ADHD with and without anger/irritability. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

80 citations


Journal ArticleDOI
TL;DR: Psychometric findings across diverse cultural contexts supported the robustness and validity of the F-SozU K-6 for cross-cultural epidemiologic studies.
Abstract: The present study evaluates a brief, cross-cultural scale that maps a wide range of social resources, useful in large-scale assessments of perceived social support. The Brief Perceived Social Support Questionnaire (Fragebogen zur Sozialen Unterstutzung Kurzform mit sechs Items, F-SozU K-6) was examined in representative and university student samples from the United States (Nrepresentative = 3038), Germany (Nrepresentative = 2007, Nstudent = 5406), Russia (Nrepresentative = 3020, Nstudent = 4001), and China (Nstudent = 13,582). Cross-cultural measurement invariance testing was conducted in both representative and student samples across countries. Scores on the F-SozU K-6 demonstrated good reliability and strong model fit for a unidimensional structure in all samples, with the exception of poor model fit for German students. The scores on F-SozU K-6 correlated negatively with scores on depression, anxiety, and stress measures and positively with scores on positive mental health measures. Norms for gender and age groups were established separately based on each representative sample. Cross-cultural measurement invariance testing found partial strong measurement invariance across three general population samples and three student samples. Furthermore, a simulation study showed that the amount of invariance observed in the partial invariance model had only a negligible impact on mean comparisons. Psychometric findings across diverse cultural contexts supported the robustness and validity of the F-SozU K-6 for cross-cultural epidemiologic studies. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

74 citations


Journal ArticleDOI
TL;DR: Findings suggest acceptable compliance in an ESM protocol of 4 to 6 study days with a high frequency of 10 assessments per day despite fluctuations across and within study days is suggested.
Abstract: Intensive repeated measurement techniques, such as the experience sampling method (ESM), put high demands on participants and may lead to low response compliance, which, in turn, may affect data quality. Therefore, the objective of this study was to investigate ESM compliance and predictors thereof based on a pooled dataset of 10 ESM studies with a total of 92,394 momentary assessments from 1,717 individuals with different mental health conditions. All included studies used an ESM paper-and-pencil diary protocol of 4 to 6 study days with 10 random time assessments per day. Analyses were conducted using multilevel mixed-effects logistic regression models. Results indicated overall acceptable compliance with an average response rate of 78% (95% CI [0.74, 0.82]). However, compliance declined across days (p < .001), reaching a low on the 5th day with 73% (95% CI [0.68, 0.77]). Compliance also varied significantly across assessments depending on the time within a day (p < .001), with highest compliance between 12 p.m. and 1.30 p.m. (83%; 95% CI [0.80, 0.86]) and lowest compliance between 7.30 a.m. and 9 a.m. (56%; 95% CI [0.50, 0.62]). Persons with psychosis were less compliant than healthy participants (70% vs. 83%, respectively; p < .001). Also females (p = .002) and older participants (p < .001) were slightly more compliant. The findings suggest acceptable compliance in an ESM protocol of 4 to 6 study days with a high frequency of 10 assessments per day despite fluctuations across and within study days. Further evidence on compliance and its predictors in different ESM protocols is needed, especially in clinical populations. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: Daily diary methods have the potential to integrate within- and between-person approaches to personality assessment by applying measures like the Personality Dynamics Diary to gain insight into the psychological mechanisms that give rise to, and maintain, a person’s maladaptive dispositions and ultimately find individualized leverage points for targeted therapeutic interventions.
Abstract: Both theories and cutting-edge research highlight the dynamic nature of personality and personality pathology, thereby posing significant challenges for an exclusively between-person, trait-based approach to personality assessment. In a series of 3 studies, we explored the viability of integrating within-person, dynamic aspects into clinical personality assessment by means of daily dairy methods. In the 1st study, 314 students filled out a 73-item questionnaire capturing daily behaviors and situation experiences across 7-10 consecutive days. We used multilevel exploratory factor analyses to construct a shortened version, the Personality Dynamics Diary (PDD). In the 2nd study, the PDD was applied in a sample of 77 psychotherapy inpatients across 40 days, on average. In the 3rd study, 35 psychotherapy outpatients as well as their therapists judged the clinical utility of a smartphone version of the PDD. Taken together, we were able to construct a relatively brief self-report measure that assesses major dimensions of within- and between-person differences of situations and behaviors in daily life with acceptable reliability. Application in clinical samples provided further evidence for the reliability, validity, and clinical utility of the PDD but also highlighted possible obstacles in clinical practice as well as the need for further replication and refinement. We conclude that daily diary methods have the potential to integrate within- and between-person approaches to personality assessment. By applying measures like the PDD, clinicians may gain insight into the psychological mechanisms that give rise to, and maintain, a person's maladaptive dispositions and ultimately find individualized leverage points for targeted therapeutic interventions. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: The above findings suggest that the EDE-Q factor structure may require further reassessment, with greater focus on the qualitative differences in interpretation of Ede-Q items between females and males.
Abstract: The Eating Disorder Examination Questionnaire (EDE-Q) is a widely used assessment of eating disorder psychopathology; however, EDE-Q norms are yet to be provided within a nonclinical U.K. adult sample. Second, there is considerable disagreement regarding the psychometric properties of this measure. Several alternative factor structures have been previously proposed, but very few have subsequently validated their new structure in independent samples and many are often confined to specific subpopulations. Therefore, in the current study, we provide norms of the original four-factor EDE-Q structure, and subsequently assess the psychometric properties of the EDE-Q in females and males using a large nonclinical U.K. sample (total N = 2459). EDE-Q norms were consistently higher in females compared with males across all samples. Initial confirmatory factor analyses (CFA) did not support the original 4-factor structure for females or males (Phase 1). However, subsequent exploratory factor analyses (EFA) revealed a 3-factor structure as being the optimal fit for both females and males, using an 18-item and 16-item model, respectively (Phase 2). For females, the newly proposed 18-item structure was validated within an independent student sample and further validated in an additional nonstudent sample. The 16-item 3-factor male structure was also validated within an independent nonstudent sample, but was marginally below accepted fit indices within an independent student sample (Phase 3). Taken together, the above findings suggest that the EDE-Q factor structure may require further reassessment, with greater focus on the qualitative differences in interpretation of EDE-Q items between females and males. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: This review begins with an overview of test bias and fairness, followed by a discussion of issues involving group classification, focusing on categorizations of race/ethnicity and sex/gender, and describes procedures used to establish MI.
Abstract: One of the most important considerations in psychological and educational assessment is the extent to which a test is free of bias and fair for groups with diverse backgrounds. Establishing measurement invariance (MI) of a test or items is a prerequisite for meaningful comparisons across groups as it ensures that test items do not function differently across groups. Demonstration of MI is particularly important in assessment settings where test scores are used in decision making. In this review, we begin with an overview of test bias and fairness, followed by a discussion of issues involving group classification, focusing on categorizations of race/ethnicity and sex/gender. We then describe procedures used to establish MI, detailing steps in the implementation of multigroup confirmatory factor analysis, and discussing recent developments in alternative procedures for establishing MI, such as the alignment method and moderated nonlinear factor analysis, which accommodate reconceptualization of group categorizations. Lastly, we discuss a variety of important statistical and conceptual issues to be considered in conducting multigroup confirmatory factor analysis and related methods and conclude with some recommendations for applications of these procedures. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: An alternative trait-based coding of the DSM–5 LPFS is examined, suggesting that its coverage of diverse maladaptivity may not be because it assesses the core of personality disorder, but rather because it has items specific to the different domains of personality.
Abstract: Proposed for the ICD-11 is a dimensional model of personality disorder that, if approved, would be a paradigm shift in the conceptualization of personality disorder. The proposal consists of a general severity rating, 5 maladaptive personality trait domains, and a borderline pattern qualifier. The general severity rating can be assessed by the Standardized Assessment of Severity of Personality Disorder (SASPD), the trait domains by the Personality Inventory for ICD-11 (PiCD), and the borderline pattern by the Borderline Pattern Scale (BPS), which is developed in the present study. To date, no study has examined the relations among all 3 components, due in part to the absence of direct measures for each component (until recently). The current study develops and provides initial validation evidence for the BPS, and examines the relations among the BPS, SASPD, and PiCD. Also considered is their relationship with the 5-factor model of general personality as well as with 2 other measures of personality disorder severity (including the DSM-5 Level of Personality Functioning Scale [LPFS]). Further, an alternative trait-based coding of the DSM-5 LPFS is examined (modeled after the ICD-11 SASPD), suggesting that its coverage of diverse maladaptivity may not be because it assesses the core of personality disorder, but rather because it has items specific to the different domains of personality. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: An introduction to the GIMME method is provided, followed by a demonstration of its use in a sample of individuals diagnosed with personality disorder who completed daily diaries over 100 consecutive days.
Abstract: Personality and psychopathology are composed of dynamic and interactive processes among diverse psychological systems, manifesting over time and in response to an individual's natural environment. Ambulatory assessment techniques promise to revolutionize assessment practices by allowing access to the dynamic data necessary to study these processes directly. Assessing manifestations of personality and psychopathology naturalistically in an individual's own ecology allows for dynamic modeling of key behavioral processes. However, advances in dynamic data collection have highlighted the challenges of both fully understanding an individual (via idiographic models) and how s/he compares with others (as seen in nomothetic models). Methods are needed that can simultaneously model idiographic (i.e., person-specific) processes and nomothetic (i.e., general) structure from intensive longitudinal personality assessments. Here we present a method, group iterative multiple model estimation (GIMME) for simultaneously studying general, shared (i.e., in subgroups), and person-specific processes in intensive longitudinal behavioral data. We first provide an introduction to the GIMME method, followed by a demonstration of its use in a sample of individuals diagnosed with personality disorder who completed daily diaries over 100 consecutive days. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: The PI-99 showed strong psychometric characteristics, primals plausibly shape many personality and wellbeing variables, and a broad research effort examining these relationships is warranted.
Abstract: Beck's insight-that beliefs about one's self, future, and environment shape behavior-transformed depression treatment. Yet environment beliefs remain relatively understudied. We introduce a set of environment beliefs-primal world beliefs or primals-that concern the world's overall character (e.g., the world is interesting, the world is dangerous). To create a measure, we systematically identified candidate primals (e.g., analyzing tweets, historical texts, etc.); conducted exploratory factor analysis (N = 930) and two confirmatory factor analyses (N = 524; N = 529); examined sequence effects (N = 219) and concurrent validity (N = 122); and conducted test-retests over 2 weeks (n = 122), 9 months (n = 134), and 19 months (n = 398). The resulting 99-item Primals Inventory (PI-99) measures 26 primals with three overarching beliefs-Safe, Enticing, and Alive (mean α = .93)-that typically explain ∼55% of the common variance. These beliefs were normally distributed; stable (2 weeks, 9 months, and 19 month test-retest results averaged .88, .75, and .77, respectively); strongly correlated with many personality and wellbeing variables (e.g., Safe and optimism, r = .61; Enticing and depression, r = -.52; Alive and meaning, r = .54); and explained more variance in life satisfaction, transcendent experience, trust, and gratitude than the BIG 5 (3%, 3%, 6%, and 12% more variance, respectively). In sum, the PI-99 showed strong psychometric characteristics, primals plausibly shape many personality and wellbeing variables, and a broad research effort examining these relationships is warranted. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

Journal ArticleDOI
TL;DR: Testing the role of Criterion A in the AMPD raised questions about whether the model may need revision moving forward, and multivariate regression analyses suggested that the traits account for substantially more unique variance in DSM-5 Section II PDs than does personality impairment.
Abstract: An alternative diagnostic model of personality disorders (AMPD) was introduced in DSM-5 that diagnoses PDs based on the presence of personality impairment (Criterion A) and pathological personality traits (Criterion B). Research examining Criterion A has been limited to date, due to the lack of a specific measure to assess it; this changed, however, with the recent publication of a self-report assessment of personality dysfunction as defined by Criterion A (Levels of Personality Functioning Scale-Self-report; LPFS-SR; Morey, 2017). The aim of the current study was to test several key propositions regarding the role of Criterion A in the AMPD including the underlying factor structure of the LPFS-SR, the discriminant validity of the hypothesized factors, whether Criterion A distinguishes personality psychopathology from Axis I symptoms, the overlap between Criterion A and B, and the incremental predictive utility of Criterion A and B in the statistical prediction of traditional PD symptom counts. Neither a single factor model nor an a priori four-factor model of dysfunction fit the data well. The LPFS-SR dimensions were highly interrelated and manifested little evidence of discriminant validity. In addition, the impairment dimensions manifested robust correlations with measures of both Axis I and II constructs, challenging the notion that personality dysfunction is unique to PDs. Finally, multivariate regression analyses suggested that the traits account for substantially more unique variance in DSM-5 Section II PDs than does personality impairment. These results provide important information as to the functioning of the two main components of the DSM-5 AMPD and raise questions about whether the model may need revision moving forward. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: This article proposes an iterative psychoneurometric approach that provides a means to establish multimethod measurement models for core biobehavioral traits that influence functioning across diverse areas of life and believes this model-oriented strategy can provide a viable pathway toward effective use of neurophysiological measures in routine clinical assessments.
Abstract: Recent scientific initiatives have called for increased use of neurobiological variables in clinical and other applied assessments. However, the task of incorporating neural measures into psychological assessments entails significant methodological challenges that have not been effectively addressed to date. As a result, neurophysiological measures remain underutilized in clinical and applied assessments, and formal procedures for integrating such measures with report-based measures are lacking. In this article, we discuss major methodological issues that have impeded progress in this direction, and propose a systematic research strategy for integrating neurophysiological measures into psychological assessment protocols. The strategy we propose is an iterative psychoneurometric approach that provides a means to establish multimethod (MM) measurement models for core biobehavioral traits that influence functioning across diverse areas of life. We provide a detailed illustration of a MM model for one such trait, inhibitory control (inhibition-disinhibition), and highlight work being done to develop counterpart models for other biobehavioral traits (i.e., threat sensitivity, reward sensitivity, affiliative capacity). We discuss how these measurement models can be refined and extended through use of already existing data sets, and outline steps that can be taken to establish norms for MM assessments and optimize the feasibility of their use in everyday practice. We believe this model-oriented strategy can provide a viable pathway toward effective use of neurophysiological measures in routine clinical assessments. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: A measurement tool was constructed to assess the Big Five personality traits and the Situation Five simultaneously, and evidence for the reliability, convergent and discriminant validity, and predictive validity of the B5PS test scores is presented.
Abstract: We present the psychometric evaluation of a personality measure that assesses the Big Five and situation perception based on a newly developed taxonomy of situation characteristics. Following the lexical approach, more than 15,000 adjectives were extracted from an authoritative German dictionary. In a first exploratory study, 521 participants rated every-day situations on 300 adjectives selected as potential situation descriptors. Seven dimensions of situation perception were initially extracted. In a second study with N = 387, five of these seven factors were confirmed: Outcome-Expectancy, Briskness, Cognitive Load, Psychological and Physical Load, and Lack of Stimuli, together referred to as the Situation Five. Finally, a measurement tool, the Big Five of Personality in Occupational Situations (B5PS), was constructed to assess the Big Five personality traits and the Situation Five simultaneously. We present evidence for the reliability, convergent and discriminant validity, and predictive validity of the B5PS test scores. Our study highlights the relevance of situation perception as a trait and discusses their applicability in diverse contexts. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: A systematic review and meta-analysis of 37 randomized controlled trials in which mindfulness questionnaires were administered before and after an evidence-based MBI and a nonmindfulness-based active control condition found that participants in MBIs showed significantly greater pre–post changes in mindfulness scores than were seen in active control conditions with no explicit mindfulness elements.
Abstract: [Correction Notice: An Erratum for this article was reported in Vol 31(10) of Psychological Assessment (see record 2019-58643-005). The article should have been published under the terms of the Creative Commons Attribution License (CC BY 3.0). Therefore, the article was amended to list the authors as copyright holders, and information about the terms of the CC BY 3.0 was added to the author note. In addition, the article is now open access. All versions of this article have been corrected.] In support of the construct validity of mindfulness questionnaires, meta-analytic reviews have reported that scores increase in mindfulness-based interventions (MBIs). However, several studies have also found increased mindfulness scores in interventions with no explicit mindfulness training, raising a question about differential sensitivity to change with treatment. We conducted a systematic review and meta-analysis of 37 randomized controlled trials in which mindfulness questionnaires were administered before and after an evidence-based MBI and a nonmindfulness-based active control condition. The central question was whether increases in mindfulness scores would be greater in the MBI than in the comparison group. On average, participants in MBIs showed significantly greater pre-post changes in mindfulness scores than were seen in active control conditions with no explicit mindfulness elements, with a small overall effect size. This effect was moderated by which mindfulness questionnaire was used, by the type of active control condition, and by whether the MBI and control were matched for amount of session time. When mindfulness facet scores were analyzed separately, MBIs showed significantly greater pre-post increases than active controls in observing, nonjudging, and nonreactivity but not in describing or acting with awareness. Although findings provide partial support for the differential sensitivity of mindfulness questionnaires to change with treatment, the nonsignificant difference in pre-post change when the MBI and control were matched for session time highlights the need to clarify how mindfulness skills are acquired in MBIs and in other interventions and whether revisions to mindfulness questionnaires would increase their specificity to changes in mindfulness skills. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: Results indicated that signal- and event-contingent recording techniques provided equivalent data quality, suggesting that researchers can use the 2 methodologies interchangeably to draw conclusions about means, variances, and associations when examining social interactions.
Abstract: Ambulatory assessment (e.g., ecological momentary assessment) is now widely used in psychological research, yet key design decisions remain largely informed by methodological lore as opposed to systematic inquiry. The present study experimentally tested whether signal- (e.g., random prompt) and event- (e.g., complete a survey every time a target event occurs) contingent recording procedures of interpersonal behavior and affect in social situations yield equivalent quality and quantity of data. Participants (N = 286) completed baseline questionnaires, underwent cluster randomization to either a signal- or event-contingent condition, and then completed 1 week of ambulatory assessment, during which they answered questions about their social behavior and affect tied to their social interactions. Conditions were compared on response frequency, means and variances of interpersonal behavior and affect, correlations between interpersonal behavior and affect within-subject, and associations between momentary behavior and affect and baseline variables (e.g., Big Five traits). Results indicated that signal- and event-contingent recording techniques provided equivalent data quality, suggesting that researchers can use the 2 methodologies interchangeably to draw conclusions about means, variances, and associations when examining social interactions. However, results also showed that event-contingent recording returned, on average, a higher number of reported social interactions per individual, and this was true for most time periods of the day. Thus, event-contingent recording may hold advantages for studying frequency and timing of social interactions. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: The new technological frameworks provide unprecedented opportunities for remote self-administered behavioral assessments but will be most productive in multidisciplinary teams to ensure the highest level of user satisfaction and data quality, and to guarantee the highestlevel of data protection.
Abstract: Behavioral assessment using smart devices affords novel methods, notably remote self-administration by the individuals themselves. However, this new approach requires navigating complex legal and technical terrain. Given the limited empirical data that currently exists, we provide and discuss anecdotes of the methodological, technical, legal, and cultural issues associated with an implementation in both U.S. and European settings of a mobile software application for regular psychological monitoring purposes. The tasks required participants to listen, watch, speak, and touch to interact with the smart device, thus assessing cognition, motor skill, and language. Four major findings merit mention: First, moving assessment out of the hands of a trained investigator necessitates excellent usability engineering, such that the tool is easily usable by the participant and the resulting data relevant to the investigator. Second, remote assessment requires that the data are transferred safely back to the investigator, and that risk of compromising participant confidentiality is minimized. Third, frequent data collection over long periods of time is associated with a possibility that participants may choose to withdraw consent for participation thus requiring data retraction. Fourth, data collection and analysis across international borders creates new challenges and new opportunities because of important cultural and language issues that may inform the underlying behavioral constructs of interest. In conclusion, the new technological frameworks provide unprecedented opportunities for remote self-administered behavioral assessments but will be most productive in multidisciplinary teams to ensure the highest level of user satisfaction and data quality, and to guarantee the highest level of data protection. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: Advances in applications of IRT to clinical measurement are reviewed in an effort to identify tangible improvements that can be attributed to the methodology.
Abstract: Item response theory (IRT) is moving to the forefront of methodologies used to develop, evaluate, and score clinical measures. Funding agencies and test developers are routinely supporting IRT work, and the theory has become closely tied to technological advances within the field. As a result, familiarity with IRT has grown increasingly relevant to mental health research and practice. But to what end? This article reviews advances in applications of IRT to clinical measurement in an effort to identify tangible improvements that can be attributed to the methodology. Although IRT shares similarities with classical test theory and factor analysis, the approach has certain practical benefits, but also limitations, when applied to measurement challenges. Major opportunities include the use of computerized adaptive tests to prevent conditional measurement error, multidimensional models to prevent misinterpretation of scores, and analyses of differential item functioning to prevent bias. Whereas these methods and technologies were once only discussed as future possibilities, they are now accessible because of recent support of IRT-focused clinical research. Despite this, much work still remains in widely disseminating methods and technologies from IRT into mental health research and practice. Clinicians have been reluctant to fully embrace the approach, especially in terms or prospective test development and adaptive item administration. Widespread use of IRT technologies will require continued cooperation among psychometricians, clinicians, and other stakeholders. There are also many opportunities to expand the methodology, especially with respect to integrating modern measurement theory with models from personality and cognitive psychology as well as neuroscience. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: These narcissism adjective scales are well-suited for momentary narcissism assessment in studies wishing to examine fluctuations in grandiosity and vulnerability.
Abstract: There is growing interest in understanding the fluctuations in grandiose and vulnerable narcissistic states over time. Momentary data collection is vital in facilitating this new area of inquiry. Two narcissism adjective scales, the Narcissistic Grandiosity Scale and the Narcissistic Vulnerability Scale, have recently been developed for this purpose. In the present study, the validity of these 2 scales was examined across 3 different samples. Results indicate that these measures perform well psychometrically at both the momentary- and trait-level. Multilevel exploratory factor analyses reveal a clear 2-factor structure at both the within- and between-person level. Within-person correlations between these scales and other momentary scales (e.g., the Positive and Negative Affect Schedule) are consistent with theoretical expectations. Finally, individual differences in endorsement of both of these scales correlated with other dispositional measures, including existing narcissism measures (e.g., The Pathological Narcissism Inventory and The Five Factor Narcissism Inventory), at the between-person level in the expected manner. We conclude these scales are well-suited for momentary narcissism assessment in studies wishing to examine fluctuations in grandiosity and vulnerability. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: Observant ratings of symptoms as well as performance task data from 514 postsecondary students assessed for ADHD at a university-affiliated clinic were examined to reduce the effect of noncredible data and reduce the number of diagnoses by approximately half.
Abstract: Prior research supports the use of multiple types of evidence from multiple sources when assessing ADHD in adults. However, limited research has examined how to best integrate the resulting set of data into a well-supported diagnostic conclusion. Moreover, clients sometimes overreport symptoms or display low effort on performance tasks, further complicating the interpretation of assessment data. The present study examined self-ratings and observer (e.g., parent) ratings of symptoms as well as performance task data from 514 postsecondary students assessed for ADHD at a university-affiliated clinic. Observer ratings were more reliable than self-ratings and were more likely to be corroborated by other data. The 2 types of ratings showed moderate to large relationships with each other as continuous variables (.32 < r < .52) while agreement around categorical symptom cutoffs was slight or fair (.12 < κ < .32). Both types of ratings showed only small relationships with a performance test designed to assess ADHD symptoms. Approximately half of the cases in the sample had at least 1 piece of potentially noncredible data (suggesting potential symptom overreporting, inconsistent responding, or inadequate effort). Requiring ratings from multiple informants (as opposed to a single informant) of clinically significant symptoms for a diagnosis substantially reduced the effect of noncredible data, while also reducing the number of diagnoses by approximately half. Implications of these and other findings for practice and future research are discussed. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: A framework capable of detecting depression with minimal human intervention: artificial intelligence mental evaluation (AiME), which consists of a short human-computer interactive evaluation that utilizes artificial intelligence, namely deep learning, and can predict whether the participant is depressed or not with satisfactory performance.
Abstract: Machine learning (ML) has been introduced into the medical field as a means to provide diagnostic tools capable of enhancing accuracy and precision while minimizing laborious tasks that require human intervention. There is mounting evidence that the technology fueled by ML has the potential to detect and substantially improve treatment of complex mental disorders such as depression. We developed a framework capable of detecting depression with minimal human intervention: artificial intelligence mental evaluation (AiME). This framework consists of a short human-computer interactive evaluation that utilizes artificial intelligence, namely deep learning, and can predict whether the participant is depressed or not with satisfactory performance. Because of its ease of use, this technology can offer a viable tool for mental health professionals to identify symptoms of depression, thus enabling a faster preventative intervention. Furthermore, it may alleviate the challenge of observing and interpreting highly nuanced physiological and behavioral biomarkers of depression by providing a more objective evaluation. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: The PSCD is a promising alternative measure for assessing early manifestation of the broader construct of psychopathy in children and its use should facilitate discussion of the conceptualization, assessment, predictive value, and clinical usefulness of the psychopathic construct as it relates to CD at early developmental stages.
Abstract: The Proposed Specifiers for Conduct Disorder (PSCD) scale (Salekin & Hare, 2016) was developed as a measure of the broader construct of psychopathy in childhood and adolescence. In addition to conduct disorder (CD) symptoms, the PSCD addresses the interpersonal (grandiose-manipulative), affective (callous-unemotional), and lifestyle (daring-impulsive) traits of psychopathic personality. The PSCD can be scored by parents and teachers. The present study is a preliminary test of the psychometric properties of the PSCD-Parent Version in a sample of 2,229 children aged 3 to 6 years. Confirmatory factor analyses supported both a 3- and 4-factor structure being invariant across gender groups. The validity of the PSCD was also supported by convergent-divergent associations with an alternative measure of psychopathic traits as well as by the expected relations with fearlessness, conduct problems, reactive and proactive aggression, attention-deficit hyperactivity disorder and oppositional defiant disorder symptoms, and social competence skills. Overall, the PSCD is a promising alternative measure for assessing early manifestation of the broader construct of psychopathy in children. Its use should facilitate discussion of the conceptualization, assessment, predictive value, and clinical usefulness of the psychopathic construct as it relates to CD at early developmental stages. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: Examination of how external validity is influenced by a trait scale’s internal characteristics, such as its length, width, and balance, finds that broad trait scales tend to have slightly stronger, and much more consistent, associations with external validity criteria than do narrow scales.
Abstract: How well can scores on a personality scale predict criteria such as behaviors and life outcomes? This question concerns external validity, which is a core aspect of personality assessment. The present research was conducted to examine how external validity is influenced by a trait scale's internal characteristics, such as its length (number of items), width (breadth of content), and balance (between positively and negatively keyed items). Participants completed the Big Five Inventory-2 (BFI-2), and were also assessed on a set of self-reported and peer-reported validity criteria. We used the BFI-2 item pool to construct multiple versions, or iterations, of each Big Five trait scale that varied in terms of length, width, and balance. We then identified systematic effects of these internal scale characteristics on external validity associations. Regarding length, we find that longer trait scales tend to have greater validity, with a scale length "sweet spot" of approximately 6 to 9 items. Regarding width, we find that broad trait scales tend to have slightly stronger, and much more consistent, associations with external validity criteria than do narrow scales; broad scales thus represent relatively safe bets for personality assessment, whereas narrow scales carry greater risks but offer potentially greater rewards. Regarding balance, we find that associations between imbalanced trait and criterion scales can be substantially inflated or suppressed by acquiescent responding; trait scales that include an equal number of positively and negatively keyed items can minimize such acquiescence bias. We conclude by translating these findings into practical advice regarding psychological assessment. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

Journal ArticleDOI
TL;DR: The findings support the generalizability of the PID-5 factor structure, suggesting the replicability of Negative Affectivity, Detachment, Antagonism, and Psychoticism factors across different samples, translations, age groups, and nations.
Abstract: The present study aimed at quantitatively synthesizing published studies on the replicability of the Personality Inventory for Diagnostic and Statistical Manual of Mental Disorders, fifth edition (DSM-5; PID-5) domain factor structure in U.S. and non-U.S. cultural contexts. A literature search was conducted, and 23 studies based on 25 samples (N = 24,240) were included. Seven studies provided data on the factor replicability of the PID-5 in the U.S. and 16 studies yielded PID-5 factor replicability data in non-U.S. countries. The majority (n = 17, 68.0%) of the studies were based on community/student samples. Median congruence coefficient (CC) values ranged from .92 to .98 in U.S. studies, and from .91 to .97 in non-U.S. studies. No significant effect of sample type, translation, and geographic area on CC values was observed. Meta-analytic structural equation modeling results supported the homogeneity of the PID-5 scale correlation matrices across both U.S. studies, root mean square error of approximation (RMSEA) = .039, and non-U.S. studies, RMSEA = .045. Dimensionality analyses of the pooled correlation matrix provided evidence for a 5-factor structure of the PID-5 scales in both U.S. and non-U.S. studies; the resulting factor loading matrices were highly similar to the normative U.S. factor loading matrix. As a whole, our findings support the generalizability of the PID-5 factor structure, suggesting the replicability of Negative Affectivity, Detachment, Antagonism, and Psychoticism factors across different samples, translations, age groups, and nations. Further studies on samples from non-Western Europe countries, as well as from specific population, are needed before drawing definitive conclusions. (PsycINFO Database Record (c) 2019 APA, all rights reserved).