Showing papers in "Educational and Psychological Measurement in 2000"

PDF

Open Access

Journal Article•DOI•

A Meta-Analysis of Response Rates in Web- or Internet-Based Surveys

[...]

Colleen Cook¹, Fred M. Heath, Russel L. Thompson¹•Institutions (1)

01 Dec 2000-Educational and Psychological Measurement

TL;DR: In this article, a meta-analysis explores factors associated with higher response rates in electronic surveys reported in both published and unpublished research and concludes that response representativeness is more important than response rate in survey research.

...read moreread less

Abstract: Response representativeness is more important than response rate in survey research. However, response rate is important if it bears on representativeness. The present meta-analysis explores factors associated with higher response rates in electronic surveys reported in both published and unpublished research. The number of contacts, personalized contacts, and precontacts are the factors most associated with higher response rates in the Web studies that are analyzed.

...read moreread less

2,520 citations

Journal Article•DOI•

Effects of stem and likert response option reversals on survey internal consistency: if you feel the need, there is a better alternative to using those negatively worded stems

[...]

J. Jackson Barnette¹•Institutions (1)

University of Iowa¹

01 Jun 2000-Educational and Psychological Measurement

TL;DR: In this article, a 2 × 3 design in which item stem direction and item response pattern direction were crossed was used to determine effects on internal consistency reliability as measured by Cronbach's alpha.

...read moreread less

Abstract: The controversy with regard to using reverse or negatively worded survey stems has been around for several decades; it is a practice of questionable utility intended to guard against acquiescence or response set behaviors. A 2 × 3 design in which item stem direction and item response pattern direction were crossed was used to determine effects on internal consistency reliability as measured by Cronbach’s alpha. The condition having the highest alpha was when all directly worded stems were used with bidirectional response options. Alpha was higher and accounted for at least 10%, and in one case 20%, higher internal consistency as compared with any of the three conditions in which negatively worded stems were used. This would indicate that the use of all directly worded stems and half of the response options going in one direction and half going in the other direction may be a better way of guarding against acquiescence and response set behaviors than the use of items with negatively worded stems.

...read moreread less

487 citations

Journal Article•DOI•

Defining and Measuring Empowering Leader Behaviors: Development of an Upward Feedback Instrument:

[...]

Lee J. Konczak, Damian J. Stelly, Michael L. Trusty

01 Apr 2000-Educational and Psychological Measurement

TL;DR: In this article, the authors discuss the development of an instrument designed to identify empowering behaviors of leaders, but there has been little research on identifying empowering behaviours of leaders in management practice.

...read moreread less

Abstract: Empowerment is a popular management practice, but there has been little research to identify empowering behaviors of leaders. The present article discusses the development of an instrument designed...

...read moreread less

379 citations

Journal Article•DOI•

Psychometrics is Datametrics: the Test is not Reliable:

[...]

Bruce Thompson¹, Tammi Vacha-Haase²•Institutions (2)

Texas A&M University¹, Colorado State University²

01 Apr 2000-Educational and Psychological Measurement

TL;DR: In this article, the authors present a manifesto regarding the nature of score reliability and what are reasonable expectations for psychometric reporting practices in substantive inquiries, and explore the consequences of misunderstandings about score reliability.

...read moreread less

Abstract: The present article responds to selected criticisms of some EPM editorial policies and Vacha-Haase’s “reliability generalization” meta-analytic methods. However, the treatment is more broadly a manifesto regarding the nature of score reliability and what are reasonable expectations for psychometric reporting practices in substantive inquiries. The consequences of misunderstandings of score reliability are explored. It is suggested that paradigmatic misconceptions regarding psychometric issues feed into a spiral of presumptions that measurement training is unnecessary for doctoral students, which then in turn further reinforces misunderstandings of score integrity issues.

...read moreread less

339 citations

Journal Article•DOI•

Does Revising the Intrinsic and Extrinsic Subscales of the Minnesota Satisfaction Questionnaire Short Form Make a Difference

[...]

Robert R. Hirschfeld¹•Institutions (1)

Louisiana State University¹

01 Apr 2000-Educational and Psychological Measurement

TL;DR: The authors compared the original intrinsic and extrinsic subscales of the Minnesota Satisfaction Questionnaire short form to revised subscales using data from two samples, and found that revising the intrinsic and intrinsic subscales made little difference in the results obtained.

...read moreread less

Abstract: This study compared the original intrinsic and extrinsic subscales of the Minnesota Satisfaction Questionnaire short form to revised subscales using data from two samples. The revised subscales were formed according to critiques by several researchers. Confirmatory factor analysis of the original and revised subscales supported the discriminant validity of scores on the intrinsic and extrinsic job satisfaction measures. Several hierarchical regression models were tested that included job involvement, overall job satisfaction, and volitional absence variables, in addition to the job satisfaction components. The analyses from both samples indicated that revising the intrinsic and extrinsic subscales made little difference in the results obtained.

...read moreread less

329 citations

Journal Article•DOI•

Type I Error Rate Comparisons of Post Hoc Procedures for I j Chi-Square Tables

[...]

Paul L. MacDonald, Robert C. Gardner¹•Institutions (1)

University of Western Ontario¹

01 Oct 2000-Educational and Psychological Measurement

TL;DR: In this article, the authors used Monte Carlo methods to assess the per-contrast and experimentwise Type I error rates of two post hoc tests of cellwise residuals and four post hoc test of pairwise contrasts in 3 4 chi-square contingency tables.

...read moreread less

Abstract: The authors used Monte Carlo methods to assess the per-contrast and experimentwise Type I error rates of two post hoc tests of cellwise residuals and four post hoc tests of pairwise contrasts in 3 4 chi-square contingency tables. The six post hoc procedures were evaluated under three sample sizes and under the null hypotheses of independence and homogeneity. Results of the study indicate that the cellwise adjusted residual method provided adequate experimentwise Type I error rate control when appropriate adjustments to the alpha level were made, and the Gardner pairwise post hoc procedure provided several advantages over the other pairwise procedures. This was true for both the independence and homogeneity models.

...read moreread less

319 citations

Journal Article•DOI•

Reliability Methods: A Note on the Frequency of Use of Various Types:

[...]

Thomas P. Hogan, Amy Benjamin, Kristen L. Brezinski¹•Institutions (1)

University of Scranton¹

01 Aug 2000-Educational and Psychological Measurement

TL;DR: This article examined the frequency of use of various types of reliability coefficients for a systematically drawn sample of 696 tests appearing in the APA-published Directory of Unpublished Experimental Mental Measures.

...read moreread less

Abstract: This study examined the frequency of use of various types of reliability coefficients for a systematically drawn sample of 696 tests appearing in the APA-published Directory of Unpublished Experimental Mental Measures. Almost all articles included some type of reliability report for at least one test administration. Coefficient alpha was the over-whelming favorite among types of coefficients. Several measures treated almost universally in psychological-testing textbooks were rarely or never used. Problems encountered in the study included ambiguous designations of types of coefficients, reporting reliability based on a study other than the one cited, inadequate information about subscales, and simply incorrect recording of the information given in an original source.

...read moreread less

234 citations

Journal Article•DOI•

Development and Validation of Scores on a Two-Dimensional Workplace Friendship Scale

[...]

Ivy K. Nielsen, Steve M. Jex, Gary A. Adams¹•Institutions (1)

University of Wisconsin–Oshkosh¹

01 Aug 2000-Educational and Psychological Measurement

TL;DR: In this paper, two studies were conducted to develop and provide evidence supporting the construct validity of scores on a scale to measure two aspects of workplace friendship: friendship prevalence and friendship opportunities.

...read moreread less

Abstract: Two studies were conducted to develop and provide evidence supporting the construct validity of scores on a scale to measure two aspects of workplace friendship: friendship prevalence and friendship opportunities. In the first study, data collected from 200 part-time graduate students supported the internal consistency and proposed dimensionality of scale scores. In the second study, data were collected from a total sample of 116, which consisted of part-time graduate students and employees of three organizations. Support was provided for convergent, discriminant, and nomological validity of scale scores.

...read moreread less

233 citations

Journal Article•DOI•

Measurement error in "Big Five factors" personality assessment: Reliability generalization across studies and measures.

[...]

Chockalingam Viswesvaran¹, Deniz S. Ones•Institutions (1)

Florida International University¹

01 Apr 2000-Educational and Psychological Measurement

TL;DR: In this paper, a total of 848 coefficients of stability and 1,359 internal consistency reliabilities across the Big Five factors of personality were examined, including emotional stability, extraversion, openness to experience, Agreeableness, and conscientiousness.

...read moreread less

Abstract: Meta-analysis was used to cumulate reliabilities of personality scale scores. A total of 848 coefficients of stability and 1,359 internal consistency reliabilities across the Big Five factors of personality were examined. The frequency-weighted mean coefficients of stability were .75 (SD = .10, K = 221), .76 (SD = .12, K = 176), .71 (SD = .13, K = 139), .69 (SD = .14, K = 119), and .72 (SD = .13, K = 193) for Emotional Stability, Extraversion, Openness to Experience, Agreeableness, and Conscientiousness, respectively. The corresponding internal consistency reliabilities were .78 (SD = .11, K = 370), .78 (SD = .09, K = 307), .73 (SD = .12, K = 251), .75 (SD = .11, K = 123), and .78 (SD = .10, K = 307). Sample-size-weighted means also were computed. The dimension of personality being rated does not appear to strongly moderate either the internal consistency or the testretest reliabilities. Implications for personality assessment are discussed.

...read moreread less

222 citations

Journal Article•DOI•

Assessing the Reliability of Beck Depression Inventory Scores: Reliability Generalization across Studies

[...]

Ping Yin¹, Xitao Fan²•Institutions (2)

University of Iowa¹, Utah State University²

01 Apr 2000-Educational and Psychological Measurement

TL;DR: The reliability estimates for the Beck Depression Inventory (BDI) scores across studies were accumulated and summarized in a meta-analysis as discussed by the authors, indicating that the logic of "test score reliability" generally has not prevailed in clinical psychology regarding application of BDI.

...read moreread less

Abstract: The reliability estimates for the Beck Depression Inventory (BDI) scores across studies were accumulated and summarized in a meta-analysis. Only 7.5% of the articles reviewed reported meaningful reliability estimates, indicating that the logic of “test score reliability” generally has not prevailed in clinical psychology regarding application of BDI. Analyses revealed that for BDI, the measurement error due to time sampling as captured by test-retest reliability estimate is considerably larger than the measurement error due to item heterogeneity and content sampling as captured by internal consistency reliability estimate. Also, reliability estimates involving substance addicts were consistently lower than reliability estimates involving normal subjects, possibly due to restriction of range problems. Correlation analyses revealed that standard errors of measurement (SEMs) were not correlated with reliability estimates but were substantially related to standard deviations of BDI scores, suggesting that SEM...

...read moreread less

207 citations

Journal Article•DOI•

Life Event Checklists: Revisiting the Social Readjustment Rating Scale after 30 Years

[...]

Judith A. Scully, Henry L. Tosi¹, Kevin Banning²•Institutions (2)

University of Florida¹, Auburn University at Montgomery²

01 Dec 2000-Educational and Psychological Measurement

TL;DR: The Social Readjustment Rating Scale (SRRS) is one of the most widely cited measurement instruments in the stress literature as discussed by the authors, and it has been widely used for stress research.

...read moreread less

Abstract: Despite criticism, the Social Readjustment Rating Scale (SRRS) is one of the most widely cited measurement instruments in the stress literature. This research assesses several criticisms of the SRRS after years of widespread use. Specifically, the authors evaluate content-related criticisms, including differential prediction of desirable relative to undesirable life events, controllable relative to uncontrollable life events, and contaminated relative to uncontaminated life event items. On balance, the authors find that the SRRS is a useful tool for stress researchers and practitioners.

...read moreread less

Journal Article•DOI•

Design Specification Issues in Time-Series Intervention Models

[...]

Bradley E. Huitema, Joseph W. McKean¹•Institutions (1)

Western Michigan University¹

01 Feb 2000-Educational and Psychological Measurement

TL;DR: In this article, it has been recognized that the two-phase version of the interrupted time-series design can be frequently modeled using a four-parameter design matrix, however, there are differences across writers in the details of the recommended design matrices to be used in the estimation of the four parameters of the model.

...read moreread less

Abstract: It has been recognized that the two-phase version of the interrupted time-series design can be frequently modeled using a four-parameter design matrix. There are differences across writers, however, in the details of the recommended design matrices to be used in the estimation of the four parameters of the model. Various writers imply that different methods of specifying the four-parameter design matrix all lead to the same conclusions; they do not. The tests and estimates for level change are dramatically different under the various seemingly equivalent design specifications. Examples of egregious errors of interpretation are presented and recommendations regarding the correct specification of the design matrix are made. The recommendations hold whether the model is estimated using ordinary least squares (for the case of approximately independent errors) or some more complex time-series approach (for the case of autocorrelated errors).

...read moreread less

Journal Article•DOI•

Sample Compositions and Variabilities in Published Studies versus Those in Test Manuals: Validity of Score Reliability Inductions

[...]

Tammi Vacha-Haase, Lori R. Kogan¹, Bruce Thompson²•Institutions (2)

Colorado State University¹, Baylor College of Medicine²

01 Aug 2000-Educational and Psychological Measurement

TL;DR: This paper investigated empirically exactly how dissimilar in both composition and variability samples inducting reliability coefficients from prior studies were from the cited prior samples from which coefficients were generalized, and concluded that reliability, once proven, is immutable.

...read moreread less

Abstract: As measurement specialists, we have done a disservice to both ourselves and our profession by habitually referring to “the reliability of the test,” or saying that “the test is reliable.” This has created a mind-set implying that reliability, once proven, is immutable. More important, practitioners and scholars need not know measurement theories if they may simply rely on the reliability purportedly intrinsic within all uses of established measures. The present study investigated empirically exactly how dissimilar in both composition and variability samples inducting reliability coefficients from prior studies were from the cited prior samples from which coefficients were generalized.

...read moreread less

Journal Article•DOI•

Reliability Generalization of the Neo Personality Scales

[...]

John C. Caruso

01 Apr 2000-Educational and Psychological Measurement

TL;DR: Reliability generalization is a meta-analytic method for examining the variability in the reliability of scores by determining which sample characteristics are related to differences in score reliability as discussed by the authors, and it was found that there was a large amount of variability of the NEO scores, both between and within personality domains.

...read moreread less

Abstract: A reliability generalization of 51 samples employing one of the NEO personality scales was conducted Reliability generalization is a meta-analytic method for examining the variability in the reliability of scores by determining which sample characteristics are related to differences in score reliability It was found that there was a large amount of variability in the reliability of NEO scores, both between and within personality domains The sample characteristics that are related to score reliability were dependent on NEO domain Agreeableness scores appear to be the weakest of the domains assessed by the NEO scales in terms of reliability, particularly in clinical samples, for male-only samples, and when temporal consistency was the criterion for reliability The reliability of Openness to Experience scores was low when the NEO-Five Factor Inventory was used The advantages of conceptualizing reliability as a property of scores, and not tests, are discussed

...read moreread less

Journal Article•DOI•

Development and Construct Validity of Scores on the Community Service Attitudes Scale

[...]

Ann Harris Shiarella, Anne M. McCarthy¹, Mary L. Tucker²•Institutions (2)

Colorado State University¹, Ohio University²

01 Apr 2000-Educational and Psychological Measurement

TL;DR: The Community Service Attitudes Scale (CSAS) as mentioned in this paper was developed based on Schwartz's helping behavior model to measure college students' attitudes about community service and found that the CSAS scale scores were positively correlated with gender, college major, community service experience, and intentions to engage in community service.

...read moreread less

Abstract: This study reports the multistage development of the Community Service Attitudes Scale (CSAS), an instrument for measuring college students’attitudes about community service. The CSAS was developed based on Schwartz’s helping behavior model. Scores on the scales of the CSAS yielded strong reliability evidence (coefficient alphas ranging from .72 to .93). Principal components analysis yielded results consistent with the Schwartz model. In addition, the CSAS scale scores were positively correlated with gender, college major, community service experience, and intentions to engage in community service. The CSAS will be useful to researchers for conducting further research on the effects of service learning and community service experiences for students.

...read moreread less

Journal Article•DOI•

Bootstrap resampling approaches for repeated measure designs: relative robustness to sphericity and normality violations

[...]

Ilona Berkovits, Gregory R. Hancock¹, Jonathan Nevitt²•Institutions (2)

University of Maryland, College Park¹, University of Maryland, Baltimore²

01 Dec 2000-Educational and Psychological Measurement

TL;DR: In this article, the authors proposed a bootstrap-F method for one-way repeated measure ANOVA design using a Monte Carlo approach in which sample size, nonsphericity, and sample complexity were taken into account.

...read moreread less

Abstract: The current article proposes a bootstrap-F method and a bootstrap-T2 method for use in a one-way repeated measure ANOVA design. Using a Monte Carlo approach in which sample size, nonsphericity, and...

...read moreread less

Journal Article•DOI•

A Meta-Analytic Investigation of the Susceptibility of Integrity Tests to Faking and Coaching

[...]

George M. Alliger, Stephen A. Dwight

01 Feb 2000-Educational and Psychological Measurement

TL;DR: In this paper, a meta-analytically synthesized set of studies that have investigated the extent to which individuals can inflate their integrity test scores when coached or instructed to fake good were investigated.

...read moreread less

Abstract: Although it has been consistently found that test takers can effectively fake good on self-report noncognitive measures when instructed to do so, not all measures are equally susceptible. The present review meta-analytically synthesized studies that have investigated the extent to which individuals can inflate their integrity test scores when coached or instructed to fake good. Both overt and personality-based integrity tests were investigated. Results indicated that the overt test was especially susceptible to both fake good (d = 0.90) and coaching (d = 1.32) instructions. Personality-based measures appeared to be more resistant to both faking good (d = 0.38) and coaching (d = 0.36). Implications of these results for integrity testing are discussed.

...read moreread less

Journal Article•DOI•

A Validity Study of Scores on the Multigroup Ethnic Identity Measure Based on a Sample of Academically Talented Adolescents

[...]

Frank C. Worrell¹•Institutions (1)

Pennsylvania State University¹

01 Jun 2000-Educational and Psychological Measurement

TL;DR: This article examined the validity of scores on the Multigroup Ethnic Identity Measure (MEIM) in a group of 275 academically talented adolescents attending a summer enrichment program and found that the MEIM was a two-factor measure.

...read moreread less

Abstract: This study examined the validity of scores on the Multigroup Ethnic Identity Measure (MEIM) in a group of 275 academically talented adolescents attending a summer enrichment program. The two-factor...

...read moreread less

Journal Article•DOI•

The Historical Growth of Statistical Significance Testing in Psychology--and Its Future Prospects.

[...]

Raymond Hubbard¹, Patricia A. Ryan•Institutions (1)

Drake University¹

01 Oct 2000-Educational and Psychological Measurement

Journal Article•DOI•

Group Overlap as a Basis for Effect Size

[...]

Carl J. Huberty, Laureen L. Lowman¹•Institutions (1)

University of Georgia¹

01 Aug 2000-Educational and Psychological Measurement

TL;DR: In this paper, the authors proposed an improvement-over-chance classification (I) index, which can be used in situations that are univariate, multivariate, homogeneous, heterogeneous, or any combination thereof.

...read moreread less

Abstract: The research content of interest herein is that of comparison of means. It is generally recognized that statistical test p values do not adequately reflect mean comparison assessments. What is desirable is some effect-size assessment. The typical effect-size indexes used in mean comparisons are restricted to the variance homogeneity condition. What is proposed here is the use of the group-overlap concept. Group overlap may be assessed via prediction of group assignment, that is, using predictive discriminant analysis. The effect-size index proposed is that of improvement-over-chance classification (I). The I index may be used in situations that are univariate, multivariate, homogeneous, heterogeneous, or any combination thereof. Some very tentative suggestions for cutoffs of I values to define index magnitude for some data situations are made.

...read moreread less

Journal Article•DOI•

Computerized Adaptive Testing for Classifying Examinees into three Categories

[...]

Theodorus Johannes Hendrikus Maria Eggen, G.J.J.M. Straetmans

01 Oct 2000-Educational and Psychological Measurement

TL;DR: The results of the study are that a reduction of at least 22% in the mean number of items can be expected in a computerized adaptive test (CAT) compared to an existing paper-and-pencil placement test.

...read moreread less

Abstract: The objective of this study was to explore the possibilities for using computerized adaptive testing in situations in which examinees are to be classified into one of three categories. Testing algorithms with two different statistical computation procedures are described and evaluated. The first computation procedure is based on statistical testing and the other on statistical estimation. Item selection methods based on maximum information (MI) considering content and exposure control are considered. The measurement quality of the proposed testing algorithms is reported. The results of the study are that a reduction of at least 22% in the mean number of items can be expected in a computerized adaptive test (CAT) compared to an existing paper-and-pencil placement test. Furthermore, statistical testing is a promising alternative to statistical estimation. Finally, it is concluded that imposing constraints on the MI selection strategy does not negatively affect the quality of the testing algorithms.

...read moreread less

Journal Article•DOI•

Validatieon of Scores on the Psychological Empowerment Scale: A Measure of Empowerment for Parents of Children with a Disability:

[...]

Theresa M. Akey, Janet Marquis¹, Margaret E. Ross²•Institutions (2)

University of Kansas¹, Auburn University²

01 Jun 2000-Educational and Psychological Measurement

TL;DR: In this paper, the authors report the results of several psychometric analyses that were conducted to provide evidence of construct validity for scores on a measure of psychological empowerment, the Psychological Empowerment Scale (PES), for parents of children with a disability.

...read moreread less

Abstract: This article reports the results of several psychometric analyses that were conducted to provide evidence of construct validity for scores on a measure of psychological empowerment, the Psychological Empowerment Scale (PES), for parents of children with a disability. Confirmatory factor analyses were conducted to evaluate the internal structure of the PES and the reliability of its scores. The results of the confirmatory factor analyses provided evidence of convergent and discriminant validity for the scores from the four subscales underlying the PES: (a) attitudes of control and competence, (b) cognitive appraisals of critical skills and knowledge, (c) formal participation in organizations, and (d) informal participation in social systems and relationships. Reliability coefficients for the subscale scores and total scale score ranged from .90 to .97. In addition, the PES scores were correlated with other empowerment-related measures. The results of these correlational analyses and group discrimination an...

...read moreread less

Journal Article•DOI•

Traditional, Likert, and Simplified Measures of Self-Efficacy

[...]

Todd J. Maurer, Kimberly D. Andrews¹•Institutions (1)

Georgia Institute of Technology¹

01 Dec 2000-Educational and Psychological Measurement

TL;DR: In this article, three methods of measuring selfefficacy were compared: traditional, Likert, and a simplified scale Scores on the three scales had highly similar reliability and validity and were strongly related.

...read moreread less

Abstract: Three methods of measuring self-efficacy were compared: traditional, Likert, and a simplified scale Scores on the three scales had highly similar reliability and validity and were strongly related The Likert and simplified scales required 50% and 70% (respectively) fewer participant responses than the traditional format, whereas the traditional and Likert formats provided more specific diagnostic information

...read moreread less

Journal Article•DOI•

Psychometrics versus Datametrics: Comment on Vacha-Haase’s “Reliability Generalization” Method and Some Epm Editorial Policies

[...]

Shlomo S. Sawilowsky

01 Apr 2000-Educational and Psychological Measurement

TL;DR: This article reviewed issues regarding test reliability, which is psychometric terminology, and score reliability which is score-centric terminology, in part due to some EPM editorial policies and Vacha-Haase's “reliability generalization” proposal.

...read moreread less

Abstract: The present article reviews issues regarding test reliability, which is psychometric terminology, and score reliability, which is score-centric terminology. These issues have arisen, in part, due to some EPM editorial policies and Vacha-Haase’s “reliability generalization” proposal. The article includes (a) a brief historical review of reliability terminology, (b) discussion on the emergence of datametrics (loosely defined as the application of psychometry to scores as opposed to an instrument) including a review of textbook authors’uses of psychometric versus datametric terminology, (c) discussion of problems with datametrics, and (d) a critique of Vacha-Haase’s proposed meta-analytic reliability generalization via dummy-coded regression. The article concludes with a brief summary that presents several suggestions.

...read moreread less

Journal Article•DOI•

Effects of response order on likert-type scales

[...]

Li-Jen Weng¹, Chung Ping Cheng¹•Institutions (1)

National Taiwan University¹

01 Dec 2000-Educational and Psychological Measurement

TL;DR: This paper investigated whether the change of response order in a Likert-type scale altered participant responses and scale characteristics and found that response order had no substantial influence on participant responses or scale characteristics.

...read moreread less

Abstract: The study investigated whether the change of response order in a Likert-type scale altered participant responses and scale characteristics. Response order is the order in which options of a Likert-type scale are offered. The sample included 490 college students and 368 junior high school students. Scale means with different response orders were compared. Structural equation modeling was used to test the invariance of interitem correlations, covariances, and factor structure across scale formats and educational levels. The results indicated that response order had no substantial influence on participant responses and scale characteristics. Motivating participants and avoiding ambiguous items may minimize possible effects of scale format on participant responses and scale properties.

...read moreread less

Journal Article•DOI•

A Quantitative Review of the Effect of Computerized Testing on the Measurement of Social Desirability

[...]

Stephen A. Dwight¹, Melissa E. Feigelson²•Institutions (2)

Hay Group¹, University at Albany, SUNY²

01 Jun 2000-Educational and Psychological Measurement

TL;DR: In this paper, a meta-analysis was conducted to determine the extent to which computer administration of a measure influences socially desirable responding, and a small but statistically significant effect was found for impression management, with impression management being lower when assessed by computer.

...read moreread less

Abstract: A meta-analysis was conducted to determine the extent to which the computer administration of a measure influences socially desirable responding. Social desirability was defined as consisting of two components: impression management and self-deceptive enhancement. A small but statistically significant effect (d = -0.08) was found for impression management, with impression management being lower when assessed by computer. Correlational analysis revealed, however, that the strength of the effect of computer administration on impression management appeared to diminish over time such that more recent studies have found small or no effects. Consistent with its conceptualization, reports of self-deceptive enhancement did not differ by testing format. The implications of these findings are discussed in terms of how they contribute to the explication of the construct of social desirability and cross-mode equivalence.

...read moreread less

Journal Article•DOI•

Statistical Significance with Comments by Editors of Marketing Journals: The Historical Growth of Statistical Significance Testing in Psychology—and its Future Prospects

[...]

Raymond Hubbard¹, Patricia A. Ryan²•Institutions (2)

Drake University¹, Colorado State University²

01 Oct 2000-Educational and Psychological Measurement

TL;DR: The historical growth in the popularity of statistical significance testing is examined using a random sample of annual data from 12 American Psychological Association (APA) journals as mentioned in this paper, and the results replicate and extend the findings of Hubbard, Parsa, and Luthy, who used data from only the Journal of Applied Psychology.

...read moreread less

Abstract: The historical growth in the popularity of statistical significance testing is examined using a random sample of annual data from 12 American Psychological Association (APA) journals. The results replicate and extend the findings of Hubbard, Parsa, and Luthy, who used data from only the Journal of Applied Psychology. The results also confirm Gigerenzer and Murray’s allegation that an inference revolution occurred in psychology between 1940 and 1955. An assessment of the future prospects for statistical significance testing is offered. It is concluded that replication with extension research, and its connections with meta-analysis, is a better vehicle for developing a cumulative knowledge base in the discipline than statistical significance testing. It is conceded, however, that statistical significance testing is likely here to stay.

...read moreread less

Journal Article•DOI•

A Test of Revised Scales for the Meyer and Allen (1991) Three-Component Commitment Construct:

[...]

Robert A. Culpepper¹•Institutions (1)

Stephen F. Austin State University¹

01 Aug 2000-Educational and Psychological Measurement

TL;DR: Despite evidence that Allen and Meyer's scales measure three-component commitment in a reliable and valid manner, the literature contains recurring criticism of several scale items as mentioned in this paper, and the authors of this paper are aware of these recurring criticism.

...read moreread less

Abstract: Despite evidence that Allen and Meyer’s scales measure three-component commitment in a reliable and valid manner, the literature contains recurring criticism of several scale items. Criticisms refe...

...read moreread less

Journal Article•DOI•

Structural Equation Models and the Regression Bias for Measuring Correlates of Change

[...]

Robert A. Cribbie¹, John Jamieson²•Institutions (2)

University of Manitoba¹, Lakehead University²

01 Dec 2000-Educational and Psychological Measurement

TL;DR: This article showed that regression and ANCOVA both exhibit a directional bias when measuring correlates of change, which confounds the comparison of changes between naturally occurring groups with large pretest differances.

...read moreread less

Abstract: ANCOVA and regression both exhibit a directional bias when measuring correlates of change. This bias confounds the comparison of changes between naturally occurring groups with large pretest differ...

...read moreread less

Journal Article•DOI•

Assessment of the Measurement Equivalence of a Spanish Translation of the 16PF Questionnaire

[...]

Barbara B. Ellis¹, Alan D. Mead•Institutions (1)

University of Houston¹

01 Oct 2000-Educational and Psychological Measurement

TL;DR: The differential functioning of items and tests (DFIT) framework was used to examine the measurement equivalence of a Spanish translation of the Sixteen Personality Factor (16PF) Questionnaire as mentioned in this paper.

...read moreread less

Abstract: The differential functioning of items and tests (DFIT) framework was used to examine the measurement equivalence of a Spanish translation of the Sixteen Personality Factor (16PF) Questionnaire. The questionnaire was administered in English to English-speaking Anglo-Americans and English-dominant Hispanic Americans and in Spanish to Spanish-dominant Hispanic Americans and Spanish-speaking Mexican nationals. As expected, the compensatory differential item functioning/differential test functioning (CDIF/DTF) procedure, which accounts for CDIF at the scale level, flagged fewer items as differential functioning than did the noncompensatory differential item functioning (NCDIF) procedure. Results did not support the hypothesis that DIF would be greatest in the Anglo versus Spanish-speaker comparison followed by the Hispanic versus Spanish-speaker comparison and least in the Anglo versus Hispanic comparison. Advantages of using the DFIT framework in assessing test translations, especially for test developers, ar...

...read moreread less