scispace - formally typeset
Search or ask a question

Showing papers in "Educational and Psychological Measurement in 1994"


Journal ArticleDOI
TL;DR: In this paper, two alternative methods for parceling questionnaire items for use in confirmatory analyses are presented, one requires that questionnaire items must pass a minimum standard of reliability and provide indications of unidimensionality to be retained for analysis.
Abstract: Two alternative methods for parceling questionnaire items for use in confirmatory analyses are presented. The first method requires that parcels must (a) pass a minimum standard of reliability and (b) provide indications of unidimensionality to be retained for analysis. The second method requires that parcels be equally representative of the multiple aspects of a domain. The parcels may then serve as adequate indicators for the general construct. The latter method is consistent with the rationale underlying aggregation of measures, a procedure currently recommended for improving the psychometric properties of behavioral measures of personality. The two methods for parceling and a comparison are illustrated with an empirical example.

954 citations


Journal ArticleDOI
TL;DR: In this article, the development of general scales to measure self-efficacy and outcome expectancy at both the individual and group level is described, and factor analysis of an initial application of these scales is presented.
Abstract: This study describes the development of general scales to measure self-efficacy and outcome expectancy at both the individual and group level. Factor analysis of an initial application of these sca...

276 citations


Journal ArticleDOI
TL;DR: A 12-item short form of the APM has been proposed in this article, which demonstrates psychometric properties similar to the long form, but with a substantially shorter administration time.
Abstract: The Raven Advanced Progressive Matrices Test (APM) is a popular measure of higher order general cognitive ability (g). Its use in both basic research and applied settings is partially attributable to its apparent low level of culture loading. However, a major drawback curtailing more widespread use is its length; the APM is a 36-item power test with an administration time of 40-60 minutes. The present study reports on the development of a 12-item short form of the APM that demonstrates psychometric properties similar to the long form, but with a substantially shorter administration time. The ultimate goal is to provide researchers and practitioners with a version of the APM that can better meet their needs by providing a sound assessment of general intelligence in a shorter time frame than is available with the present form.

276 citations


Journal ArticleDOI
TL;DR: In this article, a 30-item computer self-efficacy scale is validated and used to examine the influence of computer training on computer selfefficacy, and the scale was used to collect data from 224 unsupervised learners.
Abstract: In this article, a 30-item computer self-efficacy scale is validated and used to examine the influence of computer training on computer self-efficacy. The scale was used to collect data from 224 un...

248 citations


Journal ArticleDOI
TL;DR: It is concluded that ComQol constitutes a unique and comprehensive measure of the quality of life construct.
Abstract: This article describes the development and validation of a new 35-item, multidimensional Comprehensive Quality of Life Scale (ComQol). Psychometric properties of the scale are described. Consistenc...

177 citations


Journal ArticleDOI
TL;DR: In this paper, the latent structure of items taken from the Multifactor Leadership Questionnaire (MLQ) was investigated, and item-level confirmatory factor analyses established the superior performance of items.
Abstract: Two studies investigate the latent structure of items taken from the Multifactor Leadership Questionnaire (MLQ). In the first study, item-level confirmatory factor analyses establish the superiorit...

175 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigated the relationship between cognitive test elements and the sample to determine the causes of the low alpha and found that unbundling both the test and sample provided statistically sound justification for continuing to use all research data for a highly homogeneous sample.
Abstract: The study involved 494 auditors from five Big Six accounting firms. Although two of the cognitive tests used in the research had Cronbach's alphas in the .70 to .90 range, the Defining Issues Test had an alpha of .35. This research investigated the relationship between cognitive test elements and the sample to determine the causes of the low alpha. The procedure described in the research is a method that can be used to validate research data when Cronbach's alpha is below .70. The research findings indicate that unbundling both the test and the sample provide statistically sound justification for continuing to use all research data for a highly homogeneous sample.

159 citations


Journal ArticleDOI
TL;DR: This article introduced a graphical method to help identify the important properties of an ambivalence index, and compared five different indexes and found that only two of them possess satisfactory properties, and empirical differences among the indexes were relatively small.
Abstract: Ambivalence is expressed when a person endorses both positive and negative attitudinal positions Ambivalence is commonly measured by having respondents provide separate ratings of the positive and negative components of their attitudes A number of different equations have been proposed for combining the two ratings into a numerical index of ambivalence However, similarities and differences among the equations are not well understood This article introduces a graphical method to help identify the important properties of an ambivalence index Five ambivalence indexes were compared and contrasted by using this method, and only two were found to possess satisfactory properties The same indexes were also computed from respondents' ratings of 26 attitude topics, and it was found that empirical differences among the indexes were relatively small

157 citations


Journal ArticleDOI
TL;DR: In this paper, the authors used Scholastic Aptitude Test score, achievement by average grade earned in high school, and procrastination by score on the Procrastination Assessment Scale (PASS) as predictors of college performance.
Abstract: Ability, high school achievement, and procrastinatory behavior are tested as predictors of college performance in 194 women and 54 men. Ability was operationalized by total Scholastic Aptitude Test score, achievement by average grade earned in high school, and procrastination by score on the Procrastination Assessment Scale (PASS). It was hypothesized that procrastination could account for variance beyond that explained by ability and high school achievement in predicting college grade point average (GPA). Self-handicapping, a form of excuse making, was also included as a predictor. Results showed that procrastination does account for a significant portion of variance in college grades beyond that explained by ability and high school grades. For men, high school achievement was the strongest predictor of college performance; for women, ability was the strongest predictor. Self-handicapping did not account for any variance in GPA.

134 citations


Journal ArticleDOI
TL;DR: In this paper, the psychometric properties of an instrument (i.e., the Goals Inventory) that measured learning and performance goal orientations were investigated, and test-retest reliability estimates for the...
Abstract: This study investigated the psychometric properties of an instrument (i.e., the Goals Inventory) that measured learning and performance goal orientations. Test-retest reliability estimates for the ...

125 citations


Journal ArticleDOI
TL;DR: In this article, the authors describe the development of a brief, reliable, valid, and easy-to-administer measure of each of these psychological needs, and demonstrate the predictive utility for the AFS scales by showing that each scale predicts both self-report and behavioral measures of intrinsic motivation.
Abstract: According to self-determination theory, intrinsic motivation arises from three psychological needs: self-determination, competence, and relatedness. In the present article, the authors describe the development of a brief, reliable, valid, and easy-to-administer measure of each of these psychological needs. (A fourth scale to measure tension is included.) Normative data, coefficient alphas, and evidence for factorial and external validity for each of the Activity-Feeling States (AFS) scales are provided. Further, predictive utility for the AFS scales is demonstrated by showing that each scale predicts both self-report and behavioral measures of intrinsic motivation.

Journal ArticleDOI
TL;DR: In this paper, the internal consistency as well as the factorial and discriminant validity of the Maslach Burnout Inventory (MBI) in a sample of 326 Dutch secondary teachers was investigated.
Abstract: The study investigates the internal consistency as well as the factorial and discriminant validity of the Maslach Burnout Inventory (MBI) in a sample of 326 Dutch secondary teachers. Compared to ot...

Journal ArticleDOI
TL;DR: In this article, examinee responses were generated to simulate both uniform and non-uniform DIF, and a standard Mantel-Haenszel (MH) procedure was used first, examinees were split into two samples by breaking the full sample at approximately the middle of the test score distribution.
Abstract: The Mantel-Haenszel (MH) procedure has become one of the most popular procedures for detecting differential item functioning (DIF). One of the most troublesome criticisms of this procedure is that whereas detection rates for uniform DIF are very good, the procedure is not sensitive to nonuniform DIF. In this study, examinee responses were generated to simulate both uniform and nonuniform DIF. A standard MH procedure was used first. Then, examinees were split into two samples by breaking the full sample at approximately the middle of the test score distribution. The tests were then reanalyzed, first with the low-performing sample and then with the high-performing sample. This variation improved detection rates of nonuniform DIF considerably over the total sample procedure without increasing the Type I error rate. Items with the largest differences in discrimination and difficulty parameters were most likely to be identified.

Journal ArticleDOI
TL;DR: In this paper, the Armed Services Vocational Aptitude Battery (ASVAB) has been used in its current item and content form for more than a decade and its latent structure has never been confirmed.
Abstract: : The Armed Services Vocational Aptitude Battery (ASVAB), has been used in its current item and content form for more than a decade. Its latent structure, although explored in factor analyses, has never been confirmed. Several confirmatory factor analyses were conducted on Form 8a in a nationally representative sample. These included a g-only model, a three-factor hierarchical Vernon-like model, 2 four-factor first-order models, and 2 four-factor hierarchical models. Based on fit indexes, simple structure, and parsimony in parameter estimation, the three-factor hierarchical model was chosen to represent the data. The higher-order factor was psychometric g, and the first-order factors were interpreted as Speed, Verbal/Math, and Technical Knowledge. The latter two factors were similar to Vernon factors of Verbal/Educational and Practical.

Journal ArticleDOI
TL;DR: This paper investigated the effects of non-randomly missing data in two-predictor regression analyses and the differences in the effectiveness of five common treatments of missing data on estimates of R2 and of each of the two standardized regression weights.
Abstract: This research is an investigation of the effects of nonrandomly missing data in two-predictor regression analyses and the differences in the effectiveness of five common treatments of missing data on estimates of R2 and of each of the two standardized regression weights. Bootstrap samples of 50, 100, and 200 were drawn from three sets of actual field data. Nonrandomly missing data were created within each sample, and the parameter estimates were compared with those obtained from the same samples with no missing data. The results indicated that three imputation procedures (mean substitution, simple and multiple regression imputation) produced biased estimates of R2 and both regression weights. Two deletion procedures (listwise and pairwise) provided accurate parameter estimates with up to 30% of the data missing.

Journal ArticleDOI
TL;DR: In this article, the authors present evidence from three samples, one of graduate students at both doctoral and master's level, another entirely of first year doctoral students, and the other solely of master's students, to assess the psychometric characteristics of a theory-driven measure of perceived stress for graduate students, the Graduate Stress Inventory-Revised (GSI-R).
Abstract: This article presents evidence from three samples, one of graduate students at both doctoral and master's level, another entirely of first-year doctoral students, and the other solely of master's students, to assess the psychometric characteristics of a theory-driven measure of perceived stress for graduate students, the Graduate Stress Inventory-Revised (GSI-R). Results of the first study allowed for the evaluation of the original scale, the Graduate Stress Inventory (GSI), and led to a few deletions and improvements in the wording of a significant number of items. It also indicated that the GSI-R possessed moderate to high internal-consistency reliability. Three-factor structures were identified. Results of the second study determined the concurrent validity of the GSI-R using Spielberger's Trait Anxiety scale. The third study showed adequate retest reliability of the GSI-R. The GSI-R is suggested for examining the role of appraised stress in the lives of graduate students.

Journal ArticleDOI
TL;DR: The use of nominal-level analysis of four primary learning styles (PLS) based on the Learning Style Inventory demonstrated their discriminant/convergent validity but not the validity of Kolb's learning style types (LST).
Abstract: The use of nominal-level analysis of four primary learning styles (PLS) (i.e., doing, thinking, watching, and feeling), based on the Learning Style Inventory demonstrated their discriminant/convergent validity but not the validity of Kolb's learning style types (LST) (i.e., accommodator, diverger, converger, and assimilator). The LST typology is derived from the difference of two sets of ipsatively scored variables-a circumstance that contributes to its lack of validity, whereas the PLS categories are based directly on the rank ordering given by subjects. The PLS category, thinking, was associated with having higher scores on a mental ability measure, whereas doing was associated with higher levels of learning and performance on an origami paper-folding task (i.e., an archetypical doing task).

Journal ArticleDOI
TL;DR: The authors examined the differences in static and dynamic levels of self-efficacy, personal goals, and performance between outcome versus process selfefficacy in an academic achievement setting and found that personal goals directly affect self-confidence, while self-efficiency does not directly affect change in performance.
Abstract: Two unexplored issues in goal theory concern (a) the different predictive utility of outcome versus process self-efficacy and (b) the differences in static versus dynamic levels of self-efficacy, personal goals, and performance. This study examined these issues, using repeated measures of outcome versus process self-efficacy, personal goals, and performance over 3 months from 252 management students in an academic achievement setting. After establishing a baseline to replicate past research concerning self-efficacy, personal goal, and performance relationships, determinants of change in these variables were investigated. Outcome self-efficacy results in higher validity for predicting personal goals and performance than process self-efficacy; however, process self-efficacy significantly predicts outcome self-efficacy. Additionally, personal goals directly affect self-efficacy, self-efficacy does not directly affect change in performance, and self-efficacy, personal goals, and performance reciprocally affec...

Journal ArticleDOI
TL;DR: The similarities between multivariate multiple regression and canonical correlation analysis have been inconsistently acknowledged in the literature as discussed by the authors, although the stated objectives of these two analyses seem different, aspects of the analyses themselves are mathematically equivalent.
Abstract: The similarities between multivariate multiple regression and canonical correlation analysis have been inconsistently acknowledged in the literature. The present article shows that, although the stated objectives of these two analyses seem different, aspects of the analyses themselves are mathematically equivalent. A multivariate multiple regression analysis that incorporates discriminant analysis as part of its post hoc investigation will produce identically the same results as a canonical correlation analysis in terms of omnibus significance testing, variable weighting schemes, and dimension reduction analysis. A numerical example is provided.

Journal ArticleDOI
TL;DR: For example, this paper found that removal of a nonfunctioning option resulted in a slight, non significant overall increase in item difficulty and no significant differences in item discrimination, while a test consisting of items with a non-functioning item removed was nearly equally reliable compared with a set of items having one or more dysfunctional distracters.
Abstract: This study addressed the hypothesis that, after the systematic elimination of nonfunctioning options, four-option test items would perform as well as five-option test items having one or more dysfunctional distracters. The study consisted of two investigations involving an examination administered to 700 candidates for certification in a medical specialty. In the first investigation, it was found that content experts exhibited a high degree of accuracy in identifying nonfunctioning options where the criterion was empirical item analysis data. The second phase of the study compared five-option versions of multiple-choice items with four-option versions in which a nonfunctioning option had been removed. Results indicated that (a) removal of a nonfunctioning option resulted in a slight, non significant overall increase in item difficulty and no significant differences in item discrimination, (b) a test consisting of items with a nonfunctioning option removed was nearly equally reliable compared with a set of...

Journal ArticleDOI
TL;DR: The authors examined the attitudes of 191 regular classroom teachers from three states toward children who are limited in their English proficiency (LEP) and found that the majority of the teachers had negative attitudes toward children with limited English proficiency.
Abstract: This article examines the attitudes of 191 regular classroom teachers from three states toward children who are limited in their English proficiency (LEP). The 13-item Language Attitudes of Teacher...

Journal ArticleDOI
TL;DR: The Six-Factor Self-Concept Scale as discussed by the authors is a multidimensional measure of adult self-concept that was designed to have broad applicability across life settings, roles, and activities.
Abstract: The Six-Factor Self-Concept Scale is a multidimensional measure of adult self-concept that was designed to have broad applicability across life settings, roles, and activities. Developed through a series of exploratory factor analytic studies, the measure consists of six subscales: Likability, Morality, Task Accomplishment, Giftedness, Power, and Vulnerability. Confirmatory factor analysis revealed that the 36-item, six-factor structure provided a reasonably good fit for data derived from a sample of 365 noncollege adults. Factor structures of correlation matrixes for men and women and for undergraduates and noncollege adults were highly similar. The subscales were tested for distinctiveness, internal consistency, test-retest reliability, convergent validity, and divergent validity. Evidence for all five qualities is reported. The subscales differentially predicted childhood memories, recent behaviors and events, and ratings by knowledgeable observers.

Journal ArticleDOI
John W. Young1
TL;DR: For example, the authors found that the use of a single regression equation generally under predicts the grades of women and over predicts the grade of minority students in a recent cohort of students at a state university.
Abstract: Differential predictive validity in forecasting academic performance in college has been observed for a number of years. A common finding is that the use of a single regression equation generally under predicts the grades of women and over predicts the grades of minority students. This study was undertaken to determine whether differential prediction was evident for a recent cohort of students at a state university. The results confirm that the phenomenon still exists. For women, but not for minority students, the difference in predictive validity appears to be related to the effects of course selection.

Journal ArticleDOI
TL;DR: In this paper, the authors examined relationships among two measures of creativity level, the CPI Creativity Scale (CPI-CT) and the MBTI Creativity Index (MBTI-CI), and a measure of creativity style, the Kirton Adaption-Innovation Inventory (KAI).
Abstract: Relationships were examined among two measures of creativity level, the CPI Creativity Scale (CPI-CT) and the MBTI Creativity Index (MBTI-CI), and a measure of creativity style, the Kirton Adaption-Innovation Inventory (KAI). Scores on these scales from various managerial samples were used in the analyses. With sample sizes ranging from 431 to 12,115, significant intercorrelations were found among the three measures. Contrary to expectations, KAI scores were related to creativity levels as measured by the CPI-CT and the MBTI-CI. Additionally, gender was found to account for little variance in MBTI-CI scores.

Journal ArticleDOI
TL;DR: In this paper, the authors examined the factor structure of a revised version of a measure of organizational citizenship behavior (OCB) and assessed its validity against an objective behavioral criterion, which was whether or not employees returned an attitude survey.
Abstract: This study examined the factor structure of a revised version of a measure of organizational citizenship behavior (OCB) and assessed its validity against an objective behavioral criterion. Ratings of employees' OCB were provided by managers of 16 restaurants in the fast service industry. The objective measure of behavior was whether or not employees returned an attitude survey. Factor analysis suggested that the revised OCB instrument adequately assesses two dimensions of OCB: altruism and conscientiousness. Further, as expected, survey respondents had higher mean levels of OCB than did nonrespondents. The results suggest that managers can perceive and assess subtle components of job performance with reasonable accuracy.

Journal ArticleDOI
TL;DR: In this paper, the authors extended the validation effort by using the Learning Style Inventory (LSI) to confirm predicted learning abilities and testing Kolb's hypothesis on the relationship between learning styles and educational backgrounds.
Abstract: Although most of the validation studies on Kolb's revised Learning Style Inventory (LSI) focused on the internal consistency and construct validity of the scales, the present study extended the validation effort by (a) using the LSI to confirm predicted learning abilities and (b) testing Kolb's hypothesis on the relationship between learning styles and educational backgrounds. Based on a review of the local education system and culture, it was hypothesized that Singaporean students would score high in abstract conceptualization ability and low in concrete-experience ability. This was confirmed by the findings that also supported Kolb's hypothesis that the learning styles are associated with different educational backgrounds. The study involved 1,032 final-year students from six faculties in a local university.

Journal ArticleDOI
TL;DR: In this paper, the predictive validity of computer aptitude, as measured by four aptitude subtests on the Computer Aptitude, Literacy, and Interest Profile and two computer anxiety instruments (the Computer Anxiety Scale and the Computer Anxiety Factor), was investigated using non programming computer performance as the criterion variable.
Abstract: The predictive validity of computer aptitude, as measured by four aptitude subtests on the Computer Aptitude, Literacy, and Interest Profile and two computer anxiety instruments (the Computer Anxiety Scale and the Computer Anxiety Factor), was investigated using non programming computer performance as the criterion variable. The effects of computer anxiety on performance were inconsistent, and the suggestion is made that computer anxiety may be related to computer experience. The effects of computer aptitude on performance also yielded inconsistent results. The relationship between computer anxiety and computer aptitude varied. The reliability of the instruments is also reported.

Journal ArticleDOI
TL;DR: A QuickBASIC program for estimating the statistical power to detect the effects of dichotomous moderator variables using moderated multiple regression (MMR) is available as discussed by the authors, which runs on IBM and IBM compatible personal computers and estimates power based on specific values for (a) total sample size, sample sizes across the two categories of the hypothesized moderator, and correlation coefficients between predictor and criterion scores for each of the two moderator-based subgroups.
Abstract: A QuickBASIC program for estimating the statistical power to detect the effects of dichotomous moderator variables using moderated multiple regression (MMR) is available The program runs on IBM and IBM-compatible personal computers and estimates power based on specific values for (a) total sample size, (b) sample sizes across the two categories of the hypothesized moderator, and (c) correlation coefficients between predictor and criterion scores for each of the two moderator-based subgroups The compiled run time and source code versions of the program can be obtained from the first author

Journal ArticleDOI
TL;DR: In this paper, the authors examined the performance of a common set of items on an examination in which the order of options for one test form was experimentally manipulated and found that reordering options can have significant but unpredictable effects on item performance.
Abstract: Many testing programs rely on equating procedures to achieve comparability of scores on alternate test forms. One commonly accepted rule for developing equated examinations using the common-items nonequivalent groups (CINEG) design is that items common to the two examinations being equated should be identical. In test construction practice, this rule has been extended to include even the order in which options appear in the two examinations. The present study examined the performance of a common set of items on an examination in which the order of options for one test form was experimentally manipulated. The study sought to determine whether reordering multiple choice item options results in any significant effect on item difficulty. It was found that reordering options can have significant but unpredictable effects on item performance. A linkage is made to previous research and cautions are suggested regarding the effects of reordering options.

Journal ArticleDOI
TL;DR: The measurement integrity of the scores produced by the Fennema-Sherman Mathematics Attitude Scales has not yet been conclusively established as discussed by the authors, and the authors explored this measurement integrity issue by employing data provided by public elementary school teachers of mathematics.
Abstract: The Fennema-Sherman Mathematics Attitude Scales are among the most popular measures used in studies of attitudes toward mathematics. However, the measurement integrity of the scores produced by the measure has not yet been conclusively established. The present study explored this measurement integrity issue by employing data provided by public elementary school teachers of mathematics. Both the measure's factor structure and the measure's sensitivity to social desirability response set were investigated.