scispace - formally typeset
Search or ask a question

Showing papers in "Educational and Psychological Measurement in 1977"


Journal ArticleDOI
TL;DR: In this article, a measure of five interpersonal conflict-handling modes (competing, collaborating, compromising, avoiding, and accommodating) is proposed to control the social desirability response bias.
Abstract: This paper describes the rationale and development of a new measure of five interpersonal conflict-handling modes (competing, collaborating, compromising, avoiding, and accommodating), which attempts to control for the social desirability response bias. The instrument is entitled: "Management-of-Differences Exercise," or the MODE instrument. The results of this study indicate that the new instrument significantly reduces the social desirability bias for overall population tendencies in comparison to three other conflict behavior instruments, although all four instruments may still be susceptible to some individual tendencies in this response bias. This study also investigated other aspects of substantive validity and structural validity. Lastly, this paper presented emerging evidence on external validity, which, while encouraging, suggests the need for continuing research efforts to investigate this aspect of validity for the new MODE instrument.

526 citations


Journal ArticleDOI
TL;DR: In this paper, several indices of item homogeneity derived from the model of common factor analysis are offered as alternatives to the coefficient alpha as a measure of internal consistency and homogeneity.
Abstract: Confusion in the literature between the concepts of internal consistency and homogeneity has led to a misuse of coefficient alpha as an index of item homogeneity. Coefficient alpha is actually a complexly determined test statistic, item homogeneity only being one influence on its magnitude. The related statistic, the average intercorrelation, has similar difficulties. Several indices of item homogeneity derived from the model of common factor analysis are offered as alternatives.

484 citations


Journal ArticleDOI
TL;DR: The authors developed a brief and efficient instrument for assessment, treatment planning, and evaluation of clients by counselors who use Rational Emotive Therapy (RET), which was used to evaluate the performance of participants in a workshop on RET.
Abstract: The purpose of the study was to develop a brief and efficient instrument for assessment, treatment planning, and evaluation of clients by counselors who use Rational Emotive Therapy (RET). Subjects were 235 undergraduate students. Eleven Guttman scales were developed following factor analysis. Each factor is measured by a Guttman scale with a coefficient of reproducibility of 0.90 or greater and with a coefficient of scalability of 0.60. Pre- and posttest scores were obtained for 40 mental health professionals attending an all-day workshop on RET. Over-all test scores were significantly different at the 0.025 level in the predicted direction. There was also a significant difference in the predicted direction between pretest scores of the professionals and the college students. In another study, 87 professionals attending a two day workshop on RET were tested before and after the workshop. There were significant differences in the predicted direction at the 0.025 level or beyond for the overall test scores...

184 citations


Journal ArticleDOI
TL;DR: The Arizona Clinical Interview Rating Scale is examined for construct validity as an instrument to evaluate the interviewing techniques of medical students.
Abstract: The Arizona Clinical Interview Rating Scale is examined for construct validity as an instrument to evaluate the interviewing techniques of medical students. Evidence was gathered in the areas of co...

107 citations


Journal ArticleDOI
TL;DR: In this paper, knowledge about ecology, the environment, and pollution was found to be predictive of actual behavior related to improving the environment; an outcome leading to new and different implications for environmental education and for planned attempts to modify conservation-related human behaviors.
Abstract: The purpose of the study was to administer and to validate further three of four scales from Maloney, Ward, and Braucht's revised or short-form ecology inventory. Unlike earlier findings, knowledge about ecology, the environment, and pollution was found to be predictive of actual behavior related to improving the environment— an outcome leading to new and different implications for environmental education and for planned attempts to modify conservation-related human behaviors.

104 citations


Journal ArticleDOI
TL;DR: In this paper, the authors used the items from the Adjective Check List in a relative judgment method and found that the stereotypes derived by the two methods were highly similar and that the male and female stereotypes did not differ in overall favorability ratings.
Abstract: Using the items from the Adjective Check List in a relative judgment method. Williams and Bennett (1975) found that men and women subjects were in close agreement as to the characteristics which compose the male and female sex stereotypes. The present study was designed to replicate the Williams and Bennett findings using an absolute judgment method and also to investigate the favorability of each stereotype more systematically. Results indicated that the stereotypes derived by the two methods were highly similar and that the male and female stereotypes did not differ in overall favorability ratings. Based on the data from both studies, a single sex stereotype index score was derived for each of the 300 Adjective Check List items. These scores may be used to determine the mean sex stereotype value of standard self-descriptions, as well as any other concepts (e.g., "men in general," "ideal physician," "ideal mate") described by the use of this item pool.

92 citations


Journal ArticleDOI
TL;DR: In this paper, the authors compute rater agreement and rater bias statistics with qualitative data and utilize techniques for selecting the most reliable from a set of raters and identifying those cases which are the most diflicult for raters to classify.
Abstract: The programs described compute rater agreement and rater bias statistics with qualitative data. They also utilize techniques for (a) selecting the most reliable from a set of raters and (b) identifying those cases which are the most diflicult for raters to classify.

69 citations


Journal ArticleDOI
TL;DR: In this paper, a simplified alternative procedure for conditional estimation practical for 20 or 30 items which produces equivalent estimates is developed, and a correction factor which makes the bias negligible is identified and demonstrated.
Abstract: Two procedures for Rasch, sample-free, item calibration are reviewed and compared for accuracy. Andersen's (1972) theoretically ideal "conditional" procedure is impractical for calibrating more than 10 or 15 items. A simplified alternative procedure for conditional estimation practical for 20 or 30 items which produces equivalent estimates is developed. When more than 30 items are analyzed recourse to Wright's (1969) widely used "unconditional" procedure is inevitable but that procedure is biased. A correction factor which makes the bias negligible is identified and demonstrated.

64 citations


Journal ArticleDOI
TL;DR: In this article, the effects of violation of the assumption of homogeneity of regression on the Type I error rate and on the power of analysis of covariance (ANCOVA) were investigated.
Abstract: The effects of violation of the assumption of homogeneity of regression on the Type I error rate and on the power of analysis of covariance (ANCOVA) were investigated. The data situations included ...

57 citations


Journal ArticleDOI
TL;DR: In this article, the reliability of difference scores in more general cases where it is not assumed that error scores on distinct tests are uncorrelated is investigated, and a zero correlation between the errors can be obtained only by introducing an additional assumption of experimental independence that does not follow from the other axioms in the model.
Abstract: The usual formulas for the reliability of differences between two test scores X and Y are based on the assumption that the error scores EX and EY are uncorrelated. In modern developments of test score theory, such as that of Lord and Novick, a true score is defined as the expected value of an individual's observed score. This definition implies that true scores on any test are uncorrelated with error scores on any test, but it does not imply that error scores on distinct tests X and Y are uncorrelated. A zero correlation between the errors can be obtained only by introducing an additional assumption of "experimental independence" that does not follow from the other axioms in the model. This assumption restricts severely the class of random variables to which the usual formulas for reliability of differences will apply. The present paper investigated the reliability of difference scores in more general cases where it is not assumed that error scores on distinct tests are uncorrelated. The formulas derived ...

51 citations


Journal ArticleDOI
TL;DR: The 46 alternatives in the original 23-item forced-choice format of the I-E scale were administered in a Likert agree-disagree format and the intercorrelations of the 46 alternatives were subjected to a principal components analysis as mentioned in this paper.
Abstract: The 46 alternatives in the original 23-item forced-choice format of the I-E scale were administered in a Likert agree-disagree format. The intercorrelations of the 46 alternatives were subjected to a principal components analysis. A varimax rotation of four factors yielded four subscales comparable to those reported by Collins (1974): Belief in a difficult world, a just world, a politically responsive world, and a predictable world. Subjects' responses were factor scored and correlated with five measures: Skill-chance task preference, an achievement test, a political efficacy test, the Mach V scale, and a Just World scale. The pattern of correlations suggested that the difficult world factor, and to a lesser extent the predictable world factor, are relatively general in that they related to most of the five scales. The political world factor related only to (a) political efficacy and (b) Mach V (among males only), and appeared to be more specific in nature. The correlation with the Mach-V suggested that e...

Journal ArticleDOI
TL;DR: The validity of the Test of English as a Foreign Language (TOEFL) was examined in relation to prediction of success of 50 Asian students who had completed master's programs in engineering, chemist and biology.
Abstract: The validity of the Test of English as a Foreign Language (TOEFL) was examined in relation to prediction of success of 50 Asian students who had completed master's programs in engineering, chemistr...

Journal ArticleDOI
TL;DR: This paper is intended to inform interested readers of the availability of a computer program capable of computing coefficients of congruence of factor structures.
Abstract: The coefficient of congruence is a quantitative measure of the similarity of factor structures for different samples of subjects. This paper is intended to inform interested readers of the availability of a computer program capable of computing coefficients of congruence of factor structures.

Journal ArticleDOI
TL;DR: In this paper, a simplex model is presented for the analysis of longitudinal academic growth variables in which only one measure is obtained at each time, and when this model fits the observed data, then reliabilities and unattenuated correlations can be estimated except for the first and last periods.
Abstract: A simplex model is presented for the analysis of longitudinal academic growth variables in which only one measure is obtained at each time. When this model fits the observed data, then reliabilities and unattenuated correlations can be estimated except for the first and last periods.

Journal ArticleDOI
TL;DR: In this paper, the group peer rating technique is employed to develop highly reliable measures of 16 personality traits for 445 adult workers and 237 high school seniors, which are found to have high predictive validity for pay differentials, supervisor's ratings, and school grades.
Abstract: The group peer rating technique is employed to develop highly reliable measures of 16 personality traits for 445 adult workers and 237 high school seniors. Several of these traits are found to have high predictive validity for pay differentials, supervisor's ratings, and school grades. Dimensions derived from multidimensional scaling of the traits explain between 19 and 43 percent of the variance in criterion variables, and the corresponding validity coeficients are extremely high compared to those given in Ghiselli's (1966) comprehensive review study. Moreover, the results are robust in multivariate regressions with a series of other variables.

Journal ArticleDOI
TL;DR: This article employed discriminant analysis to improve the usefulness of student-faculty ratings in detecting differences in lecturer types, using an 18-item questionnaire like those commonly used to evaluate lecturer effectiveness.
Abstract: This investigation employed discriminant analysis to improve the usefulness of student-faculty ratings in detecting differences in lecturer types. Equivalent groups of college students in each of two studies viewed lectures delivered by a Hollywood actor so as to vary in number of substantive teaching points covered (high, low) and presentation manner (enthusiastic, unenthusiastic). Students rated lecturer effectiveness using an 18-item questionnaire like those commonly used. Optimal scoring methods were derived in the first study for the purpose of differentiating among lecturer types and were cross-validated in a second study of groups of students who saw and rated the same lectures. Scoring methods derived in the first study were valid in relation to differences in lecturer enthusiasm in the first and second studies and were valid in relation to differences in information-giving in the first but not in the second study. Results were explained in terms of the "Doctor Fox Effect" and suggestions were off...

Journal ArticleDOI
TL;DR: In this article, the differences between IQs obtained from WISC-R and those obtained from the original WISC were estimated using regression equations for the 1949 sample, by age level, to predict WISC IQs from the common core.
Abstract: The purpose of the study was to estimate the magnitude of the differences between IQs obtained from WISC-R and those from the original WISC. The 1949 WISC and the 1974 WISC-R were reviewed to identify a common core of items. Regression equations were developed for the 1949 sample, by age level, to predict WISC IQs from the common core. These equations were then used to estimate the WISC IQs of those in the WISC-R standardization sample. In the age range 6½ to 15½ years, the Full Scale IQs on WISC-R were 4 points lower, on the average, than WISC IQs. The difference, however, varied with the particular IQ Scale (Verbal or Performance), age, and ability level.

Journal ArticleDOI
TL;DR: In this article, the intercorrelations among the 12 subtests of the WISC-R were analyzed for each of the 11 age groups in the standardization sample, using the principal-factor method.
Abstract: The intercorrelations among the 12 subtests of the WISC-R were analyzed for each of the 11 age groups in the standardization sample, using the principal-factor method. Both two- and three-factor solutions were obtained for each age group, using the maxplane method, and the stability of the two solutions from one age group to another was assessed by calculating coefficients of congruence. The two-factor solution proved somewhat more stable, but the difference was relatively small and some may actually prefer the three-factor solution.

Journal ArticleDOI
TL;DR: In this paper, an efficient algorithm (MSPLIT) for maximizing split-half reliability coefficients is described, and the coefficients derived by the algorithm were found to be generally larger than odd-even splithalf coefficients and KR-20 coefficients and were nearly as large as the largest of the coefficients from among every possible splithalf arrangement.
Abstract: An efficient algorithm (MSPLIT) for maximizing split-half reliability coefficients is described. Coefficients derived by the algorithm were found to be generally larger than odd-even split-half coefficients and KR-20 coefficients and were nearly as large as the largest of the coefficients from among every possible split-half arrangement.

Journal ArticleDOI
TL;DR: In this article, a latent partition analysis of subjects' categorizations of keyed analogy items, grouped according to relational similarity, yielded eight latent types of relationships which generalized across subjects and items.
Abstract: The verbal analogy item has played a major role in the measurement of intelligence. Despite numerous correlational studies, however, neither the nature of intelligence nor the nature of the cognitive strategies and semantic structures that govern analogy test performance is well understood. These doubts arise from the apparent inappropriateness of correlational data for studying cognitive processes and structures, and the dearth of experimental research on the analogy item as a cognitive task. As a first step to investigate the analogy item as a cognitive task, the current study attempted to identify a semantic structure of relationships that individuals use to comprehend the completed analogy. A latent partition analysis of subjects' categorizations of keyed analogy items, grouped according to relational similarity, yielded eight latent types of relationships which generalized across subjects and items. Implications of the results for test development and test validity are discussed, and the content of t...

Journal ArticleDOI
TL;DR: Order analysis was proposed as a method for description of formal structures in multidimensional space using a combination of psychological measurement theory, formal logic theory, information theory and graph theory concepts and suggests isomorphism between algebraic, geometric, logical, and cognitive structures.
Abstract: Order analysis was proposed as a method for description of formal structures in multidimensional space. Its algorithm was derived using a combination of psychological measurement theory, formal logic theory, information theory and graph theory concepts. It suggests isomorphism between algebraic, geometric, logical, and cognitive structures. The model also provides for adjustment of its sensitivity to random variation with an elective degree of statistical confidence.

Journal ArticleDOI
TL;DR: This article developed scales measuring four facets of attitude toward mathematics and assessed the reliability and validity of these scales, i.e., enjoyment of word problems, pictorial problems, appreciation of the utility of mathematics, and security with mathematics.
Abstract: The purpose of this study was to develop scales measuring four facets of attitude toward mathematics and to assess the reliability and validity of these scales. The four constructs of interest were: enjoyment of word problems; enjoyment of pictorial problems; appreciation of the utility of mathematics; and security with mathematics. The scales designed to measure these four constructs were administered to 299 seventh grade students. Split-half reliability coefficients were estimated for each scale and evidence related to the construct validity of each scale was obtained. All scales demonstrated an adequate degree of reliability and some degree of both convergent and discriminant validity.

Journal ArticleDOI
TL;DR: The Academic Motivations Inventory (AMI) as mentioned in this paper is a self-report measure of the academic motivations of college students, which measures the predominant motivations of students in a course.
Abstract: This paper reports the early development of the Academic Motivations Inventory (AMI), a self-report measure of the academic motivations of college students. Content validation procedures and the reliability of items and scales suggest that AMI is at present a promising instrument for group measurement (e.g., to describe the predominant motivations of students in a course) and that it may become, through refinement of the scales, a useful tool for individual assessment.

Journal ArticleDOI
TL;DR: The concurrent validity of a computer-assisted test construction system designed to measure information in the medical sciences was assessed and results were interpreted to indicate that the Quarterly Profile Examination measures factual Information in the basic medical sciences.
Abstract: The concurrent validity of a computer-assisted test construction system designed to measure information in the medical sciences was assessed. Scores on the Quarterly Profile Examination were correl...

Journal ArticleDOI
TL;DR: In this paper, the authors used admission data from readily available admission applications for first-time entering freshmen and transfer students who initially had enrolled at the University of Northern Colorado during the fall quarter 1970 to predict membership into a class of students who were graduated by the end of the traditional 4-year college career, were still enrolled after the four-year period, were not enrolled the quarter following academic probation or suspension, or left the university while in good academic standing.
Abstract: Data from readily available admission applications were obtained for first-time entering freshmen and transfer students who initially had enrolled at the University of Northern Colorado during the fall quarter 1970. These data were used to predict membership into a class of students who (a) were graduated by the end of the traditional 4-year college career, (b) were still enrolled after the 4-year period, (c) were not enrolled the quarter following academic probation or suspension, or (d) left the university while in good academic standing. This study attempted to answer the following questions: (a) Could discriminant functions be developed which would allow for the correct classification of a student into one of the four categories of interest? (b) Which ones of the variables were the best discriminators between the groups? and (c) How efficient were the discriminant functions in this classification procedure? Results indicated that discriminant functions could be developed which accurately place 33% to ...

Journal ArticleDOI
TL;DR: In this paper, some predicted relationships among certain item characteristics, response processes, instability of response, and nearness of subject and item on a trait continuum are used to compare the traditional and proposed approaches.
Abstract: In their investigation of the relationship between item properties and test quality, itemmetricians rely too heavily upon inference about the nature of the subject-item interaction. Itemmetric research can be improved by employing data, specifying variables, and selecting an order of data analysis which are more directly related to this interaction. Some predicted relationships among certain item characteristics, response processes, instability of response, and nearness of subject and item on a trait continuum are used to compare the traditional and proposed approaches. Stronger relationships and clearer interpretations arise from the proposed approach.

Journal ArticleDOI
TL;DR: An alternative algorithm for Thurstone's pair comparisons method that provides an improved capacity for handling indeterminate proportions of 1.00 and 0.00 as well as untruncated estimates of population scale values.
Abstract: Described is an alternative algorithm for Thurstone's pair comparisons method. This algorithm provides an improved capacity for handling indeterminate proportions of 1.00 and 0.00 as well as untruncated estimates of population scale values.

Journal ArticleDOI
TL;DR: In this article, a study was conducted to investigate the contributions of the Tennessee Self Concept Scale to the understanding of self-concept components, and an Alpha Factor Analysis was employed to examine the inde...
Abstract: A study was conducted to investigate the contributions of the Tennessee Self Concept Scale to the understanding of self-concept components. An Alpha Factor Analysis was employed to examine the inde...

Journal ArticleDOI
TL;DR: In this article, a cross-validation of a Creativity Scale for the Adjective Check List (ACL) based on adult groups is presented, where self-descriptions for 103 known independent inventors were compared with those of 74 noninventors.
Abstract: Cross-validation of a Creativity (Cr) scale for the Adjective Check List (ACL), based on adult groups, is presented. ACL self-descriptions for 103 known independent inventors were compared with those of 74 noninventors. Of all the possible ACL scales, significant differences between the two groups existed for 8 scales, including the Cr scale. Within the creative group itself, there was no significant difference between patentees and nonpatentees on the Cr scale.

Journal ArticleDOI
TL;DR: A description of a FORTRAN program for linear and area transformations of test scores with optional generation of symmetric tables of areas under the standard normal curve is presented in this article.
Abstract: A description of a FORTRAN program for linear and area transformations of test scores with optional generation of symmetric tables of areas under the standard normal curve is presented Also included is a historic note on the origin of the T scale along with a discussion of the relative merits of area versus linear scale transformations