Showing papers in "Educational and Psychological Measurement in 2008"

PDF

Open Access

Journal Article•DOI•

Self-Efficacy for Self-Regulated Learning: A Validation Study

[...]

Ellen L. Usher¹, Frank Pajares²•Institutions (2)

University of Kentucky¹, Emory University²

01 Jun 2008-Educational and Psychological Measurement

TL;DR: The psychometric properties and multigroup measurement invariance of scores on the Self-Efficacy for Self-Regulated Learning Scale taken from Bandura's Children's SelfEfficiency Scale were assessed in a sample of 3,760 students from Grades 4 to 11 as discussed by the authors.

...read moreread less

Abstract: The psychometric properties and multigroup measurement invariance of scores on the Self-Efficacy for Self-Regulated Learning Scale taken from Bandura's Children's Self-Efficacy Scale were assessed in a sample of 3,760 students from Grades 4 to 11. Latent means differences were also examined by gender and school level. Results reveal a unidimensional construct with equivalent factor pattern coefficients for boys and girls and for students in elementary, middle, and high school. Elementary school students report higher self-efficacy for self-regulated learning than do students in middle and high school. The latent factor is related to self-efficacy, self-concept, task goal orientation, apprehension, and achievement.

...read moreread less

302 citations

Journal Article•DOI•

Sample Sizes When Using Multiple Linear Regression for Prediction

[...]

Gregory T. Knofczynski¹, Daniel J. Mundfrom²•Institutions (2)

Armstrong State University¹, University of Northern Colorado²

01 Jun 2008-Educational and Psychological Measurement

TL;DR: In this paper, the authors examined the relationship between the squared multiple correlation coefficients and minimum necessary sample sizes and found a definite relationship, similar to a negative exponential relationship, and provided guidelines for sample size needed for accurate predictions.

...read moreread less

Abstract: When using multiple regression for prediction purposes, the issue of minimum required sample size often needs to be addressed. Using a Monte Carlo simulation, models with varying numbers of independent variables were examined and minimum sample sizes were determined for multiple scenarios at each number of independent variables. The scenarios arrive from varying the levels of correlations between the criterion variable and predictor variables as well as among predictor variables. Two minimum sample sizes were determined for each scenario, a good and an excellent prediction level. The relationship between the squared multiple correlation coefficients and minimum necessary sample sizes were examined. A definite relationship, similar to a negative exponential relationship, was found between the squared multiple correlation coefficient and the minimum sample size. As the squared multiple correlation coefficient decreased, the sample size increased at an increasing rate. This study provides guidelines for sample size needed for accurate predictions.

...read moreread less

272 citations

Journal Article•DOI•

Factor Structure of Scores From the Maslach Burnout Inventory A Review and Meta-Analysis of 45 Exploratory and Confirmatory Factor-Analytic Studies

[...]

Jody A. Worley¹, Matt Vassar², Denna L. Wheeler³, Laura L. B. Barnes⁴•Institutions (4)

University of Oklahoma¹, Oklahoma State University Center for Health Sciences², Connors State College³, Oklahoma State University–Stillwater⁴

01 Oct 2008-Educational and Psychological Measurement

TL;DR: This paper provided a summary of 45 exploratory and confirmatory factor-analytic studies that examined the internal structure of scores obtained from the Maslach Burnout Inventory (MBI).

...read moreread less

Abstract: This study provides a summary of 45 exploratory and confirmatory factor-analytic studies that examined the internal structure of scores obtained from the Maslach Burnout Inventory (MBI). It highlights characteristics of the studies that account for differences in reporting of the MBI factor structure. This approach includes an examination of the various sample characteristics, forms of the instrument, factor-analytic methods, and the reported factor structure across studies that have attempted to examine the dimensionality of the MBI. This study also investigates the dimensionality of MBI scale scores using meta-analysis. Both descriptive and empirical analysis supported a three-factor model. The pattern of reported dimensions across validation studies should enhance understanding of the structural dimensions that the MBI measures as well as provide a more meaningful interpretation of its test scores.

...read moreread less

203 citations

Journal Article•DOI•

Using Video Clips of Mathematics Classroom Instruction as Item Prompts to Measure Teachers' Knowledge of Teaching Mathematics.

[...]

Nicole B. Kersting

05 Feb 2008-Educational and Psychological Measurement

TL;DR: In this paper, the authors used teachers' ability to analyze teaching as a proxy for their teaching knowledge, using video clips of classroom instruction as item prompts to measure teacher knowledge of teaching mathematics.

...read moreread less

Abstract: Responding to the scarcity of suitable measures of teacher knowledge, this article reports on a novel assessment approach to measuring teacher knowledge of teaching mathematics. The new approach uses teachers' ability to analyze teaching as a proxy for their teaching knowledge. Video clips of classroom instruction, which respondents were asked to analyze in writing, were used as item prompts. Teacher responses were scored along four dimensions: mathematical content, student thinking, alternative teaching strategies, and overall quality of interpretation. A prototype assessment was developed and its reliability and validity were examined. Respondents' scores were found to be reliable. Positive, moderate correlations between teachers' scores on the video-analysis assessment, a criterion measure of mathematical content knowledge for teaching, and expert ratings provide initial evidence for the criterion-related validity of the video-analysis assessment. Results suggest that teachers' ability to analyze teach...

...read moreread less

197 citations

Journal Article•DOI•

The Effects of Q-Matrix Misspecification on Parameter Estimates and Classification Accuracy in the DINA Model

[...]

André A. Rupp¹, Jonathan Templin²•Institutions (2)

Humboldt University of Berlin¹, University of Kansas²

01 Feb 2008-Educational and Psychological Measurement

TL;DR: In this article, the effects of Q-matrix misspecifications on parameter estimates and misclassification rates for the deterministic-input, noisy ''and'' gate (DINA) were investigated.

...read moreread less

Abstract: This article reports a study that investigated the effects of Q-matrix misspecifications on parameter estimates and misclassification rates for the deterministic-input, noisy ``and'' gate (DINA) mo...

...read moreread less

181 citations

Journal Article•DOI•

Comparability of Computer-Based and Paper-and-Pencil Testing in K–12 Reading Assessments: A Meta-Analysis of Testing Mode Effects

[...]

Shudong Wang, Hong Jiao, Michael J. Young, Thomas Brooks, John Olson - Show less +1 more

01 Feb 2008-Educational and Psychological Measurement

TL;DR: In this article, a meta-analysis was conducted to synthesize the administration mode effects of CBTs and paper-and-pencil tests on K-12 student reading assessments.

...read moreread less

Abstract: In recent years, computer-based testing (CBT) has grown in popularity, is increasingly being implemented across the United States, and will likely become the primary mode for delivering tests in the future. Although CBT offers many advantages over traditional paper-and-pencil testing, assessment experts, researchers, practitioners, and users have expressed concern about the comparability of scores between the two test administration modes. To help provide an answer to this issue, a meta-analysis was conducted to synthesize the administration mode effects of CBTs and paper-and-pencil tests on K—12 student reading assessments. Findings indicate that the administration mode had no statistically significant effect on K—12 student reading achievement scores. Four moderator variables—study design, sample size, computer delivery algorithm, and computer practice—made statistically significant contributions to predicting effect size. Three moderator variables—grade level, type of test, and computer delivery method...

...read moreread less

157 citations

Journal Article•DOI•

Sources of Validity Evidence for Educational and Psychological Tests

[...]

Gregory J. Cizek¹, Sharyn L. Rosenberg¹, Heather H. Koons•Institutions (1)

University of North Carolina at Chapel Hill¹

01 Jun 2008-Educational and Psychological Measurement

TL;DR: The authors investigated aspects of validity reflected in a large and diverse sample of published measures used in educational and psychological testing contexts and found that validity information is not routinely provided in terms of modern validity theory, some sources of validity evidence (e.g., consequential) are essentially ignored in validity reports, and the favorability of judgments about a test is more strongly related to the number of validity sources provided than to the perspective on validity taken or other factors.

...read moreread less

Abstract: This study investigates aspects of validity reflected in a large and diverse sample of published measures used in educational and psychological testing contexts. The current edition of Mental Measurements Yearbook served as the data source for this study. The validity aspects investigated included perspective on validity represented, number and kinds of sources of validity evidence provided, overall evaluation of the favorability of the test, and whether these factors varied as a function of the type of test. Findings reveal that validity information is not routinely provided in terms of modern validity theory, some sources of validity evidence (e.g., consequential) are essentially ignored in validity reports, and the favorability of judgments about a test is more strongly related to the number of validity sources provided than to the perspective on validity taken or other factors. The article concludes with implications for extending and refining current validity theory and validation practice.

...read moreread less

152 citations

Journal Article•DOI•

Internalizing and Externalizing Behavior Problem Scores Cross-Ethnic and Longitudinal Measurement Invariance of the Behavior Problem Index

[...]

Katarina Guttmannova¹, Jason M. Szanyi¹, Philip W. Cali²•Institutions (2)

Northwestern University¹, University of Illinois at Chicago²

14 Jan 2008-Educational and Psychological Measurement

TL;DR: In this paper, the authors used the National Longitudinal Survey of Youth 1979 (NLSY79) data to test measurement invariance of the Behavior Problem Index (BPI) during middle childhood across three ethnic groups.

...read moreread less

Abstract: Accurate measurement of behavioral functioning is a cornerstone of research on disparities in child development. This study used the National Longitudinal Survey of Youth 1979 (NLSY79) data to test measurement invariance of the Behavior Problem Index (BPI) during middle childhood across three ethnic groups. Using the internalizing and externalizing behavior problem division derived by Parcel and Menaghan (1988) and suggested for use with NLSY79 data, the configural invariance hypothesis was not supported. The BPI factor structure model was revised based on theoretical considerations using the division of items from the Child Behavior Checklist. This model demonstrated configural invariance across ethnic groups and over time. Moreover, measurement invariance of factor loadings and thresholds across ethnic groups at each time point and within each ethnic group over time was also supported. The implications of these findings for educational and cross-cultural research are outlined.

...read moreread less

92 citations

Journal Article•DOI•

The Ways of Coping Scale A Reliability Generalization Study

[...]

Kathryn R. Rexrode, Suni Petersen¹, Siobhan K. O’Toole¹•Institutions (1)

Alliant International University¹

01 Apr 2008-Educational and Psychological Measurement

TL;DR: In this article, the authors used reliability generalization to identify the variability in reliability estimates for the WOCS scores across studies, and the typical score reliability for WOCs, and salient features across studies that relate to the variability of reliability estimate scores.

...read moreread less

Abstract: For more than 20 years, the Ways of Coping Scale (WOCS) has been used extensively to measure coping. Yet beyond the original psychometric data, few studies have reexamined its properties utilizing the enormous body of research generated on the WOCS. Reliability has been assumed to be consistent as an attribute of the test. This study used reliability generalization to identify (a) the variability in reliability estimates for the WOCS scores across studies, (b) the typical score reliability for the WOCS, and (c) the salient features across studies that relate to the variability in reliability estimate scores for the WOCS. Typical reliability across subscale scores ranged from .60 to .75 with Positive Reappraisal showing the least variability and Self-Controlling showing the most. Factors related to this variability were age and format of administration.

...read moreread less

79 citations

Journal Article•DOI•

A reliability generalization study of scores on Rotter's and Nowicki-Strickland's locus of control scales

[...]

S. Natasha Beretvas, Marie-Anne P Suizzo, Jennifer A. Durham¹, Lisa M. Yarnell¹•Institutions (1)

University of Texas at Austin¹

01 Feb 2008-Educational and Psychological Measurement

TL;DR: The most commonly used locus of control measures of control are Rotter's Internality-Externality Scale (I-E) and Nowicki and Strickland's International Neighbourhood Scale (NSIE) as mentioned in this paper.

...read moreread less

Abstract: The most commonly used measures of locus of control are Rotter's Internality-Externality Scale (I-E) and Nowicki and Strickland's Internality-Externality Scale (NSIE). A reliability generalization study is conducted to explore variability in I-E and NSIE score reliability. Studies are coded for aspects of the scales used (number of response points, number of items) and for sample demographic descriptors (percentage female, average age). Results indicate no statistically significant difference in the predicted internal consistency estimate for I-E Scale versus NSIE Scale scores. Only the percentage female variable is found to predict variation in internal consistency estimates. Testing interval length explains variability in test-retest coefficient estimates. Results and directions for future research are discussed.

...read moreread less

73 citations

Journal Article•DOI•

Bayesian Multidimensional IRT Models With a Hierarchical Structure

[...]

Yanyan Sheng¹, Christopher K. Wikle²•Institutions (2)

Southern Illinois University Carbondale¹, University of Missouri²

01 Jun 2008-Educational and Psychological Measurement

TL;DR: The results from simulation studies as well as actual data suggest that IRT-based models with continuous latent traits can be developed and compared with the unidimensional IRT model, the proposed models better describe the actual data.

...read moreread less

Abstract: As item response models gain increased popularity in large-scale educational and measurement testing situations, many studies have been conducted on the development and applications of unidimensional and multidimensional models. Recently, attention has been paid to IRT-based models with an overall ability dimension underlying several ability dimensions specific for individual test items, where the focus is mainly on models with dichotomous latent traits. The purpose of this study is to propose such models with continuous latent traits under the Bayesian framework. The proposed models are further compared with the conventional IRT models using Bayesian model choice techniques. The results from simulation studies as well as actual data suggest that (a) such models can be developed; (b) compared with the unidimensional IRT model, the proposed models better describe the actual data; and (c) the use of the proposed IRT models and the multiunidimensional model should be based on different beliefs about the unde...

...read moreread less

Journal Article•DOI•

The Bem Sex-Role Inventory Continuing Theoretical Problems

[...]

Namok Choi¹, Dale R. Fuqua², Jody L. Newman³•Institutions (3)

University of Louisville¹, Oklahoma State University–Stillwater², University of Oklahoma³

05 Feb 2008-Educational and Psychological Measurement

TL;DR: In this article, both desirability ratings of BSRI traits (both for a man and for a woman) and self-ratings were obtained from the same sample and factor analyzed.

...read moreread less

Abstract: Pedhazur and Tetenbaum speculated that factor structures from self-ratings of the Bem Sex-Role Inventory (BSRI) personality traits would be different from factor structures from desirability ratings of the same traits. To explore this hypothesis, both desirability ratings of BSRI traits (both for a man and for a woman) and self-ratings were obtained from the same sample and factor analyzed. Factor analyses performed on the three sets of ratings of the 40 BSRI traits (self-ratings, desirability ratings for a man, and desirability ratings for a woman) confirmed that the factors across ratings were diverse. Thus, the underlying constructs must be studied independently. Predictive discriminant analyses replicated the finding that two traits alone (Masculine and Feminine) provided nearly all of the discrimination of males and females in the sample when self-ratings were employed. Also, predictive discriminant analyses revealed that the classification of participants into gender groups was very accurate using s...

...read moreread less

Journal Article•DOI•

Modeling Nonignorable Missing Data in Speeded Tests

[...]

Cornelis A.W. Glas¹, Jonald Pimentel¹•Institutions (1)

University of Twente¹

01 Dec 2008-Educational and Psychological Measurement

TL;DR: In this paper, a combination of two item response theory (IRT) models is used for the observed response data and one for the missing data indicator, which is modeled using a sequential model with linear restrictions on the item parameters.

...read moreread less

Abstract: In tests with time limits, items at the end are often not reached. Usually, the pattern of missing responses depends on the ability level of the respondents; therefore, missing data are not ignorable in statistical inference. This study models data using a combination of two item response theory (IRT) models: one for the observed response data and one for the missing data indicator. The missing data indicator is modeled using a sequential model with linear restrictions on the item parameters. The models are connected by the assumption that the respondents' latent proficiency parameters have a joint multivariate normal distribution. Model parameters are estimated by maximum marginal likelihood. Simulations show that treating missing data as ignorable can lead to considerable bias in parameter estimates. Including an IRT model for the missing data indicator removes this bias. The method is illustrated with data from an intelligence test with a time limit.

...read moreread less

Journal Article•DOI•

Validation of Scores on the Homework Management Scale for High School Students

[...]

Jianzhong Xu¹•Institutions (1)

Mississippi State University¹

01 Apr 2008-Educational and Psychological Measurement

TL;DR: In this paper, the authors test the validity of scores on the Homework Management Scale (HMS) using 699 rural and 482 urban 8th graders and find that urban students were more likely to manage their homework than their rural counterparts in two of the five areas, namely, handling distraction and monitoring motivation.

...read moreread less

Abstract: The purpose of this study was to test the validity of scores on the Homework Management Scale (HMS) using 699 rural and 482 urban eighth graders. The study revealed that the HMS comprised 5 separate yet related factors: arranging the environment, managing time, handling distraction, monitoring motivation, and controlling emotion. Given an adequate level of configural, factor loading, common error covariance, and intercept invariance, I further tested the difference between group means. Results revealed that urban students were more likely to manage their homework than their rural counterparts in 2 of the 5 areas, namely, handling distraction and monitoring motivation. Findings also showed that the HMS differentiated among students who were more or less likely to complete homework assignments.

...read moreread less

Journal Article•DOI•

Configural, Metric, and Scalar Invariance of the Modified Achievement Goal Questionnaire Across African American and White University Students:

[...]

Hilary L. Campbell¹, Carol L. Barry¹, Jilliam N. Joe¹, Sara J. Finney¹•Institutions (1)

James Madison University¹

01 Dec 2008-Educational and Psychological Measurement

TL;DR: In this paper, the authors investigated the measurement invariance of a particular measure of achievement goal orientation, the modified Achievement Goal Questionnaire (AGQ-M), across African American and white university students.

...read moreread less

Abstract: There has been growing interest in comparing achievement goal orientations across ethnic groups. Such comparisons, however, cannot be made until validity evidence has been collected to support the use of an achievement goal orientation instrument for that purpose. Therefore, this study investigates the measurement invariance of a particular measure of achievement goal orientation, the modified Achievement Goal Questionnaire (AGQ-M), across African American and White university students. Confirmatory factor analyses support measurement invariance across the two groups. These findings provide additional validity evidence for the newly conceptualized 2 × 2 framework of achievement goal orientation and for the equivalence of functioning of the AGQ-M across these distinct groups. Because this level of invariance is established, researchers can make more valid inferences about differences in the AGQ-M scores across African American and White students.

...read moreread less

Journal Article•DOI•

Factorial Invariance of the Academic Amotivation Inventory (AAI) Across Gender and Grade in a Sample of Canadian High School Students

[...]

Isabelle Green-Demers¹, Lisa Legault², Daniel Pelletier¹, Luc G. Pelletier²•Institutions (2)

Université du Québec en Outaouais¹, University of Ottawa²

01 Oct 2008-Educational and Psychological Measurement

TL;DR: This article examined the consistency of the metric properties of AAI scores by testing their factorial structure for invariance across gender and grade (2 [genders] × 5 [grades] = 10 groups) in a sample of 3,417 high school students.

...read moreread less

Abstract: Motivation deficits are common in high school and constitute a significant problem for both students and teachers. The Academic Amotivation Inventory (AAI) was developed to measure the multidimensional nature of the academic amotivation construct (Legault, Green-Demers, & Pelletier, 2006). The present project further examined the consistency of the metric properties of AAI scores by testing their factorial structure for invariance across gender and grade (2 [genders] × 5 [grades] = 10 [groups]) in a sample of 3,417 high school students. Factorial invariance of latent means was also examined as a complementary substantive objective. Configural, metric, and scalar invariance were successfully substantiated across all 10 groups. Results revealed well-fitting models for each group. Moreover, constraining factor loadings and intercepts had no meaningful impact on model fit. Findings are discussed in terms of an increased conceptual and psychometric understanding of scholastic motivational problems.

...read moreread less

Journal Article•DOI•

Measurement Bias Across Gender on the Children's Depression Inventory Evidence for Invariance From Two Latent Variable Models

[...]

Adam C. Carle¹, Roger E. Millsap², David A. Cole³•Institutions (3)

University of North Florida¹, Arizona State University², Vanderbilt University³

01 Apr 2008-Educational and Psychological Measurement

TL;DR: In this article, confirmatory factor analysis for ordered-categorical measures (CFA-OCM) and rating scale item response theory (IRT) analyses explore measurement bias across gender on the Children's Depression Inve...

...read moreread less

Abstract: Confirmatory factor analysis for ordered-categorical measures (CFA-OCM) and rating scale item response theory (IRT) analyses explore measurement bias across gender on the Children's Depression Inve...

...read moreread less

Journal Article•DOI•

Invariance of the Measurement Model Underlying the Wechsler Adult Intelligence Scale-III in the United States and Canada

[...]

Stephen C. Bowden¹, Rael T. Lange², Lawrence G. Weiss³, Donald H. Saklofske⁴•Institutions (4)

University of Melbourne¹, University of British Columbia², Pearson Education³, University of Calgary⁴

17 Nov 2008-Educational and Psychological Measurement

TL;DR: This article examined the invariance of a measurement model underlying Wechsler Adult Intelligence Scale-Third Edition scores in the US and the Canadian standardization samples and found that the measurement model, involving four latent variables, satisfies the assumption of invariance across samples.

...read moreread less

Abstract: A measurement model is invoked whenever a psychological interpretation is placed on test scores When stated in detail, a measurement model provides a description of the numerical and theoretical relationship between observed scores and the corresponding latent variables or constructs In this way, the hypothesis that similar meaning can be derived from a set of test scores can be tested by examination of a measurement model across groups This study examines the invariance of a measurement model underlying Wechsler Adult Intelligence Scale—Third Edition scores in the US and the Canadian standardization samples The measurement model, involving four latent variables, satisfies the assumption of invariance across samples Subtest scores also show similar reliability in both samples However, slightly higher latent variable means are found in the Canadian normative sample

...read moreread less

Journal Article•DOI•

Examining the Relationship Between Race-Based Differential Item Functioning and Item Difficulty:

[...]

Charles A. Scherbaum¹, Harold W. Goldstein¹•Institutions (1)

City University of New York¹

14 Jan 2008-Educational and Psychological Measurement

TL;DR: This article examined the relationship between item difficulty and differential item functioning by using alternative statistical techniques based on item response theory and a different standardized test, and the results replicate previous research and provide support for the generalizability of the findings.

...read moreread less

Abstract: Recent research examining racial differences on standardized cognitive tests has focused on the impact of test item difficulty. Studies using data from the SAT and GRE have reported a correlation between item difficulty and differential item functioning (DIF) such that minority test takers are less likely than majority test takers to respond correctly to easy test items. The statistical techniques used and the effect sizes reported in these studies have been heavily criticized. This study addresses these criticisms by examining the relationship between item difficulty and DIF by using alternative statistical techniques based on item response theory and a different standardized test. The results replicate previous research and provide support for the generalizability of the findings.

...read moreread less

Journal Article•DOI•

The Empirical Verification of an Assignment of Items to Subtests: The Oblique Multiple Group Method Versus the Confirmatory Common Factor Method

[...]

Ilse Stuive¹, Henk A.L. Kiers, Marieke E. Timmerman, Jos M. F. ten Berge•Institutions (1)

University of Groningen¹

21 Mar 2008-Educational and Psychological Measurement

TL;DR: In this paper, the authors compared two confirmatory factor analysis methods on their ability to verify whether correct assignments of items to subtests are supported by the data, and found that the confirmatory common factor (CCF) method is used most often and defines nonzero loadings so that they correspond to the assignment of items in subtests.

...read moreread less

Abstract: This study compares two confirmatory factor analysis methods on their ability to verify whether correct assignments of items to subtests are supported by the data. The confirmatory common factor (CCF) method is used most often and defines nonzero loadings so that they correspond to the assignment of items to subtests. Another method is the oblique multiple group (OMG) method, which defines subtests as unweighted sums of the scores on all items assigned to the subtest, and (corrected) correlations are used to verify the assignment. A simulation study compares both methods, accounting for the influence of model error and the amount of unique variance. The CCF and OMG methods show similar behavior with relatively small amounts of unique variance and low interfactor correlations. However, at high amounts of unique variance and high interfactor correlations, the CCF detected correct assignments more often, whereas the OMG was better at detecting incorrect assignments.

...read moreread less

Journal Article•DOI•

Dimensionality Assessment Using the Full-Information Item Bifactor Analysis for Graded Response Data: An Illustration with the State Metacognitive Inventory.

[...]

Jason C. Immekus¹, P.K. Imbrie²•Institutions (2)

California State University, Fresno¹, Purdue University²

14 Jan 2008-Educational and Psychological Measurement

TL;DR: In this paper, the authors used the full-information item bifactor model for graded response data to test the dimensionality of an adapted version of the State Metacognitive Inventory.

...read moreread less

Abstract: Dimensionality assessment using the full-information item bifactor model for graded response data is provided. The model applies to data in which each item relates to a general factor and one group factor. Specifically, alternative model specification within item response theory (IRT) is shown to test a scale's factor structure. For illustrative purposes, the bifactor model and competing IRT models were fit to the data of separate cohorts of incoming college students (Cohort 1, n = 1,490; Cohort 2, n = 1,533) to test the dimensionality of an adapted version of the State Metacognitive Inventory. Overall, the bifactor analysis did not strongly support distinct group factors after accounting for the general factor. Instead, results suggested conceptualizing the scale as unidimensional, indicating that scores should be based on the total scale, not subscales. Considerations related to the use of the bifactor IRT model are discussed.

...read moreread less

Journal Article•DOI•

The Importance of Construct Breadth When Examining Interrole Conflict

[...]

Ann H. Huffman¹, Satoris S. Youngcourt², Stephanie C. Payne³, Carl A. Castro⁴•Institutions (4)

Northern Arizona University¹, Kansas State University², Texas A&M University³, Walter Reed Army Institute of Research⁴

01 Jun 2008-Educational and Psychological Measurement

TL;DR: In this article, the authors compared the criterion-related validity of scores yielded by a work-non-work conflict scale and those yielded by work-family conflict scale using active-duty U.S. Army soldiers stationed in Germany and Italy with spouses and/or children and without spouses or children.

...read moreread less

Abstract: Research examining the influence of nonwork issues on work-related outcomes has flourished. Often, however, the breadth of the interrole conflict construct varies widely between studies. To determine if the breadth of the interrole conflict measure makes a difference, the current study compares the criterion-related validity of scores yielded by a work‐nonwork conflict scale and those yielded by a work‐family conflict scale using active-duty U.S. Army soldiers stationed in Germany and Italy with spouses and/or children and without spouses or children. Results demonstrated that the two constructs are related but distinct. In addition, work‐family conflict had a stronger relationship with job satisfaction and turnover intentions for employees with a spouse and/or children than for single, childless employees, whereas work‐nonwork conflict had a stronger relationship with these outcomes for single, childless employees than for employees with a spouse and/or children.

...read moreread less

Journal Article•DOI•

The Coaching Efficacy Scale II—High School Teams:

[...]

Nicholas D. Myers¹, Deborah L. Feltz², Melissa A. Chase³, Mark D. Reckase², Gregory R. Hancock⁴ - Show less +1 more•Institutions (4)

University of Miami¹, Michigan State University², Miami University³, University of Maryland, College Park⁴

23 May 2008-Educational and Psychological Measurement

TL;DR: In this paper, a revised version of the coaching efficacy scale (CES II-HST) was developed for head coaches of high school teams, and data were collected from head coaches from 14 relevant high school sports (N = 799).

...read moreread less

Abstract: The purpose of this validity study was to improve measurement of coaching efficacy, an important variable in models of coaching effectiveness. A revised version of the coaching efficacy scale (CES) was developed for head coaches of high school teams (CES II-HST). Data were collected from head coaches of 14 relevant high school sports (N = 799). Exploratory factor analysis (n = 250) and a conceptual understanding of the construct of interest led to the selection of 18 items. A single-group confirmatory factor analysis (CFA; n = 549) provided evidence for close model-data fit. A multigroup CFA provided evidence for factorial invariance by gender of the coach (n = 588).

...read moreread less

Journal Article•DOI•

Generalized Mantel-Haenszel Methods for Differential Item Functioning Detection

[...]

Ángel M. Fidalgo¹, Jaqueline M. Madeira¹•Institutions (1)

University of Oviedo¹

16 Apr 2008-Educational and Psychological Measurement

TL;DR: Mantel-Haenszel methods comprise a highly flexible methodology for assessing the degree of association between two categorical variables, whether they are nominal or ordinal, while controlling for... as discussed by the authors.

...read moreread less

Abstract: Mantel-Haenszel methods comprise a highly flexible methodology for assessing the degree of association between two categorical variables, whether they are nominal or ordinal, while controlling for ...

...read moreread less

Journal Article•DOI•

Dependent Correlations in Meta-Analysis: The Case of Heterogeneous Dependence.

[...]

Shu Fai Cheung¹, Darius K.-S. Chan•Institutions (1)

University of Macau¹

01 Oct 2008-Educational and Psychological Measurement

TL;DR: Cheung et al. as discussed by the authors extended the previous study by examining the case of heterogeneous degree of dependence and found that adjusted-weighted procedure generated slightly less biased estimates of the degree of heterogeneity than the adjusted-individual weighted procedure across conditions.

...read moreread less

Abstract: In meta-analysis, it is common to have dependent effect sizes, such as several effect sizes from the same sample but measured at different times. Cheung and Chan proposed the adjusted-individual and adjusted-weighted procedures to estimate the degree of dependence and incorporate this estimate in the meta-analysis. The present study extends the previous study by examining the case of heterogeneous degree of dependence. Simulation results reveal that these two procedures again generated less biased estimates of the degree of heterogeneity than the commonly used samplewise procedure and were statistically more powerful to detect true variations. In addition, the adjusted-weighted procedure generated slightly less biased estimates of the degree of heterogeneity than the adjusted-individual weighted procedure across conditions. Future directions to further refine these procedures are discussed.

...read moreread less

Journal Article•DOI•

Assessing General and Specific Attitudes in Human Learning Behavior: An Activity Perspective and a Multilevel Modeling Approach

[...]

Jun Sun¹, Victor L. Willson²•Institutions (2)

University of Texas–Pan American¹, Texas A&M University²

01 Apr 2008-Educational and Psychological Measurement

TL;DR: In this paper, a multilevel modeling approach is proposed to study the general and specific attitudes formed in human learning behavior based on the premises of activity theory, which conceptualizes the unit of analysis for attitude measurement as a scalable and evolving activity system rather than a single action.

...read moreread less

Abstract: This article proposes a multilevel modeling approach to study the general and specific attitudes formed in human learning behavior. Based on the premises of activity theory, it conceptualizes the unit of analysis for attitude measurement as a scalable and evolving activity system rather than a single action. Measurement issues related to this conceptualization, including scale development and validation, are discussed with the help of facet analysis and multilevel structural equation modeling techniques. An empirical study was conducted, and the results indicate that this approach is theoretically and methodologically defensible.

...read moreread less

Journal Article•DOI•

Latent Class Analysis of Differential Item Functioning on the Peabody Picture Vocabulary Test–III

[...]

Mi-young Lee Webb¹, Allan S. Cohen¹, Paula J. Schwanenflugel¹•Institutions (1)

University of Georgia¹

01 Apr 2008-Educational and Psychological Measurement

TL;DR: This paper investigated the use of latent class analysis for the detection of differences in item functioning on the Peabody Picture Vocabulary Test-Third Edition (PPVT-III) and proposed a two-class solution.

...read moreread less

Abstract: This study investigated the use of latent class analysis for the detection of differences in item functioning on the Peabody Picture Vocabulary Test—Third Edition (PPVT-III). A two-class solution f...

...read moreread less

Journal Article•DOI•

A Confirmatory Factor Analysis of Cross Racial Identity Scale (CRIS) Scores: Testing the Expanded Nigrescence Model

[...]

Frank C. Worrell¹, Stevie Watson²•Institutions (2)

University of California, Berkeley¹, Rutgers University²

23 May 2008-Educational and Psychological Measurement

TL;DR: In this article, the authors tested the viability of the expanded nigrescence (NT-E) model as operationalized by Cross Racial Identity Scale (CRIS) scores using confirmatory factor analyses.

...read moreread less

Abstract: In this study, the authors tested the viability of the expanded nigrescence (NT-E) model as operationalized by Cross Racial Identity Scale (CRIS) scores using confirmatory factor analyses. Participants were 594 Black college students from the Southeastern United States. Results indicated a good fit for NT-E's proposed six-factor structure. One-factor and two-factor higher-order models also yielded good fit indices, although several coefficients in the one-factor higher-order model were not salient or statistically significant. In sum, the results provide strong support for the CRIS as an operationalization of NT-E. The authors suggest that CRIS scores can be used in studies concerned with drawing inferences about the effects of racial identity attitudes.

...read moreread less

Journal Article•DOI•

Computer-Based and Paper-and-Pencil Administration Mode Effects on a Statewide End-of-Course English Test:

[...]

Do Hong Kim¹, Huynh Huynh²•Institutions (2)

University of North Carolina at Charlotte¹, University of South Carolina²

14 Jan 2008-Educational and Psychological Measurement

TL;DR: The authors compared student performance between paper-and-pencil testing (PPT) and computer-based testing (CBT) on a large-scale statewide end-of-course English examination.

...read moreread less

Abstract: The current study compared student performance between paper-and-pencil testing (PPT) and computer-based testing (CBT) on a large-scale statewide end-of-course English examination. Analyses were conducted at both the item and test levels. The overall results suggest that scores obtained from PPT and CBT were comparable. However, at the content domain level, a rather large difference in the reading comprehension section suggests that reading comprehension test may be more affected by the test administration mode. Results from the confirmatory factor analysis suggest that the administration mode did not alter the construct of the test.

...read moreread less

Journal Article•DOI•

Comparison of Two Approaches for Handling Missing Covariates in Logistic Regression

[...]

Chao-Ying Joanne Peng¹, Jin Zhu²•Institutions (2)

Indiana University¹, Genentech²

01 Feb 2008-Educational and Psychological Measurement

TL;DR: The present study seeks to fill the void by comparing two approaches for handling missing data in categorical covariates in logistic regression: the expectation-maximization (EM) method of weights and multiple imputation (MI).

...read moreread less

Abstract: For the past 25 years, methodological advances have been made in missing data treatment Most published work has focused on missing data in dependent variables under various conditions The present study seeks to fill the void by comparing two approaches for handling missing data in categorical covariates in logistic regression: the expectation-maximization (EM) method of weights and multiple imputation (MI) Sample data are drawn randomly from a population with known characteristics Missing data on covariates are simulated under two conditions: missing completely at random and missing at random with different missing rates A logistic regression model was fit to each sample using either the EM or MI approach The performance of these two approaches is compared on four criteria: bias, efficiency, coverage, and rejection rate Results generally favored MI over EM Practical issues such as implementation, inclusion of continuous covariates, and interactions between covariates are discussed

...read moreread less