scispace - formally typeset
Search or ask a question

Showing papers by "Johannes Hartig published in 2017"


Journal ArticleDOI
TL;DR: A method to quantify potential bias of relationship estimates (e.g., correlation coefficients) due to misfitting items is introduced and the potential deviation informs about whether item misfit is practically significant for outcomes of substantial analyses.
Abstract: Testing item fit is an important step when calibrating and analyzing item response theory (IRT)-based tests, as model fit is a necessary prerequisite for drawing valid inferences from estimated parameters. In the literature, numerous item fit statistics exist, sometimes resulting in contradictory conclusions regarding which items should be excluded from the test. Recently, researchers argue to shift the focus from statistical item fit analyses to evaluating practical consequences of item misfit. This article introduces a method to quantify potential bias of relationship estimates (e.g., correlation coefficients) due to misfitting items. The potential deviation informs about whether item misfit is practically significant for outcomes of substantial analyses. The method is demonstrated using data from an educational test.

17 citations


Journal ArticleDOI
TL;DR: In this article, the authors show that valid inferences on teaching drawn from students' test scores require that tests are sensitive to the instruction students received in class and that measures of the test items' instructional...
Abstract: Valid inferences on teaching drawn from students’ test scores require that tests are sensitive to the instruction students received in class. Accordingly, measures of the test items’ instructional ...

10 citations


Journal ArticleDOI
08 Sep 2017
TL;DR: In this paper, a multilevel structural equation modeling approach is presented to analyze data from repeated cross-sections of organizations, where different individuals are sampled from the same set of organizations at each time point of measurement.
Abstract: . In repeated cross-sections of organizations, different individuals are sampled from the same set of organizations at each time point of measurement. As a result, common longitudinal data analysis methods (e.g., latent growth curve models) cannot be applied in the usual way. In this contribution, a multilevel structural equation modeling approach to analyze data from repeated cross-sections is presented. Results from a simulation study are reported which aimed at obtaining guidelines on appropriate sample sizes. We focused on a situation where linear growth occurs at the organizational level, and organizational growth is predicted by a single organizational level variable. The power to identify an effect of this organizational level variable was moderately to strongly positively related to number of measurement occasions, number of groups, group size, intraclass correlation, effect size, and growth curve reliability. The Type I error rate was close to the nominal alpha level under all conditions.

4 citations


Book ChapterDOI
01 Jan 2017
TL;DR: This chapter presents results on the properties of the newly developed test of reading and listening comprehension, finding Abilities for both sub-domains prove to be highly correlated yet empirically distinguishable, with a latent correlation of .84.
Abstract: The project “Modeling competencies with multidimensional item-response-theory models” examined different psychometric models for student performance in English as a foreign language. On the basis of the results of re-analyses of data from completed large scale assessments, a new test of reading and listening comprehension was constructed. The items within this test use the same text material both for reading and for listening tasks, thus allowing a closer examination of the relations between abilities required for the comprehension of both written and spoken texts. Furthermore, item characteristics (e.g., cognitive demands and response format) were systematically varied, allowing us to disentangle the effects of these characteristics on item difficulty and dimensional structure. This chapter presents results on the properties of the newly developed test: Both reading and listening comprehension can be reliably measured (rel = .91 for reading and .86 for listening). Abilities for both sub-domains prove to be highly correlated yet empirically distinguishable, with a latent correlation of .84. Despite the listening items being more difficult, in terms of absolute correct answers, the difficulties of the same items in the reading and listening versions are highly correlated (r = .84). Implications of the results for measuring language competencies in educational contexts are discussed.

1 citations