scispace - formally typeset
Search or ask a question
Journal Article

Examining the Internal Structure of the Test of English-for-Teaching ("TEFT"™). Research Report. ETS RR-15-16.

01 Jun 2015-ETS Research Report Series (Educational Testing Service. Rosedale Road, MS19-R Princeton, NJ 08541. Tel: 609-921-9000; Fax: 609-734-5410; e-mail: RDweb@ets.org; Web site: https://www.ets.org/research/policy_research_reports/ets)-
TL;DR: This article examined the internal structure of the Test of English-for-Teaching (TEFT) assessment and found that the final parcel model had a higher-order general factor and four first-order factors corresponding to reading, writing, listening and speaking.
Abstract: ELTeach is an online professional development program developed by Educational Testing Service (ETS) in collaboration with National Geographic Learning. The ELTeach program consists of two courses: English-for-Teaching and Professional Knowledge for English Language Teaching (ELT). Each course includes a coordinated assessment leading to a score report and certificate for teachers of English as a foreign language (EFL). The Test of English-for-Teaching (TEFT™), the assessment component of the English-for-Teaching course, measures EFL teachers' command of English for teaching English in classroom settings, as presented in the course. In this study, we examined the internal structure of the TEFT assessment. Results of the analyses demonstrated the role of both skill and content in representing the test's internal structure. The final parcel model had a higher-order general factor and four first-order factors corresponding to reading, writing, listening, and speaking. The findings support the current score reporting practice, that is, to report a total scaled score along with score information on language skills and on language use in specific content areas.
Citations
More filters
Journal ArticleDOI
TL;DR: A systematic review of research in the area of online teacher professional development (oTPD) is presented in this paper, which aims to inform developers and facilitators on complex and unique empirical indicators that are important in designing, developing, implementing, evaluating, and researching oTPD.
Abstract: This systematic review documents the progress of research in the area of online teacher professional development (oTPD) and seeks to inform developers and facilitators on the complex and unique empirical indicators that are important in designing, developing, implementing, evaluating, and researching oTPD. The 73 studies analyzed in this review suggest that the research in oTPD is progressing toward more rigorous empirical methods and theoretically-grounded design, implementation, and evaluation. Research in oTPD is moving forward in more sophisticated ways and adding to our understanding of high-quality practices that engage teachers in meaningful teacher professional learning in online contexts.

20 citations


Cites background from "Examining the Internal Structure of..."

  • ...Other studies, like Chapman et al. (2010) and Gu et al. (2015), explore a particular aspect or question but in the context of a large-scale project....

    [...]

  • ...Researchers were also interested in how both pedagogical tools (e.g., test structure, Gu et al., 2015) and pedagogical practices (e.g., adopting problem-based learning, An, 2013) could be successfully managed and implemented in online formats....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the validity issues of the most important Iranian high-stake test known as Konkur measuring English language proficiency of the participants were addressed by running generalizability and IRT analyses.
Abstract: Over the years, in order to assess the quality of education various measures including high stakes tests have been employed. The validity of these measures has always been a matter of discussion. Thus, the present study addresses the concerns deal with the validity issues of the most important Iranian high-stake test known as Konkur measuring English language proficiency of the participants. In the present study, by taking into account the scores of 5000 high school students who sat for Konkur examination, the dimensionality and factorial structure of the test was examined through running generalizability and IRT analyses; At the first stage, in order to check the data standards along with the descriptive statistics a generalizability analysis was run; it was found that the reliability of the data equals .92 and the generalizability coefficient is 0.9 showing a high dependability for the analysis. In the second stage, the item statistics and parameter measures were analyzed and the misfit items were investigated. Next, the data was checked in terms of dimensionality and three IRT models namely unidimensional, bifactor and testlet model were run and compared. Findings showed that in terms of fit indices (M2, AIC, BIC) testlet model has the best model fit, followed by bifactor and unidimensional respectively, in determining variance of the responses to test and functions well for assessing language proficiency of the EFL leaners. The findings also showed that the variance of the factor structure of language proficiency is best explained when it is assessed through examination of several language sub-skills(i.e. testlet) rather than being measured through a number of items directly (i.e. bifactor).

8 citations

References
More filters
Journal ArticleDOI
TL;DR: In this article, the adequacy of the conventional cutoff criteria and several new alternatives for various fit indexes used to evaluate model fit in practice were examined, and the results suggest that, for the ML method, a cutoff value close to.95 for TLI, BL89, CFI, RNI, and G...
Abstract: This article examines the adequacy of the “rules of thumb” conventional cutoff criteria and several new alternatives for various fit indexes used to evaluate model fit in practice. Using a 2‐index presentation strategy, which includes using the maximum likelihood (ML)‐based standardized root mean squared residual (SRMR) and supplementing it with either Tucker‐Lewis Index (TLI), Bollen's (1989) Fit Index (BL89), Relative Noncentrality Index (RNI), Comparative Fit Index (CFI), Gamma Hat, McDonald's Centrality Index (Mc), or root mean squared error of approximation (RMSEA), various combinations of cutoff values from selected ranges of cutoff criteria for the ML‐based SRMR and a given supplemental fit index were used to calculate rejection rates for various types of true‐population and misspecified models; that is, models with misspecified factor covariance(s) and models with misspecified factor loading(s). The results suggest that, for the ML method, a cutoff value close to .95 for TLI, BL89, CFI, RNI, and G...

76,383 citations


"Examining the Internal Structure of..." refers background or methods in this paper

  • ...A CFI value larger than 0.9 indicates an adequate model fit (Hu & Bentler, 1999)....

    [...]

  • ...An SRMR value of 0.08 or below is commonly considered as a sign of acceptable fit (Hu & Bentler, 1999)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, two types of error involved in fitting a model are considered, error of approximation and error of fit, where the first involves the fit of the model, and the second involves the model's shape.
Abstract: This article is concerned with measures of fit of a model. Two types of error involved in fitting a model are considered. The first is error of approximation which involves the fit of the model, wi...

25,611 citations


"Examining the Internal Structure of..." refers methods in this paper

  • ...RMSEA values smaller than 0.05 can be interpreted as a sign of close model fit while values between 0.05 and 0.08 indicate adequate fit (Browne & Cudeck, 1993)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors examined the change in the goodness-of-fit index (GFI) when cross-group constraints are imposed on a measurement model and found that the change was independent of both model complexity and sample size.
Abstract: Measurement invariance is usually tested using Multigroup Confirmatory Factor Analysis, which examines the change in the goodness-of-fit index (GFI) when cross-group constraints are imposed on a measurement model. Although many studies have examined the properties of GFI as indicators of overall model fit for single-group data, there have been none to date that examine how GFIs change when between-group constraints are added to a measurement model. The lack of a consensus about what constitutes significant GFI differences places limits on measurement invariance testing. We examine 20 GFIs based on the minimum fit function. A simulation under the two-group situation was used to examine changes in the GFIs (ΔGFIs) when invariance constraints were added. Based on the results, we recommend using Δcomparative fit index, ΔGamma hat, and ΔMcDonald's Noncentrality Index to evaluate measurement invariance. These three ΔGFIs are independent of both model complexity and sample size, and are not correlated with the o...

10,597 citations


"Examining the Internal Structure of..." refers background in this paper

  • ...Model equivalence is also indicated by a ΔCFI less than or equal to 0.01 (Cheung & Rensvold, 2002)....

    [...]

01 Jan 1994

2,776 citations


"Examining the Internal Structure of..." refers methods in this paper

  • ...A corrected normal theory estimation method, the Satorra-Bentler estimation (Satorra & Bentler, 1994), was employed by using the MLM estimator in Mplus to correct global fit indices and standard errors for non-normality....

    [...]