scispace - formally typeset
Search or ask a question

Does it Matter How Data are Collected? A Comparison of Testing Conditions and the Implications for Validity

01 Jan 2009-Vol. 4, pp 17-26
TL;DR: In this article, the authors examined the effect of low-stakes test conditions on student self-efficacy scores and found that the very controlled context yielded the best model-data fit.
Abstract: The effects of gathering test scores under low-stakes conditions has been a prominent domain of research in the assessment and testing literature. One important area within this larger domain concerns the implications of a test being low-stakes on test evaluation and development. The current study examined one variable, the testing context, that could impact students’ responses during low-stakes testing, and subsequently the decisions made when using the data for test refinement. Specifically, the factor-structure of college self-efficacy scores was examined across three low-stakes testing contexts, and results indicated differential model-data fit across conditions (the very controlled context yielded the best model-data fit), implying that testing conditions should be seriously considered when gathering low-stakes data used for instrument development.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: This article examined the psychometric properties of the test-related items from the achievement emotion questionnaire (AEQ) using a sample of 955 university students and examined the factor factors.
Abstract: Few studies have examined the psychometric properties of the test-related items from the Achievement Emotions Questionnaire (AEQ). Using a sample of 955 university students, we examined the factor ...

5 citations


Cites background from "Does it Matter How Data are Collect..."

  • ...Given concerns regarding the validity of inferences from data gathered in lowstakes testing (Wise & DeMars, 2005), trained proctors read test instructions aloud and encouraged students to give their best effort (Barry & Finney, 2009)....

    [...]

Book ChapterDOI
01 Jan 2018
TL;DR: In this article, the authors chronicle the pedagogical theories and educational research into active learning that create a compelling argument for the RTTP approach, including the role of feedback in the learning process.
Abstract: The volume co-editors chronicle the pedagogical theories and educational research into active learning that create a compelling argument for the RTTP approach. Beyond the current research trends and emerging patterns of instructional innovation in American higher education, the work of Piaget, Bruner, and Vygotsky are discussed, as well as the role of feedback in the learning process. The authors chronicle relevant research regarding active learning calling on the most recent meta-analyses of active learning research and the scholarship on game-based active learning. Ultimately, an examination of RTTP’s history and design to capitalize on what social constructivism recommends and on what previous active learning research suggests is provided. That little research has been performed to date that specifically explores various impact metrics fostered by RTTP establishes the need for this volume and sets the stage for the eight studies that follow.

4 citations

01 Jan 2018

3 citations


Cites background from "Does it Matter How Data are Collect..."

  • ...This occurrence is known as adverse impact, and is expected across low-stakes and high-stakes testing situations (Barry & Finney, 2009; DeMars, 2000; Wise & DeMars, 2005)....

    [...]

  • ...In contrast, assessments are high-stakes when students have personal consequences for their performance (Barry & Finney, 2009; DeMars, 2000; Wise & DeMars, 2005)....

    [...]

  • ...In addition, construct-irrelevant variance due to differential effort across stakes in testing threatens the validity of scores (Barry & Finney, 2009)....

    [...]

01 Jan 2016

3 citations


Cites background from "Does it Matter How Data are Collect..."

  • ...…attenuate or inflate relationships with other variables, attenuate internal consistency estimates, and impact the factor structure of a measure (Barry & Finney, 2009; Conway, 2002; Huang, Liu, & Bowling, 2015; Kam & Meyer, 2015; MacKenzie & Podsakoff, 2012; Meade & Craig, 2012; Swerdzewski,…...

    [...]

  • ...…there is a concern students do not put forth their best effort when responding, which is problematic given previous research has found noneffortful responding can negatively impact the validity of results (e.g., Barry & Finney, 2009; Meade & Craig, 2012; Swerdzewski, Harmes, & Finney, 2011)....

    [...]

Dissertation
01 Jan 2013
TL;DR: In this paper, the authors explored students' perceptions of test stakes and test value and found that students are not motivated before the test and their motivation to work on their English after the test largely depends on their perceived usefulness of the DELTA report.
Abstract: Performance in an assessment is not the reflection of just one’s knowledge and skills;motivation also plays a part When the stakes of the assessment are low, it is logical to assume that students will have lower motivation to perform well in it The Diagnostic English Language Tracking Assessment (DELTA) diagnoses and tracks students’ English language progress during their years of study at three universities in Hong Kong Although the DELTA is a low stakes assessment, students get a report with their DELTA measure and detailed feedback on their performance This study provides insights into test motivation as well as how useful students find a diagnostic report is to their language learning by ways of questionnaire survey and group interview, so as to explore students’ perceptions of test stakes and test value The survey includes the Student Opinion Scale by Sundre and Moore (2002),which measures students’ motivation during the test; and a feedback usefulness scale specifically designed for this study to measure students’ perceptions of the usefulness of the diagnostic report The results show that both scales are valid instruments to be used in this context and students are not motivated whilst sitting the test although they find the DELTA report quite useful Data from the students’ interviews provide further information as to students’ motivation before and after the DELTA In general they are not motivated before the test and their motivation to work on their English after the test largely depends on their perceived usefulness of the DELTA report Lastly, as L2 motivation is a dynamic entity which will not remain constant over time, the study also demonstrates how Dornyei and Otto’s (1998) process model of L2 motivation can be adapted in explaining students’ test preparation and test taking process in low stakes diagnostic tests

2 citations

References
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors reviewed the underlying cognitive and communicative processes underlying self-reports, focusing on issues of question comprehension, behavioral frequency reports, and the emergence of context effects in attitude measurement.
Abstract: Self-reports of behaviors and attitudes are strongly influenced by features of the research instrument, including question wording, format, and context. Recent research has addressed the underlying cognitive and communicative processes, which are systematic and increasingly wellunderstood. I review what has been learned, focusing on issues of question comprehension, behavioral frequency reports, and the emergence of context effects in attitude measurement. The accumulating knowledge about the processes underlying self-reports promises to improve questionnaire design and data quality.

2,566 citations


"Does it Matter How Data are Collect..." refers background in this paper

  • ...We believed we were seeing an item-order effect (e.g., Schwarz, 1999; Tourangeau & Rasinksi, 1988) due to low motivation....

    [...]

  • ...…in the fact that these items were presented in succession, and the strong relationships may have been caused by an item-ordering effect; especially when expressing attitudes, preceding questions can influence the responses given to subsequent ones (e.g., Schwarz, 1999; Tourangeau & Rasinksi, 1988)....

    [...]

Book
03 Apr 2000
TL;DR: A First Course in Structural Equation Modeling as discussed by the authors is an excellent introductory book for structural equation modeling with examples from EQS, LISREL, and Mplus, which can be used to set up input files to fit the most commonly used types of structural equation models with these programs.
Abstract: In this book, authors Tenko Raykov and George A. Marcoulides introduce students to the basics of structural equation modeling (SEM) through a conceptual, nonmathematical approach. For ease of understanding, the few mathematical formulas presented are used in a conceptual or illustrative nature, rather than a computational one.Featuring examples from EQS, LISREL, and Mplus, A First Course in Structural Equation Modeling is an excellent beginner’s guide to learning how to set up input files to fit the most commonly used types of structural equation models with these programs. The basic ideas and methods for conducting SEM are independent of any particular software.Highlights of the Second Edition include:• Review of latent change (growth) analysis models at an introductory level• Coverage of the popular Mplus program• Updated examples of LISREL and EQS• Downloadable resources that contains all of the text’s LISREL, EQS, and Mplus examples.A First Course in Structural Equation Modeling is intended as an introductory book for students and researchers in psychology, education, business, medicine, and other applied social, behavioral, and health sciences with limited or no previous exposure to SEM. A prerequisite of basic statistics through regression analysis is recommended. The book frequently draws parallels between SEM and regression, making this prior knowledge helpful.

1,548 citations


"Does it Matter How Data are Collect..." refers background in this paper

  • ...These values can be positive or negative, indicating under- or over-representation of relationships, and absolute values of three or greater have been suggested as values to indicate a poorly reproduced relationship (Raykov & Marcoulides, 2000)....

    [...]

Journal ArticleDOI
TL;DR: Results demonstrate that over repeated samples, model modifications may be very inconsistent and cross-validation results may behave erratically, leading to skepticism about generalizability of models resulting from data-driven modifications of an initial model.
Abstract: In applications of covariance structure modeling in which an initial model does not fit sample data well, it has become common practice to modify that model to improve its fit. Because this process is data driven, it is inherently susceptible to capitalization on chance characteristics of the data, thus raising the question of whether model modifications generalize to other samples or to the population. This issue is discussed in detail and is explored empirically through sampling studies using 2 large sets of data. Results demonstrate that over repeated samples, model modifications may be very inconsistent and cross-validation results may behave erratically. These findings lead to skepticism about generalizability of models resulting from data-driven modifications of an initial model. The use of alternative a priori models is recommended as a preferred strategy.

1,492 citations

Journal ArticleDOI
TL;DR: In this article, the authors argue that an answer to an attitude question is the product of a four-stage process: first, respondents interpret the attitude question, determine what attitude the question is about, then they apply these beliefs and feelings in rendering the appropriate judgment, and finally they use this judgment to select a response.
Abstract: We begin this article with the assumption that attitudes are best understood as structures in longterm memory, and we look at the implications of this view for the response process in attitude surveys. More specifically, we assert that an answer to an attitude question is the product of a fourstage process. Respondents first interpret the attitude question, determining what attitude the question is about. They then retrieve relevant beliefs and feelings. Next, they apply these beliefs and feelings in rendering the appropriate judgment. Finally, they use this judgment to select a response. All four of the component processes can be affected by prior items. The prior items can provide a framework for interpreting later questions and can also make some responses appear to be redundant with earlier answers. The prior items can prime some beliefs, making them more accessible to the retrieval process. The prior items can suggest a norm or standard of comparison for making the judgment. Finally, the prior items can create consistency pressures or pressures to appear moderate. Because of the multiple processes involved, context effects are difficult to predict and sometimes difficult to replicate. We attempt to sort out when context is likely to affect later responses and include a list of the variables that affect the size and direction of the effects of context.

963 citations


"Does it Matter How Data are Collect..." refers background in this paper

  • ...We believed we were seeing an item-order effect (e.g., Schwarz, 1999; Tourangeau & Rasinksi, 1988) due to low motivation....

    [...]

  • ...…in the fact that these items were presented in succession, and the strong relationships may have been caused by an item-ordering effect; especially when expressing attitudes, preceding questions can influence the responses given to subsequent ones (e.g., Schwarz, 1999; Tourangeau & Rasinksi, 1988)....

    [...]

Journal ArticleDOI
TL;DR: In this article, a theoretical model of test-taking motivation is presented, with a synthesis of previous research indicating that low student motivation is associated with a substantial decrease in test performance.
Abstract: Student test-taking motivation in low-stakes assessment testing is examined in terms of both its relationship to test performance and the implications of low student effort for test validity. A theoretical model of test-taking motivation is presented, with a synthesis of previous research indicating that low student motivation is associated with a substantial decrease in test performance. A number of assessment practices and data analytic procedures for managing the problems posed by low student motivation are discussed.

435 citations


"Does it Matter How Data are Collect..." refers background in this paper

  • ...Thus, their scores may not serve as valid indicators of their true level of the construct of interest (Sundre, 1999; Sundre & Kitsantas, 2004; Wise & DeMars, 2005)....

    [...]

  • ...However, if low motivation results in test scores that are not truly representative of the construct of interest, the scores are then ambiguous at best and misleading at worst (Wise & DeMars, 2005)....

    [...]

  • ...Because there are very few, if any, consequences associated with performance and because students may perceive no personal gain from the experience, low-stakes testing often leads to low effort and motivation on the part of the test-taker (Wise & DeMars, 2005)....

    [...]

Trending Questions (1)
What if test is valid but not reliable? is the test still can used to collect data?

The paper does not directly address the question of whether a test can still be used to collect data if it is valid but not reliable. The paper focuses on the impact of testing conditions on data collection and instrument development.