scispace - formally typeset
Search or ask a question

Showing papers in "Measurement in Physical Education and Exercise Science in 1999"


Journal ArticleDOI
TL;DR: In this paper, a modified version of a procedure described by Hambleton (1980) was used to provide a systematic quantitative assessment of judges' item content-relevance ratings, and an expert panel of 38 judges rated the degree of match between 16 cognitive worry items and 4 latent worry dimensions to be measured by a newly constructed sport-specific anxiety measure.
Abstract: During the item-construction phase of scale development, expert judges frequently rate the degree of match between the content of test items and the objectives to be measured by the test. However, Messick (1989) contended that systematic attempts to document and assess the item ratings provided by expert judges are not commonplace in the literature, and this is particularly true in the field of sport psychology. In response to Messick's concerns, the purpose of this study was to illustrate in a sport psychology context how a modified version of a procedure described by Hambleton (1980) can be used to provide a systematic quantitative assessment of judges' item content-relevance ratings. An expert panel of 38 judges rated the degree of match between 16 cognitive worry items and 4 latent worry dimensions to be measured by a newly constructed sport-specific anxiety measure. Issues regarding the composition of the expert panel and methods used to assess and report the experts' ratings are discussed.

187 citations


Journal ArticleDOI
TL;DR: In this article, a 4-stage study empirically examined the multidimensionality of situational interest in physical education, using an iterative, multisample design, and revealed five dimensions of situational interests: novelty, challenge, exploration intention, instant enjoyment, and attention demand.
Abstract: Situational interest has been theoretically articulated as a multidimensional construct that derives from person-activity interaction. This 4-stage study empirically examined the multidimensionality of situational interest in physical education, using an iterative, multisample design. Middle school students (N = 674) were asked to view jogging and gymnastic stunts on video (in Stages 1,2, and 3) and participate in basketball chest-pass and pass-shoot activities (in Stage 4). Immediately following each activity, situational interest of the activity was assessed by having the students respond to an instrument developed to measure the 7 dimensions of situational interest. Exploratory and confirmatory factor analyses were employed to examine the dimensionality of situational interest. The analyses revealed 5 dimensions of situational interest: Novelty, Challenge, Exploration Intention, Instant Enjoyment, and Attention Demand. A 24-item Situational Interest Scale was developed and revised during the 4-stage va...

136 citations


Journal ArticleDOI
TL;DR: In this paper, the authors provided practical methods for the a priori determination of power for the repeated measures (RM) analysis of variance (ANOVA) and derived 10 power approximation equations for the main and interaction effects of the 1-way RM, 2-way mixed and 2way RM ANOVA designs.
Abstract: The determination of statistical power for a repeated measures experimental study is a necessary, but difficult and seldom done, calculation. The purpose of this study was to provide practical methods for the a priori determination of power for the repeated measures (RM) analysis of variance (ANOVA). Stepwise regression analysis procedures were used to derive 10 power approximation equations for the main and interaction effects of the 1-way RM, 2-way mixed and 2-way RM ANOVA designs. The theoretically correct power values were calculated using the program DATASIM (Bradley, 1988) for various levels of effect size, number of levels of each factor, sample size, and mean correlations among RM factor levels. Potvin and Schutz's (in press) equations for estimating the error variance were utilized for computing power for the 2-way RM ANOVA. The derived equations showed a high level of precision in approximating power (R2 = .97 to .99, SEE = .0 12 to .035) with 9 to 16 predictor variables for the 1-way RM and the...

70 citations


Journal ArticleDOI
TL;DR: The history of factor analysis dates back to the early 1900s as discussed by the authors, and the distinction between exploratory and confirmatory factor analysis has been discussed in detail in the context of construct validity.
Abstract: In this article, the separate histories of validity and factor analysis are reviewed, as well as some of the ways in which factor analysis has become such a widely used technique for an estimation of construct validity. From a historical perspective, 4 eras, in terms of definitions of validity, are identified; current views emphasize the validity of inferences drawn from scores, the conceptualization of validity as a unitary concept, and the need to include the study of the consequences of test use. The history of factor analysis dates back to the early 1900s. Important developments in its history are reviewed, including the distinction between exploratory and confirmatory factor analysis. Finally, some recommendations for the appropriate use of factor analysis are offered, as well as cautions against overreliance on factor analysis in the estimation of construct validity.

60 citations


Journal ArticleDOI
TL;DR: This study compares the data produced by the previously validated and often used System for Observing Fitness Instruction Time instrument with a computerized instrument, the Computer-SOFIT, a duration coding instrument that was written in the C computer language and pilot tested in a number of classes with different coders to make it usable in physical education.
Abstract: As the focus on health-related physical education increases, promoting physical activity leading to the development of physical fitness becomes an important component of school physical education programs. To determine the status and effect of teaching processes related to physical activity and fitness in physical education, it is necessary to have instrumentation that produces reliable, valid, and usable scores for the population in which they are used. The purpose of this study is to compare the data produced by the previously validated and often used System for Observing Fitness Instruction Time (SOFIT) instrument (McKenzie, Sallis, & Nader, 1991) with a computerized instrument, the Computer-SOFIT (C-SOFIT). The categories of the C-SOFIT instrument are exactly the same as those of the interval coding system, the SOFIT instrument. C-SOFIT is a duration coding instrument that was written in the C computer language and pilot tested in a number of classes with different coders to make it usable in physical...

36 citations


Journal ArticleDOI
TL;DR: In this paper, the norm-referenced predictive validity of maximal oxygen consumption (VO2) max estimated from the progressive aerobic cardiovascular endurance run (PACER, FITNESSGRAM®; Cooper Institute for Aerobic Research, Dallas, TX) performance was determined.
Abstract: The first purpose of this study was to determine the norm-referenced predictive validity of maximal oxygen consumption (VO2) max estimated from the progressive aerobic cardiovascular endurance run (PACER, FITNESSGRAM®; Cooper Institute for Aerobic Research, Dallas, TX) performance by 3 separate formulas: the Leger, Mercier, Gadoury, and Lambert (1988) 8- to 19-year-old equation; the Leger et al. adult equation; and the Ramsbottom, Brewer, and Williams (1988) equation. Norm-referenced intraclass stability reliability coefficients (n = 19) were determined to be .96 for PACER and estimated VO2 max values. Only the VO2 max values estimated from the Leger et al. adult equation (47.29 ± 7.02 vs. 50.45 ± 8.01 rnL · kg-1 · min-1 measured; p < .0001) were shown to be valid (r = .82; standard error of estimate [SEE] = 4.59; Error = 5.58; percentage of participants whose measured VO2 max fell within ± 4.5 mL · kg-1 · min-1 of estimated VO2 max = 59.7; N = 60 female participants ± 59 male participants). The second pu...

29 citations


Journal ArticleDOI
TL;DR: LOA and LPR are suitable techniques to compare 2 measurements and, because the levels are large and the slope does not encompass 1, suggest that the knee's PTE at 2.1 rad/sec is unreliable.
Abstract: Comparative analyses of a variable measured twice or against a "gold standard" technique should explore the existence of any fixed and proportional biases between the 2 measurements. Levels of agreement (LOA) consider these biases together and least products regression (LPR) consider their effect independently. To compare the use of LOA and LPR, the peak torque extension (PTE) of the knee at 2.1 rad/sec during isokinetic dynamometry was obtained on 2 separate days (N = 17). The mean PTE (with standard deviations in parentheses) was found to be 93.6 (13.9) N · m on Day 1 and 92.5 (11.5) N · m on Day 2. The LOA were 1.06 ± 10.80 N · m (95% confidence), and the LPR's (with 95% confidence intervals in parentheses) intercept was -17.7 N · m (-37.4 to 2.03) and slope was 1.20 (1.01 to 1.40). LOA and LPR are suitable techniques to compare 2 measurements and, because the levels are large and the slope does not encompass 1, suggest that the knee's PTE at 2.1 rad/sec is unreliable.

25 citations


Journal ArticleDOI
TL;DR: The purpose of these studies was to develop a new questionnaire, called the Body Self-Image Questionnaire (BSIQ), to measure body image in young adults.
Abstract: The purpose of these studies was to develop a new questionnaire, called the Body Self-Image Questionnaire (BSIQ), to measure body image in young adults. During the questionnaire development process...

13 citations


Journal ArticleDOI
TL;DR: The dependability of the Profile of Mood States (POMS) is examined in several different ways and, in doing so, illustrates the flexibility of Generalizability theory as mentioned in this paper.
Abstract: The dependability of the Profile of Mood States (POMS) is examined in several different ways and, in doing so, illustrates the flexibility of Generalizability theory. One issue evaluated is the impact of changing the design by sampling more dimensions of mood disturbance, or by increasing the numbers of items per subscale. The second issue examined is the impact of considering scales as fixed effects versus random effects. Each of the POMS' 6 subscales displayed reasonably high G coefficients (ξρ ≥ .74). The generalizability of the overall Total Mood Score (TMS) was found to be fairly high (-.90), even if scales were treated as random. In the random case, generalizability would tend to increase modestly if the number of scales were increased, and would increase very little if the number of items per scale were doubled. Treating subscales as fixed had a substantial positive impact on the dependability of the TMS, yielding a G coefficient of .96.

8 citations


Journal ArticleDOI
TL;DR: The ability of the BIA device to categorize into normal and obese categories when compared to the skinfold technique was also impressive, but the results of the limits of agreement analysis showed that the approximate 95% CI for the differen...
Abstract: This study sought to determine whether body fat percentage estimates using an inexpensive bioelectrical impedance analysis (BIA) device (Tanita TBF-521, Tanita Corp, of America, Arlington Heights, IL) could be freely substituted for those obtained utilizing skinfolds (method equivalence). Data were analyzed from measurements on 38 children age 6 to 12 years. Random effect intraclass correlation (r = .930; 95% confidence interval [CI] = .87 to .96), fixed effect intraclass correlation (r = .932; 95% CI = .87 to .96), and least-products regression supported the equivalence of the 2 methods. The technical error between the 2 methods (3.7; coefficient of variation = 12.2%) was small and also supported the equivalence of the two methods. The ability of the BIA device to categorize into normal and obese categories when compared to the skinfold technique was also impressive (κ = .95; 95% CI = .73 to .99). However, the results of the limits of agreement analysis showed that the approximate 95% CI for the differen...

7 citations


Journal ArticleDOI
TL;DR: In this article, the authors examined the dependability of changes in Body Awareness Scale (BAS) scores from baseline to precompetition and concluded that the BAS in its present 14-item form had a high enough generalizability coefficient to support its intended uses.
Abstract: The Body Awareness Scale (BAS) is an as-yet unpublished measure of somatic arousal that is still in the process of being validated. The purpose of this study was to examine the dependability of changes in BAS scores from baseline to precompetition. Twenty-two male collegiate athletes completed the BAS on 2 separate occasions: baseline and precompetition. Generalizability theory was used to examine the score consistency or dependability of observed baseline scores, observed precompetition scores, and observed baseline to precompetition change scores on the BAS. Results indicated that the BAS had a relatively low generalizability at baseline for these athletes. However, the precompetition scores and the change scores both demonstrated relatively high levels of generalizability. It was concluded that the BAS in its present 14-item form had a high enough generalizability coefficient to support its intended uses.

Journal ArticleDOI
TL;DR: In this article, a simple linear regression model is proposed to deal with the tendency of change scores to be negatively correlated with initial scores, and the relationship between change scores and initial score is shown to be linear, even though the underlying model is exponential.
Abstract: Hale and Hale (1972) proposed an exponential model to deal with the tendency of change scores to be negatively correlated with initial scores. The Hale and Hale method is quite complicated and requires specification of a maximum score, which is fairly arbitrary in many cases. The model developed in this article assumes that the rate of change is inversely related to the difference between current performance and best possible performance, and therefore starts from assumptions that are similar to those of Hale and Hale. The new model implies an exponential relationship between change scores and time, but the relationship between change scores and initial score is shown to be linear, even though the underlying model is exponential. Thus, simple linear regression provides an effective basis for analysis. Applications of this model are therefore potentially much simpler than the Hale and Hale model. As an example, the model is applied to children's data on motor learning over repeated trials.

Journal ArticleDOI
TL;DR: This article investigated the validity and reliability of the I-mile track jog test predictions of VO2max in African Americans and Whites and found that multiple trials resulted in highly and equally reliable estimates for both ethnic groups ( r >.98).
Abstract: The purpose of this study was to investigate the validity and reliability of the I-mile track jog test predictions of aerobic capacity (VO2max) in African Americans and Whites. Thirty Whites (15 men and 15 women) and 31 African Americans (15 men and 16 women), ranging in age from 18 to 38 years, voluntarily participated in this study. Each participant randomly performed the 1-mile track jog test twice on an indoor track and a maximum graded exercise test on a motor-driven treadmill. Intraclass reliability analysis indicated that multiple trials of the I-mile track jog test resulted in highly and equally reliable VO2max estimates for both ethnic groups ( r > .98). A two-way analysis of variance (ANOVA) indicated that the mean I-mile track jog estimates of VO2max for Whites (46.25 6.8 ml · kg-1 · min-1) and African Americans (44.1 ± 6.0 ml · kg-1 · min-1) were similar, even though significant differences (p < .05) existed in measured VO2max between the Whites (45.1 ± 9.3 ml · kg-1 · min-1 ) and African Amer...


Journal ArticleDOI
TL;DR: In this paper, the reliability of peak isometric force during curl-ups in a population of 50- to 84-year-olds was examined, and 17 participants (9 men, 8 women) completed 3 maximal curlups.
Abstract: The purpose of this study was to examine the reliability of peak isometric force during curl-ups in a population of 50- to 84-year-olds. Seventeen participants (9 men, 8 women) completed 3 maximal ...

Journal ArticleDOI
TL;DR: The authors explored the utility of ARIMA to describe the process of skill acquisition and found that it can capture the correlation of variances between prior and current performance in a modified Stroop task.
Abstract: Most skill acquisition studies examine pre-post performance test differences using t test or analysis of variance methods. These methods assume independence of variance across trials. That is, the error associated with Trial 1 is uncorrelated with Trial 2 error. This statistical approach is founded on the assumption that information from early trials does not affect later performance. In the case of within-subject performance change, this assumption may be challenged. The purpose of this review is to explore the utility of Autoregressive Integrated Moving Average (ARIMA; McCleary & Hay, 1980) modeling to describe the process of skill acquisition. ARIMA modeling assumes correlation of variances between prior and current performance. Older and younger participants who performed 300 trials of a modified Stroop task produced the data used in this assessment. Group models suggested similar performance trends; however, these group models fail to recognize significant individual variation in skill acquisition pa...