scispace - formally typeset
Search or ask a question

Showing papers on "Intraclass correlation published in 2012"


Journal ArticleDOI
TL;DR: The iHOT-33 has been shown to be reliable; shows face, content, and construct validity; and is highly responsive to clinical change; and can be used as a primary outcome measure for prospective patient evaluation and randomized clinical trials.
Abstract: Purpose: The purpose of this study was to develop a self-administered evaluative tool to measure health-related quality of life in young, active patients with hip disorders. Methods: This outcome measure was developed for active patients (aged 18 to 60 years, Tegner activity level 4) presenting with a variety of symptomatic hip conditions. This multicenter study recruited patients from international hip arthroscopy and arthroplasty surgeon practices. The outcome was created using a process of item generation (51 patients), item reduction (150 patients), and pretesting (31 patients). The questionnaire was tested for test-retest reliability (123 patients); face, content, and construct validity (51 patients); and responsiveness over a 6-month period in post-arthroscopy patients (27 patients). Results: Initially, 146 items were identified. This number was reduced to 60 through item reduction, and the items were categorized into 4 domains: (1) symptoms and functional limitations; (2) sports and recreational physical activities; (3) job-related concerns; and (4) social, emotional, and lifestyle concerns. The items were then formatted using a visual analog scale. Test-retest reliability showed Pearson correlations greater than 0.80 for 33 of the 60 questions. The intraclass correlation statistic was 0.78, and the Cronbach was .99. Face validity and content validity were ensured during development, and construct validity was shown with a correlation of 0.81 to the Non-Arthritic Hip Score. Responsiveness was shown with a paired t test (P .01), effect size of 2.0, standardized response mean of 1.7, responsiveness ratio of 6.7, and minimal clinically important difference of 6 points. Conclusions: We have developed a new quality-of-life patient-reported outcome measure, the 33-item International Hip Outcome Tool (iHOT-33). This questionnaire uses a visual analog scale response format designed for computer self-administration by young, active patients with hip pathology. Its development has followed the most rigorous methodology involving a very large number of patients. The iHOT-33 has been shown to be reliable; shows face, content, and construct validity; and is highly responsive to clinical change. In our opinion the iHOT-33 can be used as a primary outcome measure for prospective patient evaluation and randomized clinical trials.

355 citations


Journal ArticleDOI
TL;DR: The BDI-II is reliable and valid for measuring depressive symptomatology among Portuguese-speaking Brazilian non-clinical populations and taking the SCID as the gold standard for detecting depression was the best threshold.

330 citations


Journal ArticleDOI
TL;DR: This paper presents a method that explicitly incorporates a prespecified probability of achieving the prespecification width or lower limit of a confidence interval and the resultant closed-form formulas are shown to be very accurate.
Abstract: The number of subjects required to estimate the intraclass correlation coefficient in a reliability study has usually been determined on the basis of the expected width of a confidence interval. However, this approach fails to explicitly consider the probability of achieving the desired interval width and may thus provide sample sizes that are too small to have adequate chance of achieving the desired precision. In this paper, we present a method that explicitly incorporates a prespecified probability of achieving the prespecified width or lower limit of a confidence interval. The resultant closed-form formulas are shown to be very accurate. Copyright © 2012 John Wiley & Sons, Ltd.

296 citations


Journal ArticleDOI
TL;DR: Shear wave elastography is a reliable and reproducible noninvasive method for the assessment of liver elasticity and expert operator had higher reproducibility of measurements over time than novice operator.

230 citations


Journal ArticleDOI
TL;DR: In this paper, the authors evaluated agreement among three generations of ActiGraph accelerometers in children and adolescents for vertical axis counts, vector magnitude counts, and time spent in moderate-to-vigorous physical exercise (MVPA).
Abstract: In this study, we evaluated agreement among three generations of ActiGraph™ accelerometers in children and adolescents. Twenty-nine participants (mean age = 14.2 ± 3.0 years) completed two laboratory-based activity sessions, each lasting 60 min. During each session, participants concurrently wore three different models of the ActiGraph™ accelerometers (GT1M, GT3X, GT3X+). Agreement among the three models for vertical axis counts, vector magnitude counts, and time spent in moderate-to-vigorous physical exercise (MVPA) was evaluated by calculating intraclass correlation coefficients and Bland-Altman plots. The intraclass correlation coefficient for total vertical axis counts, total vector magnitude counts, and estimated MVPA was 0.994 (95% CI = 0.989-0.996), 0.981 (95% CI = 0.969-0.989), and 0.996 (95% CI = 0.989-0.998), respectively. Inter-monitor differences for total vertical axis and vector magnitude counts ranged from 0.3% to 1.5%, while inter-monitor differences for estimated MVPA were equal to or close to zero. On the basis of these findings, we conclude that there is strong agreement between the GT1M, GT3X, and GT3X+ activity monitors, thus making it acceptable for researchers and practitioners to use different ActiGraph™ models within a given study.

193 citations


Journal ArticleDOI
TL;DR: The results show that ICC estimates from different methods can be quite different, although confidence intervals generally overlap, and investigators should consider using sample size and analysis methods that allow the ICC to vary by study condition.

186 citations


Journal Article
TL;DR: High reliability and relatively moderate validity were found for the Persian translated MAQ in adults from Tehran, Iran, however, further studies with larger sample sizes are suggested to more precisely assess the validity of the MAQ.
Abstract: Background: The purpose of this study is to evaluate the validity and reliability of a Persian translation of the Modio able Activity Questionnaire (MAQ) in a sample of adults from Tehran, Iran. Methods: There were 48 adults (53.1% males) enrolled to test the physical activity questionnaire. A sub-sample included 33 participants (45.5% males) who assessed the reliability of the physical activity questionnaire.The validity was tested in 25 individuals (48.0% males). The reliability of two MAQs was calculated by intraclass correlation coefo cients. The validation study was evaluated with the Spearman correlation coefo cients to compare data between the means of 2 MAQs and the means of 4 physical activity records. Results: Intraclass correlation coefo cients between 2 MAQs for the previous year's leisure time was 0.94; for occupational, it was 0.98;and for total (leisure and occupational combined) physical activity, it was 0.97. The Spearman correlation coefo cients between the means of the 2 MAQs and means of the 4 physical activity records was 0.39 (P = 0.05) for leisure time, 0.36 (P = 0.07) for occupational, and 0.47 (P = 0.01) for total (leisure and occupational combined) physical activities. Conclusions: High reliability and relatively moderate validity were found for the Persian translated MAQ in adults from Tehran. However, further studies with larger sample sizes are suggested to more precisely assess the validity of the MAQ.

175 citations


Journal ArticleDOI
TL;DR: The results of the study suggest that the Upper Quarter Y Balance Test is a reliable test for measuring upper extremity reach distance while in a closed-chain position and there was no significant difference in performance between genders or between sides on the test when normalized to limb length.
Abstract: The inclusion of movement tests before performance training and sport participation is gaining popularity as part of musculoskeletal screening for injury. The identification of an athlete's asymmetries and poor performance in the preseason allows coaches and sports medicine clinicians the opportunity to proactively address these deficits to reduce the potential for injury. Currently, there are no tests reported in the literature that simultaneously require shoulder and core stability while taking the subjects through a large range of motion at the end range of their stability. Thus, the purpose of this article was to describe the Upper Quarter Y Balance Test and report the gender differences in the performance of the test. Upper extremity reach distances were measured in 95 active adults using a standardized upper extremity balance-and-reach protocol. Intraclass correlation coefficients were used to assess reliability, and gender differences were analyzed using an independent samples t-test, whereas bilateral differences were analyzed using a dependent samples t-test for the normalized composite reach scores. Intraclass correlation coefficient (3.1) for test-retest reliability ranged from 0.80 to 0.99. Intraclass correlation coefficient (3.1) for interrater reliability was 1.00. Average composite scores (right/left) reported as a percentage of limb length were 81.7/82.3% for men and 80.7/80.7% for women. The results of the study suggest that the Upper Quarter Y Balance Test is a reliable test for measuring upper extremity reach distance while in a closed-chain position. It was further determined that there was no significant difference in performance between genders or between sides on the test when normalized to limb length. Coaches and sports medicine professionals may consider incorporating the Upper Quarter Y Balance Test as part of their preprogram testing to identify movement limitations and asymmetries in athletes and thereby may reduce injury.

171 citations


Journal ArticleDOI
TL;DR: The OSPAQ brief instrument measures sitting and standing at work as distinct behaviors and would be especially suitable in national health surveys, prospective cohort studies, and other studies that are limited by space constraints for questionnaire items.
Abstract: Purpose: Sitting at work is an emerging occupational health risk. Few instruments designed for use in population-based research measure occupational sitting and standing as distinct behaviors. This study aimed to develop and validate brief measure of occupational sitting and physical activity. Methods: A convenience sample (n = 99, 61% female) was recruited from two medium-sized workplaces and by word-of-mouth in Sydney, Australia. Participants completed the newly developed Occupational Sitting and Physical Activity Questionnaire (OSPAQ) and a modified version of the MONICA Optional Study on Physical Activity Questionnaire (modified MOSPA-Q) twice, 1 wk apart. Participants also wore an ActiGraph accelerometer for the 7 d in between the test and retest. Analyses determined test�retest reliability with intraclass correlation coefficients and assessed criterion validity against accelerometers using the Spearman ?. Results: The test�retest intraclass correlation coefficients for occupational sitting, standing, and walking for OSPAQ ranged from 0.73 to 0.90, while that for the modified MOSPA-Q ranged from 0.54 to 0.89. Comparison of sitting measures with accelerometers showed higher Spearman correlations for the OSPAQ (r = 0.65) than for the modified MOSPA-Q (r = 0.52). Criterion validity correlations for occupational standing and walking measures were comparable for both instruments with accelerometers (standing: r = 0.49; walking: r = 0.27�0.29). Conclusions: The OSPAQ has excellent test�retest reliability and moderate validity for estimating time spent sitting and standing at work and is comparable to existing occupational physical activity measures for assessing time spent walking at work. The OSPAQ brief instrument measures sitting and standing at work as distinct behaviors and would be especially suitable in national health surveys, prospective cohort studies, and other studies that are limited by space constraints for questionnaire items.

169 citations


Journal ArticleDOI
TL;DR: In this article, the validity, reliability, and responsiveness of the Patient-Specific Functional Scale (PSFS) in patients with musculoskeletal upper extremity problems being treated in physical therapy were examined.
Abstract: Study Design Clinical measurement, longitudinal; multicenter prospective cohort study. Objectives To examine the validity, reliability, and responsiveness of the Patient-Specific Functional Scale (PSFS) in patients with musculoskeletal upper extremity problems being treated in physical therapy. Background The clinimetric properties of the PSFS have not been established nor compared with region-specific outcome measures in patients with upper extremity problems. Methods Patients completed the PSFS, Upper Extremity Functional Index (UEFI), and numeric pain rating scale (NPRS) at baseline and follow-up, and were categorized as improved, stable, or worsened, using the global rating of change. Construct validity was assessed by comparing the change scores of the stable and improved groups, using independent-samples t tests. Reliability was evaluated using intraclass correlation coefficient (ICC2,1) with 95% confidence intervals. Bland-Altman plots determined limits of agreement. Responsiveness and minimal impo...

167 citations


Journal ArticleDOI
TL;DR: The FMS includes 7 tests: deep squat (DS), hurdle step (HS), in-line lunge (IL), shoulder mobility (SM), active straight leg raise (ASLR), trunk stability push-up (TSPU), and rotary stability (RS) and with the exception of HS, all tasks displayed moderate to high intersession reliability and good to highInterrater reliability.
Abstract: The purpose of this study was to examine the real-time intersession and interrater reliability of the functional movement screen (FMS). The overall study consisted of 19 volunteer civilians (12 male, 7 female). The intersession reliability consisted of 12 men and 7 women, whereas 10 men and 6 women participated in the interrater reliability test session. Two raters (A and B) were involved in the interrater reliability aspect of this study. The FMS includes 7 tests: deep squat (DS), hurdle step (HS), in-line lunge (IL), shoulder mobility (SM), active straight leg raise (ASLR), trunk stability push-up (TSPU), and rotary stability (RS). Researchers analyzed the data via intraclass correlation (ICC). To determine the reliability of the intersession scoring of the FMS and the intrasession interrater scoring of the FMS a 2-way mixed effects model intraclass correlation coefficient (ICC3,1) was used for the continuous data, whereas a weighted Cohen's kappa ([kappa]) was used for the categorical data. The dependent variables were FMS total score (0�21 scale) and associated tests were DS, HS, IL, SM, ASLR, TSPU, and RS. Intersession reliability (ICC, SEM) and [kappa] were as follows: FMS total score (0.92, 0.51), DS ([kappa] = 0.69), HS ([kappa] = 0.16), IL ([kappa] = 0.69), SM ([kappa] = 0.84), ASLR ([kappa] = 0.69), TSPU ([kappa] = 0.77), and RS (no covariance). Interrater reliability (ICC, SEM) and [kappa] were as follows: FMS total score (0.98, 0.25), DS ([kappa] = 1.0), HS ([kappa] = 0.33), IL ([kappa] = 0.88), SM ([kappa] = 0.90), ASLR ([kappa] = 0.88), TSPU ([kappa] = 0.75), and RS (no covariance). The FMS total scores displayed high intersession and interrater reliabilities. Finally, with the exception of HS, all tasks displayed moderate to high intersession reliability and good to high interrater reliability.

Journal ArticleDOI
TL;DR: To examine the internal consistency, test–retest reliability, and responsiveness of the Movement Assessment Battery for Children–Second Edition (MABC‐2) Test for children with developmental coordination disorder (DCD).
Abstract: Aim To examine the internal consistency, test–retest reliability, and responsiveness of the Movement Assessment Battery for Children–Second Edition (MABC-2) Test for children with developmental coordination disorder (DCD). Method One hundred and forty-four Taiwanese children with DCD aged 6 to 12 years (87 males, 57 females) were tested on three separate occasions: two baseline measurements with a 20-day interval before the intervention, and a follow-up measurement after 6 months of rehabilitation. The therapists rated the performance of children in school-related physical tasks at baseline and after intervention. Results Internal consistency for the MABC-2 Test was α = 0.90. Test–retest reliability for the total score was excellent, with an intraclass correlation coefficient of 0.97. A small to medium magnitude of treatment effect was captured by the MABC-2 Test. The minimal detectable change (MDC) was 0.28 points whereas the minimal important difference (MID) values were from 2.36 to 2.50. All subscales except balance showed acceptable validity in differentiating groups of children whose physical performance had improved or remained stable. Interpretation The MABC-2 Test is a reliable and valid measure to assess motor competence in children with DCD. The MID and MDC scores provide the reference point for clinical decision-making in managing the individual child.

Journal ArticleDOI
TL;DR: In this paper, a modified nontechnical skills (NOTECHS) scale for trauma was developed to teach and assess teamwork skills of multidisciplinary trauma resuscitation teams, which was evaluated for reliability and correlation with clinical performance.
Abstract: Background A modified nontechnical skills (NOTECHS) scale for trauma (T-NOTECHS) was developed to teach and assess teamwork skills of multidisciplinary trauma resuscitation teams. In this study, T-NOTECHS was evaluated for reliability and correlation with clinical performance. Methods Interrater reliability (intraclass correlation coefficient) and correlation with the speed and completeness of resuscitation tasks were assessed during simulation-based teamwork training and during actual trauma resuscitations. Results For T-NOTECHS ratings done in real time, intraclass correlation coefficients were .44 for simulated and .48 for actual resuscitations. Reliability was higher (intraclass correlation coefficient=.71) for video review of resuscitations. Better T-NOTECHS scores were correlated with better performance during simulations, evidenced by a greater number of completed resuscitation tasks ( r = .50, P r = −.38, P P r = −.13, P r = −.16, P Conclusions Improvement in T-NOTECHS scores after teamwork training, and correlation with clinical parameters in simulated and actual trauma resuscitations, suggest its clinical relevance. Further evaluation, aiming to improve reliability, may be warranted.

Journal ArticleDOI
TL;DR: The Brief-BESTest as discussed by the authors is an alternative version of the Balance Evaluation Systems Test that is valid, reliable, time efficient, and founded upon the same theoretical underpinnings as the original test.
Abstract: Background The Balance Evaluation Systems Test (BESTest) and Mini-BESTest are clinical examinations of balance impairment, but the tests are lengthy and the Mini-BESTest is theoretically inconsistent with the BESTest. Objective The purpose of this study was to generate an alternative version of the BESTest that is valid, reliable, time efficient, and founded upon the same theoretical underpinnings as the original test. Design This was a cross-sectional study. Methods Three raters evaluated 20 people with and without a neurological diagnosis. Test items with the highest item-section correlations defined the new Brief-BESTest. The validity of the BESTest, the Mini-BESTest, and the new Brief-BESTest to identify people with or without a neurological diagnosis was compared. Interrater reliability of the test versions was evaluated by intraclass correlation coefficients. Validity was further investigated by determining the ability of each version of the examination to identify the fall status of a second cohort of 26 people with and without multiple sclerosis. Results Items of hip abductor strength, functional reach, one-leg stance, lateral push-and-release, standing on foam with eyes closed, and the Timed “Up & Go” Test defined the Brief-BESTest. Intraclass correlation coefficients for all examination versions were greater than .98. The accuracy of identifying people from the first cohort with or without a neurological diagnosis was 78% for the BESTest versus 72% for the Mini-BESTest or Brief-BESTest. The sensitivity to fallers from the second cohort was 100% for the Brief-BESTest, 71% for the Mini-BESTest, and 86% for the BESTest, and all versions exhibited specificity of 95% to 100% to identify nonfallers. Limitations Further testing is needed to improve the generalizability of findings. Conclusions Although preliminary, the Brief-BESTest demonstrated reliability comparable to that of the Mini-BESTest and potentially superior sensitivity while requiring half the items of the Mini-BESTest and representing all theoretically based sections of the original BESTest.

Journal ArticleDOI
TL;DR: Investigation of the use of ICCs in the orthopedic literature found that researchers clarify ICC models used and ICC values are interpreted in the context of measurement, but care should be taken when interpreting the absolute ICC values.
Abstract: Background: Intra-class correlation coeffi cients (ICCs) provide a statistical means of testing the reliability. However, their interpretation is not well documented in the orthopedic fi eld. The purpose of this study was to investigate the use of ICCs in the orthopedic literature and to demonstrate pitfalls regarding their use. Methods: First, orthopedic articles that used ICCs were retrieved from the Pubmed database, and journal demography, ICC models and concurrent statistics used were evaluated. Second, reliability test was performed on three common physical examinations in cerebral palsy, namely, the Thomas test, the Staheli test, and popliteal angle measurement. Thirty patients were assessed by three orthopedic surgeons to explore the statistical methods testing reliability. Third, the factors affecting the ICC values were examined by simulating the data sets based on the physical examination data where the ranges, slopes, and interobserver variability were modifi ed. Results: Of the 92 orthopedic articles identifi ed, 58 articles (63%) did not clarify the ICC model used, and only 5 articles (5%) described all models, types, and measures. In reliability testing, although the popliteal angle showed a larger mean absolute difference than the Thomas test and the Staheli test, the ICC of popliteal angle was higher, which was believed to be contrary to the context of measurement. In addition, the ICC values were affected by the model, type, and measures used. In simulated data sets, the ICC showed higher values when the range of data sets were larger, the slopes of the data sets were parallel, and the interobserver variability was smaller. Conclusions: Care should be taken when interpreting the absolute ICC values, i.e., a higher ICC does not necessarily mean less variability because the ICC values can also be affected by various factors. The authors recommend that researchers clarify ICC models used and ICC values are interpreted in the context of measurement.

Journal ArticleDOI
TL;DR: The three ladders showed good stability in the test-retest, except the community ladder that showed moderate stability and future qualitative and longitudinal studies are needed to confirm and understand the construct underlying the MacArthur Scale in the country.
Abstract: The MacArthur Scale of Subjective Social Status intend to measure the subjective social status using a numbered stepladder image This study investigated the reliability of the MacArthur scale in a subsample of the Brazilian Longitudinal Study of Adult Health (ELSA-Brasil) Three scales were employed using different references: 1) the overall socioeconomic position; 2) the socioeconomic situation of the participant’s closer community; 3) the workplace as a whole A total of 245 of the ELSA participants from six states were involved They were interviewed twice by the same person within an interval of seven to fourteen days The reliability of the scale was assessed with weighted Kappa statistics and intraclass correlation coefficient (ICC), with their respective 95% confidence interval (CI) Kappa values were 062(058 to 064) for the society ladder; 058(056 to 061) for the community-related ladder; and 067(066 to 072) for the work-related ladder The ICC ranged from 075 for the work ladder to 064 for the community ladder These values differed slightly according to the participants’ age, sex and education category The three ladders showed good stability in the test-retest, except the community ladder that showed moderate stability Because the social structure in Brazil is rapidly changing, future qualitative and longitudinal studies are needed to confirm and understand the construct underlying the MacArthur Scale in the country

Journal ArticleDOI
TL;DR: The Chinese version of the EQ-5D demonstrated acceptable construct validity and fair to moderate levels of test–retest reliability in an urban general population in China.
Abstract: To evaluate the reliability and validity of the EQ-5D in a general population sample in urban China. Thousand and eight hundred respondents in 18 communities of Hangzhou, China were recruited by multi-stage stratified random sampling. Respondents self-administered a questionnaire including the EQ-5D, the SF-36, and demographic questions. Test–retest reliability at 2-week intervals was evaluated using Kappa coefficient, the intraclass correlation coefficient. The standard error of measurement (SEM) was used to indicate the absolute measurement error. Construct validity was established using convergent, discriminant, and known groups analyses. Complete data for all EQ-5D dimensions were available for 1,747 respondents (97%). Kappa values were from 0.35 to 1.0. The ICCs of test–retest reliability were 0.53 for the EQ-5D index score and 0.87 for the EQ VAS score. The SEM values were 0.13 (9.22% range) and 4.20 (4.20% range) for the EQ-5D index and EQ VAS scores, respectively. The Pearson’s correlation coefficients between the EQ-5D and the SF-36 were stronger between comparable dimensions than those between less comparable dimensions, demonstrating convergent and discriminant evidence of construct validity. The Chinese EQ-5D distinguished well between known groups: respondents who reported poor general health and chronic diseases had worse HRQoL than those without. Older people, females, people widowed or divorced, and those with a lower socioeconomic status reported poorer HRQoL. Respondents reporting no problems on any EQ-5D dimension had better scores on the SF-36 summary scores than those reporting problems. The Chinese version of the EQ-5D demonstrated acceptable construct validity and fair to moderate levels of test–retest reliability in an urban general population in China.

Journal ArticleDOI
TL;DR: The reliability and validity of the Dyskinesia Impairment Scale (DIS) was examined, which measures both phenomena in dyskinetic cerebral palsy.
Abstract: Aim The aim of this study was to examine the reliability and validity of the Dyskinesia Impairment Scale (DIS) The DIS consists of two subscales: dystonia and choreoathetosis It measures both phenomena in dyskinetic cerebral palsy (CP) Method Twenty-five participants with dyskinetic CP (17 males; eight females; age range 5-22y; mean age 13y 6mo; SD 5y 4mo), recruited from special schools for children with motor disorders, were included Exclusion criteria were changes in muscle relaxant medication within the previous 3 months, orthopaedic or neurosurgical interventions within the previous year, and spinal fusion Interrater reliability was verified by two independent raters For interrater reliability, intraclass correlation coefficients were assessed Standard error of measurement, the minimal detectable difference, and Cronbach's alpha for internal consistency were determined For concurrent validity of the DIS dystonia subscale, the Barry-Albright Dystonia Scale was administered Results The intraclass correlation coefficient for the total DIS score and the two subscales ranged between 091 and 098 for interrater reliability The reliability of the choreoathetosis subscale was found to be higher than that of the dystonia subscale The standard error of the measurement and minimal detectable difference values were adequate Cronbach's alpha values ranged from 089 to 093 Pearson's correlation between the dystonia subscale and Barry-Albright Dystonia Scale was 084 (p

Journal ArticleDOI
TL;DR: The Norwegian PCS total score showed acceptable psychometric properties in terms of comprehensibility, consistency, construct validity, and reproducibility when applied to patients with subacute or chronic LBP from different clinical settings.
Abstract: Pain catastrophizing has been found to be an important predictor of disability and days lost from work in patients with low back pain. The most commonly used outcome measure to identify pain catastrophizing is the Pain Catastrophizing Scale (PCS). To enable the use of the PCS in clinical settings and research in Norwegian speaking patients, the PCS had to be translated. The purpose of this study was therefore to translate and cross-culturally adapt the PCS into Norwegian and to test internal consistency, construct validity and reproducibility of the PCS. The PCS was translated before it was tested for psychometric properties. Patients with subacute or chronic non-specific low back pain aged 18 years or more were recruited from primary and secondary care. Validity of the PCS was assessed by evaluating data quality (missing, floor and ceiling effects), principal components analysis, internal consistency (Cronbach’s alpha), and construct validity (Spearman’s rho). Reproducibility analyses included standard error of measurement, minimum detectable change, limits of agreement, and intraclass correlation coefficients. A total of 38 men and 52 women (n = 90), with a mean (SD) age of 47.6 (11.7) years, were included for baseline testing. A subgroup of 61 patients was included for test-retest assessments. The Norwegian PCS was easy-to-comprehend. The principal components analysis supported a three-factor structure, internal consistency was satisfactory for the PCS total score (α 0.90) and the subscales rumination (α 0.83) and helplessness (α 0.86), but not for the subscale magnification (α 0.53). In total, 86% of the correlation analyses were in accordance with predefined hypothesis. The reliability analyses showed intraclass correlation coefficients of 0.74 − 0.87 for the PCS total score and subscales. The PCS total score (range 0–52 points) showed a standard error of measurement of 4.6 points and a 95% minimum detectable change estimate of 12.8 points. The Norwegian PCS total score showed acceptable psychometric properties in terms of comprehensibility, consistency, construct validity, and reproducibility when applied to patients with subacute or chronic LBP from different clinical settings. Our study support the use of the PCS total score for clinical or research purposes identifying or evaluating pain catastrophizing.

Journal ArticleDOI
TL;DR: The modified Gait Efficacy Scale (mGES) is a reliable and valid measure of confidence in walking among community-dwelling older adults.
Abstract: Background Perceived ability or confidence plays an important role in determining function and behavior. The modified Gait Efficacy Scale (mGES) is a 10-item self-report measure used to assess walking confidence under challenging everyday circumstances. Objective The purpose of this study was to determine the reliability, internal consistency, and validity of the mGES as a measure of gait in older adults. Design This was a cross-sectional study. Methods Participants were 102 community-dwelling older adults (mean [±SD] age=78.6±6.1 years) who were independent in ambulation with or without an assistive device. Participants were assessed using the mGES and measures of confidence and fear, measures of function and disability, and performance-based measures of mobility. In a subsample (n=26), the mGES was administered twice within a 1-month period to establish test-retest reliability through the intraclass correlation coefficient (ICC [2,1]). The standard error of measure (SEM) was determined from the ICC and standard deviation. The Cronbach α value was calculated to determine internal consistency. To establish the validity of the mGES, the Spearman rank order correlation coefficient was used to examine the association with measures of confidence, fear, gait, and physical function and disability. Results The mGES demonstrated test-retest reliability within the 1-month period (ICC=.93, 95% confidence interval=.85, .97). The SEM of the mGES was 5.23. The mGES was internally consistent across the 10 items (Cronbach α=.94). The mGES was related to measures of confidence and fear ( r =.54–.88), function and disability (Late-Life Function and Disability Instrument, r =.32–.88), and performance-based mobility ( r =.38–.64). Limitations This study examined only community-dwelling older adults. The results, therefore, should not be generalized to other patient populations. Conclusion The mGES is a reliable and valid measure of confidence in walking among community-dwelling older adults.

Journal ArticleDOI
TL;DR: The enhanced UAS has adequate measurement properties to support its use in clinical research and was assessed through distribution- and anchor-based approaches.
Abstract: Background: The Urticaria Activity Score (UAS) is a widely used patient-reported outcome measure for patients with chronic idiopathic urticaria (CIU) that includes 2 items: intensity of pruritus and number of hives. Items are scored individually, and the UAS7 is calculated as the sum of pruritus and number of hives over 1 week. Recently, its instructions were enhanced. Objective: To assess the measurement properties of the enhanced UAS. Methods: Seventy-three subjects with CIU completed the UAS with enhanced instructions, other measures of disease activity including the size of the largest hive, and collateral measures during a multicenter, randomized, double-blind, placebo-controlled study of omalizumab for the treatment of CIU. The minimal important difference (MID) was estimated through distribution- and anchor-based approaches. Test-retest reliability was assessed with the intraclass correlation coefficient (ICC); internal consistency reliability was evaluated with Cronbach's alpha; 3 responsiveness coefficients were calculated; known groups validity was assessed based on physician in-clinic UAS scores; and construct validity was assessed through Spearman correlation coefficients with collateral measures. Results: The MID ranged from 9.5 to 10.5 for the UAS7, 5.0 to 5.5 for number of hives (weekly average), and 4.5 to 5.0 for pruritus and size of largest hive (weekly average). Internal consistency was supported by alpha coefficients greater than 0.80. The ICC values for test-retest reliability ranged from 0.602 to 0.884. For subjects on active treatment, responsiveness coefficients were greater than 0.80. Known-groups validity was supported for most UAS scores; and construct validity was demonstrated by relationships with collateral measures. Conclusions: The enhanced UAS has adequate measurement properties to support its use in clinical research.

Journal ArticleDOI
TL;DR: SEFAS is a self-reported foot and ankle score with good validity, reliability and responsiveness, indicating that the score can be used to evaluate patients with osteoarthritis or inflammatory arthritis of the ankle and outcome of surgery.
Abstract: Background and purpose A questionnaire was introduced by the New Zealand Arthroplasty Registry for use when evaluating the outcome of total ankle replacement surgery. We evaluated the reliability, validity, and responsiveness of the modified Swedish version of the questionnaire (SEFAS) in patients with osteoarthritis or inflammatory arthritis before and/or after their ankle was replaced or fused. Patients and methods The questionnaire was translated into Swedish and cross-culturally adapted according to a standardized procedure. It was sent to 135 patients with ankle arthritis who were scheduled for or had undergone surgery, together with the foot and ankle outcome score (FAOS), the short form 36 (SF36) score, and the EuroQol (EQ-5D) score. Construct validity was evaluated with Spearman’s correlation coefficient when comparing SEFAS with FAOS, SF-36, and EQ-5D, content validity by calculating floor and ceiling effects, test-retest reliability with intraclass correlation coefficient (ICC), internal consistency with Cronbach’s alpha (n = 62), agreement by Bland-Altman plot, and responsiveness by effect size and standardized response mean (n = 37). Results For construct validity, we correlated SEFAS with the other scores and 70% or more of our predefined hypotheses concerning correlations could be confirmed. There were no floor or ceiling effects. ICC was 0.92 (CI 95%: 0.88–0.95), Cronbach’s alpha 0.96, effect size was 1.44, and the standardized response mean was 1.00. Interpretation SEFAS is a self-reported foot and ankle score with good validity, reliability and responsiveness, indicating that the score can be used to evaluate patients with osteoarthritis or inflammatory arthritis of the ankle and outcome of surgery.

Journal ArticleDOI
TL;DR: 1RM power clean testing has a high degree of reproducibility in trained male adolescent athletes when standardized testing procedures are followed and qualified instruction is present and indicates that a real change in lifting performance between tests in young lifters is indicated.
Abstract: Although the power clean test is routinely used to assess strength and power performance in adult athletes, the reliability of this measure in younger populations has not been examined. Therefore, the purpose of this study was to determine the reliability of the 1-repetition maximum (1RM) power clean in adolescent athletes. Thirty-six male athletes (age 15.9 ± 1.1 years, body mass 79.1 ± 20.3 kg, height 175.1 ±7.4 cm) who had >1 year of training experience in weightlifting exercises performed a 1RM power clean on 2 nonconsecutive days in the afternoon following standardized procedures. All test procedures were supervised by a senior level weightlifting coach and consisted of a systematic progression in test load until the maximum resistance that could be lifted for 1 repetition using proper exercise technique was determined. Data were analyzed using an intraclass correlation coefficient (ICC[2,k]), Pearson correlation coefficient (r), repeated measures analysis of variance, Bland-Altman plot, and typical error analyses. Analysis of the data revealed that the test measures were highly reliable demonstrating a test-retest ICC of 0.98 (95% confidence interval = 0.96-0.99). Testing also demonstrated a strong relationship between 1RM measures in trials 1 and 2 (r = 0.98, p < 0.0001) with no significant difference in power clean performance between trials (70.6 ± 19.8 vs. 69.8 ± 19.8 kg). Bland-Altman plots confirmed no systematic shift in 1RM between trials 1 and 2. The typical error to be expected between 1RM power clean trials is 2.9 kg, and a change of at least 8.0 kg is indicated to determine a real change in lifting performance between tests in young lifters. No injuries occurred during the study period, and the testing protocol was well tolerated by all the subjects. These findings indicate that 1RM power clean testing has a high degree of reproducibility in trained male adolescent athletes when standardized testing procedures are followed and qualified instruction is present.

Journal ArticleDOI
TL;DR: PASE-C is a reliable and valid instrument for assessing the physical activity level of elderly in Chinese population and demonstrated good test-retest reliability.
Abstract: Objectives: Physical Activity Scale for the Elderly (PASE) is a widely used questionnaire in epidemiological studies for assessing the physical activity level of elderly. This study aims to translate and validate PASE in Chinese population. Design: Cross-sectional study. Subjects: Chinese elderly aged 65 or above. Methods: The original English version of PASE was translated into Chinese (PASE-C) following standardized translation procedures. Ninety Chinese elderly aged 65 or above were recruited in the community. Test-retest reliability was determined by comparing the scores obtained from two separate administrations by the intraclass correlation coefficient. Validity was evaluated by Spearman’s rank correlation coefficients between PASE and Medical Outcome Survey 36-Item Short Form Health Survey (SF-36), grip strength, single-legstance, 5 times sit-to-stand and 10-m walk. Results: PASE-C demonstrated good test-retest reliability (intraclass correlation coefficient = 0.81). Fair to moderate association were found between PASE-C and most of the subscales of SF-36 (r s = 0.285 to 0.578, p < 0.01), grip strength (r s = 0.405 to 0.426, p < 0.001), single-leg-stance (r s = 0.470 to 0.548, p < 0.001), 5 times sit-to-stand (r s = –0.33, p = 0.001) and 10-m walk (r s = –0.281, p = 0.007). Conclusion: PASE-C is a reliable and valid instrument for assessing the physical activity level of elderly in Chinese population.

Journal ArticleDOI
TL;DR: The KOOS outcome measure was successfully translated into Italian, and proved to have good psychometric properties that replicated the results of existing versions and is recommended for clinical and research purposes in patients with knee injuries.

Journal ArticleDOI
TL;DR: The five-repetition sit-to-stand test was a reliable and valid test to measure functional muscle strength in children with spastic diplegia in clinics.
Abstract: Objective To investigate the psychometric properties of the five-repetition sit-to-stand test, a functional strength test, in children with spastic diplegia.Design: Methodology study.Settings: Hospital, laboratory or home.Participants: In total, 108 children with spastic diplegia and 62 with typical development aged from five to 12 years were tested. For test-retest reliability, 22 children with spastic diplegia were tested twice within one week.Interventions: Not applicable.Main measures: The five-repetition sit-to-stand test measures time needed to complete five consecutive sit-to-stand cycles as quickly as possible. The higher the rate of five-repetition sit-to-stand (repetitions per second), the more strength a person has.Results: The intraclass correlation coefficients of intra-session reliability and test-retest reliability were 0.95 and 0.99 respectively. The minimal detectable difference was 0.06 rep/sec. The convergent validity of the five-repetition sit-to-stand test was supported by significant...

Journal ArticleDOI
TL;DR: The results of this study demonstrate that the behavior of respiratory muscle strength in healthy preschool and school children can be explained by age, height and weight.

Journal ArticleDOI
TL;DR: The threshold value of the 73 used in Dolphin 3D software was the most accurate to measure airway volume, but the threshold values of the 70, 71, 72, 74, and 75 had no statistically significant differences compared with the gold standard, showing they are also reliable.

Journal ArticleDOI
TL;DR: The OHIP-EDENT-J, a questionnaire on oral health-related QOL comprising 19 items, showed good reliability and validity for edentulous patients.
Abstract: doi: 10.1111/j.1741-2358.2011.00606.x Reliability and validity of a Japanese version of the Oral Health Impact Profile for edentulous subjects Objective: To evaluate the reliability and validity of the Japanese version of the Oral Health Impact Profile for edentulous (OHIP-EDENT-J) patients. Background: Oral Health Impact Profile for edentulous is an appropriate instrument for assessing the Quality of life (QOL) in edentulous patients. However, the reliability and validity of the Japanese version had not been evaluated. Methods: The study was conducted on 116 edentulous patients (Group A, requiring new dentures, n = 61; Group B, already having dentures, n = 55). Cronbach’s alpha (α) was used to measure internal consistency of the summary scores for OHIP-EDENT-J and various subscales in Groups A and B. The interclass correlation coefficient (ICC) and 95% confidence interval of the summary scores for OHIP-EDENT-J and subscales were calculated. The summary scores for OHIP-EDENT-J in Groups A and B were compared with evaluate content validity. The Spearman’s correlation coefficient between the summary scores for OHIP-EDENT-J and the satisfaction with dentures (100 mm VAS) was calculated for Groups A and B to evaluate concurrent validity. Results: The reliability of the summary scores for OHIP-EDENT-J was good (α = 0.93). The ICC of the summary scores for OHIP-EDENT-J was 0.85. Summary scores for OHIP-EDENT-J were significantly different (p = 0.027) between Group A and Group B, with Group A having the higher value. The Spearman’s correlation coefficient for the degree of satisfaction with dentures and the summary scores for OHIP-EDENT-J, calculated for Groups A and B (n = 107), was −0.609. Conclusion: The OHIP-EDENT-J, a questionnaire on oral health–related QOL comprising 19 items, showed good reliability and validity for edentulous patients.

Journal ArticleDOI
TL;DR: The Turkish version of the E&R GMFCS is shown to be reliable and valid for assessment of Turkish CP children and high test-retest reliability was found.
Abstract: Purpose: Cerebral palsy (CP) is the most common disability in childhood. The gross motor function classification system (GMFCS) has become an important tool to assess motor function in CP patient. In 2007, the expanded and revised (E&R) version of GMFCS which includes age band for youth 12–18 years of age was developed. The aim of this study was to evaluate reliability of Turkish version of expanded and revised GMCS. Methods: We assessed interobserver reliability between two physical medicine and rehabilitation specialists in 136 children with CP and test-retest reliability within a subgroup of 48 patients. Percent agreement, intraclass correlation coefficient (ICC) and μ statistics were used to evaluate reliability. Result: The ICC between two physicians was 0.97 and the total agreement was 89%. This result indicates excellent agreement. The overall weighted μ was 0.86. High test-retest reliability was found (ICC: 0.94 95% confidence interval) and the total agreement was 75% for test-retest reliability. ...