scispace - formally typeset
Search or ask a question

Showing papers on "Intraclass correlation published in 2016"


Journal ArticleDOI
TL;DR: A practical guideline for clinical researchers to choose the correct form of ICC is provided and the best practice of reporting ICC parameters in scientific publications is suggested.

12,717 citations


Journal ArticleDOI
TL;DR: In this article, the authors review the concepts of agreement and correlation and discuss differences in the application of several commonly used measures, such as Pearson correlation, intra-class correlation and intraclass correlation, and conclude that the Pearson correlation may not provide sufficient information for investigators if the nature of poor agreement is of interest.
Abstract: Agreement and correlation are widely-used concepts that assess the association between variables. Although similar and related, they represent completely different notions of association. Assessing agreement between variables assumes that the variables measure the same construct, while correlation of variables can be assessed for variables that measure completely different constructs. This conceptual difference requires the use of different statistical methods, and when assessing agreement or correlation, the statistical method may vary depending on the distribution of the data and the interest of the investigator. For example, the Pearson correlation, a popular measure of correlation between continuous variables, is only informative when applied to variables that have linear relationships; it may be non-informative or even misleading when applied to variables that are not linearly related. Likewise, the intraclass correlation, a popular measure of agreement between continuous variables, may not provide sufficient information for investigators if the nature of poor agreement is of interest. This report reviews the concepts of agreement and correlation and discusses differences in the application of several commonly used measures.

129 citations


Journal ArticleDOI
TL;DR: The value of the 6MWT as a pivotal outcome measure in SMA clinical trials is reaffirmed and the measurement properties of reproducibility, positive criterion validity, and convergent validity with established clinical assessments are reaffirmed.
Abstract: Introduction The Six-Minute Walk Test (6MWT) was adopted as a clinical outcome measure for ambulatory spinal muscular atrophy (SMA). However, a systematic review of measurement properties reported significant variation among chronic pediatric conditions. Our purpose was to assess the reliability/validity of the 6MWT in SMA. Methods Thirty participants performed assessments, including the 6MWT, strength, and function. Reproducibility was evaluated by intraclass correlation coefficients. Criterion/convergent validity were determined using Pearson correlation coefficients. Results Test-retest reliability was excellent. The 6MWT was associated positively with peak oxygen uptake, Hammersmith Functional Motor Scale Expanded (HFMSE), lower extremity manual muscle testing, knee flexion hand-held dynamometry, and inversely with 10-m walk/run. The 6MWT discriminates between disease severity, unlike the HFMSE. Conclusions This study documents measurement properties of reproducibility, positive criterion validity, and convergent validity with established clinical assessments and reaffirms the value of the 6MWT as a pivotal outcome measure in SMA clinical trials. Muscle Nerve 54: 836-842, 2016.

92 citations


Journal ArticleDOI
TL;DR: The FJS appears to be a promising tool for evaluation of small differences in knee performance in groups of patients with good clinical results after TKA and showed good construct validity and test-retest reliability.
Abstract: Background and purpose — When evaluating the outcome after total knee arthroplasty (TKA), increasing emphasis has been put on patient satisfaction and ability to perform activities of daily living. To address this, the forgotten joint score (FJS) for assessment of knee awareness has been developed. We investigated the validity and reliability of the FJS.Patients and methods — A Danish version of the FJS questionnaire was created according to internationally accepted standards. 360 participants who underwent primary TKA were invited to participate in the study. Of these, 315 were included in a validity study and 150 in a reliability study. Correlation between the Oxford knee score (OKS) and the FJS was examined and test-retest evaluation was performed. A ceiling effect was defined as participants reaching a score within 15% of the maximum achievable score.Results — The validity study revealed a strong correlation between the FJS and the OKS (intraclass correlation coefficient (ICC) = 0.81, 95% CI: 0.77–0.8...

88 citations


Journal ArticleDOI
TL;DR: Evidence is provided that the VES can be proposed as a promising tool to measure the vitiligo extent in clinical trials and in daily practice and allows us to monitor accurately and easily the affected body surface area in a standardized way.

87 citations


Journal ArticleDOI
TL;DR: The Tug test is a quick, easy-to-use, valid, and reliable tool to evaluate objective functional impairment in patients with lumbar degenerative disc disease and in the clinical setting, patients scoring a TUG test time of over 12 seconds can be considered to have functional impairment.
Abstract: Background There are few objective measures of functional impairment to support clinical decision making in lumbar degenerative disc disease (DDD). Objective We present the validation (and reliability measures) of the Timed Up and Go (TUG) test. Methods In a prospective, 2-center study, 253 consecutive patients were assessed using the TUG test. A representative cohort of 110 volunteers served as control subjects. The TUG test values were assessed for validity and reliability. Results The TUG test had excellent intra- (intraclass correlation coefficient: 0.97) and interrater reliability (intraclass correlation coefficient: 0.99), with a standard error of measurement of 0.21 and 0.23 seconds, respectively. The validity of the TUG test was demonstrated by a good correlation with the Visual Analog Scale (VAS) back (Pearson's correlation coefficient [PCC]: 0.25) and VAS (PCC: 0.29) leg pain, functional impairment (Roland-Morris Disability Index [PCC: 0.38] and Oswestry Disability Index [PCC: 0.34]), as well as with health-related quality of life (Short Form-12 Mental Component Summary score [PCC: -0.25], Short Form-12 Physical Component Summary score [PCC: -0.32], and EQ-5D [PCC: -0.28]). The upper limit of "normal" was 11.52 seconds. Mild (lower than the 33rd percentile), moderate (33rd to 66th percentiles), and severe objective functional impairment (higher than the 66th percentile) as determined by the TUG test was 18.4 seconds, respectively. Conclusion The TUG test is a quick, easy-to-use, valid, and reliable tool to evaluate objective functional impairment in patients with lumbar degenerative disc disease. In the clinical setting, patients scoring a TUG test time of over 12 seconds can be considered to have functional impairment. Abbreviations BMI, body mass indexDDD, degenerative disc diseaseHRQOL, health-related quality of lifeICC, intraclass correlationLDH, lumbar disc herniationLSS, lumbar spinal stenosisODI, Oswestry Disability IndexOFI, objective functional impairmentPCC, Pearson's correlation coefficientPCS, Physical Component SummaryRMDI, Roland-Morris Disability IndexSF, Short FormVAS, visual analog scale.

75 citations


Journal ArticleDOI
TL;DR: All balance tests are reliable, valid, and able to identify fall status in older people living in the community, and the choice of which test to use will depend on the level of balance impairment, purpose, and time availability.

73 citations


Journal ArticleDOI
TL;DR: The T-EAT-10 is a reliable and valid symptom-specific outcome tool for dysphagia in adult Turkish patients and can be used in clinical practice and research.
Abstract: The purpose of this study was to test the reliability and validity of the Turkish Eating Assessment Tool (T-EAT-10) among patients with swallowing disorders. One hundred and five patients completed the T-EAT-10 and Functional Oral Intake Scale (FOIS). The internal consistency, test–retest reliability, and criterion validity of T-EAT-10 were investigated. The internal consistency was assessed using Cronbach’s alpha. Intraclass correlation coefficient (ICC) value with 95 % confidence intervals was calculated for test–retest reliability. The criterion validity of the T-EAT-10 was determined by assessing the correlation between T-EAT-10 and FOIS. All the patients in the study completed the T-EAT-10 without assistance. The mean time to complete the instrument was 1.8 ± 0.9 min. The internal consistency of the T-EAT-10 was found to be high with 0.90 Cronbach’s alpha for test and 0.91 Cronbach’s alpha for retest reproducibility. No difference between the test and retest scores of the T-EAT-10 was found (p = 0.14). A negative, moderate correlation between T-EAT-10 and FOIS was detected (r = −0.365, p < 0.001). The T-EAT-10 is a reliable and valid symptom-specific outcome tool for dysphagia in adult Turkish patients. It can be used in clinical practice and research.

72 citations


Journal ArticleDOI
TL;DR: Simulation results suggest that a basic percentile bootstrap will perform reasonably well when using Spearman's rho, a Winsorized correlation or a skipped correlation, and this paper reports simulation results indicating the extent to which this is true.
Abstract: Let r1 and r2 be two dependent estimates of Pearson's correlation. There is a substantial literature on testing H0 : ρ1 = ρ2 , the hypothesis that the population correlation coefficients are equal. However, it is well known that Pearson's correlation is not robust. Even a single outlier can have a substantial impact on Pearson's correlation, resulting in a misleading understanding about the strength of the association among the bulk of the points. A way of mitigating this concern is to use a correlation coefficient that guards against outliers, many of which have been proposed. But apparently there are no results on how to compare dependent robust correlation coefficients when there is heteroscedasicity. Extant results suggest that a basic percentile bootstrap will perform reasonably well. This paper reports simulation results indicating the extent to which this is true when using Spearman's rho, a Winsorized correlation or a skipped correlation.

67 citations


Journal ArticleDOI
TL;DR: The Chinese version of the QoR-15C, developed according to the methods adopted by the International Quality of Life Assessment project, reveals satisfactory psychometric properties and may be a more valid, reliable, and easy-to-use scale than the PQRS.
Abstract: The Quality of Recovery-15 scale (QoR-15) is an easy-to-use score for assessing the quality of post-operative recovery. The primary aim of the present study was to translate the QoR-15 into the Chinese language and validate it. The secondary aim was to compare it with the Post-operative Quality Recovery Scale (PQRS). The Chinese version of the QoR-15 (QoR-15C) was developed according to the methods adopted by the International Quality of Life Assessment project. A total of 470 patients undergoing surgery and general anesthesia completed the QoR-15C and the PQRS before or on the day of surgery, and on post-operative days (POD)-1, -3, and -30. To validate the QoR-15C, we assessed validity, reliability, responsiveness, and clinical feasibility and compared them with those of the PQRS. Convergent validity showed the Pearson’s r coefficient of the QoR-15C with visual analog scale and the PQRS to be 0.63 and 0.10, respectively. Predictive validity showed it had significant correlations with duration of anesthesia, duration of operation, time in post-anesthesia care unit, time in intensive care unit, and length of hospital stay. Discriminant validity showed it differed between patients who had a good or poor recovery, and decreased with increasing grades (indicating difficulty and complexity) of surgery. The intraclass correlation coefficient, split-half coefficient, and Cronbach’s α were 0.99, 0.70, and 0.76, respectively. The standardized effect size ranged from 0.85 to 1.20, and the standardized response mean ranged from 0.93 to 1.27. Compared with the QoR-15C, the PQRS may have inferior convergent validity (0.36 vs. 0.63), and split-half reliability (0.63 vs. 0.70). Furthermore, the PQRS took longer to complete: 4.20 (standard deviation 0.79) versus 1.57 (standard deviation 0.65) min. Similar to the original English version, the QoR-15C reveals satisfactory psychometric properties. Furthermore, it may be a more valid, reliable, and easy-to-use scale than the PQRS.

66 citations


Journal ArticleDOI
TL;DR: Six-minute step test is an exercise test that is easy to be conducted, more tolerable than a graded exercise test, requires fewer equipments and space, and permits better monitoring of the participants.
Abstract: OBJECTIVE To determine the 6-minute step test's (6MST) reliability and validity and to establish reference performance values of this test. DESIGN Prospective observational cross-sectional study. SETTING Spirometry and Respiratory Physiotherapy Laboratory, Federal University of Sao Carlos (institutional). PARTICIPANTS Ninety-one individuals [42 men and 49 women, mean age = 39 years (SD, 17 years)] without any diagnosed diseases and with normal exercise capacity [6-minute walk test (6MWT) >75% of the predicted normal]. INDEPENDENT VARIABLES Participants underwent two 6MST on 1 day and two 6MWT on another day in randomized order. Furthermore, age, gender, height, weight, lower limbs length, abdominal circumference, percentage of body fat, and fat-free mass were obtained. MAIN OUTCOME MEASURES Test-retest reliability was assessed by comparing the findings of the two 6MST using the intraclass correlation coefficient (ICC) and Bland-Altman plot. Validity was assessed by comparing outcomes of the 6MST to outcomes of 6MWT using the Pearson correlation coefficient. A multiple regression analysis was conducted using the stepwise method to develop an equation to predict reference values. RESULTS The performance (mean steps ± SD) in the first and second test was 149 ± 34 and 149 ± 36 steps, respectively, which was correlated to distance (in meters) in 6MWT (r = 0.72; P < 0.05). Six-minute step test performance was reliable (ICC = 0.9; 95% confidence interval: 0.85-0.93). The equation to predict reference values for the first 6MST was significant (P < 0.001 and R = 0.48): Performance(steps) = 174 to 1.05 × Age(years) to women and Performance(steps) = 209 to 1.05 × Age(years) to men. CONCLUSIONS Six-minute step test is a reliable and valid test. Moreover, the number of steps may be predicted by demographic and anthropometric variables with moderate strength of prediction. CLINICAL RELEVANCE Six-minute step test is an exercise test that is easy to be conducted, more tolerable than a graded exercise test, requires fewer equipments and space, and permits better monitoring of the participants. The assessment of the reliability, validity, and reference values will provide a better interpretability for clinicians to use it, especially in primary care.

Journal ArticleDOI
TL;DR: Analytical derivations and numerical examinations are presented to assess the bias and mean square error of the alternative estimators and suggest that more advantageous indices can be recommended over ICC(2) for their theoretical implication and computational ease.
Abstract: The intraclass correlation coefficient (ICC)(2) index from a one-way random effects model is widely used to describe the reliability of mean ratings in behavioral, educational, and psychological research. Despite its apparent utility, the essential property of ICC(2) as a point estimator of the average score intraclass correlation coefficient is seldom mentioned. This article considers several potential measures and compares their performance with ICC(2). Analytical derivations and numerical examinations are presented to assess the bias and mean square error of the alternative estimators. The results suggest that more advantageous indices can be recommended over ICC(2) for their theoretical implication and computational ease.

Journal ArticleDOI
TL;DR: The Chinese version of the Physical Activity Questionnaire for Older Children (PAQ-C), which has been identified as a potentially valid instrument to assess moderate-to-vigorous physical activity (MVPA) in children among diverse racial groups, is validated.

Journal ArticleDOI
TL;DR: The reliability and validity of a single page, physical activity questionnaire for estimation of energy expenditure in Greek adults was developed and assessed, using Cronbach alpha and intraclass correlation coefficients (ICC) and Bland-Altman analysis.

Journal ArticleDOI
TL;DR: The iPadVAS provides a convenient, user-friendly, and efficient way of collecting data from participants in measuring their current pain levels and has potential use in documentation management and may encourage participatory healthcare.
Abstract: Background: New technology for clinical data collection is rapidly evolving and may be useful for both researchers and clinicians; however, this new technology has not been tested for accuracy, reliability, or validity. Objective: This study aims to test the accuracy of visual analog scale (VAS) for pain on a newly designed application on the iPad (iPadVAS) and measure the reliability and validity of iPadVAS compared to a paper copy (paperVAS). Methods: Accuracy was determined by physically measuring an iPad scale on screen and comparing it to the results from the program, with a researcher collecting 101 data points. A total of 22 healthy community dwelling older adults were then recruited to test reliability and validity. Each participant completed 8 VAS (4 using each tool) in a randomized order. Reliability was measured using interclass correlation coefficient (ICC) and validity measured using Bland-Altman graphs and correlations. Results: Of the measurements for accuracy, 64 results were identical, 2 results were manually measured as being 1 mm higher than the program, and 35 as 1 mm lower. Reliability for the iPadVAS was excellent with individual ICC 0.90 (95% CI 0.82-0.95) and averaged ICC 0.97 (95% CI 0.95-1.0) observed. Linear regression demonstrated a strong relationship with a small negative bias towards the iPad (−2.6, SD 5.0) with limits of agreement from −12.4 to 7.1. Conclusions: The iPadVAS provides a convenient, user-friendly, and efficient way of collecting data from participants in measuring their current pain levels. It has potential use in documentation management and may encourage participatory healthcare. Trial Registration: Australia New Zealand Clinical Trials Registry (ANZCTR): 367297; https://www.anzctr.org.au/Trial/Registration/TrialReview.aspx?id=367297i5(1):e3]

Journal ArticleDOI
TL;DR: Clinicians should be cautious when ImPACT is used as a criterion for medical clearance to return to play after concussion and the Pearson r correlation coefficient and average measures intraclass correlation coefficient may be inappropriately utilized to examine the reliability of ImP ACT scores.
Abstract: OBJECTIVE: To review the literature on the reliability of the Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT). DESIGN: Systematic review of the relevant literature in PubMed, CINAHL, and PSYCHINFO. Studies were evaluated using the STROBE instrument and custom developed items. RESULTS: Search yielded 5 943 articles. Ten studies met the inclusion criteria and were reviewed. With the exception of processing speed, all composite scores consistently exhibited poor to moderate reliability (ie, intraclass correlation coefficient CONCLUSIONS: The Pearson r correlation coefficient and average measures intraclass correlation coefficient may be inappropriately utilized to examine the reliability of ImPACT scores. Given the poor to moderate reliability of most ImPACT scores, clinicians should be cautious when ImPACT is used as a criterion for medical clearance to return to play after concussion. Because of its widespread use in concussion-related clinical research, researchers must exercise due diligence when utilizing ImPACT to evaluate outcomes after concussion or to validate other outcome measures. Language: en

Journal ArticleDOI
TL;DR: Results of Neuro‐QoL clinical validation using a sample of PD patients are presented and neuro‐Qol enables brief, yet precise, assessment and the ability to conduct both PD‐specific and cross‐disease comparisons.
Abstract: Introduction Neuro-QoL is a multidimensional patient-reported outcome measurement system assessing aspects of physical, mental, and social health identified by neurology patients and caregivers as important. One of the first neurology-specific patient-reported outcome measure systems created using modern test development methods, Neuro-Qol enables brief, yet precise, assessment and the ability to conduct both PD-specific and cross-disease comparisons. We present results of Neuro-QoL clinical validation using a sample of PD patients. Methods A total of 120 PD patients recruited from academic medical centers were assessed at baseline, 1 week, and 6 months. Assessments included Neuro-QoL and general and PD-specific validity measures. Results Participants were 62% male and 95% white (average age = 66); H & Y stages were 1 (16%), 2 (61%), 3 (18%), and 4 (5%). Internal consistency and test-retest reliability of Neuro-QoL ranged from Cronbach's alphas = 0.81 to 0.94 with intraclass correlation coefficients = 0.66 to 0.80. Pearson's correlations between Neuro-QoL and legacy measures were generally moderate and in expected directions. UPDRS Part 2 was moderately correlated with Neuro-QoL Upper Extremity and Mobility, respectively (r's = −0.44; −0.59). Parkinson's Disease Questionnaire-39 and Neuro-QoL measures of similar constructs showed strong-to-moderate correlations (r's = 0.70–0.44). Neuro-QoL measures of fatigue, mobility, positive emotion, and emotional/behavioral control showed responsiveness to self-reported change. Conclusions Neuro-QoL is valid for use in PD clinical research. Reliability for all but two measures is sufficient for group comparisons, with some evidence supporting responsiveness to change. Neuro-QoL possesses characteristics, such as brevity, flexibility in administration, and suitability, for cross-disease comparisons that may be advantageous to users in a variety of settings. © 2016 Movement Disorder Society

Journal ArticleDOI
TL;DR: The Iranian HPLP-II scale is an appropriate tool for assessing HPBs of the Iranian elderly and its content and construct validity were used to determine the validity and reliability.
Abstract: Background: With increasing age, the prevalence of chronic diseases increases Since health-promoting behaviors (HPB) are considered a basic way of preventing diseases, especially chronic diseases, it is important to assess HPB This study examines the validity and reliability of the Health Promoting Lifestyle Profile II (HPLP-II) Methods: This is a cross-sectional study which is conducted on 502 elderly individuals aged 60 and over in Tehran, Iran In order to determine the validity, content and construct validity were used The content validity index (CVI) was used to assess the content validity and to assess construct validity, confirmatory factor analysis (CFA), and item-total correlations were employed For reliability, test-retest analysis was used, and the internal consistency of the HPLP-II was confirmed by Cronbach's alpha For data analysis, SPSS-18 and Amos-7 software was used Results: The mean age of the subjects was 663 ± 53 years The CVI for the revised HPLP-II and all its subscales was higher than 082 The CFA confirmed a six-factor model aligned with the original HPLP-II Pearson correlation coefficients between the revised HPLP-II and their items were in range of 027-065 Cronbach's alpha of the revised HPLP-II was obtained as 078 and for their subscales were in the range of 067-084 Intraclass correlation coefficient was obtained 079 (95% confidence interval: 059-086, P Conclusions: The Iranian HPLP-II scale is an appropriate tool for assessing HPBs of the Iranian elderly

Journal ArticleDOI
TL;DR: The TBS method showed “almost perfect” agreement between observers, with average absolute correlation coefficients and average consistency correlation coefficients, but may have sources of error, scoring reliability is not one of them.
Abstract: Several authors have tested the accuracy of the Total Body Score (TBS) method for quantifying decomposition, but none have examined the reliability of the method as a scoring system by testing interobserver error rates. Sixteen participants used the TBS system to score 59 observation packets including photographs and written descriptions of 13 human cadavers in different stages of decomposition (postmortem interval: 2-186 days). Data analysis used a two-way random model intraclass correlation in SPSS (v. 17.0). The TBS method showed "almost perfect" agreement between observers, with average absolute correlation coefficients of 0.990 and average consistency correlation coefficients of 0.991. While the TBS method may have sources of error, scoring reliability is not one of them. Individual component scores were examined, and the influences of education and experience levels were investigated. Overall, the trunk component scores were the least concordant. Suggestions are made to improve the reliability of the TBS method.

Journal ArticleDOI
TL;DR: Examination of concurrent validity and 3-day test-retest reliability of Balance Tracking System (BTrackS) in community-dwelling older adults indicated that BTrackS has the potential to identify meaningful changes in balance that may warrant intervention.
Abstract: Background and purpose Falls are the leading cause of disability, injury, hospital admission, and injury-related death among older adults. Balance limitations have consistently been identified as predictors of falls and increased fall risk. Field measures of balance are limited by issues of subjectivity, ceiling effects, and low sensitivity to change. The gold standard for measuring balance is the force plate; however, its field use is untenable due to high cost and lack of portability. Thus, a critical need is observed for valid objective field measures of balance to accurately assess balance and identify limitations over time. The purpose of this study was to examine the concurrent validity and 3-day test-retest reliability of Balance Tracking System (BTrackS) in community-dwelling older adults. Minimal detectable change values were also calculated to reflect changes in balance beyond measurement error. Methods Postural sway data were collected from community-dwelling older adults (N = 49, mean [SD] age = 71.3 [7.3] years) with a force plate and BTrackS in multitrial eyes open (EO) and eyes closed (EC) static balance conditions. Force sensors transmitted BTrackS data via a USB to a computer running custom software. Three approaches to concurrent validity were taken including calculation of Pearson product moment correlation coefficients, repeated-measures ANOVAs, and Bland-Altman plots. Three-day test-retest reliability of BTrackS was examined in a second sample of 47 community-dwelling older adults (mean [SD] age = 75.8 [7.7] years) using intraclass correlation coefficients and MDC values at 95% CI (MDC95) were calculated. Results BTrackS demonstrated good validity using Pearson product moment correlations (r > 0.90). Repeated-measures ANOVA and Bland-Altman plots indicated some BTrackS bias with center of pressure (COP) values higher than FP COP values in the EO (mean [SD] bias = 4.0 [6.8]) and EC (mean [SD] bias = 9.6 [12.3]) conditions. Test-retest reliability using intraclass correlation coefficients (ICC2.1 was excellent (0.83) and calculated MDC95 for EO (9.6 cm) and EC (19.4 cm) and suggested that postural sway changes of these amounts are meaningful. Discussion BTrackS showed some bias with values exceeding force plate values in both EO and EC conditions. Excellent test-retest reliability and resulting MDC95 values indicated that BTrackS has the potential to identify meaningful changes in balance that may warrant intervention. Conclusion BTrackS is an objective measure of balance that can be used to monitor balance in community-dwelling older adults over time. It can reliably identify changes that may require further attention (eg, fall-prevention strategies, declines in physical function) and shows promise for assessing intervention efficacy in this growing segment of the population.

Journal ArticleDOI
TL;DR: The App is a reliable tool for use in acute orthopedic care and offers better intra‐ and interobserver correlation scores for a single active measurement than the standard goniometer.
Abstract: The standard goniometer (SG) is the most commonly used tool to assess range of motion (ROM) in patients with knee restrictions. Several medical applications have been designed to measure joint ROM. Little data are available on their reliability in the postoperative clinical setting. The purpose of this study was to assess whether a smartphone accelerometer-based knee goniometer application (App) is as reliable as the SG to measure knee ROM in clinical settings. A total of 60 subjects were included in this cross-sectional reliability trial. Overall, 20 healthy subjects (HS) and 20 acute postoperative patients (PO) underwent three active and three passive measurements in knee flexion and extension, using the SG and the smartphone knee goniometer App. To determine the fatigability of postoperative patients, a third group of 20 patients underwent a single active measurement in knee flexion and extension (PO1). Measurements were performed by three clinicians. For intraobserver reliability, mean intraclass correlation coefficient (ICC) values were higher for the App in all circumstances (overall mean SG 0.85, App 0.91), indicating an excellent correlation. For interobserver reliability, the highest ICC scores were in the PO1 group, with the App more consistent than the SG in all movements. Interobserver reliability was lower in the PO group versus PO1. Interobserver reliability was better for active ROM than for passive measurements. The overall concordance coefficient was very good to excellent with active measurements (range, 0.60-0.97). In conclusion, the App is a reliable tool for use in acute orthopedic care and offers better intra- and interobserver correlation scores for a single active measurement.

Journal ArticleDOI
TL;DR: Findings indicate good test–retest reliability for the JTTHF total score to measure hand function in typically developing children aged 6 to 10 years.
Abstract: Aims: The aim of this pilot study was to evaluate reproducibility of the Jebsen Taylor Test of Hand Function (JTTHF) in children. Methods: Eighty-seven typically developing children 5 to 10 years old were included from five Outside School Hours Care centers in the Greater Brisbane Region, Australia. Hand function was assessed on two occasions with a modified JTTHF, then reproducibility was assessed using Intraclass Correlation Coefficient (ICC [3,1]) and the Standard Error of Measurement (SEM). Results: Total scores for male and female children were not significantly different. Five-year-old children were significantly different to all other age groups and were excluded from further analysis. Results for 71 children, 6 to 10 years old were analyzed (mean age 8.31 years (SD 1.32); 33 males). Test–retest reliability for total scores on the dominant and nondominant hands were ICC 0.74 (95% CI 0.61, 0.83) and ICC 0.72 (95% CI 0.59, 0.82), respectively. ‘Writing’ and ‘Simulated Feeding’ subtests demons...

Journal ArticleDOI
TL;DR: To investigate the validity of the internet‐based version of the Children's Hand‐use Experience Questionnaire (CHEQ) by testing the new four‐category rating scale, internal structure, and test–retest reliability.
Abstract: Aim: To investigate the validity of the internet-based version of the Children's Hand-use Experience Questionnaire (CHEQ) by testing the new four-category rating scale, internal structure, and test-retest reliability.Method: Data were collected for 242 children with unilateral cerebral palsy (CP) (137 males and 105 females; mean age 9y 10mo, SD 3y 5mo, range 6-18y). Twenty children from the study sample (mean age 11y 8mo, SD 3y 10mo) participated in a retest within 7 to 14 days. Validity was tested by Rasch analysis based on a rating scale model and test-retest reliability by Kappa analysis and intraclass correlation coefficient (ICC).Results: The four-category rating scale was within recommended criteria for rating scale structure. One item was removed because of misfit. CHEQ showed good scale structure according to the criteria. The effective operational range was >90% for two of the CHEQ scales. Test-retest reliability for the three CHEQ scales was: grasp efficacy, ICC=0.91; time taken, ICC=0.88; and feeling bothered, ICC=0.91.Interpretation: The internet-based CHEQ with a four-category rating scale is valid and reliable for use in children with unilateral CP. Further studies are needed to investigate the validity of the internet-based version of CHEQ for children with upper limb reduction deficiency or obstetric brachial plexus palsy and the validity of the recommended improvements to the current version.

Journal ArticleDOI
TL;DR: The Arabic version of the FSS demonstrated acceptable psychometric properties and was able to differentiate between patients with SLE or MS, and healthy subjects.
Abstract: Objectives: To develop and test the psychometric properties of an Arabic version of Fatigue Severity Scale (FSS-Ar) that can be used to measure fatigue in Arabic patients with disorders where fatigue is a major symptom. Methods: Forward and backward translations of FSS were undertaken to develop an Arabic version. The validity and reliability of the FSS-Ar was then tested on 28 patients with systemic lupus erythematosus (SLE), 24 patients with multiple sclerosis (MS), and 31 healthy subjects. Exploratory factor analysis and hypothesis testing methods were used to examine construct validity. The correlation between FSS-Ar and the vitality domain of the RAND 36-Item Health was examined to test construct validity. The study was conducted at the King Khalid University Hospital, Riyadh, Kingdom of Saudi Arabia between February and June 2012. Results: Using a score of ≥4.05 to define fatigue, 39 of 52 (75%) participants were fatigued compared with 10 out of 31 (32%) healthy participants. The correlation between the FSS-Ar and the vitality domain of the RAND-36 was acceptable (r = -0.46). Factor analysis showed that items of the FSS-Ar measured one underlying construct, namely, fatigue. Test-retest reliability and internal consistency of the FSS-Ar was acceptable (intraclass correlation coefficient model 2,1 = 0.80; Cronbach’s alpha = 0.84). Conclusion: The Arabic version of the FSS demonstrated acceptable psychometric properties and was able to differentiate between patients with SLE or MS, and healthy subjects. Saudi Med J 2016; Vol. 37 (1): 73-78 doi: 10.15537/smj.2016.1.13055 How to cite this article: Al-Sobayel HI, Al-Hugail HA, AlSaif RM, Albawardi NM, Alnahdi AH, Daif AM, et al. Validation of an Arabic version of Fatigue Severity Scale . Saudi Med J 2016; 37: 73-8.

Journal ArticleDOI
TL;DR: The MGII was developed using a patient-centered framework of myasthenia-related impairments and incorporating patient input throughout the development process and is reliable in an outpatient setting and has demonstrated construct validity.
Abstract: Objective: We aimed to develop a measure of myasthenia gravis impairment using a previously developed framework and to evaluate reliability and validity, specifically face, content, and construct validity. Methods: The first draft of the Myasthenia Gravis Impairment Index (MGII) included examination items from available measures enriched with newly developed, patient-reported items, modified after patient input. International neuromuscular specialists evaluated face and content validity via an e-mail survey. Test–retest reliability was assessed in stable patients at a 3-week interval and interrater reliability was evaluated in the same day. Construct validity was assessed through correlations between the MGII and other measures and by comparing scores in different patient groups. Results: The first draft was assessed by 18 patients, and 72 specialists answered the survey. The second draft had 7 examination and 22 patient-reported items. Field testing included 200 patients, with 54 patients completing the reliability studies. Test–retest reliability of the total score was good (intraclass correlation coefficient 0.92; 95% confidence interval 0.79–0.94), as was interrater reliability of the examination component (intraclass correlation coefficient 0.81; 95% confidence interval 0.79–0.94). The MGII correlated well with comparison measures, with higher correlations with the MG–activities of daily living ( r = 0.91) and MG-specific quality of life 15-item scale ( r = 0.78). When assessing different patient groups, the scores followed expected patterns. Conclusions: The MGII was developed using a patient-centered framework of myasthenia-related impairments and incorporating patient input throughout the development process. It is reliable in an outpatient setting and has demonstrated construct validity. Responsiveness studies are under way.

Journal ArticleDOI
TL;DR: The PPAQ-C is reliable and moderately accurate for measuring physical activity in pregnant Chinese women and its reliability and validity is determined by Spearman correlation coefficients.
Abstract: Objectives The objectives of the present study were to translate the English version of the Pregnancy Physical Activity Questionnaire into Chinese (PPAQ-C) and to determine its reliability and validity for use by pregnant Chinese women. Methods The study included 224 pregnant women during their first, second, or third trimesters of pregnancy who completed the PPAQ-C on their first visit and wore a uniaxial accelerometer (Lifecorder; Suzuken Co. Ltd) for 7 days. One week after the first visit, we collected the data from the uniaxial accelerometer records, and the women were asked to complete the PPAQ-C again. Results We used intraclass correlation coefficients to determine the reliability of the PPAQ-C. The intraclass correlation coefficients were 0.77 for total activity (light and above), 0.76 for sedentary activity, 0.75 for light activity, 0.59 for moderate activity, and 0.28 for vigorous activity. The intraclass correlation coefficients were 0.74 for “household and caregiving”, 0.75 for “occupational” activities, and 0.34 for “sports/exercise”. Validity between the PPAQ-C and accelerometer data was determined by Spearman correlation coefficients. Although there were no significant correlations for moderate activity (r = 0.19, P > 0.05) or vigorous activity (r = 0.15, P > 0.05), there were significant correlations for total activity [light and above; r = 0.35, P < 0.01)] and for light activity (r = 0.33, P < 0.01). Conclusions for Practice The PPAQ-C is reliable and moderately accurate for measuring physical activity in pregnant Chinese women.

Journal ArticleDOI
TL;DR: This work investigated construct validity and intra‐ and interrater reliability of the Selective Control Assessment of the Lower Extremity (SCALE) and found no significant differences between the two assessments.
Abstract: AIM: Assessing impaired selective voluntary movement control in children with cerebral palsy (CP) has gained increasing interest. We investigated construct validity and intra- and interrater reliability of the Selective Control Assessment of the Lower Extremity (SCALE). METHOD: Thirty-nine children (21 males, 18 females) with spastic CP, mean age 12 years 6 months [range 6y 11mo-19y 9mo], Gross Motor Function Classification System (GMFCS) levels I to IV, participated. Differences in SCALE scores were determined on joint levels and between patients categorized according to their limb distribution and GMFCS levels. SCALE scores were correlated with the Fugl-Meyer Assessment, Manual Muscle Test, and Modified Ashworth Scale. To determine reliability, the SCALE was applied once and recorded on video. RESULTS: SCALE scores differed significantly between the less and more affected leg (p<0.001) and between most leg joints. Total SCALE scores differed significantly between GMFCS levels I and II. Correlations with Fugl-Meyer Assessment, Manual Muscle Test, and Modified Ashworth Scale were 0.88, 0.88, and -0.55 respectively. Intraclass correlation coefficients were all above 0.9, with the minimal detectable change below 2 points. INTERPRETATION: The SCALE appears to be a valid and reliable tool to assess selective voluntary movement control of the legs in children with spastic CP.

Journal ArticleDOI
TL;DR: Clinicians can utilize the BESTest and its short versions to evaluate balance problems in community-dwelling older cancer survivors and apply the established MDC to assess the intervention outcomes.
Abstract: Background: Cancer is primarily a disease of older adults. About 77% of all cancers are diagnosed in persons aged 55 years and older. Cancer and its treatment can cause diverse sequelae impacting body systems underlying balance control. No study has examined the psychometric properties of balance assessment tools in older cancer survivors, presenting a signifi cant challenge in the selection of outcome measures for clinicians treating this fast-growing population. Purpose: This study aimed to determine the reliability, validity, and minimal detectable change (MDC) of the Balance Evaluation System Test (BESTest), Mini-Balance Evaluation Systems Test (Mini-BESTest), and Brief-Balance Evaluation Systems Test (Brief-BESTest) in community-dwelling older cancer survivors. Methods: This study was a cross-sectional design. Twenty breast and 8 prostate cancer survivors participated [age (SD) = 68.4 (8.13) years]. The BESTest and Activity-specifi c Balance Confi dence (ABC) Scale were administered during the fi rst session. Scores of Mini-BESTest and Brief-BESTest were extracted on the basis of the scores of BESTest. The BESTest was repeated within 1 to 2 weeks by the same rater to determine the test-retest reliability. For the analysis of the interrater reliability, 21 participants were randomly selected to be evaluated by 2 raters. A primary rater administered the test. The 2 raters independently and concurrently scored the performance of the participants. Each rater recorded the ratings separately on the scoring sheet. No discussion among the raters was allowed throughout the testing. Intraclass correlation coeffi cients (ICCs), standard error of measurement, minimal detectable change (MDC), and Bland-Altman plots were

Journal ArticleDOI
TL;DR: The developed C-LKS questionnaire is reliable, valid and responsible for the evaluation of Chinese-speaking patients with ACL injuries and it would be an effective instrument.
Abstract: The Lysholm Knee Score (LKS) is widely used and is one of the most effective questionnaires employed to assess knee injuries. Although LKS has been translated into multiple languages, there is no Chinese version even though China has the largest population of patients with knee-joint injuries. The objective of our study was to develop the Chinese version of LKS (C-LKS) and assess its reliability, validity and responsiveness in Chinese patients with anterior cruciate ligament (ACL) injuries. Study participants were mainly recruited among patients with ACL injuries scheduled for arthroscopic ACL reconstruction at our hospital. First, we developed the C-LKS in a five-step translation and cross-cultural adaptation procedure. Next, we calculated the Cronbach’s alpha, intraclass correlation coefficient (ICC), Pearson’s correlation coefficient (r), effect size (ES), and standardized response mean (SRM) to evaluate the reliability, validity, and responsiveness of C-LKS respectively. Overall, 126 patients with ACL injuries successfully completed the questionnaires. Acceptable internal consistency (Cronbach’s alpha = 0.726) as well as excellent test-retest reliability (ICC = 0.935) was found for C-LKS. Good or moderate correlation (r = 0.514–0.837) was determined among C-LKS and International Knee Documentation Committee Subjective Knee Form (IKDC), Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC), physical subscales of SF-36; C-LKS also had fair or moderate correlation (r = 0.207–0.462) with the other subscales of SF-36, which adequately illustrated that good validity was included in C-LKS. In addition, good responsiveness was also observed in C-LKS (ES = 1.36,SRM = 1.26). We have shown that our developed C-LKS questionnaire is reliable, valid and responsible for the evaluation of Chinese-speaking patients with ACL injuries and it would be an effective instrument.

Journal ArticleDOI
TL;DR: The cultural adaptation of the SF-36v2 was successful and has sufficient reliability and validity to measure a variety of musculoskeletal pathologies for Turkish-speaking individuals.