scispace - formally typeset
Search or ask a question

Showing papers on "Intraclass correlation published in 2006"


Journal ArticleDOI
01 Nov 2006-Obesity
TL;DR: The purpose of this study was to calibrate and validate the ActiGraph accelerometer for use with 3‐ to 5‐year‐old children.
Abstract: Objective: Obesity rates in young children are increasing, and decreased physical activity is likely to be a major contributor to this trend. Studies of physical activity in young children are limited by the lack of valid and acceptable measures. The purpose of this study was to calibrate and validate the ActiGraph accelerometer for use with 3- to 5-year-old children. Research Methods and Procedures: Thirty preschool children wore an ActiGraph accelerometer (ActiGraph, Fort Walton Beach, FL) and a Cosmed portable metabolic system (Cosmed, Rome, Italy) during a period of rest and while performing three structured physical activities in a laboratory setting. Expired respiratory gases were collected, and oxygen consumption was measured on a breath-by-breath basis. Accelerometer data were collected at 15-second intervals. For cross-validation, the same children wore the same instruments while participating in unstructured indoor and outdoor activities for 20 minutes each at their preschool. Results: In calibrating the accelerometer, the correlation between Vo2 (ml/kg per min) and counts was r = 0.82 across all activities. The only significant variable in the prediction equation was accelerometer counts (R2 = 0.90, standard error of the estimate = 4.70). In the cross-validation, the intraclass correlation coefficient between measured and predicted Vo2 was R = 0.57 and the Spearman correlation coefficient was R = 0.66 (p < 0.001). Cut-off points for moderate- and vigorous-intensity physical activity were identified at 420 counts/15 s (Vo2 = 20 mL/kg per min) and 842 counts/15 s (Vo2 = 30 mL/kg per min), respectively. When these cutpoints were applied to the cross-validation data, percentage agreement, kappa, and modified kappa for moderate activity were 0.69, 0.36, and 0.38, respectively. For vigorous activity, the same measures were 0.81, 0.13, and 0.62. Discussion: Accelerometer counts were highly correlated with Vo2 in young children. Accelerometers can be appropriately used as a measure of physical activity in this population.

611 citations


Journal ArticleDOI
TL;DR: CAIT is a simple, valid, and reliable tool to measure severity of functional ankle instability and can be used as a comparison tool for assessing global perception of ankle instability.

527 citations


Journal ArticleDOI
TL;DR: The AD8 is a brief, sensitive measure that validly and reliably differentiates between nondemented and demented individuals and can be used as a general screening device to detect cognitive change regardless of etiology and with different types of informants.
Abstract: Objective: To establish the validity, reliability, and discriminative properties of the AD8, a brief informant interview to detect dementia, in a clinic sample. Methods: We evaluated 255 patient–informant dyads. We compared the number of endorsed AD8 items with an independently derived Clinical Dementia Rating (CDR) and with performance on neuropsychological tests. Construct and concurrent validity, test–retest, interrater and intermodal reliability, and internal consistency of the AD8 were determined. Receiver operator characteristic curves were used to assess the discriminative properties of the AD8. Results: Concurrent validity was strong with AD8 scores correlating with the CDR ( r = 0.75, 95% CI 0.63 to 0.88). Construct validity testing showed strong correlation between AD8 scores, CDR domains, and performance on neuropsychological tests. The Cronbach alpha of the AD8 was 0.84 (95% CI 0.80 to 0.87), suggesting excellent internal consistency. The AD8 demonstrated good intrarater reliability and stability (weighted kappa = 0.67, 95% CI 0.59 to 0.75). Both in-person and phone administration showed equal reliability (weighted kappa = 0.65, 95% CI 0.57 to 0.73). Interrater reliability was very good (Intraclass correlation coefficient = 0.80, 95% CI 0.55 to 0.92). The area under the curve was 0.92 (95% CI 0.88 to 0.95), suggesting excellent discrimination between nondemented individuals and those with cognitive impairment regardless of etiology. Conclusion: The AD8 is a brief, sensitive measure that validly and reliably differentiates between nondemented and demented individuals. It can be used as a general screening device to detect cognitive change regardless of etiology and with different types of informants.

344 citations


Journal ArticleDOI
TL;DR: All short forms demonstrated excellent criterion validity and good construct validity, and all short-form questionnaires were positively correlated with the ratings of oral health and overall well-being, with the correlation coefficient being higher for the latter.
Abstract: The Child Perceptions Questionnaire for children aged 11 to 14 years (CPQ11–14) is a 37-item measure of oral-health-related quality of life (OHRQoL) encompassing four domains: oral symptoms, functional limitations, emotional and social well-being. To facilitate its use in clinical settings and population-based health surveys, it was shortened to 16 and 8 items. Item impact and stepwise regression methods were used to produce each version. This paper describes the developmental process, compares the discriminative properties of the resulting four short-forms and evaluates their precision relative to the original CPQ11–14. The item impact method used data from the CPQ11–14 item reduction study to select the questions with the highest impact scores in each domain. The regression method, where the dependent variable was the overall CPQ11–14 score and the independent variables its individual questions, was applied to the data collected in the validity study for the CPQ11–14. The measurement properties (i.e. criterion validity, construct validity, internal consistency reliability and test-retest reliability) of all 4 short-forms were evaluated using the data from the validity and reliability studies for the CPQ11–14. All short forms detected substantial variability in children's OHRQoL. The mean scores on the two 16-item questionnaires were almost identical, while on the two 8-item questionnaires they differed by only one score point. The mean scores standardized to 0–100 were higher on the short forms than the original CPQ11–14 (p < 0.001). There were strong significant correlations between all short-form scores and CPQ11–14 scores (0.87–0.98; p < 0.001). Hypotheses concerning construct validity were confirmed: the short-forms' scores were highest in the oro-facial, lower in the orthodontic and lowest in the paediatric dentistry group; all short-form questionnaires were positively correlated with the ratings of oral health and overall well-being, with the correlation coefficient being higher for the latter. The relative validity coefficients were 0.85 to 1.18. Cronbach's alpha and intraclass correlation coefficients ranged 0.71–0.83 and 0.71–0.77, respectively. All short forms demonstrated excellent criterion validity and good construct validity. The reliability coefficients exceeded standards for group-level comparisons. However, these are preliminary findings based on the convenience sampling and further testing in replicated studies involving clinical and general samples of children in various settings is necessary to establish measurement sensitivity and discriminative properties of these questionnaires.

226 citations


Journal ArticleDOI
TL;DR: The Spastic Paraplegia Rating Scale is a reliable and valid measure of disease severity and construct validity was shown by high correlation of SPRS to Barthel Index and the International Cooperative Ataxia ratings and low correlation to Mini-Mental Status Examination.
Abstract: Objective: To develop and evaluate a clinical Spastic Paraplegia Rating Scale (SPRS) to measure disease severity and progression. Methods: A 13-item scale was designed to rate functional impairment occurring in pure forms of spastic paraplegia (SP). Additional symptoms constituting a complicated form of SP are recorded in an inventory. Two independent patient cohorts were evaluated in a two-step validation procedure. Results: Application of SPRS requires less than 15 minutes and does not require any special equipment, so it is suitable for an outpatient setting. Interrater agreement of SPRS was high (intraclass correlation coefficient = 0.99). Reliability was further supported by high internal consistency (Cronbach α = 0.91). SPRS values were almost normally distributed without apparent floor or ceiling effect. Construct validity was shown by high correlation of SPRS to Barthel Index and the International Cooperative Ataxia Rating Scale (convergent validity) and low correlation to Mini-Mental Status Examination (discriminant validity). Conclusion: The Spastic Paraplegia Rating Scale is a reliable and valid measure of disease severity.

215 citations


Journal ArticleDOI
TL;DR: In general, the PYTPAQ has acceptable reliability and validity for measurement of past-year physical activity that is comparable to that of similar questionnaires.
Abstract: The authors determined the validity and reliability of their Past Year Total Physical Activity Questionnaire (PYTPAQ), which assesses the frequency, duration, and intensity of occupational, household, and recreational activities performed over the past year. The PYTPAQ was completed twice at baseline, 9 weeks apart (on average), by 154 healthy Canadian men and women aged 35-65 years for assessment of reliability. The PYTPAQ was completed again 1 year later as a self-administered questionnaire. Four times during the year, participants wore an accelerometer for 7 days and completed 7-day physical activity logs. The authors assessed validity by comparing PYTPAQ summary values with 1-year averages of the physical activity logs and accelerometer data and with physical fitness and anthropometric data measured at baseline and 1 year. Spearman correlations for reliability (metabolic equivalent-hours/week) were 0.64 for total activity, 0.70 for occupational activity, 0.73 for recreational activity, and 0.65 for household activity. For total activity, the intraclass correlation coefficient for correlation between the PYTPAQ and the 7-day physical activity logs was 0.42 (95% confidence interval: 0.28, 0.54), and for the accelerometer data it was 0.18 (95% confidence interval: 0.03, 0.32). Spearman correlations between PYTPAQ hours/week of vigorous activity and maximal oxygen uptake were 0.37 and 0.32 at baseline and follow-up, respectively. In general, the PYTPAQ has acceptable reliability and validity for measurement of past-year physical activity that is comparable to that of similar questionnaires.

193 citations


Journal ArticleDOI
TL;DR: The purpose of this study was to develop a reliable and valid instrument for measuring perceived self‐efficacy in patients with an ACL injury.
Abstract: It has been suggested that self-efficacy belief is of major importance for rehabilitation outcome after sports-related injuries. No instruments are, however, available to evaluate perceived self-efficacy for prognostic and outcome expectations in patients with an anterior cruciate ligament (ACL) injury. Perceived self-efficacy is defined as a judgment of one's potential ability to carry out a task, rather than a measure of whether or not one actually can or does perform the task. The purpose of this study was to develop a reliable and valid instrument for measuring perceived self-efficacy in patients with an ACL injury. A total of 210 male and female patients with an ACL injury were included in this study. The items were generated by health professionals with long clinical experience of patients with an ACL injury and by discussions with patients. After item analysis and item reduction, based on the results from 88 patients, the final 22-item version of the Knee Self-Efficacy Scale (K-SES) was evaluated in 18 patients for test-retest reliability and in 104 patients for internal consistency and validity. The K-SES was compared with the Multidimensional Health Locus of Control (MHLC), Coping Strategies Questionnaire (CSQ), SF-36 and Knee Injury and Osteoarthritis Outcome Score (KOOS) instruments. A factor analysis was also performed on the K-SES. The test-retest revealed a correlation of r(s)=0.73 between test-days and an intraclass correlation coefficient of 0.75. No significant difference between test-days was found. The internal consistency was 0.94, as calculated with Cronbach's alpha. There were low correlations between the K-SES and MHLC and the K-SES and CSQ, respectively. A strong correlation was found between the K-SES and physical functioning, as measured by the SF-36 (r(s)=0.8). All the sub-scales in the KOOS correlated moderately to strongly (r(s)=0.4-0.7) to the K-SES. The factor analysis produced two factors of importance. Factor one was related to how patients perceived their present physical performance/function, while factor two was related to how patients perceived the future physical performance/prognosis of their knee. Good reliability and good face, content, construct and convergent validity were demonstrated for this new instrument (K-SES) for measuring perceived self-efficacy in patients with an ACL injury. The K-SES is recommended for studies designed to evaluate prognostic and outcome expectations of perceived self-efficacy in patients with an ACL-insufficient knee.

172 citations


Journal ArticleDOI
TL;DR: To assess the performance of self‐assessment scales in severely demented hospitalized patients and to compare it with observational data.
Abstract: OBJECTIVES: To assess the performance of self-assessment scales in severely demented hospitalized patients and to compare it with observational data. DESIGN: Prospective clinical study. SETTING: Geriatrics hospital and a geriatric psychiatry service. PARTICIPANTS: All patients who met Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, criteria for dementia, with a Mini-Mental State Examination score less than 11 and a Clinical Dementia Rating score of 3. MEASUREMENTS: Three self-assessment tools—the verbal, horizontal visual, and faces pain scales—were administered in randomized order. A nursing team independently completed an observational pain rating scale. Main outcomes were comprehension (ability to explain scale use and correctly indicate positions for no pain and extreme pain, on two separate occasions), inter- and intrarater reliability, and comparison of pain intensities measured by the different scales. RESULTS: Sixty-one percent of 129 severely demented patients (mean age 83.7, 69% women) demonstrated comprehension of at least one scale. Comprehension rates were significantly better for the verbal and the faces pain scales. For patients who demonstrated good comprehension, the inter- and intrarater reliability of the three self-assessment scales was high (intraclass correlation coefficient=0.88–0.98). Correlation between the three self-assessment scales was moderate to strong (Spearman correlation coefficient (r)=0.45–0.94; P<.001). Observational rating correlated at least moderately with self-assessment (r=0.25–0.63), although for patients reporting pain, the observational rating scale underestimated severity compared with all three self-assessment scales. CONCLUSION: Clinicians should not apply observational scales routinely in severely demented patients, because many are capable of reliably reporting their own pain.

172 citations


Journal ArticleDOI
TL;DR: The Brazilian Portuguese version of the Fugl-Meyer Assessment Scale did not show any conflicts of interpretation, thereby allowing this version to be used as instrument for clinical evaluation and research in Brazil.
Abstract: Reliability Study on the Application of the Fugl-Meyer Scale in Brazil Objective: The aim of this study was to produce a Brazilian version of the original Fugl-Meyer Assessment Scale and to verify the intrarater and interrater reliability in chronic post-stroke patients. Method: Fifty hemiparetic patients participated in this study. The Fugl-Meyer assessment was applied to them twice (intrarater reliability) by three physiotherapists (interrater reliability), from three rehabilitation centers. Results: The results showed that the whole Fugl-Meyer scale demonstrated high interrater and intrarater reliability (intraclass correlation coefficient = 0.99 and 0.98, respectively), and high reliability for each subscale (intraclass interrater = 0.99 to 0.94; intraclass intrarater = 0.98 to 0.87). Conclusion: It was concluded that the Brazilian Portuguese version of the Fugl-Meyer Assessment Scale did not show any conflicts of interpretation. High intrarater and interrater reliability rates were obtained, thereby allowing this version to be used as instrument for clinical evaluation and research in Brazil.

164 citations


Journal ArticleDOI
TL;DR: The five-item CYBOCS-PDD is reliable, distinct from other measures of repetitive behavior, and sensitive to change.
Abstract: Objective To examine the psychometric properties of the Children's Yale-Brown Obsessive Compulsive Scales (CYBOCS) modified for pervasive developmental disorders (PDDs). Method Raters from five Research Units on Pediatric Psychopharmacology (RUPP) Autism Network were trained to reliability. The modified scale (CYBOCS-PDD), which contains only the five Compulsion severity items (range 0-20), was administered to 172 medication-free children (mean 8.2 ± 2.6 years) with PDD (autistic disorder, n = 152; Asperger's disorder, n = 6; PDD not otherwise specified, n = 14) participating in RUPP clinical trials. Reliability was assessed by intraclass correlation coefficient (ICC) and internal consistency by Cronbach's α coefficient. Correlations with ratings of repetitive behavior and disruptive behavior were examined for validity. Results Eleven raters showed excellent reliability (ICC = 0.97). The mean CYBOCS score was 14.4 (± 3.86) with excellent internal consistency (α = .85). Correlations with other measures of repetitive behavior ranged from r = 0.11 to r = 0.28 and were similar to correlations with measures of irritability ( r = 0.24) and hyperactivity ( r = 0.25). Children with higher scores on the CYBOCS-PDD had higher levels of maladaptive behaviors and lower adaptive functioning. Conclusions The five-item CYBOCS-PDD is reliable, distinct from other measures of repetitive behavior, and sensitive to change.

159 citations


Journal ArticleDOI
TL;DR: Sport participation and the frequencies of moderate and hard activity provide valid data about adolescents' usual week physical activity, based on CSA comparison, indicate that the physical activity computer variables provide reliable information.
Abstract: The reliability and validity of a physical activity computer questionnaire of a usual week were studied in 33 adolescents between 12 and 18 years of age. Intraclass correlation coefficients and Kappa values were calculated to verify test-retest reliability. Validity was investigated by calculating Pearson correlation coefficients between the questionnaire and the Computer Science and Applications uniaxial accelerometer (CSA). Accelerometer data were obtained during seven successive days (sum and mean counts, estimated MET). Intraclass coefficients generally exceeded 0.70 and all Kappa values but one varied between 0.44 and 1.00. Transport variables (active transport from and to school, and during leisure time) showed no relationship with CSA. Sport participation during leisure time, sport participation summed with total transport, and the frequencies of moderate and hard activity were significantly correlated with CSA (r between 0.48 and 0.78). These data indicate that the physical activity computer variables provide reliable information. Moreover, sport participation (and summed with total transport) and the frequencies of moderate and hard activity provide valid data about adolescents' usual week physical activity, based on CSA comparison.

Journal ArticleDOI
TL;DR: Resistance to passive movement in children with spastic cerebral palsy was assessed by two raters using the Modified Ashworth Scale and the Modified Tardieu scale and the intraclass correlation coefficient was low.
Abstract: Resistance to passive movement in children with spastic cerebral palsy was assessed by two raters using the Modified Ashworth Scale and the Modified Tardieu Scale. Four muscle groups in the lower limbs were tested using a standardized procedure. Interrater reliability of the scales was evaluated by the intraclass correlation coefficient. Seventeen children, with a mean age of 7 years 9 months, were included. Two children were rated twice. The intraclass correlation coefficients of both scales were low and did not reach the acceptable limit of 0.75. Caution should be used when these scales are applied.

Journal ArticleDOI
01 Jan 2006-Spine
TL;DR: Examination of the reliability of examination items and a classification decision-making algorithm using physical therapists with varying levels of experience may improve the reproducibility of classification methods.
Abstract: Study design Test-retest design to examine interrater reliability. Objective Examine the interrater reliability of individual examination items and a classification decision-making algorithm using physical therapists with varying levels of experience. Summary of background data Classifying patients based on clusters of examination findings has shown promise for improving outcomes. Examining the reliability of examination items and the classification decision-making algorithm may improve the reproducibility of classification methods. Methods Patients with low back pain less than 90 days in duration participating in a randomized trial were examined on separate days by different examiners. Interrater reliability of individual examination items important for classification was examined in clinically stable patients using kappa coefficients and intraclass correlation coefficients. The findings from the first examination were used to classify each patient using the decision-making algorithm by clinicians with varying amounts of experience. The reliability of the classification algorithm was examined with kappa coefficients. Results A total of 123 patients participated (mean age 37.7 [+/-10.7] years, 44% female), 60 (49%) remained stable between examinations. Reliability of range of motion, centralization/peripheralization judgments with flexion and extension, and the instability test were moderate to excellent. Reliability of centralization/peripheralization judgments with repeated or sustained extension or aberrant movement judgments were fair to poor. Overall agreement on classification decisions was 76% (kappa = 0.60, 95% confidence interval 0.56, 0.64), with no significant differences based on level of experience. Conclusion Reliability of the classification algorithm was good. Further research is needed to identify sources of disagreements and improve reproducibility.

Journal ArticleDOI
TL;DR: There is substantial interobserver variability and moderate intraob server variability among embryologists, which could alter both the expected quality of embryos transferred, as well as the number transferred, both of which directly impact IVF program success.

Journal ArticleDOI
TL;DR: The ULFI demonstrated sound psychometric properties, practical characteristics, and clinical utility thereby making it a viable clinical outcome tool for the determination of upper limb status and impairment.

Journal ArticleDOI
01 Aug 2006-Stroke
TL;DR: Significant differences between patient and proxy HRQL domain scores is modest at best and is affected by patient depression and proxy perception of burden, which may impact the outcome assessment in stroke clinical trials.
Abstract: Background and Purpose— Proxy respondents are often needed to report outcomes in stroke survivors, but they typically systematically rate impairments worse than patients themselves. The magnitude of this difference, the degree of agreement between patients and proxies, and the factors influencing agreement are not well known. Methods— We compared patient and family proxy health-related quality of life (HRQL) responses in 225 patient–proxy pairs enrolled in a clinical trial for poststroke depression. We used paired t-tests and the intraclass correlation (ICC) statistic to evaluate the agreement between patient and proxy domain scores and the overall Stroke-specific Quality of Life (SS-QOL) score. We used multivariate linear regression to model patient- and proxy-reported SS-QOL scores. Results— Patients were older (63 versus 55 years) and less often female (48% versus 74%) than proxies. Proxies rated all domains of SS-SQOL slightly worse than patients. The Mood, Energy, and Thinking domains had the greates...

Journal ArticleDOI
TL;DR: Intratester reliability did not always ensure acceptable intertester reliability or measurement precision, suggesting more training may be required to achieve acceptable measurement reliability and precision between multiple testers.
Abstract: Objective: To determine whether multiple examiners could be trained to measure lower extremity anatomic characteristics with acceptable reliability and precision, both within (intratester) and between (intertester) testers. We also determined whether testers trained 18 months apart could perform these measurements with good agreement.Setting: University's Applied Neuromechanics Research Laboratory.Participants: Sixteen, healthy participants (7 men, 9 women).Assessment of Risk Factors: Six investigators measured 12 anatomic characteristics on the right lower extremity in the Fall of 2004. Four testers underwent training immediately preceding the study, and measured subjects on 2 separate days to examine intratester reliability. Two testers trained 18 months before the study (Spring 2002) measured each subject on day 1 to examine the consistency of intertester reliability when testers are trained at different times.Main Outcome Measurements: Knee laxity, genu recurvatum, quadriceps angle, tibial torsion, tibiofemoral angle, hamstring extensibility, pelvic angle, navicular drop, femur length, tibial length, and hip anteversion.Results: With few exceptions, all testers consistently measured each variable between test days (intraclass correlation coefficient>=0.80). Intraclass correlation coefficient values were lower for intertester reliability (0.48 to 0.97), and improved from day 1 to day 2. Intertester reliability was similar when comparing testers trained 18 months before those trained immediately before the study. Absolute measurement error varied considerably across individual testers.Conclusions: Multiple investigators can be trained at different times to measure anatomic characteristics with good to excellent intratester reliability. Intratester reliability did not always ensure acceptable intertester reliability or measurement precision, suggesting more training (or more experience) may be required to achieve acceptable measurement reliability and precision between multiple testers.

Journal ArticleDOI
TL;DR: These findings support the use of the PQ-LES-Q as an additional measure of current clinical status and outcome because it taps dimensions that are not covered by the commonly used global severity of illness or symptomatic measures.
Abstract: Objective The pediatric version of the Short Form of the Quality of Life Enjoyment and Satisfaction Questionnaire (PQ-LES-Q) was developed to aid in the assessment of an important aspect of life experience in children and adolescents. Method The reliability and validity of the PQ-LES-Q was tested using data from a sample of 376 outpatient children (6-11 years old) and adolescents (12-17 years old) with major depressive disorder. Results The internal consistency coefficients at screening, baseline, and endpoint were high (0.87, 0.90, 0.89, respectively) as was the 1-week test-retest intraclass correlation coefficient of reliability (0.78). The correlations of the PQ-LES-Q total score with concurrent measures of severity of illness were in the moderate range (e.g., Global Clinical Impression of Severity, −0.40; Children's Global Assessment Scale, 0.36; Children's Depression Rating Scale total score, −0.45), as were the correlations with measures of change between baseline and endpoint (e.g., Clinical Global Impression of Severity, −0.34; Children's Global Assessment Scale, 0.33; Children's Depression Rating Scale total score, −0.45). Conclusions These findings support the use of the PQ-LES-Q as an additional measure of current clinical status and outcome because it taps dimensions that are not covered by the commonly used global severity of illness or symptomatic measures.

Journal ArticleDOI
TL;DR: A patient-derived questionnaire can provide a high level of agreement with surgeon assessments of outcome following shoulder surgery and should continue to be evaluated as a means of assessment of these patients.
Abstract: Background: We found no information in the literature regarding the relationship between patient and physician-derived outcome assessments with a shoulder questionnaire. In this study, we examined a group of patients who were assessed with patient and physician-administered questionnaires following shoulder arthroplasty. Methods: From August 2003 to February 2004, sixty-seven consecutive patients who had been followed for a minimum of six months after shoulder arthroplasty were evaluated with a self-administered and an identical physician-directed shoulder questionnaire that assessed clinical and functional outcomes at the time of routine follow-up. An assessment of the agreement between physicians and patients as well as the factors that affected agreement was performed. Results: The intraclass correlation indicated almost perfect physician-patient agreement (>0.80) on items related to overall pain, pain at night, pain with activity, stability, and active elevation and substantial agreement (intraclass correlation, 0.66 and 0.69) between the physician and patient assessments of pain without activity and strength. While the differences were small, on the average physician ratings for pain were lower (indicating less pain) than patient ratings for pain, physicians rated stability and strength as being closer to normal, and they reported less active elevation. There was substantial agreement between the physician and patient assessments of outcome with the modified Neer system (intraclass correlation = 0.75), with 87% agreement if excellent and satisfactory outcomes were combined. Conclusions: A patient-derived questionnaire can provide a high level of agreement with surgeon assessments of outcome following shoulder surgery. Patient-administered methods should continue to be evaluated as a means of assessment of these patients.

Journal ArticleDOI
TL;DR: The findings of this study provide further evidence of the utility of the ALBPSQ in clinical studies and in primary care settings to help identify patients at risk of developing chronic LBP and disability.
Abstract: Objectives: The purpose of this study was to evaluate the reliability and construct and predictive validity of the Norwegian version of the Acute Low Back Pain Screening Questionnnaire (ALBPSQ). Methods: A prospective study with a 12-month follow-up was conducted on 123 patients with acute low back pain (LBP) seeking help in primary health care for the first time and 50 patients with chronic LBP for more than 3 months. Results: Test-retest reliability was high with intraclass correlation coefficients of 0.90, minimal detectable change of 12 points (of a total score of 210), and coefficient of variation of 4%. Internal consistency was 0.95. Principal-components analysis revealed 3 factors explaining 49% of the variance. The ALBPSQ score correlated highly (rZ0.60) with disability variables, moderately (0.30

Journal ArticleDOI
TL;DR: The BILAG-2004 index is a reliable tool to assess SLE activity and the use of a well-defined glossary and training of raters are essential to ensure the optimal performance of the index.
Abstract: OBJECTIVE: To test the interrater reliability of the revised British Isles Lupus Assessment Group 2004 (BILAG-2004) index for the assessment of systemic lupus erythematosus (SLE) activity. METHODS: Patients with SLE were recruited from 11 centers. Two physician raters separately assessed the patients' disease activity using the BILAG-2004 index in routine clinical practice. Scores ranged from A (for very active disease) to E (for inactivity). Two reliability exercises were performed. Changes were made to the index after the first exercise (E1), and additional training was provided to the raters before the second exercise (E2). E1 and E2 involved 12 and 14 raters, respectively. Interrater reliability was assessed using kappa statistics and intraclass correlation coefficients. Levels of agreement and the extent of major disagreement were also examined. Major disagreement was defined as a score difference between raters of A versus C, D, or E or B versus D or E. RESULTS: For each exercise, 97 patients were recruited. In E1, the mean age of the patients was 42.3 years (range 18.5-82.2 years), 89.7% were women, and 74.2% were white, 8.2% were Afro-Caribbean, and 13.4% were South Asian, and in E2, the mean age was 43.7 years (range 17.7-75 years), 90.7% were women, and 68% were white, 15.5% were Afro-Caribbean, and 11.3% were South Asian. The mean disease duration was 9.4 years (range 0-32.1 years) for patients in E1 and 10 years (range 0-34.8 years) in E2. There was improvement in the interrater reliability and the level of agreement from E1 to E2. Further improvement was achieved after removal of poorly performing items. CONCLUSION: The BILAG-2004 index is a reliable tool to assess SLE activity. The use of a well-defined glossary and training of raters are essential to ensure the optimal performance of the index.

Journal ArticleDOI
TL;DR: This research presents a novel and scalable approach that aims to provide real-time information about the severity of dyspepsia symptoms in patients using a simple, scalable, and scalable method.
Abstract: Summary Background Currently there is no consensus on the optimal method to measure the severity of dyspepsia symptoms in clinical trials. Aim To validate the 7-point Global Overall Symptom scale. Methods The Global Overall Symptom scale uses a 7-point Likert scale ranging from 1 = no problem to 7 = a very severe problem. Validation was performed in two randomized-controlled trials (n = 1121 and 512). Construct validity: Global Overall Symptom was compared with the Quality of Life in Reflux And Dyspepsia, Gastrointestinal Symptom Rating Scale, Reflux Disease Questionnaire and 10 specific symptoms using Spearman correlation coefficients. Test–retest reliability: The Intraclass Correlation Coefficient was calculated for patients with stable dyspepsia defined by no change in Overall Treatment Effect score over two visits. Responsiveness: effect size and standardized response mean were also calculated. Results Construct validity: Change in Global Overall Symptom score correlated significantly with Quality of Life for Reflux And Dyspepsia, Gastrointestinal Symptom Rating Scale, Reflux Disease Questionnaire and specific symptoms (all P < 0.0002). Reliability: The Intraclass Correlation Coefficient was 0.62 (n = 205) and 0.42 (n = 270). Responsiveness: There was a positive correlation between change in Global Overall Symptom and change in symptom severity. The effect size and standardized response mean were 1.1 and 2.1, respectively. Conclusion The Global Overall Symptom scale is a simple, valid outcome measure for dyspepsia treatment trials.

Journal ArticleDOI
TL;DR: The modified WMFT is reliable and valid as an outcome measure for people with chronic moderate and mild UE hemiparesis and is stable, but 1 repeat testing is recommended when practical.

Journal ArticleDOI
TL;DR: In this paper, the authors developed a questionnaire aimed at Brazilian adolescents and to assess its validity and reproducibility, which comprised 17 questions on habitual physical activity in the last 12 months (15 questions on sports and physical exercise and two on transportation physical activity).
Abstract: OBJECTIVE: To develop a physical activity questionnaire aimed at Brazilian adolescents and to assess its validity and reproducibility. METHODS: A total of 94 adolescents (30 males and 64 females) aged 11-16 years were included in the study, which was conducted in 2004. The questionnaire comprised 17 questions on habitual physical activity in the last 12 months (15 questions on sports and physical exercise and two on transportation physical activity), and was standardized to yield final scores for weekly and yearly activity. As a reference, we used the multistage 20-meter shuttle run test, measuring variables maximum time in minutes, maximum speed, maximum oxygen uptake and maximum heart rate. For validity analysis, we used the Spearman coefficient and age-adjusted correlation. For reproducibility analysis, we repeated evaluations after 15 days and measured the intraclass correlation coefficient. RESULTS: For the weekly score, the highest correlations were obtained for maximum time for the entire sample (r=0.19), maximum speed for males (r=0.20), and both maximum oxygen uptake and maximum time for females (r=0.17). For the yearly score, the highest correlations were obtained for maximum time for the entire sample (r=0.30), maximum heart rate for males (r=0.22), and maximum time for females (r=0.23). In reproducibility analyses, correlations were 0.61 for weekly score and 0.68 for yearly score. CONCLUSIONS: The questionnaire was valid and reproducible. Its use is recommended for the evaluation of physical activity in epidemiological studies with adolescents.

Journal ArticleDOI
TL;DR: Based on the natural variability of the tasks, the 5-min walking and stair-climbing task, and to a lesser degree the 50-ft (15 m) walking, sit-to-stand and loaded forward reach, seem clinically useful.
Abstract: Objective: To examine the influence of task experience on the difference between test and retest and to assess test-retest reliability and limits of agreement of six performance tasks in chronic low back pain patients. These measures will be used to define the clinical usability.Design: Test-retest of six performance tasks in a group of patients with no experience and a group of patients after previous experience with these tasks.Setting: Three rehabilitation centres.Subjects: Fifty-three patients with non-specific chronic low back pain.Main measures: Five-minute walking, 50-ft (15 m) fast walking, sit-to-stand, loaded forward reach, 1-min stair-climbing and Progressive Isoinertial Lifting Evaluation (PILE). To assess the influence of task experience, differences between test and retest between both groups were tested using Mann-Whitney test. For both groups together, intraclass correlation coefficients (ICCs) and the limits of agreement using Bland and Altman plots were calculated.Results: Thirty patient...

Journal ArticleDOI
TL;DR: The Stoma Quality of Life Scale demonstrates reasonable psychometric properties for measuring quality of life in patients with stomas, and is capable of discriminating between patients with better and worse quality oflife after stoma formation.
Abstract: Few studies have evaluated the impact of a stoma on patient quality of life because of a lack of specific validated measures. This study documents the development and initial application of a Stoma Quality of Life Scale. Content experts generated initial questions. Patient focus groups were conducted to ensure that the questions addressed all stoma-related issues considered important by patients. Responses from pilot groups allowed refinement to produce the final measure, the Stoma Quality of Life Scale, a 21-item questionnaire. Three scales are featured: Work/Social Function (6 items), Sexuality/Body Image (5 items), and Stoma Function (6 items). In addition, one item (scored separately) measures financial impact, one measures skin irritation, and two measure overall satisfaction. This questionnaire was administered to 100 consecutive ostomy patients, and readministered three weeks later. Reliability was assessed by using coefficient alpha for internal consistency and intraclass correlation coefficient for test-retest reproducibility. To test validity in extreme groups, scores were compared for patients with improved quality of life vs. those whose stoma worsened their quality of life. To evaluate convergent validity, we analyzed correlation of instrument scales with the SF12. The Stoma Quality of Life scales demonstrated adequate test-retest reproducibility (intraclass correlation coefficient >0.8) and acceptable internal consistency (coefficient alpha approximately 0.8). The scales were capable of discriminating between patients with better and worse quality of life after stoma formation (P < 0.02 for all scales). The Stoma Quality of Life scales significantly correlated (range, 0.12–0.75) with the Physical and Mental Health Composite Scale Scores of the SF-12. The Stoma Quality of Life Scale demonstrates reasonable psychometric properties for measuring quality of life in patients with stomas. Further studies are needed to refine the instrument.

Journal ArticleDOI
TL;DR: The SAOM was found to be highly reliable, to have very strong validity, and to be sensitive to clinical change when used to monitor substance abuse treatment among patients with a primary substance use diagnosis.
Abstract: Objective The study sought to determine the validity and reliability of the Substance Abuse Outcomes Module (SAOM), a self-report tool designed to assess patient characteristics, process of care, and outcomes of care, using a minimum amount of information, in order to improve treatment. Methods A longitudinal field test (baseline and three-month follow-up) compared the SAOM to seven other research instruments in the assessment of 100 substance-abusing patients who were entering a new treatment episode. Quota samples of patients were drawn from two private inpatient substance abuse treatment facilities and an outpatient methadone clinic. The study's primary outcome measures were diagnostic accuracy, internal and test-retest reliability of key constructs, concurrent and predictive validity, and sensitivity to change. Cronbach's alpha coefficients were calculated to examine internal consistency and reliability. Intraclass correlation coefficients and kappa coefficients were used to examine test-retest reliability. Concurrent validity of outcomes measures was examined with Pearson or Spearman correlation coefficients and chi square and kappa statistics. Changes between baseline and follow-up were examined as a function of case-mix measures with ordinary least-squares multiple regression. Sensitivity to change was examined by calculating effect size scores. Results The SAOM had high internal consistency and a high level of agreement with research diagnoses at baseline and follow-up. The SAOM was found to be highly reliable, to have very strong validity, and to be sensitive to clinical change. Conclusions The SAOM appears to be a reasonably reliable and valid self-report instrument when used to monitor substance abuse treatment among patients with a primary substance use diagnosis.

Journal ArticleDOI
TL;DR: Findings suggest this questionnaire is reliable and acceptable to children for assessing environmental perceptions relevant to physical activity among 11-year-old children.
Abstract: Background: Environmental factors are increasingly being implicated as key influences on children's physical activity. Few studies have comprehensively examined children's perceptions of their environment, and there is a paucity of literature on acceptable and reliable scales for measuring these. This study aimed to develop and test the acceptability and reliability of a scale which examined a broad range of environmental perceptions among children. Methods: Based on constructs from ecological models, a survey incorporating items on children's perceptions of the physical and social environment at home and in the neighbourhood was developed. This was administered on two occasions, nine days apart, to a sample of 39 children aged 11 years (54% boys), attending a metropolitan Australian elementary school. The acceptability of the survey was determined by the proportion of missing responses to each item. The test-retest reliability of individual items, scores and scales were determined using Kappa statistics and percent agreement for categorical variables, and intraclass correlation coefficients (ICC) for continuous variables. Results: There were few missing responses to each question, with only 4% of all responses missing. Although some Kappa values were low, all categorical variables showed acceptable reliability when examined for percent agreement between test and retest (range 68%–100% agreement). Continuous variables all showed moderate to good ICC values (range 0.72–0.92). Conclusion: Findings suggest this questionnaire is reliable and acceptable to children for assessing environmental perceptions relevant to physical activity among 11-year-old children.

Journal ArticleDOI
TL;DR: This evaluation evaluated the reliability and sensitivity to change over time of a newly developed self-administered version of the ALS functional rating scale-revised (ALSFRS-R) in 60 consecutive patients from an ALS clinic.
Abstract: We evaluated the reliability and sensitivity to change over time of a newly developed self-administered version of the ALS functional rating scale-revised (ALSFRS-R) in 60 consecutive patients from an ALS clinic. The self-administered ALSFRS-R showed excellent reliability (intraclass correlation = 0.93, 95% CI: 088 to 0.96) and similar sensitivity to change over time vs the standard evaluator-administered ALSFRS-R.

Journal ArticleDOI
TL;DR: The FI-2 is a valid and reliable outcome measure of impairment for patients with polymyositis or dermatomyositis and it is well tolerated and the unilateral FI-1 requires a maximum of 20 minutes to perform.
Abstract: Objective To revise the content of the Functional Index in myositis (FI) and to evaluate measurement properties of a revised FI. Methods Previously performed FI (n = 287) were analyzed for internal redundancy and consistency, and ceiling and floor effects. Content was evaluated and a preliminary revised FI was developed. To evaluate the construct validity of the preliminary revised FI, it was compared with isokinetic measurements of muscular strength and endurance, the Myositis Activities Profile, disease impact on general wellbeing, and creatine phosphokinase levels. Minor adjustments were made and the revised FI was investigated for interrater reliability and intrarater reliability over a 1-week period. After this, some minor, additional adjustments were made leading to the final version, FI-2. Results Five tasks were removed from the original FI due to ceiling effects. Performance pace and number of repetitions were modified for the remaining tasks. A moderate correlation (rs = 0.58) was found between the shoulder flexion task of the preliminary revised FI and isokinetic measurements of shoulder flexion endurance. Intraclass correlation coefficient (ICC) for interrater reliability of the revised FI varied from 0.86–0.99 with no systematic differences. ICC for intrarater reliability varied from 0.56–0.99 with systematic differences (P < 0.05) between test and retest in 3 of the tasks. The sit-up task was excluded due to low intrarater reliability resulting in the final 7-item FI-2. There was a good correlation between tasks on the right and left side suggesting that the FI-2 could be performed unilaterally. Conclusion The FI-2 is a valid and reliable outcome measure of impairment for patients with polymyositis or dermatomyositis. It is well tolerated and the unilateral FI-2 requires a maximum of 20 minutes to perform. Further evaluation of sensitivity to change and testing in healthy individuals needs to be conducted.