scispace - formally typeset
Search or ask a question

Showing papers on "Intraclass correlation published in 1997"


Journal ArticleDOI
TL;DR: In this paper, a concordance correlation coefficient was proposed to evaluate the reproducibility of measurements between two trials of an assay or instrunent and developed an alternative called the concordances correlation coefficient.
Abstract: Lin (1989, Biometrics 45, 255-268) objected to the use of the intraclass correlation coefficient as a way to evaluate the reproducibility of measurements between two trials of an assay or instrunent and developed an alternative called the concordance correlation coefficient. It is noted that intraclass correlation refers not to a single coefficient but to a group of coefficients and that Lin's alternative is nearly identical to a subset of the coefficients in this group.

525 citations


Journal ArticleDOI
TL;DR: The results suggest that the SCID-II 2.0 has adequate interrater and internal consistency reliability.
Abstract: Interrater reliability and internal consistency of the SCID-II 2.0 was assessed in a sample of 231 consecutively admitted in- and outpatients using a pairwise interview design, with randomized rater pairing and blind interview assessment. Interrater reliability coefficients ranged from .48 to .98 for categorical diagnosis (Cohen kappa), and from .90 to .98 for dimensional judgements (Intraclass correlation coefficient). Internal consistency coefficients were satisfactory (.71-.94). The results suggest that the SCID-II 2.0 has adequate interrater and internal consistency reliability.

489 citations


Journal ArticleDOI
TL;DR: Using actual elbow flexor make and break strength measurements, this article illustrates a method for estimating a confidence interval for the SEM, shows how an a priori specification of confidence interval width can be used to estimate sample size, and provides several approaches for comparing error variances.
Abstract: The intraclass correlation coefficient (ICC) and the standard error of measurement (SEM) are two reliability coefficients that are reported frequently. Both measures are related; however, they define distinctly different properties. The magnitude of the ICC defines a measure's ability to discriminate among subjects, and the SEM quantifies error in the same units as the original measurement. Most of the statistical methodology addressing reliability presented in the physical therapy literature (eg, point and interval estimations, sample size calculations) focuses on the ICC. Using actual elbow flexor make and break strength measurements, this article illustrates a method for estimating a confidence interval for the SEM, shows how an a priori specification of confidence interval width can be used to estimate sample size, and provides several approaches for comparing error variances (and square root of the error variance, or the SEM).

385 citations


Journal ArticleDOI
TL;DR: NEMS is a suitable therapeutic index to measure nursing workload at the ICU level and is indicated for multicentre ICU studies and management purposes in the general (macro) evaluation and comparison of workload at this level.
Abstract: Objectives: To develop a simplified Therapeutic Intervention Scoring System (TISS) based on the TISS-28 items and to validate the new score in an independent database. Design: Retrospective statistical analysis of a database and a prospective multicentre study. Setting: Development in the database of the Foundation for Research on Intensive Care in Europe with external validation in 64 intensive care units (ICUs) of 11 European countries. Measurements and results: Development of NEMS on a random sample of TISS-28 items, cross validation on another random sample of TISS-28, and external validation of NEMS in comparison with TISS-28 scored by two independent raters on the day of the visit to the ICUs participating in an international study. Multivariable regression techniques, Pearson's correlation, and paired sample t-tests were used (significance at p < 0.05 level). Intraclass correlation, rate of agreement, and kappa statistics were used for interrater reliability tests. The TISS-28 items were reduced to NEMS (9 items) in a random sample of 2000 records; the means of the two scores were no different: TISS-28 26.23 ± 10.38, NEMS 26.19 ± 9.12, NS. Cross-validation in a random sample of 996 records; mean TISS-28 26.13 ± 10.38, NEMS 26.17 ± 9.38, NS; R 2 = 0.76. External validation on 369 pairs of TISS-28 and NEMS has shown that the means of the two scores were no different: TISS-28 27.56 ± 11.03, NEMS 27.02 ± 8.98, NS; R 2 = 0.59. Reliability tests have shown an “almost perfect” interrater correlation. Similar to studies correlating TISS with Simplified Acute Physiology Score (SAPS)-I and/or Acute Physiology and Chronic Health Evaluation II scores, the value of NEMS scored on the first day accounts for 30.4 % of the variation of SAPS-II score. Conclusions: NEMS is a suitable therapeutic index to measure nursing workload at the ICU level. The use of NEMS is indicated for: (a) multicentre ICU studies; (b) management purposes in the general (macro) evaluation and comparison of workload at the ICU level; (c) the prediction of workload and planning of nursing staff allocation at the individual patient level.

341 citations


Journal ArticleDOI
TL;DR: The results of this investigation demonstrate that clinicians can use functional performance testing to obtain reliable measures of lower extremity performance when using a standardized protocol.
Abstract: Clinicians routinely have used functional performance tests as an evaluation tool in deciding when an athlete can safely return to unrestricted sporting activities. These practitioners assumed that these tests provide a reliable measure of lower extremity performance; however, little research has been reported on the reliability of these measures. The purpose of this investigation was to determine the reliability of lower extremity functional performance tests. Five male and 15 female volunteers were evaluated using the single hop for distance, triple hop for distance, 6-m timed hop, and cross-over hop for distance as described by Noyes (10). One clinician measured each subject's performance using a standardized protocol and retested subjects in the same manner approximately 48 hours later. The order of testing was randomly determined. Subjects' average and individual scores on each functional performance test were used for statistical analysis. Intraclass correlation coefficients (ICCs) and standard erro...

315 citations


Journal ArticleDOI
TL;DR: The results of this study provide further evidence supporting the reliability, validity, and efficiency of the Patient-Specific Functional Scale when applied to patients with knee dysfunction.
Abstract: Background and Purpose. Assessing disability is important, and numerous interviewer-assisted and self-report questionnaires are used to accomplish this task. These questionnaires can be classified as being generic, condition or disease specific, or patient specific. The purpose of this study was to determine test-retest reliability, construct validity, and sensitivity to change of the Patient-Specific Functional Scale (PSFS) when applied to patients with knee dysfunction. Subjects. Subjects were 38 physician-referred patients with knee dysfunction. Methods. The PSFS and the Medical Outcomes Study 36-Item Short-Form Health Survey were administered at a patient's initial visit and following 2 to 3 weeks of treatment. An assessment of global change was also made by the patient and clinician at follow-up. These measures allowed the assessment of construct validity and sensitivity to change. To obtain an estimate of reliability, the PSFS was also administered within 72 hours of the initial assessment. Results. Test-retest reliability and sensitivity to change were excellent (intraclass correlation coefficient [type 2,1] R=.84 and Pearson's r =.78, respectively). Validity was also confirmed. Conclusion and Discussion. Previous investigation on persons with low back pain suggested that the PSFS has promising measurement properties. The results of this study provide further evidence supporting the reliability, validity, and efficiency of the PSFS. Further investigation is needed to determine the extent to which the PSFS can be applied across a variety of conditions and age groups.

286 citations


Journal ArticleDOI
TL;DR: The MFA can be used to assess the health status of patients who have a musculoskeletal disorder, and it was more responsive than the SF-36 and was more efficient than the SIP in measuring changes in function.
Abstract: We compared the reliability, validity, and responsiveness of the Musculoskeletal Function Assessment (MFA) questionnaire with those of three commonly used health-status measures: the Short Form-36 (SF-36), the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC), and the Sickness Impact Profile (SIP). The MFA, like the other health-status measures, demonstrated good reliability (intraclass correlation coefficients of more than 0.70), good sensitivity and specificity (more than 70 per cent), good criterion validity that correlated with physicians' ratings (p < 0.01), and good construct validity that correlated with the characteristics of the patients (p < 0.01). It also demonstrated better content validity than the other questionnaires, with no ceiling or floor effects for the total score. In addition, it was more responsive than the SF-36; for eight of the eleven comparisons, it was more efficient (relative efficiency of more than 2.00) in measuring changes in function between the baseline values and the values determined at the latest follow-up evaluation. These findings suggest that the MFA can be used to assess the health status of patients who have a musculoskeletal disorder.

229 citations


Journal ArticleDOI
TL;DR: It is concluded that the Patient-Specific Index is reliable, valid, and responsive, and the additive versions were the most responsive and are recommended for future applications.
Abstract: The Patient-Specific Index is used to assess the outcome of total hip arthroplasty by evaluating the preferences of the individual patient. The purpose of this study was to determine the reliability, validity, and responsiveness of this index and to compare different methods of combining patients' ratings of the severity and importance of their complaints, to obtain Patient-Specific Index summary scores. All patients who were scheduled to have a total hip arthroplasty performed by one surgeon at a single institution were eligible for the study. The patients completed the Harris hip score form, the McMaster-Toronto Arthritis (MACTAR) Patient Preference Disability Questionnaire, the Short Form-36, the Western Ontario and McMaster University Osteoarthritis Index (WOMAC), and the Patient-Specific Index. With use of the Patient-Specific Index, patients rated the severity and importance of each complaint. These ratings were summed in four different ways to derive severity-importance scores. The questionnaires were completed twice (two weeks apart) before the total hip arthroplasty and twice (two weeks apart) six months after the total hip arthroplasty by a subset of the patients. The seventy-eight participating patients had a mean age of 62.2 years (range, twenty-five to eighty-seven years) at the time of the operation. Forty-three patients (55 per cent) were men, and sixty-three (81 per cent) had osteoarthrosis. The inter-rater and intra-rater test-retest random-effects intraclass correlation coefficients of the Patient-Specific Index were 0.77 or greater (greater than 0.75 is considered excellent). Construct validity was shown by correlations of the Patient-Specific Index with other scales. The additive versions of the Patient-Specific Index (with a responsiveness statistic of 3.3 or greater and a standardized response mean of 1.6 or greater) were more responsive than the other scales. We concluded that the Patient-Specific Index is reliable, valid, and responsive. The additive versions were the most responsive and are recommended for future applications. Such indices need to be tested in studies of patients who have osteoarthrosis of the hip and other musculoskeletal diseases, to ensure generalizability of the results.

172 citations


Journal ArticleDOI
TL;DR: The results suggest that in the early and middle stages of PD, many of the measures of impairment and physical performance are relatively stable.
Abstract: Background and Purpose. Parkinson's disease (PD) is characterized by rigidity, postural instability, bradykinesia, and tremor, as well as other musculoskeletal impairments and functional limitations. The purpose of this investigation was to determine the reliability and stability of measures of impairments and physical performance for people in the early and middle stages of PD. Subjects. Thirteen men and 2 women in Hoehn and Yahr stages 2 and 3 of PD participated. Their mean age was 74.5 years (SD=5.7, range=64-84). Methods. Thirteen impairmentlevel variables and 8 physical performance variables were measured. Measurements were taken on two consecutive days and again a week later on the corresponding two consecutive days. Reliability and stability were assessed using analysis of variance and intraclass correlation coefficients (ICCs). Results. Test-retest reliability (ICCs) of variables ranged from .69 (hamstring muscle length) to .97 (lumbar flexion). Intraclass correlation coefficients were .85 or greater for 10 of the variables. Conclusions and Discussion. The results suggest that in the early and middle stages of PD, many of the measures of impairment and physical performance are relatively stable. [Schenkman M, Cutson T, Kuchibhatla M, et al. Reliability of impairment and physical performance measures for persons with Parkinson's disease. Phys Ther. 1997;77: 19-27.]

142 citations


Journal ArticleDOI
TL;DR: The AIMS2-SF is a shorter version of the AIMs2 (i.e., available in 2-page format) and has psychometric properties similar to those of the former questionnaire, preserving content validity as the priority criterion.
Abstract: Objective. To develop a short form of the Arthritis Impact Measurement Scales 2 (AIMS2) questionnaire, preserving content validity as the priority criterion. Methods. A 2-step reduction procedure was used: 1) Delphi technique, with 1 panel of patients and 1 panel of experts each selecting 1 set of items independently; and 2) nominal group technique, where members of both panels reached consensus on the final selection of items, using information derived from item analysis. Psychometric properties of the AIMS2-Short Form (AIMS2-SF) and AIMS2 were compared using data from a cohort of 127 rheumatoid arthritis patients who completed the AIMS2 twice prior to the initiation of methotrexate (MTX) treatment and 3 months post-initiation of MTX treatment. Results. The 2 panels reached consensus on a 26-item AIMS2-SF (54.4% reduction from the AIMS2). Factor analysis showed preservation of the 5-component structure. Convergent validity (Physical and Symptom components with clinical variables: r = 0.24-0.59), test-retest reproducibility (intraclass correlation coefficient >0.7), and sensitivity to change at 3 months (standardized response mean 0.36-0.8, except Social Interaction component [0.08]) were very close to the values for the original AIMS2. Conclusion. The AIMS2-SF is a shorter version of the AIMS2 (i.e., available in 2-page format) and has psychometric properties similar to those of the AIMS2.

139 citations


Journal ArticleDOI
TL;DR: The AIMS2-SF is a shorter version of the Arthritis Impact Measurement Scales 2 (i.e., available in 2-page format) and has psychometric properties similar to those of the AIMs2.
Abstract: Objective To develop a short form of the Arthritis Impact Measurement Scales 2 (AIMS2) questionnaire, preserving content validity as the priority criterion Methods A 2-step reduction procedure was used: 1) Delphi technique, with 1 panel of patients and 1 panel of experts each selecting 1 set of items independently; and 2) nominal group technique, where members of both panels reached consensus on the final selection of items, using information derived from item analysis Psychometric properties of the AIMS2-Short Form (AIMS2-SF) and AIMS2 were compared using data from a cohort of 127 rheumatoid arthritis patients who completed the AIMS2 twice prior to the initiation of methotrexate (MTX) treatment and 3 months post-initiation of MTX treatment Results The 2 panels reached consensus on a 26-item AIMS2-SF (544% reduction from the AIMS2) Factor analysis showed preservation of the 5-component structure Convergent validity (Physical and Symptom components with clinical variables: r = 024-059), test-retest reproducibility (intraclass correlation coefficient >07), and sensitivity to change at 3 months (standardized response mean 036-08, except Social Interaction component [008]) were very close to the values for the original AIMS2 Conclusion The AIMS2-SF is a shorter version of the AIMS2 (ie, available in 2-page format) and has psychometric properties similar to those of the AIMS2

Journal ArticleDOI
15 Apr 1997-Spine
TL;DR: The classification system for degenerative disc disease proposed by Kellgren et al and the method of measurement of sagittal curves from C2 to C7 demonstrated an acceptable level of reliability and can be used in outcomes research.
Abstract: STUDY DESIGN Interexaminer reliability study. OBJECTIVES To determine the reliability of grading apophysial joint and disc degenerative changes and the reliability of measuring sagittal curves on lateral cervical spine radiographs. SUMMARY OF BACKGROUND DATA Several authors have proposed that the presented of degenerative changes and the absence of lordosis in the cervical spine are indicators of poor recovery from neck injuries caused by motor vehicle collisions. The validity of those conclusions is questionable because the reliability of the methods used in their studies to measure the presence of degenerative changes and the absence of lordosis has not been determined. METHODS Kellgren's classification system for apophysial joint and disc degeneration, as well as the pattern and magnitude of the sagittal curve on 30 lateral cervical spine radiographs were assessed independently by three examiners. RESULTS Moderate reliability was demonstrated for classifying apophysial joint degeneration with an intraclass correlation coefficient of 0.45 (95% confidence interval, 0.09-0.71). Classifying degenerative disc disease had substantial reliability, with an intraclass correlation coefficient of 0.71 (95% confidence interval, 0.23-0.88). Measuring the magnitude of the sagittal curve from C2 to C7 had excellent interexaminer agreement, with an intraclass correlation coefficient of 0.96 (95% confidence interval, 0.88-0.98) and an interexaminer error of 8.3 degrees. CONCLUSIONS The classification system for degenerative disc disease proposed by Kellgren et al and the method of measurement of sagittal curves from C2 to C7 demonstrated an acceptable level of reliability and can be used in outcomes research.

01 Jan 1997
TL;DR: The newly developed comorbidity measures are reliable and valid for use in stroke outcome research.
Abstract: Objective: To develop standardized comorbidity measures for use in stroke outcome research. Design: Retrospective review of medical records to analyze comorbidities and to study reliability and validity of the newly developed measures, comorbidity index (CI), and weighted comorbidity index (w-CI). Setting: Tertiary rehabilitation center in Japan. Patients: 106 stroke patients, age 56.5 ± 13.2yr, admitted and discharged during the year from May 1994 to December 1995. The median days of duration of stroke, onset to admission, and length of stay (LOS) were 199, 83, and 105.5, respectively. The median admission and discharge Functional Independence Measure (FIM) raw scores were 85 and 110, respectively. Main Outcome Measures: Assessment of interrater reliability with intraclass correlation coefficient (ICC) for total scores and weighted kappa for subscores; assessment of concurrent validity by relating the measures to Charlson's comorbidity index, total numbers of medications, laboratory studies, therapeutic interventions, consultations, and days of interruption (Spearman's rank correlation method); study of predictive validity with discharge FIM score and LOS as dependent variables. Results: The ICCs were .896 for CI and .997 for w-CI, and weighed kappa ranged from .615 to 1.00. CI and w-CI correlated significantly with Charlson index and the above indices of validity. They also correlated negatively with discharge FIM scores and positively with LOS. With stepwise multiple regression analysis, 79.8% of the variance of discharge FIM scores could be explained by w-CI, days from onset to admission, admission FIM score, and deviation in tape bisection task. Conclusion: The newly developed comorbidity measures are reliable and valid for use in stroke outcome research.

Journal ArticleDOI
TL;DR: In this paper, the authors developed standardized comorbidity measures for use in stroke outcome research and evaluated the reliability and validity of the newly developed measures, which are called COMORBIDITY index (CI) and weighted COMORBIBIH index (w-CI).

Journal ArticleDOI
01 Sep 1997-Chest
TL;DR: The Seattle Obstructive Lung Disease Questionnaire is a reliable, valid, and responsive measure of physical and emotional function, coping skills, and treatment satisfaction, useful in monitoring long-term outcomes among large groups of COPD patients.

Journal ArticleDOI
TL;DR: The modified MAI modified for elderly outpatients in a non-Veterans Affairs, ambulatory, elderly population and may provide pharmacists with a practical and standard method to evaluate patients' drug regimens and identify some potential drug-related problems.
Abstract: OBJECTIVE:To evaluate the reliability of a medication appropriateness index (MAI) modified for elderly outpatients in a non-Veterans Affairs setting.DESIGN:Reliability study.SETTING:General community.PARTICIPANTS:Ten community-dwelling elderly (> 65 y) taking five or more regularly scheduled medications and participating in a university-based health service intervention study.MAIN OUTCOME MEASURES:Interrater reliability of MAI ratings of 65 medications made by two clinical pharmacists for individual items and for an overall summed score was calculated by use of κ statistics and intraclass correlation coefficient.RESULTS:The interrater agreement for each of the individual MAI items was high for both appropriate and inappropriate ratings and ranged from 80% to 100% (overall κ = 0.64). Overall agreement for the summed score was good (intraclass correlation = 0.80).CONCLUSIONS:The modified MAI is a reliable instrument for evaluation of medication appropriateness in a non-Veterans Affairs, ambulatory, elderly ...

Journal ArticleDOI
TL;DR: Clinical reliability was demonstrated for each technique; however, validity compared with the radiographic measurement could not be established and future research is necessary to establish interrater reliability and assess each technique's ability to detect postural changes over time.
Abstract: Clinicians often rely on visual inspection and descriptive terms to document a patient's forward shoulder posture. The purpose of this study was to assess the validity and intrarater reliability of four objective techniques to measure forward shoulder posture. Subjects were 25 males and 24 females. Subjects had a lateral cervical spine radiograph taken, from which the horizontal distance from the C7 spinous process to the anterior tip of the left anterior acromion process was measured. Subjects then proceeded twice through a random order of four measurements: the Baylor square, the double square, the Sahrmann technique, and scapular position. These results were then used to determine the intrarater reliability of each technique. Multiple regression analyses were performed on each measure's mean scores to determine both the correlation with and the predictive value for the radiographic measurement. The intraclass correlation coefficients for intrarater reliability ranged from .89 to .91. The correlation co...

Journal ArticleDOI
TL;DR: The results of this study support the reliability and validity of a new health-related quality-of-life measure containing global and obesity- specific domains and an obesity-specific health state preference (HSP) assessment, and further testing is needed to evaluate the responsiveness of both assessments in a weight-stable group.
Abstract: The objective of this study was to assess the reliability, validity and responsiveness of a new health-related quality-of-life (HRQOL) measure containing global and obesity-specific domains and an obesity-specific health state preference (HSP) assessment. A total of 417 obese and ‘normal’ weight individuals completed these assessments. Internal consistency and test-retest reliability were demonstrated, with Cronbach's ?, intraclass correlation coefficient and ? values well above the acceptable level for most scales. Construct validity hypotheses were confirmed by examining scale correlations. The normal weight individuals reported statistically significantly better functioning and well-being on the majority of the HRQOL scales and HSP than obese individuals. Guyatt's statistic of responsiveness was moderate to high for all the scales and items in the weight-loss and weight-gain groups; however, many of the scales and items in the weight-stable group also displayed responsiveness. The results of this study support the reliability and validity of these assessments. However, further testing is needed to evaluate the responsiveness of both assessments in a weight-stable group.

Journal ArticleDOI
01 Oct 1997-Stroke
TL;DR: A substantial decrement in functioning in stroke patients is indicated and data suggest that family caregivers can complete the Health Utilities Index reliably when patients are unable to do so.
Abstract: Background and Purpose Few studies currently assess the health-related quality of life of individuals following a stroke. One of the major challenges of assessing quality of life is the high likelihood that after a stroke a patient will not be able to complete such an assessment. One practical solution is to have a family caregiver complete the assessment on behalf of these individuals. This current pilot study examined the interrater reliability of having family caregivers complete the Health Utilities Index (HUI) on behalf of stroke patients. Methods A total of 74 patients who experienced an ischemic stroke and 37 family caregivers completed the interviewer-administered HUI (data were available for 33 pairs). The HUI is designed to produce a single summary measure of health-related quality of life, the global multiattribute utility score, as well as descriptive information on each of its attributes. Interrater reliability was measured by evaluating the percent agreement, Cohen’s κ statistics, intraclass correlation coefficients (ICCs), Pearson’s R correlations, and paired t tests between the patient and caregiver responses. Results In most instances interrater reliability was acceptable, with values suggesting moderate to high agreement. The mean global multiattribute utility scores for the HUI 2 were identical for patients and caregivers (0.64±0.29), with an ICC of .72. A preponderance of patients reported decrements in several attributes of the HUI. Conclusions These data indicate a substantial decrement in functioning in stroke patients and suggest that family caregivers can complete the HUI reliably when patients are unable to do so.

Journal ArticleDOI
01 Nov 1997-Pain
TL;DR: The data suggest that there is no significant genetic contribution to the strong correlation in PPT that is observed in twin pairs, and reinforce the view that learned patterns of behaviour within families are an important determinant of perceived sensitivity to pain.
Abstract: The objective of this study was to examine the relative contribution of genetic and environmental factors in determining pain perception in a classical twin study. Dolorimeter measurements of pressure pain threshold (PPT) were recorded in 609 healthy female-female twin pairs of whom 269 pairs were monozygotic (MZ) and 340 were dizygotic (DZ). There was a strong correlation (R) in PPT in both MZ and DZ pairs (R(MZ) = 0.57, 95% confidence interval (CI): [0.49, 0.65]; R(DZ) = 0.51, 95% CI: [0.42, 0.59]). The slight excess in intraclass correlation observed in MZ when compared with DZ twins corresponds to a heritability for PPT of only 10% and is not statistically significant. Neither estimate of intraclass correlation was substantially altered after adjusting for a range of potential confounding variables including age, current tobacco and alcohol use, current analgesic use, psychological status assessed by the general health questionnaire, and social class. The dolorimeter measurements were shown to be reliable (between observer agreement R = 0.66; within observer agreement R = 0.70-0.76) and stable over time. In conclusion, these data suggest that there is no significant genetic contribution to the strong correlation in PPT that is observed in twin pairs. These findings reinforce the view that learned patterns of behaviour within families are an important determinant of perceived sensitivity to pain.

Journal ArticleDOI
TL;DR: The contention that some physical performance measures can be used to test individuals in the later stages of Alzheimer's disease given appropriate modification is supported.
Abstract: BACKGROUND: Investigation of the effects of exercise on frail, institutionalized individuals with dementia has been impeded by concerns about the reliability of physical performance measures when used in this population. METHODS: The physical performance of 33 institutionalized subjects with Alzheimer's disease was measured during both the morning and afternoon of day 1 by rater 1 and during both the morning and afternoon of day 2, one week later, by rater 1 and rater 2. Intraclass correlation coefficients (ICCs) were calculated to examine the inter- and intrarater reliability of "sit to stand," "25-foot walk," and "the distance walked in 6 minutes" and walking speed over 25 feet and for 6 minutes. An analysis of variance was performed to determine the components of variance for each test. RESULTS: ICCs for "distance walked in 6 minutes" ranged from .80 to .99 with 77% of the variance explained by inter-subject difference. The ICCs for "time to walk 25 feet" ranged from .57 to .97 with 25% of the variance explained by inter-subject differences. In contrast, the "sit to stand" measure produced ICCs ranging from -.07 to .85 with only 7% of the variance explained by inter-subject differences in this impaired population. CONCLUSION: Our results support the contention that some physical performance measures can be used to test individuals in the later stages of Alzheimer's disease given appropriate modification. Although subjects with Alzheimer's disease may have difficulty following commands and/or require physical assistance, this does not prohibit the reliable assessment of physical performance if measurements are made over longer (6-minute walk) rather than shorter periods (25-foot walk). Language: en

Journal ArticleDOI
TL;DR: TISS-28 can replace TISS-76 for the measurement of the nursing workload in Portuguese ICUs, and was validated on this independent population.
Abstract: Objective: To evaluate the performance of the Simplified Therapeutic Intervention Scoring System on an independent database and determine its relation with the Therapeutic Intervention Scoring System in the quantification of nursing workload in intensive care. Design: Analysis of the database of a multicenter prospective Portuguese study. Setting: 19 intensive care units (ICUs) in Portugal. Patients: Data on 1094 patients consecutively admitted to the ICUs were collected during a period of 3 months. Methods: Collection of the data necessary for the calculation of the Therapeutic Intervention Scoring System (TISS-76) and the Simplified Therapeutic Intervention Scoring System (TISS-28) during the first 24 h in the ICU. Basic demographic statistics and all the variables necessary for the computation of the Simplified Acute Physiology Score II were also collected. Vital status at discharge from the hospital was registered. Regression techniques, Pearson's correlation and paired sample t-test were used. Results are presented as mean ± standard deviation except when stated otherwise. Reliability was evaluated by the use of intraclass correlation coefficients in a 5 % random sample. Measurements and results: After exclusion of all the patients with missing data, 1080 patients were analysed. The overall mean TISS-28 (29.82 ± 10.64) was significantly lower than the mean TISS-76 (31.14 ± 11.95). Both systems showed very significant differences between ICUs (p < 0.001). The correlation between the two was good, with TISS-28 explaining 72 % of the variation of TISS-76 (r = 0.85, r 2 = 0.72). The relation between the two systems was TISS-28 = 6.22 + 0.85 TISS-76. In this cohort, reliability of data collection was very high, with intraclass correlation coefficients greater than 0.90 for both systems. Conclusions: TISS-28 was validated on this independent population. The results indicate that TISS-28 can replace TISS-76 for the measurement of the nursing workload in Portuguese ICUs

Journal ArticleDOI
TL;DR: Results show that both the p and the residual variance can be reduced, by an average of 20 and 11%, respectively, offering greater efficiency for investigators who plan future studies and who are able to measure those covariates in their studies.

Journal ArticleDOI
TL;DR: This study demonstrates that instrumental measures for the assessment of dyskinesia are reliable and can be implemented in multi-center studies with minimal training.
Abstract: Nine VA Medical Centers are participating in a 2-year double-blind placebo controlled study of antioxidant treatment for tardive dyskinesia (TD) conducted by the Department of Veteran Affairs Cooperative Studies Program. One of the principal outcome measures of this study is the score derived from the instrumental assessment of upper extremity dyskinesia. Dyskinetic hand movements are quantified by assessing the variability associated with steady-state isometric force generated by the patient. In the present report, we describe the training procedures and results of a multi-center reliability assessment of this procedure. Data from nine study centers comprising 45 individual patients with six trials each (three from left hand and three from right hand) were reanalyzed by an independent investigator and the results were subjected to reliability assessment. For the statistic of interest (average coefficient of variation over trials 2 and 3 for each hand, then take the larger of these two values), we found very high intraclass correlation coefficients for reliability over all patients across sites (ICC = 0.995). We also calculated the reliability of the measures across trials within patient for each combination of hand (right, left, dominant), rater group (site, control), and trials set (all three, trials 2 and 3). For a given hand and trial set, the reliability of the site raters was similar to that of the control. This study demonstrates that instrumental measures for the assessment of dyskinesia are reliable and can be implemented in multi-center studies with minimal training.

Journal Article
TL;DR: Establishment of high repeatability of DI measurements suggests that the stress-radiographic method may be used by multiple examiners with the expectation of comparable and consistent results.
Abstract: OBJECTIVE To evaluate in vivo repeatability of the distraction index method of evaluating hip joint laxity in dogs. ANIMALS 31 two-year-old Labrador Retrievers. PROCEDURE Each dog was anesthetized and radiographically evaluated for hip joint laxity 4 times: twice by an experienced examiner and twice by an examiner who had no previous knowledge of or training in the technique prior to the first day of testing. Distraction indices (DI) were determined from the radiographs and intraclass correlation coefficients were calculated to evaluate the repeatability of DI measurements between and within examiners. RESULTS Intraclass correlation coefficients were high (range, 0.85 to 0.94). Lower limits of the 95% confidence intervals for the intraclass correlation coefficients ranged from 0.75 to 0.89. CONCLUSIONS Between- and within-examiner repeatabilities of DI measurements were high, suggesting that the technique is clinically reliable. CLINICAL RELEVANCE Distraction index is a reliable measure of hip joint laxity and a good predictor of the risk of development of degenerative joint disease associated with hip dysplasia in dogs. Establishment of high repeatability of DI measurements suggests that the stress-radiographic method may be used by multiple examiners with the expectation of comparable and consistent results.

Journal ArticleDOI
TL;DR: The results support the use of each of the 15-sec lateral step-up tests as reliable, stable measures of lower extremity performance, but caution should be used when interpreting the results of either of the 50-repetition lateral step up tests if used as demonstrated in this study.
Abstract: The lateral step-up test is often utilized by clinicians to assess lower extremity performance capabilities. Reliability of the lateral step-up test, however, is not available. Therefore, the purpose of this study was to determine the test-retest reliability of a 15-sec and a 50-repetition lateral step-up test on a .15-m (6-inch) and .2-m (8-inch) step. For each of the 15-sec lateral step-up tests, subjects were asked to perform as many repetitions as possible during the 15-sec time frame, while for each of the 50-repetition lateral step-up tests, subjects were asked to perform 50 repetitions as quickly as possible. Eighteen healthy subjects were studied. Data were analyzed through a repeated measures analysis of variance, intraclass correlation coefficients (ICC) (2, 1), and standard errors of measurement. The ICC values were .90 and .94 for the .15-m and .2-m 15-sec lateral step-up tests and .91 and .96 for the .15-m and .2-m 50-repetition lateral step-up tests, respectively, revealing test-retest reliability to be high for each of the tests. Significant differences, however, were noted between the testing days for each of the 50-repetition lateral step-up tests, indicating that the measures may not be stable. No significant differences were seen between testing days for either of the 15-sec lateral step-up tests. While the results support the use of each of the 15-sec lateral step-up tests as reliable, stable measures of lower extremity performance, caution should be used when interpreting the results of either of the 50-repetition lateral step-up tests if used as demonstrated in this study.

Journal ArticleDOI
TL;DR: The authors assessed the limits of reliable history-taking in depressed elderly patients with some cognitive impairment and found the aggregate kappa statistic can provide a clinically meaningful way of assessing interrater reliability of psychopathological constructs for which several definitions are used.
Abstract: The authors assessed the limits of reliable history-taking in depressed elderly patients (N = 20) with some cognitive impairment. Each subject and an informant was interviewed with structured instruments by two trained raters. An expert panel formed consensus judgments after reviewing information reported by the patients, the informants, and each of the clinical raters. Intraclass correlation between the two raters was 0.99 for the duration of depressive episodes and 0.88 for age at onset. The raters agreed on the duration of major depressive episodes in 85% of cases and on age at onset in 80% of cases. The duration of previous depressive episodes and age at depression onset cannot always be determined reliably even when informants and structured interviews are used. Greater difficulties may be encountered in patients with minor depression or chronic intermittent depression and early-onset depression. Clinicians should obtain history from as many reliable sources as possible and critically evaluate this information while considering the entire clinical picture. The aggregate kappa statistic can provide a clinically meaningful way of assessing interrater reliability of psychopathological constructs for which several definitions are used.

Journal ArticleDOI
TL;DR: Computerised FHR baseline estimation shows improvements in the accuracy and efficiency of the current baseline estimation methods, particularly when dealing with high levels of uncertainty in the response of the immune system.

Journal ArticleDOI
TL;DR: The use of phone-administered mood ratings as a reliable and convenient method to monitor patients with RCBD supports the use of face-to-face administration of both scales.
Abstract: We examined the reliability and level of agreement between the telephone and face-to-face administration of two mood-rating scales (HIGH-SAD and SIGH-SAD) in patients with rapid cycling bipolar disorder (RCBD). Two clinicians administered the HIGH-SAD and SIGH-SAD to 14 outpatients with RCBD. Patients received consecutive phone and face-to-face mood ratings in a randomized order. Using a paired t-test, no significant differences were found when comparing HIGH-SAD and SIGH-SAD scores administered face-to-face and over the phone. There was a high correlation between the face-to-face and phone administration of both scales as measured by intraclass correlation (r = 0.94 for SIGH-SAD; r = 0.85 for HIGH-SAD). Our results support the use of phone-administered mood ratings as a reliable and convenient method to monitor patients with RCBD.

01 Jan 1997
TL;DR: Based on these findings, a mixed strategy, the telephone mode for patients capable of responding to the FONE FIM and in-home assessments for those who are incapable, is recommended.
Abstract: The "motor" (activities of daily living) component of the FONE FIM, the telephone version of the Functional Independence Measure (FIM) was evaluated in a cohort of 132 patients who had been discharged to home from a geriatric inpatient assessment and rehabilitation program. In the current study, Rasch person ability measures were derived from telephone assessments 5 weeks after discharge and in-home assessments 1 week later. Concordance between the modes was shown to be satisfactory for the Rasch measures based on intraclass correlation coefficients. However, the telephone mode consistently generated lower estimates than did the observational mode. This was due to the fact that the telephone mode underestimated motor function for the majority of patients who were at higher levels of cognition and motor function, but overestimated for patients who were at lower levels of cognition and motor function. At the item level, concordance, as determined by Kappa statistics, was better when the FONE FIM responses came from the patient rather than proxy respondents, and when the assessments were done by more experienced rather than less experienced raters. Based on these findings, a mixed strategy, the telephone mode for patients capable of responding to the FONE FIM and in-home assessments for those who are incapable, is recommended.