scispace - formally typeset
Search or ask a question

Showing papers on "Intraclass correlation published in 2000"


Journal ArticleDOI
TL;DR: The GMFM-66 has good psychometric properties and can provide a better understanding of motor development for children with CP than the 88 item GMFM and can improve the scoring and interpretation of data obtained with the GMFM.
Abstract: Background and purpose This study examined the reliability, validity, and responsiveness to change of measurements obtained with a 66-item version of the Gross Motor Function Measure (GMFM-66) developed using Rasch analysis. Subjects and methods The validity of measurements obtained with the GMFM-66 was assessed by examining the hierarchy of items and the GMFM-66 scores for different groups of children from a stratified random community-based sample of 537 children with cerebral palsy (CP). A subset of 228 children who had been reassessed at 12 months was used to test the hypothesis that children who are young ( Results The overall changes in GMFM-66 scores over 12 months and a time ( severity ( age interaction supported our hypotheses. Test-retest reliability was high (intraclass correlation coefficient=.99). Conclusion and discussion This study demonstrated that the GMFM-66 has good psychometric properties. By providing a hierarchical structure and interval scaling, the GMFM-66 can provide a better understanding of motor development for children with CP than the 88 item GMFM and can improve the scoring and interpretation of data obtained with the GMFM.

586 citations


Journal ArticleDOI
TL;DR: The Swedish version of the DASH is a reliable and valid instrument that can provide a standardized measure of patient-centered outcomes in upper-extremity musculoskeletal conditions.
Abstract: The disabilities of the arm, shoulder and hand (DASH) questionnaire is a self-administered region-specific outcome instrument developed to measure upper-extremity disability and symptoms. The DASH consists mainly of a 30-item disability/symptom scale. We performed cross-cultural adaptation of the DASH to Swedish, using a process that included double forward and backward translations, expert and lay review, as well as field-testing to achieve linguistic and conceptual equivalence. The Swedish version's reliability and validity were then evaluated in 176 patients with upper-extremity conditions. The patients completed the DASH and SF-12 generic health questionnaire before elective surgery or physical therapy. Internal consistency of the DASH was high (Cronbach alpha 0.96). Test-retest reliability, evaluated in a subgroup of 67 patients who completed the DASH on two occasions, with a median interval of 7 days, was excellent (intraclass correlation coefficient 0.92). Construct validity was shown by a positive correlation of DASH scores with the SF-12 scores (worse upper-extremity disability correlating with worse general health), stronger correlation with the SF-12 physical than with the mental health component, correlation of worse DASH scores with worse self-rated global health, and ability to discriminate among conditions known to differ in severity. The Swedish version of the DASH is a reliable and valid instrument that can provide a standardized measure of patient-centered outcomes in upper-extremity musculoskeletal conditions.

417 citations


Journal Article
TL;DR: The conclusion is that the authors need to be especially careful when using existing correlation methods and a new correlation method needs to be developed in the future.
Abstract: In this note, we first review some recent developments about measures of agreement, which are often required in medicine and other sciences, with focus on differences between these methods. In the last part, we mention five important concerns when using a newly developed concordance correlation coefficient. Our conclusion is that we need to be especially careful when using existing correlation methods and a new correlation method needs to be developed in the future.

374 citations


Journal ArticleDOI
01 May 2000-Chest
TL;DR: Preliminary data is provided suggesting that a triaxial movement sensor is a reliable, valid, and stable measure of walking and daily physical activity in COPD patients, and measures an important dimension of functional status not previously well-described.

303 citations


Journal ArticleDOI
01 Jun 2000-Diabetes
TL;DR: The results suggest that the degree of clustering of risk variables of Syndrome X varies with age from childhood to adulthood and is likely influenced by the age-related changes in obesity and the attendant insulin resistance.
Abstract: The age-related patterns of clustering of cardiovascular risk variables of Syndrome X from childhood to adulthood were examined in a community-based sample of black and white children (aged 5-10 years, n = 2,389), adolescents (aged 11-17 years, n = 3,371), and young adults (aged 18-37 years, n = 2,115). In the analysis of clustering, insulin resistance index, BMI, triglycerides/ HDL cholesterol ratio, and mean arterial pressure were used either as categorical variables (age-, race- and sex-specific values >75th percentiles) to calculate risk ratios (observed frequency/expected frequency) or as continuous variables (normal scores based on ranks) to compute intraclass correlations. In the total sample, the risk ratio for clustering of adverse levels of all 4 variables was 9.8 for whites (P < 0.01) versus 7.4 for blacks (P < 0.01); the intraclass correlation was 0.33 for whites (P < 0.001) versus 0.26 for blacks (P < 0.001). Both the risk ratio and intraclass correlation were significantly higher in whites than in blacks in the total sample. The intraclass correlations of the 4 variables were significant (P < 0.001) in all race and age-groups, and they were higher during preadolescence and adulthood than during adolescence. Furthermore, unlike risk ratios, intraclass correlations showed a continuous increase with age during adulthood. When BMI was adjusted, the intraclass correlations involving the other 3 variables were reduced by approximately 50%, and the age-related pattern was no longer evident. These results suggest that the degree of clustering of risk variables of Syndrome X varies with age from childhood to adulthood and is likely influenced by the age-related changes in obesity and the attendant insulin resistance.

227 citations


Journal ArticleDOI
TL;DR: The reliability of the IEQ in five languages varies across sites, but is sufficiently high in at least four out of five.
Abstract: Background In international research on the consequences of psychiatric illnesses for relatives for patients, the need for an internationally standardised measure has been identified. Aims To test the internal consistency and the test-retest reliability of the Involvement Evaluation Questionnaire (IEQ) in five European countries. Method The IEQ was administered twice to a sample of relatives or friends of patients with an ICD-10 diagnosis of schizophrenia. Reliability was tested using Cronbach's α, intraclass correlation coefficients and standard error of measurement. Reliability estimates were tested between sites. Results Test sample sizes ranged from 30 to 90 across sites, and retest sample sizes ranged from 21 to 77. Cronbach's α values of IEQ sub-scales and sumscore were substantial at most sites; but at two, α values were moderate. Intraclass correlation coefficients were substantial to high at all sites. The standard errors of measurement differed across sites, indicating differences in performance. Conclusion The reliability of the IEQ in five languages varies across sites, but is sufficiently high in at least four out of five.

171 citations


Journal ArticleDOI
TL;DR: Information involving the development of the DSM-IV version of the Children's PTSD Inventory is described, and independent ratings by highly experienced judges denote that the instrument encompassed the universe of definition that it was intended to measure.
Abstract: Information involving the development of the DSM-IV version of the Children's PTSD Inventory is described. Independent ratings by highly experienced judges denote that the instrument encompassed the universe of definition that it was intended to measure (i.e., the DSM-IV criteria for PTSD). The instrument was administered to 82 traumatized and 22 nontraumatized youths at Bellevue Hospital. Moderate to high Cronbach alphas (.53-.89) were evident at the subtest level. An alpha of .95 was evident at the diagnostic level. In terms of inter-rater reliability, 98.1% agreement was evident at the diagnostic level. Inter-rater intraclass correlation coefficients (ICCs) ranged from .88 to .96 at the subtest level and .98 at the diagnostic level. Good to excellent kappas (.66-1.00) were reported for inter-rater reliability at the subtest level. An inter-rater reliability kappa of .96 was evident at the diagnostic level. In terms of test-retest reliability, 97.6% agreement was evident at the diagnostic level. Good to excellent test-retest kappas (.66-1.00) and ICCs (.66-.94) were observed. A test-retest kappa of .91 and an ICC of .88 was observed at the diagnostic level.

141 citations


Journal ArticleDOI
TL;DR: The results indicate that the CKC Upper Extremity Stability Test is a reliable evaluation tool.
Abstract: Context: Functional testing of patients is essential to clinicians because it provides objective data for documentation that can be used for serial reassessment and progression through a rehabilitation program. Furthermore, new tests should require minimal time, space, and money to implement. Purpose: To determine the test-retest reliability of the Closed Kinetic Chain (CKC) Upper Extremity Stability Test. Participants: Twenty-four male college students. Methods: Each subject was tested initially and again 7 days later. Each subject performed 1 submaximal test followed by 3 maximal efforts. A 45-second rest was given after each 15-second test. The 2 maximal-test scores were averaged and compared with those from the retest. Results: The intraclass correlation coefficient was .922 for test-retest reliability. A paired-samples t test (.927) was conducted, and the coefficient of stability was .859. The results indicate that the CKC Upper Extremity Stability Test is a reliable evaluation tool.

110 citations


Journal ArticleDOI
TL;DR: The symptom score is a reproducible, valid, and responsive instrument for assessing symptoms caused by GERD and demonstrates longitudinal validity.
Abstract: The purpose of this study was to establish the reproducibility, validity, and responsiveness of a symptom questionnaire to assess patients with gastroesophageal reflux disease (GERD). A total of 300 patients with GERD completed questionnaires before and 6 months after laparoscopic Nissen fundoplication. Forty-six GERD patients who continued on omeprazole served as controls. Lower esophageal sphincter pressure, 24-h pH, and quality of life (SF36) were measured at baseline and follow-up. Reproducibility was calculated as an intraclass correlation coefficient (ICC) from a repeated-measures analysis of variance on symptom scores (SS) on two consecutive days. Validity was established by correlating SS with 24-h pH and SF36 scores. Responsiveness was calculated as the the ratio of the mean paired difference in score in the surgical group to the within-subject variability in control subjects. Reproducibility was very high, as revealed by an ICC of 0.92. Strong correlations between SS and SF36 scores at baseline and after surgery demonstrated high cross-sectional validity. Correlation between change in SS and change in pH, SF36 pain, general health, and physical health scores demonstrated longitudinal validity. The mean (95% confidence interval) paired differences in SS were 25.6 (23.7, 27.5) in the study and 2.0 (-3.2, 7.3) in the control groups, and the responsive index was 1.0. The estimated minimally important clinical difference was 7. We conclude that the symptom score is a reproducible, valid, and responsive instrument for assessing symptoms caused by GERD.

107 citations


Journal ArticleDOI
TL;DR: Reliability data on the Catatonia Rating Scale showed that all were frequently endorsed and occurred across a wide range of severity, suggesting that the CRS is a reliable rating scale for the diagnosis of catatonia.

104 citations


Journal ArticleDOI
TL;DR: The OSD-6 is a reliable, responsive, easily administered instrument that is valid for detecting change after adenotonsillectomy in children with OSDs and its large responsiveness to clinical change is demonstrated.
Abstract: Objective To validate a disease-specific health-related quality of life (HRQOL) instrument for children with obstructive sleep disorders (OSDs). Design Prospective cohort study using a 6-item health-related instrument (OSD-6). Subjects One hundred caregivers of patients with OSDs secondary to adenotonsillar hypertrophy (age range, 2-12 years) from 2 tertiary care, pediatric otolaryngology practices. Intervention The OSD-6 was administered on initial presentation and 4 to 5 weeks after adenotonsillectomy. A subset of patients repeated the OSD-6 within 3 weeks after presentation to assess test-retest reliability. Main Outcome Measures Test-retest reliability, internal consistency, construct validity, and responsiveness to clinical change of the OSD-6 score. Results Test-retest reliability was good (intraclass correlation coefficient = 0.74). Median OSD-6 score was 4.5 (0- to 6-point scale) with higher scores indicating poorer quality of life (QOL). Construct validity was demonstrated by the moderate correlation between OSD-6 score and global adenoid and tonsil-related QOL ( R = −0.62), strong correlation between the OSD-6 change score and change in global adenoid and tonsil-related QOL ( R = −0.63), and the moderate correlation between the change score and parent estimate of clinical change ( R = 0.40). The mean change in OSD-6 score after adenotonsillectomy was 3.0 (95% confidence interval, 2.7-3.4). The mean standardized response was 2.3 (95% confidence interval, 1.9-2.7) indicating the instrument's large responsiveness to clinical change. The change score was very reliable ( R = 0.85). Conclusions The OSD-6 is a reliable, responsive, easily administered instrument. It is valid for detecting change after adenotonsillectomy in children with OSDs.

Journal ArticleDOI
TL;DR: The PPT-8 was more responsive than the 6-minute walk test to change in performance expected with this functional training intervention, and measured to be highly reliable.
Abstract: Background and Purpose. The reliability and responsiveness of 2 physical performance measures were assessed in this nonrandomized, controlled pilot exercise intervention. Subjects. Forty-five older individuals with mobility impairment (mean age=77.9 years, SD=5.9, range=70–92) were sequentially assigned to participate in an exercise program (intervention group) or to a control group. Methods. The intervention group performed exercise 3 times a week for 12 weeks that targeted muscle force, endurance, balance, and flexibility. Outcome measures were the 8-item Physical Performance Test (PPT-8) and the 6-minute walk test. Test-retest reliability and responsiveness indexes were determined for both tests; interrater reliability was measured for the PPT-8. Results. The intraclass correlation coefficient for interrater reliability for the PPT-8 was .96. Intraclass correlation coefficients for test-retest reliability were .88 for the PPT-8 and .93 for the 6-minute walk test. The intervention group improved 2.4 points and the control group improved 0.7 point on the PPT-8, as compared with baseline measurements. There was no change in 6-minute walk test distance in the intervention group when compared with the control group. The responsiveness index was .8 for the PPT-8 and .6 for the 6-minute walk test. Conclusion and Discussion. Measurements for both the PPT-8 and the 6-minute walk test appeared to be highly reliable. The PPT-8 was more responsive than the 6-minute walk test to change in performance expected with this functional training intervention.

Journal ArticleDOI
TL;DR: For many types of items, participant-proxy reliability is sufficient to merit the use of proxies in TBI outcome research when the participants are allowed to select their own proxy.
Abstract: Objective:To assess reliability between persons with Traumatic Brain Injury (TBI) and their self-selected proxies.Design:Intraclass Correlation Coefficients were used to assess participant-proxy reliability on the Craig Handicap Assessment and Reporting Technique (CHART), the Community Integration Q

Journal ArticleDOI
TL;DR: Comparing the absolute values for, and the day-to-day reliability of, measures of cervical spinal mobility made with two computerised motion analysis devices found each device is highly reliable in itself and can be used with confidence in longitudinal studies.
Abstract: Range of motion tests are often employed in the quantification of musculoskeletal impairment and in the assessment of the efficacy of therapeutic interventions. The aim of the present study was to compare the absolute values for, and the day-to-day reliability of, measures of cervical spinal mobility made with two computerised motion analysis devices. The ranges of cervical flexion, extension, lateral bending, axial rotation, and axial rotation in flexion and extension were determined for 19 volunteers using both the CA6000 Spine Motion Analyser and the Zebris CMS system; all measures were repeated on a second occasion 1–3 days later. The test-retest reliability was good for each instrument: there was no significant difference between the mean values derived on the two separate days (P>0.05), and the corresponding intraclass correlation coefficients were 0.75–0.93 for all primary movements and 0.57–0.93 for axial rotation in flexion or in extension. For each primary movement, ¶a small but significant difference (1–10%; P<0.05) between the values derived from the two instruments was observed, the systematic nature of which was revealed by the excellent correlation coefficients between them. For the measures of axial rotation in flexion or in extension, however, there was not only a poor correlation between the data obtained from the two devices, but the mean values also differed significantly. Each device is highly reliable in itself and can be used with confidence in longitudinal studies. The establishment of ‘normal’ values for the primary motions should take account of the slight differences observed between devices. Normal values for rotation in flexion or extension cannot be established until the source of the device-dependent difference is identified.

Journal ArticleDOI
TL;DR: In 22 patients, the validity and interobserver reliability of two scoring methods commonly used as main endpoints in clinical trials, i.e., the Myasthenic Muscle Score (MMS) and the Quantified Myastshenia Gravis Strength Score (QMGSS), were compared.
Abstract: Valid and reliable measurements of muscle impairment are needed to assess therapeutic efficacy in patients with generalized myasthenia gravis (MG). In 22 patients we compared the validity and interobserver reliability of two scoring methods commonly used as main endpoints in clinical trials, i.e., the Myasthenic Muscle Score (MMS) ranging from 0 to 100 (normal) and the Quantified Myasthenia Gravis Strength Score (QMGSS) ranging from 0 (normal) to 39. Each score is correlated more with functional scale and less with the patient's self-evaluation. Using intraclass correlation we found strong agreement between observers for both the MMS (r = 0.906) and the QMGSS (r = 0.905). The correlation between MMS and QGMSS was high (r = 0.87). The reliability of neither score depended on any specific item, since the removal of individual items did not significantly alter the intraclass correlation coefficient (ranging from 0.86 to 0.93).

Journal ArticleDOI
TL;DR: Compared the reliability estimates produced by these three techniques, the ICC is the preferred technique to measure reliability and there are some limitations associated with the use of this technique that can be overcome.
Abstract: Data obtained with any research tool must be reproducible, a concept referred to as reliability. Three techniques are often used to evaluate reliability of tools using continuous data in aging research: intraclass correlation coefficients (ICC), Pearson correlations, and paired t tests. These are often construed as equivalent when applied to reliability. This is not correct, and may lead researchers to select instruments based on statistics that may not reflect actual reliability. The purpose of this paper is to compare the reliability estimates produced by these three techniques and determine the preferable technique. A hypothetical dataset was produced to evaluate the reliability estimates obtained with ICC, Pearson correlations, and paired t tests in three different situations. For each situation two sets of 20 observations were created to simulate an intrarater or inter-rater paradigm, based on 20 participants with two observations per participant. Situations were designed to demonstrate good agreement, systematic bias, or substantial random measurement error. In the situation demonstrating good agreement, all three techniques supported the conclusion that the data were reliable. In the situation demonstrating systematic bias, the ICC and t test suggested the data were not reliable, whereas the Pearson correlation suggested high reliability despite the systematic discrepancy. In the situation representing substantial random measurement error where low reliability was expected, the ICC and Pearson coefficient accurately illustrated this. The t test suggested the data were reliable. The ICC is the preferred technique to measure reliability. Although there are some limitations associated with the use of this technique, they can be overcome.

Journal ArticleDOI
TL;DR: Some important properties of the NRS-R are described and, through an understanding of its underlying structure and relationships with the patients' clinical characteristics, contribute to the conceptual framework of neuropsychologic impairments after TBI.

Journal ArticleDOI
TL;DR: Although gait analysis data are themselves objective, this study demonstrates some subjectivity in their interpretation, similar to that reported for established classification systems of various orthopedic conditions.
Abstract: The purpose of this study was to assess the reliability of interpretation of gait analysis data between physicians and institutions. Gait analysis data from seven patients were reviewed by 12 experienced gait laboratory physicians from six institutions. Reviewers identified problems and made treatment recommendations based on the data provided. Agreement among physicians for the most commonly diagnosed problems was slight to moderate (kappa range, 0.14-0.46). Physicians agreed on identification of soft tissue more than bony problems (intraclass correlation, 0.56 vs. 0.37). Variability regarding surgical recommendations for soft-tissue procedures (kappa range, 0.20-0.64) was similar to that for diagnosis of both soft-tissue and bone problems, although recommendation for hamstring lengthening showed substantial agreement (kappa = 0.64). There was less agreement in recommendation of osteotomies (kappa range, 0.13-0.22). Physicians agreed more on the number of soft-tissue procedures than bone procedures recommended (intraclass correlation, 0.65 vs. 0.19). There was an interinstitutional difference in the frequency of soft-tissue (p = 0.0152) and osseous problem identification (p = 0.0002), as well as in the frequency of recommendations for soft-tissue surgery (p = 0.0004) and osteotomies (p < 0.0001). Although gait analysis data are themselves objective, this study demonstrates some subjectivity in their interpretation. The interobserver variability reported here is similar to that reported for established classification systems of various orthopedic conditions.

Journal ArticleDOI
TL;DR: The behavioral parameters of the elevated plus-maze and the behavioral despair are not stable and therefore they are possibly more related to state than trait characteristics, which appears to be not appropriate to evaluate trait characteristics which are supposed to be stable over time without treatment.
Abstract: 1. The use of animal models in certain types of psychobiological studies (for instance, the relationship between anxiety and depression) requires that the behavior measured is stable over time. 2. The test-retest reliability of the elevated plus-maze indexes of anxiety and the immobility time in the behavioral despair were evaluated. 3. The behavior of two groups of drug naive mice was measured on two occasions on the same test, 1 week apart, on the elevated plus-maze or on the behavioral despair and then the intraclass correlation coefficient and kappa were calculated. 4. These behaviors showed a very low intraclass correlation coefficient (0.02 - 0.05) and low kappa (-0.08 - 0.21) in the test-retest design, which suggest a poor reliability of these measures. 5. These results suggest that the behavioral parameters of the elevated plus-maze and the behavioral despair are not stable and therefore they are possibly more related to state than trait characteristics. Therefore they appear to be not appropriate to evaluate trait characteristics which are supposed to be stable over time without treatment.

Journal ArticleDOI
TL;DR: A generalized estimating equation approach is developed with two sets of equations that models the marginal distribution of categorical ratings and the pairwise association of ratings with the kappa coefficient (kappa) as a metric.
Abstract: SUMMARY A method for analysing dependent agreement data with categorical responses is proposed. A generalized estimating equation approach is developed with two sets of equations. The first set models the marginal distribution of categorical ratings, and the second set models the pairwise association of ratings with the kappa coefficient (κ) as a metric. Covariates can be incorporated into both sets of equations. This approach is compared with a latent variable model that assumes an underlying multivariate normal distribution in which the intraclass correlation coefficient is used as a measure of association. Examples are from a cervical ectopy study and the National Heart, Lung, and Blood Institute Veteran Twin Study.

Journal ArticleDOI
TL;DR: The TFGS presently appears to be the best option in those situations in which accurate and precise documentation of facial function is required and is superior to other scales by virtue of its sensitivity, comprehensiveness, ease of use, and interobserver reliability.
Abstract: The Toronto Facial Grading System (TFGS) is an observer scale for rating facial nerve dysfunction. The TFGS scores aspects of resting symmetry, symmetry of voluntary movement, and synkinesis for each division of the face (subscores) and then provides calculated total scores and an overall composite score of facial function. The developers of the scale have validated its sensitivity for identifying small changes in facial dysfunction and the independence of the different components measured. Herein we report our results in a study of interobserver reliability using the TFGS. Twenty-five patients from the Massachusetts Eye and Ear Infirmary Facial Nerve Center with varying degrees of facial paresis, paralysis, and synkinesis were videotaped, and the video recordings were scored by 5 independent observers using the TFGS. Intraclass correlation coefficients (kappa) and 95% confidence intervals were calculated for subscores and for each total and composite score. Intraclass correlation coefficients ranged from 0.59 to 0.85, all considered substantial to near-perfect agreement between observers. We believe the TFGS is superior to other scales by virtue of its sensitivity, comprehensiveness, ease of use, and interobserver reliability. The TFGS presently appears to be the best option in those situations in which accurate and precise documentation of facial function is required.

Journal ArticleDOI
TL;DR: This study has demonstrated high reproducibility of lower limb multi-joint testing for peak torque, average power, and total work on healthy subjects and then has employed the protocol to demonstrate similarly high reliability on a patient group and has shown that multi-Joint testing can be used safely and reliably in patients with patellofemoral pain syndrome.

Journal ArticleDOI
TL;DR: Comparison of performance ranks and linear regression analyses indicated that the short fast walks and seated step test may not be suitable substitutes for treadmill or long self-paced corridor walks, and more development is needed for comprehensive assessment of exercise tolerance in older adults.
Abstract: This study examined the reproducibility and comparability of five measures of function and exercise tolerance. The test battery and questionnaire on function and physical activity were administered twice, 7-10 days apart to 38 men and 12 women aged 54-80 years at the Baltimore Veterans Affairs Medical Center. Tests included fast pace 4 and 20-meter walks, 6-minute and graded treadmill walks, and a seated step test. All tests demonstrated good reproducibility with Pearson and intraclass correlation coefficients ranging from 0.84 to 0.98, and percent differences on retest ranging from 4 to 11%. Although correlations between different tests were all significant (range 0.34-0.89), comparison of performance ranks and linear regression analyses indicated that the short fast walks and seated step test may not be suitable substitutes for treadmill or long self-paced corridor walks. Only 28% had the same quintile performance ranking on the step test as on the treadmill walk, and 36% had rankings 2 or more points apart. The fast 20m walk shows the most promise as a low-level alternative to the 6-minute walk; performances had a correlation of 0.73, 82% of ranks were within one point, and 20m speed explained 42% of the variance in distance covered. More development is needed for comprehensive assessment of exercise tolerance in older adults; the 6-minute walk did not adequately discriminate fitness level in persons who walk regularly, and the treadmill posed problems for those with walking difficulty.

Journal ArticleDOI
TL;DR: Long-term follow-up and objective measurements performed in patients with biopsy-proven lesions show that the natural course of FNH is variable, in particular, lesion regression is not rare.
Abstract: PURPOSE: The purpose of this work was to assess the natural course of biopsy-proven focal nodular hyperplasia (FNH). METHOD: Eighteen biopsy-proven FNHs in 14 patients (12 women and 2 men) who were followed for at least 6 months with CT and/or MRI were included in the study. The volume of the lesions was calculated twice by two observers using the summation of areas method. Intra- and interobserver variability was assessed by intraclass correlation coefficients. Longitudinal data analysis was performed with generalized estimating equations. RESULTS: The volume of FNH was stable in 6 cases, decreased in 10 cases, and increased in 2 cases. Intra- and interobserver variability in size measurements was 5-10%. Intraclass correlation coefficients were >0.992. Longitudinal data analysis showed that there was a general trend of lesion regression. CONCLUSION: Long-term follow-up and objective measurements performed in patients with biopsy-proven lesions show that the natural course of FNH is variable. In particular, lesion regression is not rare.

Journal ArticleDOI
TL;DR: Substantial familial influence on age of onset, depression and agitation suggests that genotype does influence phenotype in Alzheimer's disease, and establishing the molecular basis for this phenotypic variation may prove relevant to other neuropsychiatric disorders.
Abstract: BACKGROUND: Alzheimer's disease manifests considerable heterogeneity, the cause of which is unknown. AIMS: To determine the familial (genotypic) influence on phenomenology (phenotype) in Alzheimer's disease. METHOD: Affected sibling pairs with Alzheimer's disease were assessed for a range of cognitive and non-cognitive symptoms. Resemblance for phenotypic characteristics was estimated using intraclass correlations for continuous traits and by pairwise concordance for dichotomous traits. The relationship between age of onset and APOE genotype was examined using linear regression analysis. RESULTS: Significant familial effects on age of onset (intraclass correlation 0.41) and mood state (intraclass correlation 0.26), and a relatively high pairwise concordance for agitation (excess concordance 0.1) were found. The APOE locus was found to account for 4% of the variance in age of onset. CONCLUSIONS: Substantial familial influence on age of onset, depression and agitation suggests that genotype does influence phenotype in Alzheimer's disease. Establishing the molecular basis for this phenotypic variation may prove relevant to other neuropsychiatric disorders.

Journal ArticleDOI
TL;DR: It is concluded that the method is reliable and can be used in certain clinical settings and the intraclass correlation coefficients values for position vectors were lower, probably due to the lack of variance between subjects.

Journal ArticleDOI
TL;DR: A self-reported version of the Patient-Specific Index, which focuses on the concerns of individuals, is reliable and has criterion validity compared with an interviewer-administered version.
Abstract: Background: The Patient-Specific Index is unique in that it reflects how individual patients weigh concerns in rating the outcome of total hip arthroplasty. The Patient-Specific Index was originally administered by an interviewer, which is not always feasible and can be costly. The purposes of the present study were (1) to create a self-reported version of the Patient-Specific Index, (2) to determine the reliability of this new self-reported version, and (3) to determine the relationship between the scores on the new self-reported version and those on the original interviewer-administered version. Methods: A self-reported version of the Patient-Specific Index was developed, and a pilot test was performed on ten patients. Patients who were scheduled for a total hip arthroplasty or who had recently had a total hip arthroplasty were eligible for the reliability and validity testing. A copy of the new self-reported Patient-Specific Index was mailed to the patients, and they completed it independently. The patients' ratings of the importance and severity of twenty-four concerns prior to total hip arthroplasty were added together to create a summary Patient-Specific Index score. To determine test-retest reliability, patients completed the self-reported Patient-Specific Index a second time, two weeks later. To determine criterion validity, participants also completed the interviewer-administered Patient-Specific Index. Results: Fifty-five patients completed the study. The random-effects intraclass correlation test-retest coefficient was 0.79 (greater than 0.75 represents excellent reliability). The mean Patient-Specific Index scores on the self-reported version and on the interviewer-administered version were 173 and 165 points, respectively (Student t test, p = 0.45). The self-reported Patient-Specific Index was concordant with the interviewer-administered Patient-Specific Index (intraclass correlation coefficient, 0.78). Conclusions: We concluded that a self-reported version of the Patient-Specific Index, which focuses on the concerns of individuals, is reliable and has criterion validity compared with an interviewer-administered version.

Journal ArticleDOI
TL;DR: Which of three generic QoL instruments was most suitable for use in an 8-year nutritional primary prevention trial was determined, with the SF36 chosen for its high responsiveness and the Duke Health Profile selected for its practicality and favorable psychometric properties.

Journal ArticleDOI
TL;DR: The results from this study suggest that the AHFT is a reliable and valid test to measure hand function in persons with systemic sclerosis.
Abstract: Objective To determine the interrater and test–retest reliability and validity of the Arthritis Hand Function Test (AHFT) in persons with systemic sclerosis. Methods Interrater reliability of the AHFT was established by two raters independently scoring the performances of 20 women with systemic sclerosis. The same group of subjects was tested again 7–10 days later to determine test–retest reliability. Concurrent validity was established by the subjects' self-reports of their abilities to perform activities of daily living as measured by the Health Assessment Questionnaire and the Arthritis Impact Measurement Scales 2 (AIMS2). Results All of the items had excellent interrater intraclass correlation coefficients (ICC = 0.99–1.00). The ICCs for test–retest reliability were in the excellent (ICC = 0.80–0.97) range for most of the items and moderate (ICC = 0.57–0.73) for the others. Most of the items were moderately correlated with items on the AIMS2 (r = 0.45–0.69). Conclusion The results from this study suggest that the AHFT is a reliable and valid test to measure hand function in persons with systemic sclerosis.

Journal Article
TL;DR: The AIMS2-SF is amenable for use in large surveys with a modification of one item in the symptom scale, and is similar to other instruments within the same domains, showing similar construct validity.
Abstract: Objective To examine the agreement between and compare the sensitivity to change of the Arthritis Impact Measurement Scale (AIMS2) and AIMS2 Short Form (AIMS2-SF) in a large sample of rheumatoid arthritis (RA) patients examined within the framework of a longitudinal observational study. Methods Data were collected from patients in a community based RA register by a postal survey in April 1994 (1,030 respondents) and again in 1996 (1,153 respondents), comprising AIMS2, Modified Health Assessment Questionnaire (MHAQ), Medical Outcome Survey SF-36, and other commonly used health status measures. The degree of agreement was examined by plotting differences between AIMS2 and AIMS2-SF against the mean of the 2 scores for the 5 main components. The upper and lower limits of agreement (mean diff. +/- 1.96 SD) were calculated and plotted. The intraclass correlation coefficients were computed by repeated measurement ANOVA. Validity was assessed on the basis of external indicators of health status, and responsiveness on the basis of standardized response means. Results The AIMS2 and AIMS2-SF showed substantial to near-perfect agreement. Best agreement was seen for the physical and affect components. Better agreement for the symptom component was obtained when replacing item 42 with item 38. Internal consistency was high in all components. The 2 forms correlated similarly with scores from other instruments within the same domains, showing similar construct validity. There was no difference in responsiveness between the 2 forms when using changes in patient assessed global disease activity as external indicator of change in health status, and responsiveness for the physical and symptom dimension was similar to other instruments (SF-36, MHAQ). Conclusion The AIMS2-SF is amenable for use in large surveys with a modification of one item in the symptom scale.