scispace - formally typeset
Search or ask a question
Topic

Intra-rater reliability

About: Intra-rater reliability is a research topic. Over the lifetime, 2073 publications have been published within this topic receiving 140968 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: The Graf classification showed moderate reliability among orthopaedic surgeons in their final training year and who learned the method by instructed teaching or self study and can be learned almost, but not quite as effectively as by a structured program.
Abstract: We sought to establish the levels of interrater reliability and intrarater reliability of the Graf classification among orthopaedic surgeons in their final training year and who learned the method by instructed teaching or self study. Using standard teaching material developed by Graf, two groups of senior orthopaedic residents at the same training level received structured teaching sessions (Group A, n = 2) or performed self study (Group B, n = 2). Interrater reliability and intrarater reliability were determined (Cohen's weighted kappa). Proportions of correctly rated sonograms were compared between groups, implications of misclassifications were analyzed, and sensitivity analyses were performed. Interrater reliability was 0.59 (95% CI = 0.32-0.85) for Group A, and 0.47 (95% CI = 0.14-0.79) for Group B. Intrarater reliability showed an overall kappa of 0.57 (95% CI = 0.35-0.78) in Group A, and 0.47 (95% CI = 0.19-0.75) in Group B. The proportion of correctly rated sonograms between groups was similar in the original dataset and in the sensitivity analysis. Misclassifications influencing treatment were infrequent; one patient would have received unwarranted treatment and three patients would not have received warranted treatment. The Graf classification showed moderate reliability. Using self study, it can be learned almost, but not quite as effectively as by a structured program.

52 citations

Journal ArticleDOI
01 Jul 1997-Sleep
TL;DR: The hypothesis that infant SP and SS ratings can be reliably scored at substantial levels of agreement is tested and supports the conclusion that the IPSG is a reliable source of clinical and research data when supported by significant kappa s and CIs.
Abstract: Infant polysomnography (IPSG) is an increasingly important procedure for studying infants with sleep and breathing disorders. Since analyses of these IPSG data are subjective, an equally important issue is the reliability or strength of agreement among scorers (especially among experienced clinicians) of sleep parameters (SP) and sleep states (SS). One basic issue of this problem was examined by proposing and testing the hypothesis that infant SP and SS ratings can be reliably scored at substantial levels of agreement, that is, kappa (kappa) > or = 0.61. In light of the importance of IPSG reliability in the collaborative home infant monitoring evaluation (CHIME) study, a reliability training and evaluation process was developed and implemented. The bases for training on SP and SS scoring were CHIME criteria that were modifications and supplements to Anders, Emde, and Parmelee (10). The kappa statistic was adopted as the method for evaluating reliability between and among scorers. Scorers were three experienced investigators and four trainees. Inter- and intrarater reliabilities for SP codes and SSs were calculated for 408 randomly selected 30-second epochs of nocturnal IPSG recorded at five CHIME clinical sites from healthy full term (n = 5), preterm (n = 4), apnea of infancy (n = 2), and siblings of the sudden infant death syndrome (SIDS) (n = 4) enrolled subjects. Infant PSG data set 1 was scored by both experienced investigators and trained scorers and was used to assess initial interrater reliability. Infant PSG data set 2 was scored twice by the trained scorers and was used to reassess inter-rater reliability and to assess intrarater reliability. The kappa s for SS ranged from 0.45 to 0.58 for data set 1 and represented a moderate level of agreement. Therefore, rater disagreements were reviewed, and the scoring criteria were modified to clarify ambiguities. The kappa s and confidence intervals (CIs) computed for data set 2 yielded substantial inter-rater and intrarater agreements for the four trained scorers; for SS, the kappa = 0.68 and for SP the kappa s ranged from 0.62 to 0.76. Acceptance of the hypothesis supports the conclusion that the IPSG is a reliable source of clinical and research data when supported by significant kappa s and CIs. Reliability can be maximized with strictly detailed scoring guidelines and training.

52 citations

Journal ArticleDOI
TL;DR: The intra-rater reliability of dial calipers for measurement of RAD was investigated by this study, using a repeated measures design, and high reliability was demonstrated for resting and active RAD measurements.
Abstract: To date, physiotherapists have relied upon the use of finger widths for measurement of rectus abdominis diastasis (RAD) This method has been proven unreliable, due to variations in finger widths The intra-rater reliability of dial calipers for measurement of RAD was investigated by this study, using a repeated measures design Measurements were taken at rest and during contraction on three occasions in 30 postpartum subjects High reliability was demonstrated for resting and active RAD measurements, (ICC = 093 and 095 respectively) In conclusion, dial calipers are a reliable measuring device when used by a single clinician Further testing is required to determine inter-rater reliability

51 citations

Journal ArticleDOI
TL;DR: The finding suggests the current AKE test showed excellent interrater and intrarater reliability for assessing hamstring flexibility in healthy adults.
Abstract: [Purpose] The purpose of this study was to determine the reliability of the active knee extension (AKE) test among healthy adults. [Subjects] Fourteen healthy participants (10 men and 4 women) volunteered and gave informed consent. [Methods] Two raters conducted AKE tests independently with the aid of a simple and inexpensive stabilizing apparatus. Each knee was measured twice, and the AKE test was repeated one week later. [Results] The interrater reliability intraclass correlation coefficients (ICC2,1) were 0.87 for the dominant knee and 0.81 for the nondominant knee. In addition, the intrarater (test-retest) reliability ICC3,1 values range between 0.78-0.97 and 0.75-0.84 for raters 1 and 2 respectively. The percentages of agreement within 10° for AKE measurements were 93% for the dominant knee and 79% for the nondominant knee. [Conclusion] The finding suggests the current AKE test showed excellent interrater and intrarater reliability for assessing hamstring flexibility in healthy adults.

51 citations

Journal ArticleDOI
TL;DR: It is concluded that most of the quantitative measurements are reliable for the study of non-specific low back pain, however the CPT should be applied with care as it has a great variation among individuals and potential of measurement error.
Abstract: Purpose: This preliminary study aimed to determine the intrarater reliability of the quantitative tests for the study of non-specific low back pain. Methods: Test-retest reliability of the measurements of ratio data was determined by an intraclass correlation coefficient (ICC), standard error of measurements (SEMs), coefficient of variation (CV), and one-way repeated measures ANOVA using the values collected from 13 young individuals (25.8 ± 6.2 years) with chronic non-specific low back pain on two occasions separated by 2 days. Percent agreement of the ordinal data was also determined by Cohen’s Kappa statistics (kappa). The measures consisted of tissue blood flow (BF), average pain visual analog scales (VAS), pressure pain threshold (PPT), cold pain threshold (CPT), heat pain threshold (HPT) and lumbo-pelvic stability test (LPST). An acceptable reliability was determined as the ICC values of greater than 0.85, SEMs less than 5%, CV less than 15%, the kappa scores of greater than 80% and no evidence of systematic error (ANOVA, P > 0.05). Results: ICC of all measures in the lumbo-sacral area were greater than 0.87. The kappa was also greater than 83%. Most measures demonstrated a minimal error of measurements and less potential of systemic error in nature. Only the SEMs and the CV of the CPT exceeded the acceptable level. Conclusions: It is concluded that most of the quantitative measurements are reliable for the study of non-specific low back pain, however the CPT should be applied with care as it has a great variation among individuals and potential of measurement error.

51 citations


Network Information
Related Topics (5)
Rehabilitation
46.2K papers, 776.3K citations
69% related
Ankle
30.4K papers, 687.4K citations
68% related
Systematic review
33.3K papers, 1.6M citations
68% related
Activities of daily living
18.2K papers, 592.8K citations
68% related
Validity
13.8K papers, 776K citations
67% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202342
202278
202186
202083
201986
201867