scispace - formally typeset
Search or ask a question

Showing papers on "Intra-rater reliability published in 2003"


Journal ArticleDOI
TL;DR: Practical Reliability Engineering provides a nice overview for a student of reliability (or an engineer transferred into it) who wants to visit the entire waterfront and includes practical, numerical examples that illustrate some of the methods and problems encountered.
Abstract: This book was written to provide an introduction to and an overview of reliability engineering and management for students and practicing engineers. In its 500C pages, the book touches most aspects of reliability engineering. The book comprises 15 chapters. Chapter 1 gives basic deŽ nitions and concepts. Chapter 2 quickly overviews most of the basic statistical methods used in reliability. Chapter 3 discusses probability plotting and gives many detailed examples. Chapter 4 covers load-strength interference. Chapter 5 quickly reviews parametric, nonparametric, and Taguchi methods of experimental design. Chapter 6 covers reliability prediction and modeling methods, including Markov, simulation, availability, redundancy, fault, and event trees. Chapter 7 reviews reliability design, including FMECA. Chapter 8 covers reliability of mechanical components and systems, including stress, strength, fatigue, fracture, and wear. Chapter 9 reviews electrical systems; Chapter 10, software reliability; Chapter 11, reliability testing methods; Chapter 12, analysis of reliability data, including accelerated testing and reliability growth; and Chapter 13, methods that increase reliability in manufacturing, including process capability, quality control charts, and acceptance sampling. Chapter 14 presents an extensive discussion on maintainability. Finally, Chapter 15 discusses management issues in reliability. Some statistical tables are included at the end of the book. Most chapters include practical, numerical examples that brie y illustrate some of the methods and problems encountered, along with tables of useful formulas and of work forms. At the end of each chapter is a short bibliography with suggestions for further reading, as well as some problems and questions (whose solutions are given in a separate Instructor’s Manual). As a reliability statistician and college instructor, I have some comments and observations. First, the book covers so much that it is necessarily thin in several areas. The statistics chapter (Chap. 2, “Reliability Mathematics”) is an example of this. O’Connor discusses t tests, hypothesis testing, nonparametrics, and related topics in only a few paragraphs and skips other aspects (e.g., test size, power). However, this book is not about statistics, but about reliability engineering (which uses lots of statistics). And, to be fair, this chapter may serve the initiated as a quick refresher. On the other hand, O’Connor does an excellent job on the plotting chapter (Chap. 3). If he had done likewise with all of the other subjects he covers, then he probably would have ended with a book of several volumes. This seems to be a catch-22 situation. Some of the most useful features of in the book are the long digressions on reliability topics. Reading these is like having the opportunity to talk with an “old hand” and listen to his experiences. This is very valuable. Finally, to make this book usable as a textbook, the author may want to include the solutions for some (say, the odd-numbered) exercises. Relegating all of the solutions to the Instructor’s Manual is a problem in a textbook and does not help the student. My overall assessment is that Practical Reliability Engineering provides a nice overview for a student of reliability (or an engineer transferred into it) who wants to visit the entire waterfront. It also serves as a reference where the reader can Ž nd most formulas and short, worked-out illustrative examples (which he or she probably has to Ž ll in with personal experience, many times).

264 citations


Book ChapterDOI
01 Jan 2003
TL;DR: HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not, for teaching and research institutions in France or abroad.
Abstract: HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. Software Reliability Modeling James Ledoux

131 citations


Journal ArticleDOI
15 Aug 2003-Spine
TL;DR: Intrarater reliability of the visual assessment of cervical and lumbar lordosis was statistically fair, whereas interrater reliability was poor.
Abstract: Study design Blinded test-retest design. Objective To measure the intrarater and interrater reliability of the visual assessment of cervical and lumbar lordosis. Summary of background data Cervical and lumbar lordoses are frequently evaluated using visual assessment, but little attempt has previously been made to measure the reliability of visual assessment. Methods Twenty-eight chiropractors, physical therapists, physiatrists, rheumatologists, and orthopedic surgeons were recruited to evaluate the posture of photographed subjects (with and without back pain). Each clinician rated the lordosis of the cervical and lumbar spines as normal, increased, or decreased. Kappa coefficients (kappa) were calculated to determine intrarater and interrater reliability. Results Twenty-eight clinicians evaluated photographs of 36 individuals (17 with back pain, 19 without). Mean intrarater reliability was kappa = 0.50 (95% confidence interval 0.02-0.98) and mean interrater reliability was kappa = 0.16 (95% confidence interval 0.00-0.48). No statistically significant difference existed among the five groups of clinicians or between the evaluation of the subjects with and without back pain. Conclusion Intrarater reliability of the visual assessment of cervical and lumbar lordosis was statistically fair, whereas interrater reliability was poor.

128 citations


Journal ArticleDOI
TL;DR: The OGS was found to have acceptable interrater and intrarater reliability for knee and foot position in mid-stance, initial foot contact, and heel rise with weighted kappas and comparison with 3-DGA suggests that these sections might also have high validity.
Abstract: The aim of this study was to establish the reliability and validity of visual gait assessment in children with spastic diplegia, who were community or household ambulators, using a modified version of the Physicians Rating Scale, known as the Observational Gait Scale (OGS). Two clinicians viewed edited split-screen video recordings of 20 children/adolescents (11 males, 9 females; mean age 12 years, range 6 to 21 years) made at the time of three-dimensional gait analysis (3-DGA). Walking ability in each child was scored at initial assessment and reassessed from the same videos three months later using the first seven sections of the OGS. Validity of the OGS score was determined by comparison with 3-DGA. The OGS was found to have acceptable interrater and intrarater reliability for knee and foot position in mid-stance, initial foot contact, and heel rise with weighted kappas (wk) ranging from 0.53 to 0.91 (intrarater) and 0.43 to 0.86 (interrater). Comparison with 3-DGA suggests that these sections might also have high validity(wk range 0.38-0.94). Base of support and hind foot position had lower interrater and intrarater reliabilities (wk 0.29 to 0.71 and wk 0.30 to 0.78 respectively) and were not easily validated by 3-DGA.

122 citations


Journal ArticleDOI
TL;DR: Leonard et al. as discussed by the authors assessed the intra-and interrater reliabilities of the Myotonometer®, a hand-held, computerized, electronic device that quantifies muscle stiffness (tone/compliance).

110 citations


Journal ArticleDOI
TL;DR: Average levels of interrater and intrarater reliability for job analysis data were investigated using meta-analysis and scales of frequency and importance were the most reliable.
Abstract: Average levels of interrater and intrarater reliability for job analysis data were investigated using meta-analysis. Forty-six studies and 299 estimates of reliability were cumulated. Data were categorized by specificity (generalized work activity or task data), source (incumbents, analysts, or technical experts), and descriptive scale (frequency, importance, difficulty, time-spent, and the Position Analysis Questionnaire). Task data initially produced higher estimates of interrater reliability than generalized work activity data and lower estimates of intrarater reliability. When estimates were corrected for scale length and number of raters by using the Spearman-Brown formula, task data had higher interrater and intrarater reliabilities. Incumbents displayed the lowest reliabilities. Scales of frequency and importance were the most reliable. Implications of these reliability levels for job analysis practice are discussed.

101 citations


Journal ArticleDOI
01 Jul 2003-Stroke
TL;DR: Results from the present study suggest that quantifying mismatch by the human eye is reproducible but not reliable among observers, which raises doubts about using mismatch for clinical decision making and clinical trial enrollment.
Abstract: Background and Purpose— Emergent neurovascular imaging holds promise in identifying new and optimum target populations for thrombolysis in stroke. Recent research has focused on patients with diffusion-weighted MRI (DWI)-perfusion-weighted MRI (PWI) mismatch as a marker of tissue at risk of infarction and a means to select the most suitable candidates for thrombolysis. The present study sought to estimate the reliability of assessing the percentage of DWI-PWI mismatch. Methods— Thirteen patients with acute strokes had DWI and PWI within 7 hours of symptom onset. Six raters independently created relative mean transit time (rMTT) maps and then compared them with DWI images to assess the percentage of mismatch (PWI>DWI) in 10% increments. The MR scans were reassessed by 4 raters, tracing around the lesions to calculate the volume percentage of mismatch. Results— Visual assessment had an interrater reliability of 0.68 (95% CI, 0.52 to 1.0; SEM=21.6%) and an intrarater reliability of 0.80 (95% CI, 0.47 to 1.0;...

101 citations


01 Jan 2003

80 citations


Journal ArticleDOI
TL;DR: Coefficients of repeatability and reproducibility can be guides in differentiating between real changes and measurement error in stroke patients, to assess relationship between grip force of the hands and between sustained and peak grip force.
Abstract: OBJECTIVE: Coefficients of repeatability and reproducibility can be guides in differentiating between real changes and measurement error. The aim was to evaluate test-retest intra-rater reliability ...

69 citations


Journal ArticleDOI
TL;DR: Commercially available dynamometers can be used to quantify hip abduction strength with good to excellent reliability and a previously undescribed method of quantifying hip abductor strength in a clinical setting using readily available instrumentation is presented.
Abstract: Background: Reliable quantification of hip abductor strength in a clinical setting is challenging. Objectives: To examine the intrarater and interrater reliability of three commonly used commercial dynamometers in the measurement of hip abduction. Methods: Supine gravity minimised measures of unilateral hip abduction strength were recorded in 10 women (mean (SD) age 23.5 (1.9) years) using three different commercially available dynameters. Measurements were repeated over a three day period with a different device used on each day. Results: Intrarater reliability ranged from 0.880 to 0.958 across the three devices, and measures of interrater reliability ranged from 0.899 to 0.948. Conclusion: Commercially available dynamometers can be used to quantify hip abduction strength with good to excellent reliability. A previously undescribed method of quantifying hip abduction strength in a clinical setting using readily available instrumentation is presented.

68 citations


01 Jan 2003
TL;DR: The relationship between engineering quality and reliability is discussed and the role of statistics and statisticians in the field of reliability is outlined and some predictions for the future of statistics in engineering reliability are made.

Journal ArticleDOI
TL;DR: The authors evaluated the intrarater reliability of two functional behavior assessment rat- ing scales: the Motivation Assessment Scale and the Problem Behavior Questionnaire, and found variable and inconsistent ratings across admin- istrations and rating scales.
Abstract: This study evaluated the intrarater reliability of two functional behavior assessment rat- ing scales: the Motivation Assessment Scale and the Problem Behavior Questionnaire. Teachers rated 30 students from 10 self-contained classrooms for students with emotional or behavioral dis- orders on three separate occasions using both rating scales. Pearson correlation coefficients and exact and adjacent agreement percentages indicated variable and inconsistent ratings across admin- istrations and rating scales. The authors discuss possible reasons for inconsistencies, as well as implications for practice and future research.



Journal ArticleDOI
TL;DR: Lateral shift judgements have only moderate reliability, even when trained raters judge stable stimuli, and it is proposed that the photo model employed can be used to explore the source of error in this process.

Journal ArticleDOI
TL;DR: Baer et al. as discussed by the authors established the reliability of mobility milestones as an outcome measure for stroke by using video data of patients with stroke, including sitting balance, standing balance, and walking ability.

Journal ArticleDOI
TL;DR: In this paper, the authors used a hand-held goniometer within a team of therapists and found that therapists were more consistent in their measurements when placing the goniometers that most were using within their clinical practice.



Journal Article
TL;DR: The total histologic score is a reliable and valid end point for judging the efficacy of agents in skin cancer chemoprevention studies and additional interrater reliability tests utilizing larger test sets and a rigorous statistical design should be undertaken to establish its portability.
Abstract: OBJECTIVE: To develop a reliable and valid scoring system for grading skin biopsies from actinic keratosis (AK) and sun-damaged skin for use in evaluating the efficacy of skin cancer chemopreventive agents. STUDY DESIGN: A panel of dermatopathologists developed histologic criteria and diagnostic definitions for the progression of lesions from early AK to AK. The criteria were then applied to a sample of 335 histologic slides from an ongoing chemoprevention study. A 10% sample of 35 slides was reread in order to assess intrarater reliability. RESULTS: Six of the 7 criteria demonstrated high reliability (>85%). The total histologic score, calculated using the 6 criteria, was found to significantly differentiate between (blinded) biopsy location (normal, pre-AK, AK and adjacent to squamous cell carcinoma) and histologic diagnosis (normal, pre- or early AK, AK and squamous cell carcinoma). CONCLUSION: The total histologic score, having demonstrated reliability on repeated readings and validity in its association with biopsy location and histologic score, is a reliable and valid end point for judging the efficacy of agents in skin cancer chemoprevention studies. Additional interrater reliability tests utilizing larger test sets and a rigorous statistical design should be undertaken to establish its portability.



Journal ArticleDOI
TL;DR: The instrument used to assess practice in the present study is highly internally consistent and there is evidence to support intra-rater reliability, however, further development and testing of the instrument is required.



Journal ArticleDOI
TL;DR: In conclusion, canonical correlation based on parallel tests splitted in subsets gives information on consistency, i.
Abstract: Test results (raw scores) are composed of an unknown true score and an error term. The error term can be estimated by means of test reliability which is defined by the ratio of true variance and obtained variance. Different estimates of reliability either based on single measurements (e. g. Cronbach's coefficient, split half reliability, Kuder Richardson method) or two measurements (test/retest, inter- or intrarater reliability) are available. Parallel test reliability depends on the correlation of two different tests obtained in one session. Canonical correlation methods allow an extension of the parallel test situation and split half technique. Two or more tests are performed in a sample of subjects. Randomized subsets are correlated using canonical correlation technique. The objective of this study is to estimate the homogeneity of test batteries. 94 patients (64 f, 30 m; age: 54 - 89 ys.) supposed to have dementia were tested using the clocktest (CT, scores: 1 - 5), MMSE (mini mental state examination) and SKT (Syndrom Kurztest). Four (i, j: 1 - 4) subsets of 20 patients each were determined by random and the following characteristics were calculated: Empiric correlation coefficient for n = 94 (R), canonical correlation coefficient (Rcan), eigenvalues (EV) and redundancy (Rnd) of corresponding variable sets. The results of canonical analysis showed canonical correlation coefficients in order of 0.8 to 0.9 (p-values < 0,001). This high internal consistency can be interpreted as a measure of reliability of the test batteries. In conclusion, canonical correlation based on parallel tests splitted in subsets gives information on consistency, i. e. reliability, of test batteries in addition to conventional correlation methods.

Journal Article
TL;DR: In this article, the general status of communication product reliability is presented based on the analysis of the reliability targets for several communication equipments, and the reliability of these targets is analyzed.
Abstract: In this paper , the general status of communication product reliability is presented based on the analysis of the reliability targets for several communication equipments. Is an extortis-nate realibility target reliable? We should pay attention to it.


01 Aug 2003
TL;DR: Examination of the reliability of an ultrasound-based Zebris motion analysis system with head and shoulder attachments for measurement of dynamic cervical motions in healthy adults found excellent reproducibility is obtained when cervical range of motion in three cardinal planes were measured by a physical therapist on the same day.
Abstract: Purposes: Measurement of cervical motions is essential for patients with cervical dysfunction, either for determining the severity of the problem or for assessing the progress of treatment. Although several methods with good reliability have been used clinically, most of them are static in nature. The reliability of dynamic measurement for cervical motions has seldom been reported. This study examined the reliability of an ultrasound-based Zebris motion analysis system with head and shoulder attachments for measurement of dynamic cervical motions in healthy adults. Method: To test the intra-session reliability of the measurement, 20 healthy young adults were tested for the range of cervical motions in the principal planes twice on the same day at an interval of 5 to 10 minutes using an ultrasound-based Zebris motion analysis system. For testing inter-session reliability, another 28 healthy young individuals were tested twice within one to two weeks. The cervical motion was defined as the relative motion of the head with respect to the right shoulder. The test protocol consisted of performing cervical movements at a self-determined speed in six directions in three cardinal planes, i.e. flexion/extension, left/ rightward rotation, and left/rightward side-bending. Intra-class correlation coefficients (ICC 3,1) and standard errors of measurement (SEM) were calculated. Results: The ICC values of the intra-session reliability of the six principal cervical motions ranged between 0.85 and 0.95. The SEM of the intra session testing ranged from 2.2° to 4.2°. The inter-session reliability ranged from 0.75 to 0.88 for cervical extension, left/rightward rotation, and rightward side-bending, and was 0.64 for cervical flexion and 0.58 for leftward side-bending. The SEM for the inter session testing ranged from 4.1° to 15.9°. Conclusion: Excellent reproducibility is obtained when cervical range of motion in three cardinal planes were measured by a physical therapist on the same day using an ultrasound-based Zebris motion analysis system while using the shoulder attachment as the reference plane. The inter-session reliabilities in all directions except for cervical flexion and leftward sidebending were excellent.