scispace - formally typeset
Search or ask a question

Showing papers on "Reliability (statistics) published in 1984"


Journal ArticleDOI
TL;DR: In this paper, the authors present methods for assessing agreement among the judgments made by a single group of judges on a single variable in regard to a single target, such as a manuscript, a lower-level manager, or a team.
Abstract: : This article presents methods for assessing agreement among the judgments made by a single group of judges on a single variable in regard to a single target. For example, the group of judges could be editorial consultants, members of an assessment center, or members of a team. The single target could be a manuscript, a lower-level manager, or a team. The variable on which the target is judged could be overall publishability in the case of the manuscript, managerial potential for the lower-level manager, or team cooperativeness for the team. The methods presented are based on new procedures for estimating interrater reliability. For situations such as the above, these procedures are shown to furnish more accurate and interpretable estimates of agreement than estimates provided by procedures commonly used to estimate agreement, consistency, or interrater reliability. In addition, the proposed methods include processes for controlling for the spurious influences of response biases (e.g., positive leniency, social desirability) on estimates of interrater reliability. (Author)

4,460 citations


Journal ArticleDOI
TL;DR: Probability and Statistics with Reliability, Queuing and Computer Science Applications, Second Edition, offers a comprehensive introduction to probabiliby, stochastic processes, and statistics for students of computer science, electrical and computer engineering, and applied mathematics.
Abstract: Probability and Statistics with Reliability, Queuing and Computer Science Applications, Second Edition, offers a comprehensive introduction to probabiliby, stochastic processes, and statistics for students of computer science, electrical and computer engineering, and applied mathematics. Its wealth of practical examples and up-to-date information makes it an excellent resource for practitioners as well.

2,738 citations


Journal ArticleDOI
TL;DR: Oncologists may train themselves to use the Karnofsky Performance Status in a standard way, which should increase reliability and validity of the KPS and has implications for patients and research studies that use KPS as a stratifying variable.
Abstract: Little research has been conducted documenting the reliability and validity of the Karnofsky Performance Status (KPS) scale, and guidelines based on empirical data do not exist to govern its use. Two hundred ninety-three cancer patients completed a questionnaire that assesses their physical and psychosocial difficulties. Physicians rated patients on the KPS and a subsample of 75 patients was used to evaluate interrater reliability. Analyses were conducted to evaluate the interrater reliability and construct validity of the KPS. The KPS was shown to have good reliability and validity. Detailed examination of the reliability data suggested areas in which physicians err in their judgments. Multiple regression techniques were used to empirically identify seven behaviorally based questions that would be helpful in predicting KPS scores. The seven variables included weight loss, weight gain, reduced energy, difficulty walking, driving, grooming, and working part time. An interview approach with behaviorally based guidelines is presented using these variables to obtain relevant data and make more accurate KPS ratings. With the approach suggested and the guidelines presented, oncologists may train themselves to use the KPS in a standard way, which should increase reliability and validity of the KPS and has implications for patients and research studies that use KPS as a stratifying variable.

1,304 citations


Journal ArticleDOI
Mitsuru Ohba1
TL;DR: Improvements to conventional software reliability analysis models by making the assumptions on which they are based more realistic are discussed, including the delayed S-shaped growth model, the inflection S- shaped model, and the hyperexponential model.
Abstract: This paper discusses improvements to conventional software reliability analysis models by making the assumptions on which they are based more realistic. In an actual project environment, sometimes no more information is available than reliability data obtained from a test report. The models described here are designed to resolve the problems caused by this constraint on the availability of reliability data. By utilizing the technical knowledge about a program, a test, and test data, we can select an appropriate software reliability analysis model for accurate quality assessment. The delayed S-shaped growth model, the inflection S-shaped model, and the hyperexponential model are proposed.

596 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present a bibliography of papers on the subject of power system reliability evaluation, which includes material which has become available since the publication of the four previous papers.
Abstract: This paper presents a bibliography of papers on the subject of power system reliability evaluation. Papers in such areas as: probabilistic load flow, probabilistic production costing, probabilistic transient stability evaluation, etc. have not been included except where they specifically address power system reliability evaluation. It includes material which has become available since the publication of the four previous papers. 'Bibliography on the Application of Probability Methods in Power System Reliability Evaluation', IEEE Trans. On Power Apparatus and Systems PAS-91, 1972, p.649-660; PAS-97, 1978, p.2235-2242; PAS-103, 1984, p.275-282 and IEEE Trans. On Power Systems, vol.3, no.4,p.1555-1564, 1984. The authors have endeavored to include papers which are readily archival on an international basis. Consequently, the proceedings of such conferences as: CIGRE, Inter-RAM, PMAPS, etc. are regretfully not included. Due to space constraints, only papers written in English were considered for inclusion in this bibliography. >

534 citations


Proceedings ArticleDOI
26 Mar 1984
TL;DR: A new software reliability model is developed that predicts expected failures as well or better than existing software reliability models, and is simpler than any of the models that approach it in predictive validity.
Abstract: A new software reliability model is developed that predicts expected failures (and hence related reliability quantities) as well or better than existing software reliability models, and is simpler than any of the models that approach it in predictive validity. The model incorporates both execution time and calendar time components, each of which is derived. The model is evaluated, using actual data, and compared with other models.

484 citations




Journal ArticleDOI
TL;DR: The authors describe the development of multiple-item measures to capture the construct of channel member satisfaction, which is found to be multidimensional, involving satisfaction with products, financial considerations, social interaction, cooperative advertising programs, and other promotional assistances.
Abstract: The authors describe the development of multiple-item measures to capture the construct of channel member satisfaction. Two measures are developed that are found to have high levels of reliability ...

323 citations



Journal ArticleDOI
TL;DR: The problem of determining the optimum time when testing can stop and the system can be considered ready for operational use is considered, and an optimum release policy is derived based on the cost criterion.

Journal ArticleDOI
TL;DR: The results of a four-year project, sponsored by the American Petroleum Institute, to investigate the fatigue design process in the welded joints of steel offshore structures are summarized in this article, where a simple analytical expression for damage based on the rainflow method of cycle counting in a wide band process was constructed.
Abstract: The results of a four‐year project, sponsored by the American Petroleum Institute, to investigate the fatigue design process in the welded joints of steel offshore structures are summarized. Fatigue damage expressions were formulated. A simple analytical expression for damage based on the rainflow method of cycle counting in a wide band process was constructed. Because fatigue design factors have significant uncertainty, a reliability approach, using the lognormal format, was developed. The performance of Miner's rule was characterized statistically, as was modeling error associated with the process of computing fatigue stresses in a joint from oceanographic data. Procedures of S‐N data analysis were developed, and characteristic S‐N data sets were presented. The reliability format was employed to construct a design rule for low period structures. It is argued that the new design rule is more discriminating in that it can account for service life, wave spectra, water depth, platform dynamics, as well as t...

Journal Article
TL;DR: The authors analyzes the implicit requirements for achieving reliable re-sults from holistic ratings and argues that these conditions bring the validity of the ratings into doubt, and suggests that even in carefully supervised rating sessions, holistic ratings may be un- duly influenced by superficial features of the writing samples.
Abstract: Teachers, administrators, testing agencies, and researchers all need a valid, reliable method of assessing writing ability. Each group has turned to holistic ratings of writing samples as a reliable qualitative procedure for responding to the essential features of writing. Yet the validity of holistic ratings has never been convincingly demonstrated. This paper analyzes the implicit requirements for achieving reliable re- sults from holistic ratings and argues that these conditions bring the validity of the ratings into doubt. The research available suggests that even in carefully supervised rating sessions, holistic ratings may be un- duly influenced by superficial features of the writing samples. Those who use holistic ratings to evaluate writing ability need to give more serious attention to the validity of the scores that result.

Journal ArticleDOI
Abstract: This article discusses several philosophical aspects concerning power-system reliability. It puts the reliability aspects in perspective, describes a hierarchical framework of analysis and discusses how the economics of reliability should be compared.

Journal ArticleDOI
TL;DR: The results of this study indicate that for the knee and elbow joints, goniometric measurements performed in a clinical setting can be highly reliable.
Abstract: Reliability of goniometric measurements has been examined only under standardized conditions and usually with healthy subjects. The purpose of this study was to assess goniometric reliability in a clinical setting. The reliability of goniometric measurements of passive elbow and knee positions was assessed using patients as subjects. The effect of using the means of repeated measurements and the interdevice reliability of three common goniometers were also examined. Results showed that intratester reliability for flexion and extension of the knee and the elbow joints was high (r = .91 to .99). Intertester reliability was also high (r = .88 to .97) for these measurements except for measurements of knee extension (r = .63 to .70). Although previous investigators have suggested that using the means of multiple measurements improves reliability, our data indicate that this procedure never improves the correlation coefficient more than .12. The reliability was similar for all three devices. The results of this study indicate that for the knee and elbow joints, goniometric measurements performed in a clinical setting can be highly reliable. The method described in this study provides a simple protocol that can be used clinically to investigate goniometric reliability.

Journal ArticleDOI
TL;DR: A linear-time algorithm and its short computer program in BASIC for k-out-of-n:G system reliability computation is presented.
Abstract: A linear-time algorithm and its short computer program in BASIC for k-out-of-n:G system reliability computation is presented.

Journal ArticleDOI
TL;DR: In this paper, the reliability and validity of conjoint analysis is compared with that of the conventional conjoint method. But, relatively little empirical evidence has been published on how its reliability or validity compares with those of the traditional conjoint methods.
Abstract: Though the use of conjoint analysis has increased substantially in recent years, relatively little empirical evidence has been published on how its reliability and validity compares with that of al...

Journal Article
TL;DR: This analysis indicated that, regardless of whether difference scores are derived from within or between observers, repeated measurements under controlled conditions can confidently be expected to fall within approximately four angular degrees of each other.

Journal ArticleDOI
TL;DR: Reliability for an evaluation method that will provide an objective foundation on which to claim a drug or therapeutic procedure does or does not have an effect in treating Duchenne muscular dystrophy is demonstrated.
Abstract: A multiclinic, collaborative study has been designed to assess the natural progression and efficacy of treatment of Duchenne muscular dystrophy. This article describes the protocol for the evaluation technique and the method used to establish within (intraobserver) and between (interobserver) reliability of the protocol evaluation procedures. Standardized patient evaluations were used, and consistency of evaluation was monitored by a computer. The reliability of the measures was analyzed 1) within observers by comparing the results of each of the first three tests done by each evaluator for all patients and 2) between observers by comparing, at multicenter group meetings, the results of each of the four evaluators' tests of the same patient. We have demonstrated reliability for an evaluation method that will provide an objective foundation on which to claim a drug or therapeutic procedure does or does not have an effect in treating Duchenne muscular dystrophy.




Journal Article
TL;DR: Concepts that are fundamental to proper use of norm-referenced tests in clinical assessment are discussed, common errors in the use of such tests are considered, and alternatives to norm- referenced testing for certain assessment purposes are suggested.
Abstract: The purposes of this paper are to discuss concepts that are fundamental to proper use of norm-referenced tests in clinical assessment, to consider common errors in the use of such tests, and to suggest alternatives to norm-referenced testing for certain assessment purposes. A hypothetical client is used to illustrate the following errors: the use of age-equivalent scores as the sole summary of test results, the use of individual items to formulate therapy objectives, and the failure to consider the possible effects of measurement error when difference scores are used to assess progress or to examine patterns of impairment.


Journal ArticleDOI
TL;DR: In this article, the main index of interest in reliability is the time to the first system failure, and the assumption that element failure rates are low allows to obtain an expression for the main term in the asymptotic representation of system reliability function.
Abstract: : Section 1 of this paper reviews some works related to reliability evaluation of nonrenewable systems. The assumption that element failure rates are low allows to obtain an expression for the main term in the asymptotic representation of system reliability function. Section 2 is devoted to renewable systems. The main index of interest in reliability is the time to the first system failure. A typical situation in reliability is that the repair time is much smaller than the element lifetime. This fast repair property leads to an interesting phenomenon that for many renewable systems the time to system failure converges in probability, under appropriate norming, to exponential distribution. Some basic theorems explaining this fact are presented and a series of typical examples is considered. Special attention is paid to reviewing the works describing the exponentiality phenomenon in the birth-and-death processes. Some important aspects of computing the normalizing constants are considered, among them, the role played by so-called main event. Section 2 contains also a review on various bounds on the deviation from exponentiality. Sections 3 , 4 describe some additional aspects of asymptotics in reliability. It is typical for the probabilistic models considered in these sections, that a small parameter is introduced in an explicit form into the characteristic of the random processes. A considerable part of this review is based on the sources which were originally published in Russian and are available in the English translation. (Author)

01 Sep 1984
TL;DR: The Performance Evaluation Tests for Environmental Research (PETER) Program as mentioned in this paper identified a set of measures of human cognitive, perceptual, and motor capabilities for use in the study of environmental and other time-course effects.
Abstract: : The goal of the Performance Evaluation Tests for Environmental Research (PETER) Program was to identify a set of measures of human cognitive, perceptual, and motor capabilities for use in the study of environmental and other time-course effects. Tasks were evaluated as suitable for repeated measures applications when their intertrial means, variances, and correlations were well-behaved under constant baseline conditions. The results of this program are documented in more than 90 reports. This report provides an evaluation of 112 measures studied in the PETER Program. They are categorized into four groups based upon consideration of task stability and task definition. The Recommended category contained 30 measures that clearly obtained total stabilization and had an acceptable level of reliability efficiency (i.e., rxx greater than .50, normalized to a three minute administration). The Acceptable-But-Redundant category contained 15 measures that met the same requirements as those in the Recommended category but were found redundant. The 35 measures in the Marginal category usually had desirable features which were outweighted by faults. The 32 measures in the Unacceptable category were characterized by either differential instability or weak reliability efficiency (rxx less than .15). Originator supplied keywords include: Behavioral science, Psychology, Human factors engineering, Human performance, Repeated measures, Psychological tests, Cognitive tests, Psychomotor tests, Perceptual tests, Test construction, Environmental tests, Test batteries.

Journal ArticleDOI
01 Apr 1984-The Auk
TL;DR: On tente d'estimer le biais methodologique lie a l'observateur lorsque la densite de population varie as mentioned in this paper, on tente de leur et al.
Abstract: On tente d'estimer le biais methodologique lie a l'observateur lorsque la densite de population varie

Journal ArticleDOI
TL;DR: A new procedure is described which employs a simple modification to a standard video camera to produce an image which appears from 20% thinner to 40% fatter than the actual person, without other distortion of the image.
Abstract: Previously employed techniques for the measurement of body image are briefly described, with a short consideration of methodological or procedural limitations associated with each technique. A new procedure is described which employs a simple modification to a standard video camera to produce an image which appears from 20% thinner to 40% fatter than the actual person, without other distortion of the image. Reliability and preliminary validity data for the new procedure are presented.

Journal ArticleDOI
TL;DR: A set of criteria is proposed for the comparison of software reliability models to provide a logically organized basis for determining the superior models and for the presentation of model characteristics.
Abstract: A set of criteria is proposed for the comparison of software reliability models. The intention is to provide a logically organized basis for determining the superior models and for the presentation of model characteristics. It is hoped that in the future, a software manager will be able to more easily select the model most suitable for his/her requirements from among the preferred ones.

Journal ArticleDOI
TL;DR: It is hoped that discussions such as this will promote increased attention to validity and reliability concerns in qualitative evaluations and thus help improve the quality of those evaluations.
Abstract: Interest in and use of qualitative methodological strategies in evaluating research have increased considerably in the last few years. Many of the recent evaluation frameworks or models are entirely or partly oriented toward use of qualitative methods. A number of methodological issues and concerns have been raised, including the appropriateness of validity and reliability estimation for the measurement strategies employed in qualitative evaluations becoming more common in health and other fields. In this article, the views of prominent qualitative methodologists on this topic are briefly summarized; a case is made for the relevance of validity and reliability estimation; definitions of validity and reliability for qualitative measurement are presented; and appropriate estimation techniques are suggested. It is hoped that discussions such as this will promote increased attention to validity and reliability concerns in qualitative evaluations and thus help improve the quality of those evaluations.