scispace - formally typeset
Search or ask a question

Showing papers on "Reliability (statistics) published in 1976"



Journal ArticleDOI
TL;DR: In this paper, survival distributions for reliability applications in the Biomedical Sciences are discussed, with a focus on the reliability of the distribution of survival distributions in the field of bio-medical applications.
Abstract: (1976). Survival Distributions: Reliability Applications in the Biomedical Sciences. Technometrics: Vol. 18, No. 4, pp. 501-501.

513 citations


Journal ArticleDOI
TL;DR: Methods for statistical analysis of reliability and life data are presented for the first time in a systematic fashion and provide a clear picture of the variability in the reliability of individual transactions.
Abstract: (1976). Methods for Statistical Analysis of Reliability and Life Data. Journal of the Operational Research Society: Vol. 27, Volume 27, issue 2, pp. 401-403.

511 citations


Journal ArticleDOI
TL;DR: The test-retest reliability of the SIP was investigated using different interviewers, forms, administration procedures, and a variety of subjects who differed in terms of type and severity of dysfunction.
Abstract: This report describes the results of research conducted on the reliability of the Sickness Impact Profile (SIP). The SIP is a questionnaire instrument designed to measure sickness-related behavioral dysfunction and is being developed for use as an outcome measure in the evaluation of health care. The test-retest reliability of the SIP in terms of several reliability measures was investigated using different interviewers, forms, administration procedures, and a variety of subjects who differed in terms of type and severity of dysfunction. The results provided evidence for the feasibility of collecting reliable data using the SIP under these various conditions. In addition, subject variability in relation to reliability is discussed.

283 citations


Journal ArticleDOI
TL;DR: A computer simulation of the capture–recapture process was used to examine the influence of five population parameters on the reliability of enumeration, and it was concluded that enumerated populations are at least 10–20% smaller than the actual populations.
Abstract: Some studies of populations in small rodents rely on enumeration of individuals to provide an estimate of population size. A computer simulation of the capture–recapture process was used to examine...

238 citations


Journal ArticleDOI
TL;DR: In this article, the optimum sample size is defined as the smallest sample size that would assure the desired reliability of the estimate, and the formula giving it depends on the way we choose to define reliability.
Abstract: Estimation of field population parameters is an indispensable part of many ecological and various pest management projects. Obviously, the greater the sample size, the more reliable the estimates are. However, the cost per unit sampled is often substantial, which makes the collection of unnecessarily large samples unwise. Thus, we must determine in advance the smallest sample size that would assure us the desired reliability of the estimate. This is called the optimum sample size. The formula giving it depends on the way we choose to define “reliability.“.

184 citations


Journal Article

152 citations


Journal ArticleDOI
TL;DR: In this article, the reliability of criterion-referenced tests was investigated from the point of view of consistency of decisions across repeated testings, and it was shown that test information does not contribute to the accuracy of the decision-making process.
Abstract: Several writers (Carver, 1970; Hambleton & Novick, 1973) indicate that the reliability of criterion-referenced tests might most appropriately be viewed from the standpoint of consistency of decisions across repeated testings. The proposal in anchored in the presumption that a primary purpose of criterion-referenced testing is to classify examinees into achievement statuses which are usually taken as mastery and nonmastery (Millman, 1974). Swaminathan, Hambleton, and Algina (1974), in turn, suggest the coefficient kappa (K) (Cohen, 1960) as an index of reliability for criterion-referenced tests. Operationally, this index is defined as the ratio K = (p pc)/(l Pc), where p is the proportion of consistent decisions based on test and retest data and pc is the proportion of consistent decisions expected by chance alone (see Figure 1 for a visual display of these two proportions). Under the mild assumption of exchangeability, Huynh (1976b) proves that K varies from 0 to I inclusive. The lower limit is realized when test information do not contribute to the accuracy of the decision-making process. On the other hand, the upper bound is reached when equivalent data produce exactly the same classifications. Since kappa is defined by means of a retest (with the same test form or an equivalent one), it cannot be readily obtained from a single test administration. Because practical situations frequently make repeated testing impossible, it would be desirable to have ways to estimate the reliability of decisions on the basis of a single testing. This study deals mainly with tests measuring a single trait. In addition it is assumed that a probability sampling procedure is used to select items from a welldefined universe. Such tests are referred to as domained-referenced (Millman, 1974).

146 citations


Journal ArticleDOI
TL;DR: In this paper, a generalized mathematical model for forecasting technological substitution under a wide variety of circumstances is presented, and methods are also suggested for improving the reliability of the model by taking corrective measures on the available data and following a step-wise forecasting procedure.

143 citations


Journal ArticleDOI
TL;DR: Gross and Clark as discussed by the authors presented a survival distribution for reliability applications in the Biomedical sciences, which is based on the survival distribution distribution in probability and Mathematical Statistics (PPS).
Abstract: Survival Distributions: Reliability Applications in the Biomedical Sciences. By Alan J. Gross and Virginia A. Clark. New York and London, Wiley, 1976. xv, 331 p. 914″. £11·50. (Wiley Series in Probability and Mathematical Statistics.)

136 citations


Journal ArticleDOI
TL;DR: The new method makes no attempt to generate mutually exclusive events from the set of paths or cutsets but uses a technique to reduce greatly the number of terms in the reliability expression.
Abstract: This paper presents a new algorithm for symbolic system reliability analysis. The method is applicable to system graphs with unreliable branches or nodes. Each branch is directed or undirected. Element probabilities need not be equal, but their failures are assumed to be s-independent. The new method makes no attempt to generate mutually exclusive events from the set of paths or cutsets but uses a technique to reduce greatly the number of terms in the reliability expression. Actual programming results show that the new method can efficiently handle systems having fewer than 20 paths or cutsets between the input-output node pair.

Journal ArticleDOI
TL;DR: A test in which the range of possible scores is partitioned into k nonoverlapping intervals that define different levels of student mastery of a well-specified content domain is generally classified as criterion-referenced (Millman, 1974).
Abstract: A test in which the range of possible scores is partitioned into k non-overlapping intervals that define different levels of student mastery of a well-specified content domain is generally classified as criterion-referenced (Millman, 1974). An example is the familiar pass-fail test with a criterion of, say, 75 percent correct. Since such tests are often used in conjunction with instructional programs that attempt to maximize the number of students attaining the highest mastery states and thereby minimize the variability of test scores, the classical correlation between scores on parallel tests (or equivalently, the ratio of true to observed variance) may be attenuated by lack of variability and is unsatisfactory as an indicator of reliability (Popham & Husek, 1969). In addition, Hambleton and Novick (1973) note that the classical approach to reliability is based on a definition of error that is not entirely appropriate for such tests.



Journal ArticleDOI
TL;DR: The paper describes an efficient algorithm for evaluating the minimal cut sets of any general network based on Boolean algebra and set theory, and contains many important improvements.
Abstract: The paper describes an efficient algorithm for evaluating the minimal cut sets of any general network. The algorithm is based on Boolean algebra and set theory, and contains many important improvements. The four most important features are 1. only one set of topological input data is required to evaluate the minimal cuts and reliability indices of every output node; 2. a mix of undirectional, bidirectional and multi ended components can be included very simply; 3. any number of input nodes may be specified; 4. a new concept of overall system reliability permits different, large, and complex systems to be compared. The computational efficiency of the algorithm is clearly indicated by the fact that the time required to analyse Example 1 on a CDC7600 was 0.7 sec. The storage required with the appropriate arrays dimensioned for a system having 100 components and up to 125 minimal cut sets per output node is 15 k-words. These times and storage include the overall system reliability analysis.


Journal ArticleDOI
TL;DR: In this paper, the inverse of the standard deviation of point-to-point travel times was used to measure the reliability of urban bus transit reliability, and the selected measure was found to be a useful and easily collected indicator of service reliability, which was shown to be significantly degraded by increasing route length, increasing intensity of intersection control, increasing traffic volumes, and, with less certainty, increasing bus passenger loadings.
Abstract: Attitudinal surveys have indicated that travel time reliability is an important attribute of urban transportation services affecting choice of mode of travel. Yet there is no generally accepted measure of transport reliability, nor is much known about factors affecting this parameter. This study was conducted to test a particular measure of reliability, the inverse of the standard deviation of point-to-point travel times, using data from bus services in the Chicago area. The selected measure was found to be a useful and easily collected indicator of service reliability. Reliability measured in this form was found to be significantly degraded by increasing route length, increasing intensity of intersection control (particularly traffic signal density), increasing traffic volumes, and, with less certainty, increasing bus passenger loadings. Several strategies for increasing urban bus transit reliability are suggested.



Journal ArticleDOI
TL;DR: In this article, two standard setting procedures were employed by two groups of judges to set pass-fail levels for comparable samples of a nationally administered examination, and these procedures were both designed to...
Abstract: Two standard setting procedures were employed by two groups of judges to set pass-fail levels for comparable samples of a nationally administered examination. These procedures were both designed to...

Journal Article
TL;DR: Basic measurement theory and appropriate procedures for estimating validity and reliability within an occupational therapy framework are presented and special considerations with regard to measurement of motor behavior are emphasized.
Abstract: Occupational therapists use many forms of measurement tools to assess the existing and potential functions of their clients. Too often the principles of measurement theory have not been applied in the development of such instruments and the resulting assessments have no established validity or reliability. This article presents basic measurement theory and appropriate procedures for estimating validity and reliability within an occupational therapy framework. Special considerations with regard to measurement of motor behavior are emphasized. An understanding of these assesment principles can enable a therapist to construct measurement scales that are valid, reliable, and yield data of scientific value.

Journal ArticleDOI
TL;DR: In this article, a theoretical analysis of the random effect in the determination of speech discrimination, demonstrates that binomial distribution can be used to assess the reliability of the result, which depends on the discrimination of the patient, length of word list and upon the character of the word material.
Abstract: A theoretical statistical analysis, which takes into consideration the random effect in the determination of speech discrimination, demonstrates that binomial distribution can be used to assess the reliability of the result. A clinical study supports the theory, which shows that reliability depends on the discrimination of the patient, length of word list and upon the character of the word material. The standard deviation of the discrimination value at repeated measurements on the same patient is, in theory, proportionally inverse to the square root of the number of words in the list. Curves with confidence intervals for the discrimination values obtained are presented and also diagrams indicating when two different discrimination results deviate significantly from each other. Curtailment of a 50-word list to a 25-word list is recommended only in those cases where the patient has at maximum one mistake in these 25 words. In order that reliability can be estimated, the length of the word list used always s...


Journal ArticleDOI
TL;DR: In this paper, a method for computing exactly the frequency and duration of load loss events as measures of generating system reliability is presented, which utilizes an exact state capacity model together with a cumulative state load model which requires no idealization.
Abstract: The paper presents a method for computing exactly the frequency and duration of load loss events as measures of generating system reliability. The method utilizes an exact state capacity model together with a cumulative state load model which requires no idealization. Efficient computer algorithms for building the exact state capacity model and computing the system reliability indices are presented.

Journal ArticleDOI
TL;DR: This work presents a Bayesian procedure that does allow using both sets of test data gathered both at the component level of a multicomponent system and at the system level.
Abstract: On occasion, reliability analysts have test data gathered both at the component level of a multicomponent system and at the system level. When the system test data provide no information on component performance, classical statistical techniques in all but trivial cases do not allow using both sets of data. We present a Bayesian procedure that does allow using both sets. The procedure for attribute data makes use of a lemma that relates the moments of the prior and posterior distributions of reliability to the test data. The procedure for variables data assumes the time to failure distribution of each component is exponential.

Journal ArticleDOI
TL;DR: The writers raise several conceptual issues concerning direct similarity judgments and provide data on their reliability and validity and the results present a discouraging view of these judgments as measures of individual perceptions.
Abstract: The writers raise several conceptual issues concerning direct similarity judgments and provide data on their reliability and validity. The results present a discouraging view of the reliability and...

Journal ArticleDOI
TL;DR: McGaw et al. as mentioned in this paper proposed a variance components approach which enables the researcher to pinpoint multiple sources of error, and to compute a number of different reliability coefficients for different purposes.
Abstract: Researchers in the area of classroom observation have been greatly troubled by questions concerning the reliability of the measures that they obtain. Until recently, this concern was frequently assuaged by the routine computation of one or more coefficients of observer agreement (see Frick & Semmel, 1974). However, the work of Medley and Mitzel (1963) and McGaw, Wardrop and Bunda (1972) has clearly established the inadequacy of observer agreements alone as indices of reliability. The variance components approach which they propose enables the researcher to pinpoint multiple sources of error, and to compute a number of different reliability coefficients for different purposes. Unfortunately, the literature does not indicate that these methods have gained wide acceptance, at least not in practice. The most likely reason for this would appear to be the inference from both papers that the estimation of reliability properly requires a fully-fledged reliability study, using multiple observers fully crossed with classrooms, and (following McGaw et al., 1972) crossed also with situations. The magnitude of such a study is far beyond the resources of most researchers, nor does such an undertaking relate very closely to the purposes of their own studies (typically, to make some statement about teacher or pupil behavior, and possibly its relationship with educational outcomes). Consequently, it has been common to avoid the question of reliability altogether, or else to report a coefficient of observer agreement, knowing full well its inadequacy for that purpose. It has been urged (Herbert & Attridge, 1975) that users and developers of observation systems ought to provide data pertaining to reliability, and, as well, \". . . a discussion of which reliability measures were selected, and why (p. 14).\" A full-scale reliability study, along the lines of McGaw et al (1972) or Medley and Mitzel (1963) can be, and probably ought to be, demanded of the developer of an

Journal ArticleDOI
TL;DR: The purpose of this study was to determine the short-term test-retest reliability of three alcohol consumption scores, all derived from the same interview: the absolute alcohol score, from Jessor et al., the QFV index, and the volume variability index, both developed by Cahalan et al.
Abstract: Since the early 1950s, when Maxwell’ and Straus and Bacon2 published a quantity-frequency approach to estimating alcohol consumption, this procedure, with various modifications, has been widely used, especially for studying drinking habits in large populations. Because many inferences are made about the actual drinking habits of persons based on self-reports of their drinking practices, the stability of the self-report of drinking behavior over time is a relevant issue. The question with which we are concerned is: If a person is classified as a “heavy drinker” or a “light drinker” on the basis of a n interview given today, will he still be classified as a “heavy drinker” or a “light drinker” when given the same interview in about a week? This paper discusses the reliability of the classification of subjects’ alcohol consumption, based on their self-reports of drinking behavior obtained through a quantity-frequency-variability (QFV) interview. Only Edwards et al.3 have reported on the test-retest reliability of self-reports of alcohol consumption. Because their procedure for obtaining and scoring the data differed considerably from the procedures commonly used in this country, the present study was designed. The purpose of this study was to determine the short-term test-retest reliability of three alcohol consumption scores, all derived from the same interview: the absolute alcohol (AA) score, from Jessor et al.,4 the QFV index, and the volume variability (VV) index, both developed by Cahalan et al.5

Proceedings ArticleDOI
13 Oct 1976
TL;DR: This paper defines a practical procedure to perform an assessment of program reliability using structured programming techniques and considers that a certain degree of verification is attained with a given set of tests, according to the number of paths actually traversed.
Abstract: This paper deals with the problem of assessing the reliability of programs written using structured programming techniques and having undergone a certain amount of testing. A program is said to be verified if, for a given set of tests it can be shown that every case of interest has been tested. As this end is, however, unattainable, we will consider, in the following, that a program is verified if one can prove that all the logic paths in the program flow graph have been traversed. Therefore, we will consider that a certain degree of verification is attained with a given set of tests, according to the number of paths actually traversed. This degree of verification, which is a non-decreasing function of the number of tests can be considered as an assessment of program reliability. The degree of verification attained through experiments can then be deduced from the images of experiments in the program flow graph. This paper defines a practical procedure to perform such an evaluation.