scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Reliability in 1988"


Journal Article•DOI•
TL;DR: Improvement factors in hazard rate and age for a sequential preventative maintenance (PM) policy are introduced and the optimal policies to minimize the mean cost rates are discussed.
Abstract: Improvement factors in hazard rate and age for a sequential preventative maintenance (PM) policy are introduced. Two imperfect PM models are analyzed: (1) PM reduces the hazard rate while it increases with the number of PMs, and (2) PM reduces the age. The PM is done at intervals x/sub k/ (k=1,2, . . . , N) and is imperfect. The optimal policies to minimize the mean cost rates are discussed. The optimal PM sequences (x/sub k/) are computed for a Weibull distribution. >

344 citations


Journal Article•DOI•
TL;DR: The Dotson algorithm is suited not only for numerical reliability, but for obtaining symbolic expression for the terminal-pair reliability with no additional effort.
Abstract: Four algorithms for the terminal-pair-reliability problem are compared. Nelson (1970), Lin (1976), Shooman (1968), and Dotson (1979) algorithms are used in this study. It is shown that the Dotson algorithm is the fastest among the terminal-pair reliability algorithms analyzed. The Dotson algorithm is suited not only for numerical reliability, but for obtaining symbolic expression for the terminal-pair reliability with no additional effort. By modifying the Dotson algorithm the efficiency can be further improved. The modifications to this algorithm are listed and the reliability of the modified Dotson algorithm is computed. >

138 citations


Journal Article•DOI•
TL;DR: In this paper, a series system of three components, each with a constant failure rate, is considered and a single numerical procedure for obtaining maximum-likelihood estimations (MLEs) in the general case is proposed.
Abstract: Life data from multicomponent systems are often analyzed to estimate the reliability of each system component. Due to the cost and diagnostic constraints, however, the exact cause of system failure might be unknown. Referring to such situations as being masked, the authors use a likelihood approach to exploit all the available information. They focus on a series system of three components, each with a constant failure rate, and propose a single numerical procedure for obtaining maximum-likelihood estimations (MLEs) in the general case. It is shown that, under certain assumptions, closed-form solutions for the MLEs can be obtained. The authors consider that the cause of system failure can be isolated to some subset of components, which allows them to consider the full range of possible information on the cause of system failure. The likelihood, while presented for complete data, can be extended to censoring. The general likelihood expressions can be used with various component life distributions, e.g., Weibull, lognormal. However, closed-form MLEs would most certainly be intractable and numerical methods would be required. >

124 citations


Journal Article•DOI•
TL;DR: An implementation of an algorithm which uses the factoring theorem, in conjunction with degree-1 and degree-2 vertex reductions, to determine the reliability of a network is presented.
Abstract: The factoring theorem is a simple tool for determining the K-terminal reliability of a network, i.e. the probability that a given set K of terminals in the network are connected to each other by a path of working edges. An implementation of an algorithm which uses the factoring theorem, in conjunction with degree-1 and degree-2 vertex reductions, to determine the reliability of a network is presented. Networks treated have completely reliable nodes and have edges which fail statistically and independently with known probabilities. The reliability problem is to determine the probability that all nodes in a designated set of nodes can communicate with each other. Such an implementation of the factoring theorem can be incorporated in a small, stand-alone program of about 500 lines of code. A program of this type requires little computer memory and is ideally suited for microcomputer use. >

115 citations


Journal Article•DOI•
S. H. Sim1, J. Endrenyi1•
TL;DR: In this paper, a minimal preventive-maintenance model is developed for repairable, continuously operating devices whose conditions deteriorate with the time in service and the times to preventive maintenance have an Erland distribution and can be, in a limiting case, deterministic.
Abstract: A minimal preventive-maintenance model is developed for repairable, continuously-operating devices whose conditions deteriorate with the time in service. The times to preventive maintenance have an Erland distribution and can be, in a limiting case, deterministic. The optimal value of the mean time a preventive maintenance is determined by minimizing the unavailability of the device due to preventive maintenance, to Poisson-distributed failures, and to deterioration failures. The model is useful for many devices, including electric power-system components such as coal pulverizers, circuit breakers, and transformers. >

113 citations


Journal Article•DOI•
Kyung S. Park1•
TL;DR: In this paper, the optimal wear limit for preventive replacement that minimizes the long-run total average cost rate is derived, where the item is preventively replaced as the wear at periodic inspections exceeds a certain wear limit; on failure, it is replaced immediately.
Abstract: An item breaks down when it wears continuously beyond a certain threshold. The item is preventively replaced as the wear at periodic inspections exceeds a certain wear limit; on failure, it is replaced immediately. The optimal wear limit for preventive replacement that minimizes the long-run total average-cost rate is derived. A numerical example demonstrates its computability. >

92 citations


Journal Article•DOI•
TL;DR: In this paper, the reliability of a m-out-of-n:G system is obtained under the assumption that the failure of a component changes the failure rate of the surviving components.
Abstract: An m-out-of-n:G system is composed of statistically independent and identically distributed components with exponential lifetimes. The reliability of such a system is obtained under the assumption that the failure of a component changes the failure rate of the surviving components. >

89 citations


Journal Article•DOI•
TL;DR: In this paper, a method for online hazard aversion and fault diagnosis in chemical processes is developed using a directed graph model of process operation and control, which combines real-time data and prior rates of equipment malfunctions and process disturbances.
Abstract: A method for online hazard aversion and fault diagnosis in chemical processes is developed. The method uses a directed graph model of process operation and control. Fault trees developed from the directed graphs are combined with real-time data to provide online diagnosis for hazard aversion and fault detection. Both hardwired control and manual control are modeled. A single control loop illustrates the modeling technique and the diagnosis method. The method provides an advance alert to process problems and an identification of the problems' causes, based on the available real-time data and prior rates of equipment malfunctions and process disturbances. >

88 citations


Journal Article•DOI•
S.H. Ahmad1•
TL;DR: Two methods are given that use combinations of nodes to enumerate all minimal cutsets of a general acyclic directed graph and four rules are given for deletion of those combinations that yield redundant and nonminimal subsets.
Abstract: Two methods are given that use combinations of nodes to enumerate all minimal cutsets. One simply has to enumerate all combinations of orders 1 to N-3 of nodes from 2 to N-1, where N is the total number of nodes. Collecting only those symbols of links of first row of adjacency matrix and in the rows given in a combination that are not in the columns of the combination, a cutset of an acyclic directed graph having all adjacent nodes is obtained. To obtain the cutsets of a general acyclic directed graph, four rules are given for deletion of those combinations that yield redundant and nonminimal subsets. The rules provide a reduced set of combinations, which then gives rise to minimal cutsets of a general graph. Three examples illustrate the algorithms. >

84 citations



Journal Article•DOI•
TL;DR: The use of Petri nets to represent fault trees is discussed and a more general and useful method to study the dynamic behavior of the model at various levels of abstraction is examined.
Abstract: The use of Petri nets to represent fault trees is discussed. Using reachability and other analytic properties of Petri nets, a more general and useful method to study the dynamic behavior of the model at various levels of abstraction is examined. The problems of fault-detection and propagation are discussed. For simplicity, only coherent fault trees are considered. However, the representation and analysis techniques are general and can be used for noncoherent fault trees. >

Journal Article•DOI•
TL;DR: The body of literature addressing human errors and their effect on system performance is listed and categorized in this article, where the following factors were considered in classifying the literature: applicability-human performance prediction, performance analysis of man-machine system, reliability allocation, human-error data collection, or human error overview, system under consideration-human component only or both human and hardware and type of task being performed-operational or maintenance work, continuous or discrete tasks, and full range of human behaviour or single functions like decision-making or signal detection.
Abstract: The body of literature addressing human errors and their effect on system performance is listed and categorized. The following factors were considered in classifying the literature: (1) applicability-human performance prediction, performance analysis of man-machine system, man-machine reliability allocation, human-error data collection, or human-error overview, (2) system under consideration-human component only or both human and hardware and (3) type of task being performed-operational or maintenance work, continuous or discrete tasks, and full range of human behaviour or single functions like decision-making or signal detection. >

Journal Article•DOI•
TL;DR: The authors present a model for the behavior of software failures that fits into the general framework of empirical Bayes problems; however, they take a proper Bayes approach for inference by viewing the situation as a Bayes empirical-Bayes problem.
Abstract: The authors present a model for the behavior of software failures. Their model fits into the general framework of empirical Bayes problems; however, they take a proper Bayes approach for inference by viewing the situation as a Bayes empirical-Bayes problem. An approximation due to D.V. Lindley (1980) plays a central role in the analysis. They show that the Littlewood-Verall model (1973) is an empirical Bayes model and discuss a fully Bayes analysis of it using the Bayes empirical-Bayes setup. Finally, they apply both models to some actual software failure data and compare their predictive performance. >

Journal Article•DOI•
TL;DR: This paper seeks a checkpoint strategy which maximizes the probability of critical-task completion on a system with limited repairs, and extends the model to include a constraint which enforces timely completion of the critical task.
Abstract: The selection of an optimal checkpointing strategy has most often been considered in the transaction processing environment where systems are allowed unlimited repairs. In this environment an optimal strategy maximizes the time spent in the normal operating state and consequently the rate of transaction processing. This paper seeks a checkpoint strategy which maximizes the probability of critical-task completion on a system with limited repairs. These systems can undergo failure and repair only until a repair time exceeds a specified threshold, at which time the system is deemed to have failed completely. For such systems, a model is derived which yields the probability of completing the critical task when each checkpoint operation has fixed cost. The optimal number of checkpoints can increase as system reliability improves. The model is extended to include a constraint which enforces timely completion of the critical task. >

Journal Article•DOI•
TL;DR: In this paper, a method for deriving a formula for MTBF in a k-out-of-n:G parallel system that is easily reproduced quickly by remembering a few simple concepts is presented.
Abstract: It is often necessary to calculate the MTBF (mean time between failures) quickly in order to make timely design decisions. An important system for which such calculations must be made is a k-out-of-n:G parallel system with unlimited repair and exponential interfailure and repair times at the unit level. Although a general formula is known, it is not easily remembered or derived. A method for deriving a formula for MTBF in this situation that is easily reproduced quickly by remembering a few simple concepts is presented. >

Journal Article•DOI•
TL;DR: In this article, the effect of nonnormality on E(X) and R charts is examined by comparing the probabilities that E( X and R lie outside their three standard deviation and two standard deviation control limits.
Abstract: The effect of nonnormality on E(X) and R charts is reported. The effect of departure from normality can be examined by comparing the probabilities that E(X) and R lie outside their three-standard-deviation and two-standard-deviation control limits. Tukey's lambda -family of symmetric distributions is used because it contains a wide spectrum of distributions with a variety of tail areas. The constants required to construct E(X) and R charts for the lambda -family are computed. Control charts based on the assumption of normality give inaccurate results when the tails of the underlying distribution are thin or thick. The validity of the normality assumption is examined by using a numerical example. >

Journal Article•DOI•
TL;DR: In this article, the authors obtained Bayes estimates of the parameters and reliability function of a 3-parameter Weibull distribution and compared posterior standard-deviation estimates with the corresponding asymptotic standard deviation estimates of their maximum likelihood counterparts.
Abstract: The authors obtain Bayes estimates of the parameters and reliability function of a 3-parameter Weibull distribution and compare posterior standard-deviation estimates with the corresponding asymptotic standard-deviation estimates of their maximum likelihood counterparts. Numerical examples are given. >

Journal Article•DOI•
TL;DR: In this paper, a performance index for a telecommunication network is defined as a composite index integrating the important measures of both reliability and capacity, and a fast method is proposed for deriving the symbolic expression of the performance index in a compact form.
Abstract: A performance index for a telecommunication network is defined as a composite index integrating the important measures of both reliability and capacity. In this index the network states that permit a flow capability less than the maximum, are not totally ignored but are assigned a normalized weight less than one. A fast method is proposed for deriving the symbolic expression of the performance index in a compact form. Because the capacity of several subnetworks must be computed, an efficient procedure for capacity determination is also suggested. An example illustrates the procedure and points out the simplicity of the resulting expression for the performance index. >

Journal Article•DOI•
G.W. Cran1•
TL;DR: Weibull moments are defined generally and then calculated for the 3-parameter Weibull distribution with non-negative location parameter as discussed by the authors, and sample estimates for these moments are given and used to estimate the parameters.
Abstract: Weibull moments are defined generally and then calculated for the 3-parameter Weibull distribution with non-negative location parameter. Sample estimates for these moments are given and used to estimate the parameters. The results of a simulation investigation of the properties of the parameter estimates are discussed briefly. A simple method of deciding whether the location parameter can be considered zero is described. >

Journal Article•DOI•
TL;DR: In this article, the conditional expectation is used to characterize gamma and negative binomial distributions and a necessary and sufficient condition for each distribution is given in terms of the failure rate for each distributions.
Abstract: Characterizations of gamma and negative binomial distributions are presented by using the conditional expectation. A necessary and sufficient condition is given in terms of the failure rate for each distribution. To illustrate the usefulness of the results, the authors discuss the mean residual-life of the gamma distribution. >

Journal Article•DOI•
TL;DR: In this paper, three manual techniques of conventional reliability analysis are adapted for the computation of the proposed performance indexes: a map procedure, reduction rules, and a generalized cutset procedure.
Abstract: Two recently proposed performance indexes for telecommunication networks are shown to be the s-t and overall versions of the same measure, namely, the mean normalized network capacity. The network capacity is a pseudo-switching-function of the branch successes, and hence it means value is readily obtainable from its sum-of-products expression. Three manual techniques of conventional reliability analysis are adapted for the computation of the proposed performance indexes: a map procedure, reduction rules, and a generalized cutset procedure. Four tutorial examples illustrate these techniques and demonstrate their computational advantages over the state-enumeration technique. >

Journal Article•DOI•
TL;DR: In this paper, the optimal wear-limit for preventive replacement for an item with wear-dependent failure rate is derived by minimizing the long-run total mean cost rate, where the generic term wear connotes any type of degradation that accumulates through use and is observed continuously in time.
Abstract: The optimal wear-limit for preventive replacement for an item with wear-dependent failure rate is derived by minimizing the long-run total mean cost rate. The generic term wear connotes any type of degradation that accumulates through use and is observed continuously in time. The optimal strategy has the same form as the age replacement policy. >

Journal Article•DOI•
TL;DR: A combined performance and reliability (performability) measure for gracefully degradable fault-tolerant systems is introduced and a closed-form, analytic solution is provided for computing the performability of a class of unrepairable systems which can be modeled by general acyclic Markov processes.
Abstract: A combined performance and reliability (performability) measure for gracefully degradable fault-tolerant systems is introduced and a closed-form, analytic solution is provided for computing the performability of a class of unrepairable systems which can be modeled by general acyclic Markov processes. This allows the study of models which consider the degradation of more than one type of system component, e.g. processors, memories, buses. An efficient evaluation algorithm is provided, with an extensive analysis of its time and space complexity. A numerical example is provided which shows how the combined performance/reliability measure provides for a complete evaluation of the relative merits of different multiprocessor structures. >

Journal Article•DOI•
R.K. Sah1, J.E. Stiglitz•
TL;DR: In this article, the authors derived several properties for optimal k-out-of-n systems in which the i.i.d components can be with a prespecified frequency, in one of two possible modes, subject to failures in each of the two modes; and the costs of two kinds of system failures are not necessarily the same.
Abstract: Several properties are derived for optimal k-out-of-n systems in which: (1) the i.i.d components can be with a prespecified frequency, in one of two possible modes; (2) components are subject to failures in each of the two modes; and (3) the costs of two kinds of system failures are not necessarily the same. A characterization of the optimal k that maximizes the mean system profit is obtained. It is shown how one can predict directly from the parameters of the system whether the optimal k is smaller or larger than n/2. The directions of change in the optimal k resulting from changes in system parameters are ascertained. A subclass of the formulation and results corresponds to the case examined in the literature in which the optimal k is chosen to maximize the system's reliability. >

Journal Article•DOI•
TL;DR: In this paper, the problem of designing the most reliable configuration of a given number of s-identical components that can experience both failure modes is addressed, and two simple algorithms for designing an optimal configuration are presented.
Abstract: A 3-state component is one which can fall in two different modes: an open mode and a shorted mode. Systems built from such components can also experience either of these two failure modes. For a given number of s-identical s-independent components, a pure 'series' or pure 'parallel' configuration would be most reliable if only one of the two failure modes were possible. This paper treats the problem of designing the most reliable configuration of a given number of s-identical components that can experience both failure modes. Two simple algorithms for designing an optimal configuration are presented, and by analysis of 6, 8, and 20-component systems they illustrate the extent to which other configurations can be more reliable than 'series-parallel' or 'parallel-series' arrays. >

Journal Article•DOI•
TL;DR: An algorithm to compute variance importance, a measure of uncertainty importance for system components, is presented, which overcomes NP-difficulty (non-polynomial difficulty) which exists in earlier methods for computing uncertainty importance, and is simpler, more accurate, and more practical.
Abstract: The authors present an algorithm to compute variance importance, a measure of uncertainty importance for system components. A simple equation has been derived for the measure, and Monte Carlo simulation is used to obtain numerical estimates. The algorithm overcomes NP-difficulty (non-polynomial difficulty) which exists in earlier methods for computing uncertainty importance, and is simpler, more accurate, and more practical. Moreover, it shows the direct relationship between probabilistic importance and uncertainty importance. An example illustrates the evaluation of Monte Carlo variance importance for a sample system. >

Journal Article•DOI•
TL;DR: In this paper, an approach to the analysis of failure data from a Weibull distribution is presented, which features incorporation of expert opinion and of the author's opinion on the expertise of the experts in the analysis.
Abstract: An approach to the analysis of failure data from a Weibull distribution is presented. The approach features incorporation of expert opinion and of the author's opinion on the expertise of the experts in the analysis. The use of Laplace approximation results in formulas which are easy to compute. >

Journal Article•DOI•
TL;DR: In this paper, the key elements that make the hazards and operability (HAZOP) technique effective for identifying chemical process hazards are outlined and six categories of problems that can sometimes reduce the effectiveness of HAZOP and even prevent it from discovering some major hazards are explained.
Abstract: The key elements that make the hazards and operability (HAZOP) technique effective for identifying chemical process hazards are outlined. Six categories of problems that can sometimes reduce the effectiveness of HAZOP and even prevent it from discovering some major hazards are explained. Several examples are included to show how lack of experience, failure to communicate, management shortcomings, complacency and poor loss-prevention practices, a shortage of technical information, and other limitations, each contribute to the problem. Practical solutions are recommended for countering the difficulties and for making the HAZOP a more effective risk-management tool. >

Journal Article•DOI•
TL;DR: In this paper, the reliability of three printed wiring boards representative of those used for avionic applications was evaluated using the Mil-Hdbk-217E model. But the results suggest that reliability can be predicted only when the layout of the components and exact thermal mapping are known.
Abstract: Some limitations on the use of Mil-Hdbk-217E models within the design process are discussed. Reliability was predicted for three printed wiring boards representative of those used for avionic applications in order to evaluate the inherent variability. Parts count and parts stress analyses were conducted in three environments using Mil-Hdbk-217E models. In addition, parts stress analyses were conducted at various temperatures, assuming that components were thermally isolated and the thermal interactions resulted from the characteristics of the cooling system. The results suggest that reliability can be predicted only when the layout of the components and exact thermal mapping are known. In practice this is not acceptable, since some measure of reliability prediction is necessary in determining electrical, thermal, and mechanical design tradeoffs early in the design process. >

Journal Article•DOI•
TL;DR: An algorithm for generating spanning trees (termed, appended spanning trees) that are mutually disjoint that gives the global reliability of a network directly is proposed.
Abstract: Global reliability of a network is defined and then evaluated using spanning trees of the network graph. An algorithm for generating spanning trees (termed, appended spanning trees) that are mutually disjoint is proposed. Each appended spanning tree represents a probability term in the final global reliability expression. The algorithm gives the global reliability of a network directly. It is illustrated with an example. The algorithm is fast, requires very little memory, is adaptable to multiprocessors, and can be terminated at an appropriate stage for an approximate value of global reliability. >