scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Reliability in 2004"


Journal ArticleDOI
TL;DR: An ant colony meta-heuristic optimization method is used to solve the redundancy allocation problem (RAP), and it is shown that the ant colony method performs with little variability over problem instance or random number seed.
Abstract: This paper uses an ant colony meta-heuristic optimization method to solve the redundancy allocation problem (RAP). The RAP is a well known NP-hard problem which has been the subject of much prior work, generally in a restricted form where each subsystem must consist of identical components in parallel to make computations tractable. Meta-heuristic methods overcome this limitation, and offer a practical way to solve large instances of the relaxed RAP where different components can be placed in parallel. The ant colony method has not yet been used in reliability design, yet it is a method that is expressly designed for combinatorial problems with a neighborhood structure, as in the case of the RAP. An ant colony optimization algorithm for the RAP is devised & tested on a well-known suite of problems from the literature. It is shown that the ant colony method performs with little variability over problem instance or random number seed. It is competitive with the best-known heuristics for redundancy allocation.

347 citations


Journal ArticleDOI
TL;DR: A photovoltaic degradation and failure assessment procedure is established and results obtained indicate that the thin-film modules degrade by up to 50% in performance after an initial outdoor exposure of 130 kWh/m/sup 2/.
Abstract: Photovoltaic (PV) modules are renowned for their reliability. However, some modules degrade or even fail when operating outdoors for extended periods. To reduce the degradation, and the number of failures, extensive research is needed on the performance of PV modules. The aim of this study was to establish a photovoltaic degradation and failure assessment procedure. This procedure should assess all parameters of PV modules to completely analyze any observed degradation or failure. In this paper some degradation modes of PV modules are discussed and a procedure used to assess these degradation modes is then presented. Results obtained by subjecting Copper Indium Diselenide (CIS), single and triple junction amorphous silicon (a-Si and a-SiGe), Edge-defined Film-fed Growth (EFG) silicon and mono-crystalline silicon (mono-Si) modules to the assessment procedure are presented and discussed. Results obtained indicate that the thin-film modules degrade by up to 50% in performance after an initial outdoor exposure of 130 kWh/m/sup 2/. Visual inspection revealed that both crystalline modules had cracked cells. The mismatch due to the cracked cell in the EFG-Si module, however, was limited by the interconnect busbars. This paper accentuates the importance of characterizing all module performance parameters in order to analyze observed degradation and failure modes.

338 citations


Journal ArticleDOI
TL;DR: A reliability model, and a reliability analysis technique for component-based software, using scenarios of component interactions, is introduced and a probabilistic model named Component-Dependency Graph (CDG) is constructed.
Abstract: This paper introduces a reliability model, and a reliability analysis technique for component-based software. The technique is named Scenario-Based Reliability Analysis (SBRA). Using scenarios of component interactions, we construct a probabilistic model named Component-Dependency Graph (CDG). Based on CDG, a reliability analysis algorithm is developed to analyze the reliability of the system as a function of reliabilities of its architectural constituents. An extension of the proposed model and algorithm is also developed for distributed software systems. The proposed approach has the following benefits: 1) It is used to analyze the impact of variations and uncertainties in the reliability of individual components, subsystems, and links between components on the overall reliability estimate of the software system. This is particularly useful when the system is built partially or fully from existing off-the-shelf components; 2) It is suitable for analyzing the reliability of distributed software systems because it incorporates link and delivery channel reliabilities; 3) The technique is used to identify critical components, interfaces, and subsystems; and to investigate the sensitivity of the application reliability to these elements; 4) The approach is applicable early in the development lifecycle, at the architecture level. Early detection of critical architecture elements, those that affect the overall reliability of the system the most, is useful in delegating resources in later development phases.

209 citations


Journal ArticleDOI
TL;DR: This study provides some support for the idea that the Markov-chain technique might not be as robust as the other intrusion-detection methods such as the chi-square distance test technique, although it can produce better performance when the noise level of the data is low,such as the Mill & Pascal data in this study.
Abstract: Cyber-attack detection is used to identify cyber-attacks while they are acting on a computer and network system to compromise the security (e.g., availability, integrity, and confidentiality) of the system. This paper presents a cyber-attack detection technique through anomaly-detection, and discusses the robustness of the modeling technique employed. In this technique, a Markov-chain model represents a profile of computer-event transitions in a normal/usual operating condition of a computer and network system (a norm profile). The Markov-chain model of the norm profile is generated from historic data of the system's normal activities. The observed activities of the system are analyzed to infer the probability that the Markov-chain model of the norm profile supports the observed activities. The lower probability the observed activities receive from the Markov-chain model of the norm profile, the more likely the observed activities are anomalies resulting from cyber-attacks, and vice versa. This paper presents the learning and inference algorithms of this anomaly-detection technique based on the Markov-chain model of a norm profile, and examines its performance using the audit data of UNIX-based host machines with the Solaris operating system. The robustness of the Markov-chain model for cyber-attack detection is presented through discussions & applications. To apply the Markov-chain technique and other stochastic process techniques to model the sequential ordering of events, the quality of activity-data plays an important role in the performance of intrusion detection. The Markov-chain technique is not robust to noise in the data (the mixture level of normal activities and intrusive activities). The Markov-chain technique produces desirable performance only at a low noise level. This study also shows that the performance of the Markov-chain techniques is not always robust to the window size: as the window size increases, the amount of noise in the window also generally increases. Overall, this study provides some support for the idea that the Markov-chain technique might not be as robust as the other intrusion-detection methods such as the chi-square distance test technique , although it can produce better performance than the chi-square distance test technique when the noise level of the data is low, such as the Mill & Pascal data in this study.

206 citations


Journal ArticleDOI
Dilip Roy1
TL;DR: Using a general approach for discretization of continuous life distributions in the univariate & bivariate situations, a discrete Rayleigh distribution is proposed, which can replace the much relied upon simulation method.
Abstract: Using a general approach for discretization of continuous life distributions in the univariate & bivariate situations, we have proposed a discrete Rayleigh distribution. This distribution has been examined in detail with respect to two measures of failure rate. Characterization results have also been studied to establish a direct link between the discrete Rayleigh distribution, and its continuous counterpart. This discretization approach not only expands the scope of reliability modeling, but also provides a method for approximating probability integrals arising out of a continuous setting. As an example, the reliability value of a complex system has been approximated. This discrete approximation in a nonnormal setting can be of practical use & importance, as it can replace the much relied upon simulation method. While the replication required is minimal, the degree of accuracy remains reasonable for our suggested method when compared with the simulation method.

176 citations


Journal ArticleDOI
TL;DR: A functional classification is used in this paper to provide taxonomy of those voting algorithms used in safety-critical applications, and voters are classified into three categories: generic, hybrid, and purpose-built voters.
Abstract: Voting algorithms are used to provide an error masking capability in a wide range of highly dependable commercial & research applications. These applications include N-Modular Redundant hardware systems and diversely designed software systems based on N-Version Programming. The most sophisticated & complex algorithms can even tolerate malicious (or Byzantine) subsystem errors. The algorithms can be implemented in hardware or software depending on the characteristics of the application, and the type of voter selected. Many voting algorithms have been defined in the literature, each with particular strengths and weaknesses. Having surveyed more than 70 references from the literature, a functional classification is used in this paper to provide taxonomy of those voting algorithms used in safety-critical applications. We classify voters into three categories: generic, hybrid, and purpose-built voters. Selected algorithms of each category are described, for illustrative purposes, and application areas proposed. Approaches to the comparison of algorithm behavior are also surveyed. These approaches compare the acceptability of voter behavior based on either statistical considerations (e.g., number of successes, number of benign or catastrophic results), or probabilistic computations (e.g., probability of choosing correct value in each voting cycle or average mean square error) during q voting cycles.

150 citations


Journal ArticleDOI
TL;DR: This work investigates in detail the case of progressively Type-I right censored data with a single stress variable in a k-step-stress accelerated test with equal duration steps /spl tau/.
Abstract: We consider in this work a k-step-stress accelerated test with equal duration steps /spl tau/. Censoring is allowed at each change stress point i/spl tau/, i=1,...k. The problem of choosing the optimal /spl tau/ is addressed using variance optimality as well as determinant-optimality criteria. We investigate in detail the case of progressively Type-I right censored data with a single stress variable.

139 citations


Journal ArticleDOI
TL;DR: This paper addresses system reliability optimization when component reliability estimates are treated as random variables with estimation uncertainty, and Pareto optimality is an attractive alternative for this type of problem.
Abstract: Summary & Conclusions-This paper addresses system reliability optimization when component reliability estimates are treated as random variables with estimation uncertainty. System reliability optimization algorithms generally assume that component reliability values are known exactly, i.e., they are deterministic. In practice, that is rarely the case. For risk-averse system design, the estimation uncertainty, propagated from the component estimates, may result in unacceptable estimation uncertainty at the system-level. The system design problem is thus formulated with multiple objectives: (1) to maximize the system reliability estimate, and (2) to minimize its associated variance. This formulation of the reliability optimization is new, and the resulting solutions offer a unique perspective on system design. Once formulated in this manner, standard multiple objective concepts, including Pareto optimality, were used to determine solutions. Pareto optimality is an attractive alternative for this type of problem. It provides decision-makers the flexibility to choose the best-compromise solution. Pareto optimal solutions were found by solving a series of weighted objective problems with incrementally varied weights. Several sample systems are solved to demonstrate the approach presented in this paper. The first example is a hypothetical series-parallel system, and the second example is the fault tolerant distributed system architecture for a voice recognition system. The results indicate that significantly different designs are obtained when the formulation incorporates estimation uncertainty. If decision-makers are risk averse, and wish to consider estimation uncertainty, previously available methodologies are likely to be inadequate.

124 citations


Journal ArticleDOI
Yi-Kuei Lin1
TL;DR: This paper concentrates on a stochastic-flow network in which nodes as well as branches have several possible capacities, and can fail, and proposes an efficient algorithm to generate all lower boundary points for (d, C).
Abstract: System reliability evaluation for flow networks is an important issue in quality management. This paper concentrates on a stochastic-flow network in which nodes as well as branches have several possible capacities, and can fail. The possibility is evaluated that a given amount of messages can be transmitted through the stochastic-flow network under the budget constraint. Such a possibility, system reliability, is a performance index for a stochastic-flow network. A minimal path, an order sequence of nodes & branches from the source to the sink without cycles, is used to assign the flow to each component (branch or node). A lower boundary point for (d, C) is a minimal capacity vector, which enables the system to transmit d messages under the budget C. Based on minimal paths, an efficient algorithm is proposed to generate all lower boundary points for (d, C). The system reliability can then be calculated in terms of all lower boundary points for (d, C) by applying the inclusion-exclusion rule. Simulation shows that the implicit enumeration (step 1) of the proposed algorithm can be executed efficiently.

112 citations


Journal ArticleDOI
TL;DR: Two heuristic algorithms are introduced, called Iterative Two-Step-Approach (ITSA) & Maximum Likelihood Relaxation (MLR), which aim to explore the approximating optimal solutions with less computation time, and evaluate the performance of the proposed schemes.
Abstract: This paper proposes a suite of approaches to solve the survivable routing problem with shared protection. We first define in mathematics the maximum extent of resource sharing for a protection path given the corresponding working path according to the current network link-state. Then the problem of solving the least-cost working & protection path-pair (in terms of the sum of the cost) is formulated into an Integer Linear Programming process. Due to the dependency of the protection path on its working path, however, the formulation is not scalable with the network size, and takes an extra effort to solve. Therefore, we introduce two heuristic algorithms, called Iterative Two-Step-Approach (ITSA) & Maximum Likelihood Relaxation (MLR), which aim to explore the approximating optimal solutions with less computation time. We evaluate the performance of the proposed schemes, and make a comparison with some reported counterparts. The simulation results show that the ITSA scheme, with a properly defined tolerance to optimality, can achieve the best performance at the expense of more computation time. On the other hand, MLR delivers a compromise between computation efficiency & performance.

97 citations


Journal ArticleDOI
TL;DR: A feedback control approach is proposed which enables tradeoffs between the failure cost of a compromised information system and the maintenance cost of ongoing defensive countermeasures, and applies to other types of computer attacks, network-level security and other domains which could benefit from automatic decision-making based on a sequence of sensor measurements.
Abstract: We address the problem of information system survivability, or dynamically preserving intended functionality & computational performance, in the face of malicious intrusive activity. A feedback control approach is proposed which enables tradeoffs between the failure cost of a compromised information system and the maintenance cost of ongoing defensive countermeasures. Online implementation features an inexpensive computation architecture consisting of a sensor-driven recursive estimator followed by an estimate-driven response selector. Offline design features a systematic empirical procedure utilizing a suite of mathematical modeling and numerical optimization tools. The engineering challenge is to generate domain models and decision strategies offline via tractable methods, while achieving online effectiveness. We illustrate the approach with experimentation results for a prototype autonomic defense system which protects its host, a Linux-based web-server, against an automated Internet worm attack. The overall approach applies to other types of computer attacks, network-level security and other domains which could benefit from automatic decision-making based on a sequence of sensor measurements.

Journal ArticleDOI
TL;DR: DEEM is able to deal with all the scenarios of MPS which have been analytically treated in the literature, at a cost which is comparable with that of the cheapest ones, completely solving the issues posed by the phased-behavior of M PS.
Abstract: Multiple-Phased Systems (MPS), i.e., systems whose operational life can be partitioned in a set of disjoint periods, called "phases", include several classes of systems such as Phased Mission Systems and Scheduled Maintenance Systems. Because of their deployment in critical applications, the dependability modeling and analysis of Multiple-Phased Systems is a task of primary relevance. The phased behavior makes the analysis of Multiple-Phased Systems extremely complex. This paper describes the modeling methodology and the solution procedure implemented in DEEM, a dependability modeling and evaluation tool specifically tailored for Multiple Phased Systems. It also describes its use for the solution of representative MPS problems. DEEM relies upon Deterministic and Stochastic Petri Nets as the modeling formalism, and on Markov Regenerative Processes for the model solution. When compared to existing general-purpose tools based on similar formalisms, DEEM offers advantages on both the modeling side (sub-models neatly model the phase-dependent behaviors of MPS), and on the evaluation side (a specialized algorithm allows a considerable reduction of the solution cost and time). Thus, DEEM is able to deal with all the scenarios of MPS which have been analytically treated in the literature, at a cost which is comparable with that of the cheapest ones, completely solving the issues posed by the phased-behavior of MPS.

Journal ArticleDOI
TL;DR: The modular approach divides the multi-phase system into its static and dynamic subsystems, and solves them independently; and then combines the results for the solution of the entire system using the module joint probability method.
Abstract: Binary Decision Diagram (BDD)-based solution approaches and Markov chain based approaches are commonly used for the reliability analysis of multi-phase systems. These approaches either assume that every phase is static, and thus can be solved with combinatorial methods, or assume that every phase must be modeled via Markov methods. If every phase is indeed static, then the combinatorial approach is much more efficient than the Markov chain approach. But in a multi-phased system, using currently available techniques, if the failure criteria in even one phase is dynamic, then a Markov approach must be used for every phase. The problem with Markov chain based approaches is that the size of the Markov model can expand exponentially with an increase in the size of the system, and therefore becomes computationally intensive to solve. Two new concepts, phase module and module joint probability, are introduced in this paper to deal with the s-dependency among phases. We also present a new modular solution to nonrepairable dynamic multi-phase systems, which provides a combination of BDD solution techniques for static modules, and Markov chain solution techniques for dynamic modules. Our modular approach divides the multi-phase system into its static and dynamic subsystems, and solves them independently; and then combines the results for the solution of the entire system using the module joint probability method. A hypothetical example multi-phase system is given to demonstrate the modular approach.

Journal ArticleDOI
TL;DR: In this article, the authors investigated the allocation of one active redundancy when it differs depending on the component with which it is to be allocated and compared it with two active redundancies in two different ways.
Abstract: An effective way of improving the reliability of a system is the allocation of active redundancies. Let X/sub 1/, X/sub 2/ be s-independent lifetimes of the components C/sub 1/ and C/sub 2/, respectively, which form a series system. Let us denote U/sub 1/ = min(max(X/sub 1/,X),X/sub 2/) and U/sub 2/ = min(X/sub 1/, max(X/sub 2/, X)), where X is the lifetime of a redundancy (say R) s-independent of X/sub 1/ and X/sub 2/. That is, U/sub 1/(U/sub 2/) denote the lifetime of a system obtained by allocating R to C/sub 1/(C/sub 2/) as an active redundancy. Singh and Misra (1994) considered the criterion where C/sub 1/ is preferred to C/sub 2/ for the allocation of R as active redundancy if P(U/sub 1/ > U/sub 2/) /spl ges/ P(U/sub 2/ > U/sub 1/). In this paper, we use the same criterion of Singh and Misra (1994). We investigate the allocation of one active redundancy when it differs depending on the component with which it is to be allocated. We also compare the allocation of two active redundancies (say R/sub 1/ and R/sub 2/) in two different ways; that is, R/sub 1/ with C/sub 1/ & R/sub 2/ with C/sub 2/, and viceversa. For this case, the hazard rate order plays an important role. We furthermore consider the allocation of active redundancy to k-out-of-n: G systems.

Journal ArticleDOI
TL;DR: It is shown that if the subsystem being analyzed is in series with the rest of the system, then the optimal n which maximizes subsystem reliability can also maximize the system reliability, and the computational procedure of the proposed algorithms is illustrated through the examples.
Abstract: Systems subjected to imperfect fault-coverage may fail even prior to the exhaustion of spares due to uncovered component failures. This paper presents optimal cost-effective design policies for k-out-of-n:G subsystems subjected to imperfect fault-coverage. It is assumed that there exists a k-out-of-n:G subsystem in a nonseries-parallel system and, except for this subsystem, the redundancy configurations of all other subsystems are fixed. This paper also presents optimal design polices which maximize overall system reliability. As a special case, results are presented for k-out-of-n:G systems subjected to imperfect fault-coverage. Examples then demonstrate how to apply the main results of this paper to find the optimal configurations of all subsystems simultaneously. In this paper, we show that the optimal n which maximizes system reliability is always less than or equal to the n which maximizes the reliability of the subsystem itself. Similarly, if the failure cost is the same, then the optimal n which minimizes the average system cost is always less than or equal to the n which minimizes the average cost of the subsystem. It is also shown that if the subsystem being analyzed is in series with the rest of the system, then the optimal n which maximizes subsystem reliability can also maximize the system reliability. The computational procedure of the proposed algorithms is illustrated through the examples.

Journal ArticleDOI
TL;DR: A detailed error analysis, based on an alternative configuration of the same data, reveals a serious flaw in this type of data which hinders masquerade detection and indicates some steps that need to be taken to improve future results.
Abstract: A masquerade attack, in which one user impersonates another, may be one of the most serious forms of computer abuse. Automatic discovery of masqueraders is sometimes undertaken by detecting significant departures from normal user behavior, as represented by a user profile formed from system audit data. A major obstacle for this type of research is the difficulty in obtaining such system audit data, largely due to privacy concerns. An immense contribution in this regard has been made by Schonlau et al., who have made available UNIX command-line data from 50+ users collected over a number of months. Most of the research in this area has made use of this dataset, so this paper takes as its point of departure the Schonlau et al. dataset and a recent series of experiments with this data framed by the same researchers . In extending that work with a new classification algorithm, a 56% improvement in masquerade detection was achieved at a corresponding false-alarm rate of 1.3%. In addition, encouraging results were obtained at a more realistic sequence length of 10 commands (as opposed to sequences of 100 commands used by Schonlau et al.). A detailed error analysis, based on an alternative configuration of the same data, reveals a serious flaw in this type of data which hinders masquerade detection and indicates some steps that need to be taken to improve future results. The error analysis also demonstrates the insights that can be gained by inspecting decision errors, instead of concentrating only on decision successes.

Journal ArticleDOI
TL;DR: A multi-objective optimization approach, based on genetic algorithms, which transparently and explicitly account for the uncertainties in the parameters is proposed, providing the decision maker with a useful tool for determining those solutions which allow a high degree of assurance in the actual system performance.
Abstract: The determination of optimal Surveillance Test Intervals (STI) is a matter of great importance in risk-informed applications, due both to the implications that maintenance actions have on the safety of the risky plants such as the nuclear ones, and to the important amount of resources invested in maintenance operations by the industrial organizations. The common approach to determining the optimal STI uses a simplified system-availability model relying on a set of parameters at the component level (failure rate, repair rate, frequency of failure on demand, human error rate, inspection duration, etc.), whose values are typically estimated on the basis of few, sparse data, and can suffer from appreciable uncertainties. Thus, the prediction of the system behavior on the basis of the parameters' best estimates is scarcely significant, if not accompanied by some measure of the associated uncertainty, such as the variance. This paper proposes a multi-objective optimization approach, based on genetic algorithms, which transparently and explicitly account for the uncertainties in the parameters. The objectives considered (the inverse of the s-expected system failure probability and the inverse of its variance), are such as to drive the genetic search toward solutions which are guaranteed to give optimal performance with high assurance. For validation purposes, a simple case study regarding the optimization of the layout of a pipeline is firstly presented. The procedure is then applied to a more complex system taken from literature, the Residual Heat Removal safety system of a Boiling Water Reactor, for determining the optimal STI of the system components. The approach provides the decision maker with a useful tool for determining those solutions which, besides being optimal with respect to the s-expected safety behavior, allow a high degree of assurance in the actual system performance.

Journal ArticleDOI
TL;DR: Four models for optimizing the reliability of embedded systems considering both software and hardware reliability under cost constraints, and one model to optimize system cost under multiple reliability constraints are presented.
Abstract: Summary and Conclusions-This paper presents four models for optimizing the reliability of embedded systems considering both software and hardware reliability under cost constraints, and one model to optimize system cost under multiple reliability constraints. Previously, most optimization models have been developed for hardware-only or software-only systems by assuming the hardware, if any, has perfect reliability. In addition, they assume that failures for each hardware or software unit are statistically independent. In other words, none of the existing optimization models were developed for embedded systems (hardware and software) with failure dependencies. For our work, each of our models is suitable for a distinct set of conditions or situations. The first four models maximize reliability while meeting cost constraints, and the fifth model minimizes system cost under multiple reliability constraints. This is the first time that optimization of these kinds of models has been performed on this type of system. We demonstrate and validate our models for an embedded system with multiple applications sharing multiple resources. We use a Simulated Annealing optimization algorithm to demonstrate our system reliability optimization techniques for distributed systems, because of its flexibility for various problem types with various constraints. It is efficient, and provides satisfactory optimization results while meeting difficult-to-satisfy constraints.

Journal ArticleDOI
TL;DR: In this paper, an optimal allocation problem is studied for a monotonic failure rate repairable system under some resource constraints and the optimal allocation algorithm is derived.
Abstract: The effect of a repair of a complex system can usually be approximated by the following two types: minimal repair for which the system is restored to its functioning state with minimum effort, or perfect repair for which the system is replaced or repaired to a good-as-new state. When both types of repair are possible, an important problem is to determine the repair policy; that is, the type of repair which should be carried out after a failure. In this paper, an optimal allocation problem is studied for a monotonic failure rate repairable system under some resource constraints. In the first model, the numbers of minimal & perfect repairs are fixed, and the optimal repair policy maximizing the expected system lifetime is studied. In the second model, the total amount of repair resource is fixed, and the costs of each minimal & perfect repair are assumed to be known. The optimal allocation algorithm is derived in this case. Two numerical examples are shown to illustrate the procedures.

Journal ArticleDOI
TL;DR: A generalized stress-strength interference (SSI) reliability model to consider stochastic loading and strength aging degradation is presented and a numerical recurrence formula is presented based on the Gauss-Legendre quadrature formula to calculate multiple integrations of a random variable vector.
Abstract: A generalized stress-strength interference (SSI) reliability model to consider stochastic loading and strength aging degradation is presented in this paper. This model conforms to previous models for special cases, but also demonstrates the weakness of those models when multiple stochastic elements exist. It can be used for any nonhomogeneous Poisson loading process, and any kind of strength aging degradation model. To solve the SSI reliability equation, a numerical recurrence formula is presented based on the Gauss-Legendre quadrature formula to calculate multiple integrations of a random variable vector. Numerical analysis of three examples shows this SSI reliability model provides accurate results for both homogeneous & nonhomogeneous Poisson loading processes.

Journal ArticleDOI
TL;DR: The results indicate that the Chi square distance measure with the EWMA forecasting provides better performance in intrusion detection than that with the average-based forecasting method.
Abstract: Intrusions into computer systems have caused many quality/reliability problems. Detecting intrusions is an important part of assuring the quality/reliability of computer systems by quickly detecting intrusions and associated quality/reliability problems in order to take corrective actions. In this paper, we present and compare two methods of forecasting normal activities in computer systems for intrusion detection. One forecasting method uses the average of long-term normal activities as the forecast. Another forecasting method uses the EWMA (exponentially weighted moving average) one-step-ahead forecast. We use a Markov chain model to learn and predict normal activities used in the EWMA forecasting method. A forecast of normal activities is used to detect a large deviation of the observed activities from the forecast as a possible intrusion into computer systems. A Chi square distance metric is used to measure the deviation of the observed activities from the forecast of normal activities. The two forecasting methods are tested on computer audit data of normal and intrusive activities for intrusion detection. The results indicate that the Chi square distance measure with the EWMA forecasting provides better performance in intrusion detection than that with the average-based forecasting method.

Journal ArticleDOI
TL;DR: The proposed statistical methodology is especially useful when intermittent inspection is the only feasible way of checking the status of test units during a step-stress test.
Abstract: This paper studies statistical analysis of grouped and censored data obtained from a step-stress accelerated life test We assume that the stress change times in the step-stress life test are fixed and the lifetimes observed are type I censored Maximum likelihood estimates and asymptotic confidence intervals for model parameters are obtained We provide an asymptotic statistical test for the cumulative exposure model based on the grouped and type I censored data We also present the optimum test plan for a simple step-stress test when the lifetime under constant stress is assumed exponential Finally we give an application of our methods by applying our analysis process to a real life data set The proposed statistical methodology is especially useful when intermittent inspection is the only feasible way of checking the status of test units during a step-stress test

Journal ArticleDOI
TL;DR: A goodness-of-fit test for the location-scale family based on progressively Type-II censored data is proposed, which has good power properties in detecting departures from the s-normal and Gumbel distributions.
Abstract: There has been extensive research on goodness-of-fit procedures for testing whether or not a sample comes from a specified distribution. These goodness-of-fit tests range from graphical techniques, to tests which exploit characterization results for the specified underlying model. In this article, we propose a goodness-of-fit test for the location-scale family based on progressively Type-II censored data. The test statistic is based on sample spacings, and generalizes a test procedure proposed by Tiku . The null distribution of the test statistic is shown to be approximated closely by a s-normal distribution. However, in certain situations it would be better to use simulated critical values instead of the s-normal approximation. We examine the performance of this test for the s-normal and extreme-value (Gumbel) models against different alternatives through Monte Carlo simulations. We also discuss two methods of power approximation based on s-normality, and compare the results with those obtained by simulation. Results of the simulation study for a wide range of sample sizes, censoring schemes, and different alternatives reveal that the proposed test has good power properties in detecting departures from the s-normal and Gumbel distributions. Finally, we illustrate the method proposed here using real data from a life-testing experiment. It is important to mention here that this test can be extended to multi-sample situations in a manner similar to that of Balakrishnan et al.

Journal ArticleDOI
TL;DR: Type I progressively censored variable-sampling plans for Weibull lifetime distributions under competing causes of failure are discussed and the proposed procedure is attractive as it yields useful degradation-related information for improving product quality.
Abstract: For many high reliability products where very few items are expected to fail during the test period, testing under normal conditions is not feasible. Further, the requirement for high reliability increases the need for test procedures which yield valuable degradation and other useful information for improving product reliability. Thus in some manufacturing and other experiments, various types of failure censored and accelerated life tests are commonly employed for life testing. In this paper we discuss Type I progressively censored variable-sampling plans for Weibull lifetime distributions under competing causes of failure. The proposed procedure is attractive as it yields useful degradation-related information for improving product quality. In addition, the procedure is useful when a test is conducted under severe time constraint and/or when the experimenter wishes to save costly specimens or scarce test facilities for other use.

Journal ArticleDOI
TL;DR: A technique designated as the Path Tracing Algorithm is presented, which can handle both simple and complex networks, and considers both unidirectional and bi-directional branches, and does not require limits on the size of the network.
Abstract: Network reliability analysis is usually based on minimal path or cut enumeration from which the associated reliability expressions are deduced. The cut-set method is a popular approach in the reliability analysis of many systems from simple to complex configurations. The computational requirements necessary to determine the minimal cut-sets of a network depend mainly on the number of minimal paths between the source and the sink. A technique designated as the "Path Tracing Algorithm" is presented in this paper, which can handle both simple and complex networks, and considers both unidirectional and bi-directional branches. A step by step procedure is explained using a bridge-network. The algorithm is easy to program, and does not require limits on the size of the network. The applicability of the proposed technique is illustrated by application to a more complicated system.

Journal ArticleDOI
TL;DR: A new model which generalizes the linear consecutive k-out-of-r-from-n system to the case of multiple failure criteria is proposed and an algorithm for system reliability evaluation is suggested which is based on an extended universal moment generating function.
Abstract: This paper proposes a new model which generalizes the linear consecutive k-out-of-r-from-n system to the case of multiple failure criteria. In this model, the system consists of n linearly ordered elements, and fails iff in any group of r/sub 1/,...,r/sub H/ consecutive elements less than k/sub 1/,...,k/sub H/ elements respectively are in a working state. An algorithm for system reliability evaluation is suggested which is based on an extended universal moment generating function. Examples of system reliability evaluation are presented.

Journal ArticleDOI
Jun Bai1, Hoang Pham1
TL;DR: The approach incorporates the information of system structure, the value of time and the impact of repair actions, which are of great importance to warranty cost prediction and evaluation, but have not been sufficiently studied in the literature of warranty analysis.
Abstract: Many factors should be considered in modeling DWC (discounted warranty cost) of repairable systems or products including system structure, components' failure processes, methods of discounting as well as the warranty policy itself. In this paper, we present DWC models for repairable series systems. In particular, a free repair warranty policy and a pro-rata warranty policy are studied. The impact of repair actions on components' failure times is assumed to be minimal, hence NHPPs are used to describe the failure processes. Two types of discounting methods are considered in this paper: a continuous discount function and a discrete discount function. Expressions for both the expected value and variance of DWC are derived. The applications of our findings can be seen in warranty design, warranty reserve determination and risk analysis. Our approach incorporates the information of system structure, the value of time and the impact of repair actions, which are of great importance to warranty cost prediction and evaluation, but have not been sufficiently studied in the literature of warranty analysis.

Journal ArticleDOI
TL;DR: It is shown that the presence of crosstalk faults at voter inputs may impair both the voting, and the diagnosis mechanisms, and this problem has been quantified by applying a probabilistic model of cOSstalk fault effects on voting and diagnosis to a set of benchmark circuits.
Abstract: In high reliability systems, the effectiveness of fault tolerant techniques, such as Triple-Modular-Redundancy (TMR), must be validated with respect to the faults that are likely in the current technology. In todays' Integrated Circuits (IC), this is the case of crosstalks, whose importance is growing because of device & interconnect scaling. This paper analyzes the problem of crosstalk faults at the inputs of voters in TMR systems. In particular, possible problems are illustrated, and it is shown that such crosstalk may invalidate the reliability of both voting, and diagnosing operations. The problem is analyzed from a probabilistic point of view. Its occurrence is estimated by using a set of TMR systems obtained with combinational benchmarks as functional modules. The possible problems of such operations are discussed in the presence of crosstalk faults. It is shown that crosstalk may invalidate the reliability of both voting, and diagnosis operations. A probabilistic model of the voting & diagnosis operations in the presence of crosstalk has been developed. Finally, such a model has been used to estimate the probability of voting & diagnosis failures in a set of TMR systems obtained by using combinational benchmarks as functional modules. We have shown that the presence of crosstalk faults at voter inputs may impair both the voting, and the diagnosis mechanisms. This problem has been quantified by applying a probabilistic model of crosstalk fault effects on voting and diagnosis to a set of benchmark circuits. Results show that crosstalk may create a reliability problem for TMR systems. Such a problem can be solved by using on-line testing or design for testability providing additional controllability & observability to the replicated functional units.

Journal ArticleDOI
TL;DR: Under the proposed definition, multi-state systems are divided into two groups without reference to component relevancy conditions: dominant systems, and nondominant systems.
Abstract: In this paper, we propose a definition of the dominant multi-state system. Under the proposed definition, multi-state systems are divided into two groups without reference to component relevancy conditions: dominant systems, and nondominant systems. Dominant systems can be further divided into two groups: with binary image, and without binary image. A multi-state system with binary image implies that its structure function can be expressed in terms of binary structure functions such that it can be treated as a binary system structure, and existing algorithms for reliability evaluation of binary systems can be applied for system performance evaluation. A technique is provided for establishing the bounds of performance distribution of dominant systems without binary image. The properties of dominant systems are studied. Examples are given to illustrate applications of the proposed definitions and methods.

Journal ArticleDOI
TL;DR: The TDD-based separable approach is presented, and compared with existing methods for analyzing the GPMS reliability, and an example generalized phased-mission system is analyzed to illustrate the advantages of the approach.
Abstract: This paper considers the reliability analysis of a generalized phased-mission system (GPMS) with two-level modular imperfect coverage. Due to the dynamic behavior & the statistical dependencies, generalized phased-mission systems offer big challenges in reliability modeling & analysis. A new family of decision diagrams called ternary decision diagrams (TDD) is proposed for use in the resulting separable approach to the GPMS reliability evaluation. Compared with existing methods, the accuracy of our solution increases due to the consideration of modular imperfect coverage; the computational complexity decreases due to the nature of the TDD, and the separation of mission imperfect coverage from the solution combinatorics. In this paper, the TDD-based separable approach is presented, and compared with existing methods for analyzing the GPMS reliability. An example generalized phased-mission system is analyzed to illustrate the advantages of our approach.