scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

A decision support model for determining the applicability of prognostic health management (PHM) approaches to electronic systems

04 Apr 2005-pp 422-427
TL;DR: A model that enables determining on an application-specific basis when the reliability of electronics has become predictable enough to warrant the application of PHM-based scheduled maintenance concepts and the determination of optimal safety margins on life consumption monitoring predictions and prognostic distances for health monitoring is presented.
Abstract: This paper presents a model that enables the determination of when scheduled maintenance makes sense, and how to optimally interpret prognostic health management (PHM) results for electronic systems. In this context, optimal interpretation of PHM results means translating PHM information into maintenance policies that minimize life cycle costs. The electronics PHM problem is characterized by imperfect and partial monitoring and a significant random/overstress failure component must be considered in the decision process. Specifically the model enables determining on an application-specific basis when the reliability of electronics has become predictable enough to warrant the application of PHM-based scheduled maintenance concepts. Given that the forecasting ability of PHM (whether health monitoring or life consumption monitoring based) is fraught with uncertainties in the sensor data collected, the models applied, the material parameters assumed in the models, etc., the model in this paper addresses how PHM results can be interpreted so as to provide value to the system. The result of the model is the determination of optimal safety margins on life consumption monitoring predictions and prognostic distances for health monitoring. The model provides the type of information needed to construct a business case showing the application-specific usefulness of health monitoring and/or life consumption monitoring for electronic systems.

Summary (3 min read)

1. INTRODUCTION

  • Prognostics is the estimation of remaining life in terms that are useful to the maintenance decision process.
  • Most approaches to PHM are focused on monitoring failure precursor indications (i.e., health monitoring), which does not require system failures to be deterministic in nature, but does require that the failure precursor have a deterministic link to the actual system failure.
  • Modeling to determine the optimum schedule for performing maintenance for systems is not a new concept.
  • Maintenance modeling has not been widely applied to electronic systems where presumed random electronics failure is usually modeled as an unscheduled maintenance activity, and wear-out is beyond the end of the system’s life.
  • This boils down to determining optimal safety margins on life consumption monitoring predictions and prognostic distances1 for health monitoring.

2. MODEL FORMULATION

  • The following model accommodates variable time-tofailure and LCM forecast distributions.
  • The model considers only one LRU (Line Replaceable Unit) within a larger system.
  • To assess PHM, relevant failure mechanisms must be segregated into two types: Failure mechanisms that are random from the view point of the PHM methodology.
  • Note, the model formulation is presented based on “time” to failure measured in operational hours, however, the relevant quantity could be a non-time measure such as thermal cycles.
  • Example results generated using all the approaches discussed in this section are presented in Section 3.

2.1 Fixed Scheduled Maintenance Interval

  • This case is well understood, but included herein because it serves to define the general approach that will be used for assessing health monitoring and life consumption monitoring.
  • In this case a fixed scheduled maintenance interval is selected that is kept constant for all instances of the LRU throughout the system life cycle.
  • The following algorithm is used to accumulate life cycle costs (C) based on time stepping through the lifetimes of a statistically relevant set of LRUs where T is time: 1. Defined Time-to-Failure (TTF) distributions (subscript R = random, subscript P = predictable) 2. Sample the TTF distributions to get TTFR and TTFP 3.
  • Repeat steps 2-5 until T > operation and support life of the system 7.
  • To model the random failure rate, a uniform distribution with a height equal to the average random failure rate per year and a width equal to the inverse of the average random failure rate is created and sampled to get TTFR.

2.2 Life Consumption Monitoring (LCM)

  • Life Consumption Monitoring is defined in this paper as the process by which a history of environmental stresses (e.g., thermal, vibration) is used in conjunction with physics of failure models to compute damage accumulated and thereby forecast life remaining.
  • For this example, the LCM forecast was modeled as a symmetric triangular distribution with a most likely value set to the time-to-failure of the LRU instance and a fixed width measured in operational hours, (Fig. 2).
  • The LCM distribution is then sampled and if the LCM sample minus the safety margin is less than the actual time-to-failure of the LRU instance then LCM was successful (failure avoided).
  • If successful, a scheduled maintenance activity is performed and the timeline is incremented by the LCM sampled time-to-failure minus the safety factor.
  • The Histograms allow us to choose safety margins that minimize mean life cycle cost or alternatively minimize the cumulative life cycle cost of all units sampled.

2.3 Health Monitoring (HM)

  • Health monitoring is defined in this paper as monitoring for failure precursors.
  • If the health monitoring distribution sample is greater than the actual time-to-failure of the LRU instance then health monitoring was unsuccessful.
  • If successful, a scheduled maintenance activity is performed and the timeline is incremented by the health monitoring sampled time-tofailure.
  • For both the LCM and HM approaches, the performance of the PHM methodology is modeled as a probability distribution taking into account uncertainties embedded in the methodology, sensors, models, etc.

3. MODEL RESULTS

  • All of the variable inputs to the model can be treated as probability distributions or as fixed values, however, for example purposes, only the time to failure and performance of the PHM methodologies have been characterized by probability distributions.
  • For long scheduled maintenance intervals virtually every instance of the LRU fails prior to the scheduled maintenance activity and the life cycle cost per unit becomes equivalent to a simple unscheduled maintenance model.
  • The authors have also found that the effective life cycle cost is very sensitive to the width of the LCM forecasted timeto-failure distribution for the specific LRU (one case with a 2000 hour width is shown in Fig. 5).
  • Figure 6 shows example results from a health monitoring solution for the example described in Table I. Figure 6 indicates that the health monitoring based methodology yields lower life cycle costs then either unscheduled maintenance or scheduled maintenance with a fixed interval.
  • As the safety margin or prognostic distance increase (or fixed scheduled maintenance interval is small) the failures avoided limits to 100% in all cases (with and without random failures included).

4. DISCUSSION

  • Previous PHM work on electronic systems has demonstrated life consumption monitoring for electronic systems, [4].
  • Such an analysis becomes non-trivial when one considers the accuracy that life consumption monitoring results are likely to have (imperfect and partial monitoring conditions).
  • The 1990’s perfected obtaining and storing large amounts of information, and as a result, the world is wading in a lot more information that it knows how to use.
  • The single LRU model presented in this paper is being extended to treat multiple LRUs.
  • Realistic PHM solutions for electronic systems will probably be mixtures of LCM, HM and scheduled maintenance Treatment of redundancy Second order uncertainty (uncertainty about uncertainty) may be a real issue in the treatment of this problem.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

A Decision Support Model for Determining the Applicability of
Prognostic Health Management (PHM) Approaches to Electronic
Systems
Peter Sandborn, CALCE, Dept. of Mechanical Eng., University of Maryland
Key Words: prognostic health management (PHM), decision support, cost modeling, imperfect monitoring
SUMMARY & CONCLUSIONS
This paper presents a model that enables the
determination of when scheduled maintenance makes sense,
and how to optimally interpret Prognostic Health Management
(PHM) results for electronic systems. In this context, optimal
interpretation of PHM results means translating PHM
information into maintenance policies that minimize life cycle
costs. The electronics PHM problem is characterized by
imperfect and partial monitoring and a significant
random/overstress failure component must be considered in
the decision process. Specifically the model enables
determining on an application-specific basis when the
reliability of electronics has become predictable enough to
warrant the application of PHM-based scheduled maintenance
concepts. Given that the forecasting ability of PHM (whether
health monitoring or life consumption monitoring based) is
fraught with uncertainties in the sensor data collected, the
models applied, the material parameters assumed in the
models, etc., the model in this paper addresses how PHM
results can be interpreted so as to provide value to the system.
The result of the model is the determination of optimal safety
margins on life consumption monitoring predictions and
prognostic distances for health monitoring.
The model provides the type of information needed to
construct a business case showing the application-specific
usefulness of health monitoring and/or life consumption
monitoring for electronic systems.
1. INTRODUCTION
Prognostics is the estimation of remaining life in terms
that are useful to the maintenance decision process.
All PHM
approaches are essentially the extrapolation of trends based on
recent observations to estimate remaining life, [1].
Unfortunately, this calculation alone does not provide
sufficient information to form a decision or to determine
corrective action. Without comprehending the corresponding
measures of the uncertainty associated with the calculation,
remaining life projections have little practical value, [1]. It is
the comprehension of the corresponding uncertainties
(decision making under uncertainty) that is at the heart of
being able to develop a business case that addresses prognostic
requirements.
Electronic systems have not traditionally been subject to
Prognostic Health Management (PHM) because their time to
wear-out was assumed to be much longer than the system life
cycle or technology refresh period (non-life limited). Most
approaches to PHM are focused on monitoring failure
precursor indications (i.e., health monitoring), which does not
require system failures to be deterministic in nature, but does
require that the failure precursor have a deterministic link to
the actual system failure. While there is considerable existing
work on failure precursors for mechanical systems, only a few
attempts have been made to apply health monitoring to
electronics, [2-3]. Alternatively, Life Consumption
Monitoring (LCM), which is another approach to PHM,
depends on the deterministic nature of system failures. In
LCM, a history of environmental stresses (e.g., thermal,
vibration) is used in conjunction with physics of failure
models to compute damage accumulated and thereby forecast
life remaining, [4]. Electronic systems have not traditionally
been subject to LCM because the distribution of failures prior
to wear-out was considered to be random (non-deterministic).
With the transition from military-specification parts to
commercial-off-the-shelf (COTS) parts, many of which are
now targeted for lifetimes in the 5 to 7 year range, wear-out of
electronics parts is becoming a relevant concern for long field
life systems, [5]. In addition, physics of failure approaches to
modeling electronic system reliability have shown that time-
to-failure prior to wear-out for electronic parts can be
predicted within quantifiable bounds of uncertainty, [6].
Modeling to determine the optimum schedule for
performing maintenance for systems is not a new concept.
Examples of traditional applications of maintenance modeling
include production equipment [7] and the hardware portions of
engines and other propulsion systems [8]. However,
maintenance modeling has not been widely applied to
electronic systems where presumed random electronics failure
is usually modeled as an unscheduled maintenance activity,
and wear-out is beyond the end of the system’s life.
Although many applicable models for single and multi-
unit maintenance planning have appeared [9,10], the majority
of the models assume that monitoring information is perfect
(without uncertainty) and complete (all units are monitored the
same), i.e., maintenance planning can be performed with
perfect knowledge as to the state of each unit. For many types
of systems, and especially electronic systems these are not
good assumptions and maintenance planning, if possible at all,
becomes an exercise in decision making under uncertainty
with sparse data. The perfect monitoring assumption is
especially problematic when the PHM approach is Life
Proc. Reliability and Maintainability Symposium (RAMS), Arlington, VA, Jan. 2005

Consumption Monitoring (LCM) because LCM does not
depend on detecting precursors to failure. When managing
electronic systems, system-level and major component
failures are caused by a mixture of failure mechanisms. These
failure mechanisms result from defects, wearout, overstress
conditions, and random system interactions. These types of
failures are a mixture of predictable and partially predictable
events. Thus, for electronics, LCM PHM processes do not
deliver any measures that correspond exactly to the state of the
system. Previous work that treats imperfect monitoring
includes, [11] and [12]. Perfect, but partial monitoring has
been treated in [13].
This paper presents a model which determines when
scheduled maintenance makes good business sense. The
model shows how to optimally interpret damage accumulation
or failure precursor monitoring data. This will apply to failure
events that appear to be random or appear to be clearly caused
by defects, wearout, or overstress conditions. This optimal
interpretation of the data means that we can optimize system
availability and minimize life cycle cost. Specifically the
model is targeted at addressing the following questions:
How do we determine on an application-specific basis
when the reliability of electronics has become predictable
enough to warrant the application of PHM-based
scheduled maintenance concepts? Note, we do not mean
to imply that predictability in isolation is the criteria for
PHM vs. non-PHM solutions, e.g., if the system reliability
is predictable and
very reliable, it would not make sense
to implement a PHM solution.
Given that the forecasting ability of PHM (health
monitoring or life consumption monitoring based) is
fraught with uncertainties in the sensor data collected, the
data reduction methods, the models applied, the material
parameters assumed in the models, etc., how can PHM
results be interpreted so as to provide value? This boils
down to determining optimal safety margins on life
consumption monitoring predictions and prognostic
distances
1
for health monitoring.
How can a business case be constructed to show the
usefulness of health monitoring and/or life consumption
monitoring for electronic systems?
This paper describes the formulation of a single LRU
(Line Replaceable Unit) stochastic optimization model.
2. MODEL FORMULATION
The following model accommodates variable time-to-
failure and LCM forecast distributions. The model considers
only one LRU (Line Replaceable Unit) within a larger system.
The model treats all inputs as probability distributions, i.e., a
stochastic analysis is used (Monte Carlo). Various
maintenance interval and PHM approaches are distinguished
by how sampled time-to-failure values are used to model PHM
forecasting distributions. The metrics computed are: life cycle
1
The duration (e.g., measured in time or cycles) between the actual failure
and the point where the prognostic structure fails or indicates failure.
cost, failures avoided, and operational availability. Nothing
about the modeling is electronic system specific.
To assess PHM, relevant failure mechanisms must be
segregated into two types:
Failure mechanisms that are random from the view point
of the PHM methodology. These are failure mechanisms
that the PHM methodology is not collecting any
information about (non-detection events). These failure
mechanisms may be predictable but are outside the scope
of the PHM methods applied.
Failure mechanisms that are predictable to some degree
from the view point of the PHM methodology, i.e., for
which a probability distribution can be assigned.
Several cases are considered in the model that follows: 1)
a fixed scheduled maintenance interval that is kept constant
for all instances of the LRU throughout the system life cycle;
where an “instance” is one particular fielded LRU; 2) a
variable maintenance schedule for the LRU that is based on
inputs from a Life Consumption Monitoring (LCM)
methodology; and 3) a variable maintenance scheduled for the
LRU that is based on a Health Monitoring (HM) methodology.
Note, the model formulation is presented based on “time” to
failure measured in operational hours, however, the relevant
quantity could be a non-time measure such as thermal cycles.
Example results generated using all the approaches
discussed in this section are presented in Section 3.
2.1 Fixed Scheduled Maintenance Interval
This case is well understood, but included herein because
it serves to define the general approach that will be used for
assessing health monitoring and life consumption monitoring.
In this case a fixed scheduled maintenance interval is selected
that is kept constant for all instances of the LRU throughout
the system life cycle. In this case the LRU is replaced on a
fixed interval (measured in operational hours), i.e., time-based
prognostics. Consider the simply represented time to failure
distribution shown in Fig. 1.
The following algorithm is used to accumulate life cycle
costs (C) based on time stepping through the lifetimes of a
statistically relevant set of LRUs where T is time:
Most likely value
Probability
Various widths
Time-to-Failure (TTF)
Fig. 1. Symmetric triangular time to failure distribution.
Note, the model is not constrained in any way to
working with either symmetric or triangular
distributions, other distributions can be used.

1. Defined Time-to-Failure (TTF) distributions (subscript R
= random, subscript P = predictable)
2. Sample the TTF distributions to get TTF
R
and TTF
P
3. Compare TTF
S
=min(TTF
R
, TTF
P
) to a defined fixed
maintenance interval (MI)
4. If TTF
S
MI then T = T+ TTF
S
and C = C + C
us
(subscript us = unscheduled)
5. If TTF
S
>MI then T = T+ MI and C = C + C
s
(subscript s
= scheduled) – in this case a failure was avoided
6. Repeat steps 2-5 until T > operation and support life of
the system
7. Repeat steps 1-6 a statistically relevant number of times
in order to build histograms of life cycle costs,
availability, and failures avoided.
The random times-to-failures, TTF
R
, are characterized by
an average random failure rate per operational period (e.g., 1
year) expressed as a fraction of the total units fielded. To
model the random failure rate, a uniform distribution with a
height equal to the average random failure rate per year and a
width equal to the inverse of the average random failure rate is
created and sampled to get TTF
R
.
2.2 Life Consumption Monitoring (LCM)
Life Consumption Monitoring is defined in this paper as
the process by which a history of environmental stresses (e.g.,
thermal, vibration) is used in conjunction with physics of
failure models to compute damage accumulated and thereby
forecast life remaining.
The LCM methodology forecasts a unique time-to-failure
distribution for each instance of the LRU based on its unique
environmental stress history. For this example, the LCM
forecast was modeled as a symmetric triangular distribution
with a most likely value set to the time-to-failure of the LRU
instance and a fixed width measured in operational hours,
(Fig. 2). The shape and width of LCM distribution depends on
the uncertainties associated with the sensing technologies and
uncertainties in the prediction of the damage accumulated
(data and model uncertainty). The variable to be optimized in
this case is the safety margin assumed on the LCM forecasted
time-to-failure, i.e., the length of time (e.g., in operation
hours) before the LCM forecasted failure the unit should be
replaced. The model proceeds in the following way: for each
time-to-failure distribution sample, an LCM distribution is
created that is centered on the time-to-failure. The LCM
distribution is then sampled and if the LCM sample minus the
safety margin is less than the actual time-to-failure of the LRU
instance then LCM was successful (failure avoided). If the
LCM distribution sample minus the safety margin is greater
than the actual time-to-failure of the LRU instance then LCM
was unsuccessful. If successful, a scheduled maintenance
activity is performed and the timeline is incremented by the
LCM sampled time-to-failure minus the safety factor. If
unsuccessful, an unscheduled maintenance activity is
performed and the timeline is incremented by the actual time-
to-failure of the LRU instance. As with the fixed scheduled
maintenance interval approach, the LCM model is
implemented as a stochastic simulation. In this simulation, a
statistically relevant number of LRUs are considered in order
to construct Histograms. The Histograms allow us to choose
safety margins that minimize mean life cycle cost or
alternatively minimize the cumulative life cycle cost of all
units sampled. Also similar to the fixed interval case, a
random failure component is superimposed.
2.3 Health Monitoring (HM)
Health monitoring is defined in this paper as monitoring
for failure precursors. For health monitoring the methodology
presented in this paper can be used to determine prognostic
distance. Where the prognostic distance is a measure of how
long before system failure the prognostic structures or
prognostic cell is expected to fail or indicate failure (in
operational hours for example). The health monitoring
methodology forecasts a unique time-to-failure distribution for
each instance of the LRU based on its time-to-failure. For this
example, the health monitoring forecast was modeled as a
symmetric triangular distribution with a most likely value set
to the time-to-failure of the LRU instance minus the
prognostic distance, Fig. 3. The health monitoring distribution
has a fixed width measured in the relevant environmental
stress units (e.g., operational hours in our example)
representing the probability of the prognostic structure
indicating the precursor to a failure. As a simple example, if
the prognostic structure was a fuse that is designed to fail at
some prognostic distance earlier than the system it protects,
then for this example the distribution on the right side of Fig. 3
represents the distribution of fuse failures. The variable to be
Most likely value
Probability
Various widths
Time-to-Failure (TTF)
LRU TTF Sample
Probability
LCM width
LCM Forecasted Time-to-Failure
Safety Margin
Sampled LCM TTF forecast
Maintenance interval
for the sample
Sample
Fig. 2. Life consumption monitoring modeling approach.
Symmetric triangular distributions are shown for
simplicity.
Most likely value
Probability
Various widths
Time-to-Failure (TTF)
Sample
LRU TTF
Sample
Probability
HM width
Health Monitoring Forecasted
Time-to-Failure
Prognostic
Distance
Sampled HM TTF
forecast and
maintenance interval
for the sample
Fig. 3. Health monitoring modeling approach. Symmetric
triangular distributions are shown for simplicity.

optimized in the HM case is the prognostic distance assumed
on the health monitoring forecasted time-to-failure. The
model proceeds in the following way: for each time-to-failure
distribution sample, a health monitoring distribution is created
that is centered on the time-to-failure minus the prognostic
distance. The health monitoring distribution is then sampled
and if the health monitoring sample is less than the actual
time-to-failure of the LRU instance then health monitoring
was successful. If the health monitoring distribution sample is
greater than the actual time-to-failure of the LRU instance
then health monitoring was unsuccessful. If successful, a
scheduled maintenance activity is performed and the timeline
is incremented by the health monitoring sampled time-to-
failure. If unsuccessful, an unscheduled maintenance activity
is performed and the timeline is incremented by the actual
time-to-failure of the LRU instance.
For both the LCM and HM approaches, the performance
of the PHM methodology is modeled as a probability
distribution taking into account uncertainties embedded in the
methodology, sensors, models, etc. Note that in both cases the
PHM probability distributions are coupled to the actual failure
of an instance of the LRU.
3. MODEL RESULTS
Variables in the current model are shown in Table I.
Table I. Data assumptions for cases presented in this paper.
Variable in the model Value used for example analysis
Production cost (per unit) $10,000
Time to failure 5000 operational hours = the most
likely value (symmetric triangular
distribution with variable
distribution width)
Operational hours per
year
2500
Sustainment life 25 years
Unscheduled Scheduled
Value of each hour out of
service
$10,000 $500
Time to repair 6 hours 4 hours
Time to replace 1 hour 0.7 hours
Cost of repair (materials
cost)
$500 $350
Fraction of repairs
requiring replacement of
the LRU (as opposed to
repair of the LRU)
1.0 0.7
All of the variable inputs to the model can be treated as
probability distributions or as fixed values, however, for
example purposes, only the time to failure and performance of
the PHM methodologies have been characterized by
probability distributions. Note, all the life cycle cost results
provided in the remainder of this paper are the mean life cycle
cost from a probability distribution of life cycle costs
generated by the model.
Figure 4 shows the fixed scheduled maintenance interval
results. 10,000 LRUs were simulated in a Monte Carlo
analysis and the mean life cycle costs are plotted in Fig. 4.
The general characteristics in Fig. 4 are intuitive: For short
scheduled maintenance intervals, virtually no expensive
unscheduled maintenance occurs, but the life cycle cost per
unit is high because large amounts of remaining life in the
LRU are thrown away. For long scheduled maintenance
intervals virtually every instance of the LRU fails prior to the
scheduled maintenance activity and the life cycle cost per unit
becomes equivalent to a simple unscheduled maintenance
model. For some scheduled maintenance interval between the
extremes, the life cycle cost per unit is minimized. If the time-
to-failure distribution had a width of zero, then the optimum
fixed scheduled maintenance interval would be exactly equal
to the time-to-failure. As the time-to-failure distribution
becomes wider (i.e., the time-to-failure is less well defined), a
practical fixed scheduled maintenance interval becomes more
difficult to find and the best solution approaches an
unscheduled maintenance model.
Figure 5 shows example results from life consumption
monitoring. Several general trends are apparent. First, the
width of the time-to-failure distribution has little effect on the
LCM-based results. It is important to note that in this model,
100000
150000
200000
250000
300000
0 2000 4000 6000 8000 10000 12000
Fixed Scheduled Maintenance Interval (operational hours)
Effective Life Cycle Cost (per unit)
1000 hours
2000 hours
4000 hours
6000 hours
8000 hours
10000 hours
4000 hours (alt sched maint)
TTF Distribution is
very deterministic
TTF distribution is not
very deterministic
Width of the time to failure distribution
~Unscheduled
maintenance solution
100000
150000
200000
250000
300000
0 2000 4000 6000 8000 10000 12000
Fixed Scheduled Maintenance Interval (operational hours)
Effective Life Cycle Cost (per unit)
Width of the time to failure
distribution = 6000 operational hours
0
25
50
75
100
Failures Avoided (%)
Life Cycle Cost
Failures Avoided
No random failures
10% random failures per year
10% random failures per year
No random failures
Fig. 4. Variation of the effective life cycle cost per unit
with the fixed scheduled maintenance interval (10,000
Monte Carlo samples). Top: no random failures assumed;
Bottom: 10% random failures per year included, variation
in failures avoided also shown.

the characteristics of the actual time-to-failure distribution are
uncoupled from the characteristics of the LCM forecasted
time-to-failure distribution, i.e., they are assumed to have
different widths. This is reasonable because the actual time-
to-failure distribution is a distribution of the time-to-failures of
a large population of LRUs, and the LCM forecasted time-to-
failure distribution is the prediction of the time-to-failure of a
specific LRU given uncertainties in its environmental stress
history, damage accumulated prior to the start of life
consumption monitoring, and variations in its materials and
manufacturing. The second trend is that for all but the
narrowest distribution on the LRU time-to-failure, the LCM-
based methodology yields lower life cycle costs then either
unscheduled maintenance or scheduled maintenance with a
fixed interval. However, for a very narrow time-to-failure
distribution, a fixed scheduled maintenance interval may yield
a lower life cycle cost solution than a LCM-based
methodology. We have also found that the effective life cycle
cost is very sensitive to the width of the LCM forecasted time-
to-failure distribution for the specific LRU (one case with a
2000 hour width is shown in Fig. 5).
Figure 6 shows example results from a health monitoring
solution for the example described in Table I. Figure 6
indicates that the health monitoring based methodology yields
lower life cycle costs then either unscheduled maintenance or
scheduled maintenance with a fixed interval. However, life
consumption monitoring results in lower life cycle costs (at
least for the numbers assumed in this example). Figure 6 also
shows that the effective life cycle cost is very sensitive to the
width of the health monitoring forecasted time-to-failure
distribution for the specific LRU.
Figures 4-6 include plots of the failures avoided for
several of the cases considered with and without random
failures included in the analysis. In all cases the failure
avoided when random failures are included is lower than when
random failures are not included and the safety margin or
prognostic distance is small (or fixed scheduled maintenance
interval is large). As the safety margin or prognostic distance
increase (or fixed scheduled maintenance interval is small) the
failures avoided limits to 100% in all cases (with and without
random failures included). However, for the example data
used in this paper, safety margins or prognostic distances must
be increased substantially beyond the range plotted in Figs. 4-
6 for the cases with random failures to tend to 100%.
100000
150000
200000
250000
300000
0 100 200 300 400 500 600 700 800 900 1000
LCM Safety Margin (operating hours)
Effective Life Cycle Cost (per unit)
1000 hours, 1000 hour LCM width
4000 hours, 1000 hour LCM width
8000 hours, 1000 hour LCM width
10000 hours, 1000 hour LCM width
4000 hours, 2000 hour LCM width
4000 hours, 1000 hour LCM width (alt sched maint)
Width of the time-to-failure distribution
Optimal LCM safety margin
100000
150000
200000
250000
300000
0 100 200 300 400 500 600 700 800 900 1000
LCM Safety Margin (operating hours)
Effective Life Cycle Cost (per unit)
Width of the time to failure distribution =
4000 operational hours
1000 hour LCM width
0
25
50
75
100
Failures Avoided (%)
Life Cycle Cost
Failures Avoided
No random failures
10% random failures per year
10% random failures per year
No random failures
Fig. 5. Variation of the effective life cycle cost per unit
with the safety margin for a LCM-based maintenance-
planning scheme (10,000 Monte Carlo samples). Top: no
random failures assumed; Bottom: 10% random failures
per year included, variation in failures avoided also shown.
100000
150000
200000
250000
300000
0 200 400 600 800 1000 1200 1400 1600
Prognostic Distance (operational hours)
Effective Life Cycle Cost (per unit)
6000 hour HM width
4000 hour HM width
2000 hour HM width
2000 hour HM width (alt sched maint)
Optimal prognostic distances
6000 hour TTF width
100000
150000
200000
250000
300000
0 200 400 600 800 1000 1200 1400 1600
Prognostic Distance (operational hours)
Effective Life Cycle Cost (per unit)
6000 hour TTF width
4000 hour HM width
0
25
50
75
100
Failures Avoided (%)
Life Cycle Cost
Failures Avoided
No random failures
10% random failures per year
10% random
failures per year
No random failures
Fig. 6. Variation of the effective life cycle cost per unit
with the prognostic distance for a health monitoring based
maintenance planning scheme (10,000 Monte Carlo
samples). Top: no random failures assumed; Bottom:
10% random failures per year included, variation in
failures avoided also shown.

Citations
More filters
Journal ArticleDOI
TL;DR: The discrete event simulation model described in this paper provides the information needed to construct a business case showing the application-specific usefulness for various PHM approaches including health monitoring (HM) and life consumption monitoring (LCM) for electronic systems.

102 citations

Journal ArticleDOI
TL;DR: In this paper, a method to estimate the remaining useful life (RUL) of complex systems comprising multiple components is presented, which can be used as a powerful decision support tool to help industry practitioners reduce operational costs and increase availability of systems.
Abstract: Most prognostics and health management publications focus on the development of algorithms to monitor and estimate the health condition of individual components. However, estimating the remaining useful life (RUL) of complex systems comprising multiple components is a more relevant topic for the industry. An accurate system-level RUL prediction can be used as a powerful decision support tool to help industry practitioners reduce operational costs and increase availability of systems. This paper presents a method to estimate the RUL of multiple-component systems based on health monitoring information regarding each component in the system under consideration. The proposed method relates the health factors of each component to its performance. Then, a system-level performance indicator is computed based on the performance of each component and a system architecture function that describes the relations among different components within the system. The system-level RUL is then estimated based on the extrapolation of the system-level performance indicator and a known failure threshold. In the proposed method, a system failure is not necessarily connected with a component failure. We present two case studies to illustrate the application of the proposed method: 1) a simplified aircraft hydraulic system containing multiple pumps; and 2) an aircraft air conditioning system containing different components.

63 citations


Additional excerpts

  • ...valves [14], and electronic devices [15]....

    [...]

Journal ArticleDOI
TL;DR: The proposed methodology combines system architecture information and RUL estimations for all components in the system under study, allowing the estimation of an overall system-level RUL (S-RUL), which can be used to support maintenance decisions regarding the replacement of multiple components.
Abstract: Remaining useful life (RUL) estimations obtained from a prognostics and health monitoring (PHM) system can be used to plan in advance for the repair of components before a failure occurs. However, when system architecture is not taken into account, the use of PHM information may lead the operator to rush to replace a component that would not affect immediately the operation of the system under consideration. This paper presents a methodology for decision support in maintenance planning with application in aeronautical systems. The proposed methodology combines system architecture information and RUL estimations for all components in the system under study, allowing the estimation of an overall system-level RUL (S-RUL). The S-RUL information can be used to support maintenance decisions regarding the replacement of multiple components. For this purpose, the decision problem can be cast into an optimization framework involving the minimization of the component replacement cost under a safety constraint. Two case studies are used to illustrate the S-RUL concept, as well as the proposed optimization methodology.

54 citations


Cites background from "A decision support model for determ..."

  • ...The literature on PHM solutions for aeronautical components comprises a wide range of applications, such as the monitoring of valves [3], pumps [4], engines [5], and electronic devices [6]....

    [...]

Journal ArticleDOI
TL;DR: The proposed approach leads to a better interpretation of PHM results and thus helps translate PHM information to maintenance actions and policies which can assist in minimizing life cycle costs and maximizing the availability across an airline network.

25 citations

Proceedings ArticleDOI
07 Mar 2009
TL;DR: In this article, the authors proposed a new maintenance concept for air forces to sustain the same level of readiness with a reduced number of aircraft, but it is not sufficient to improve current maintenance concepts, but also new ones have to be introduced.
Abstract: With global cuts in defense budgets, air forces have to sustain the same level of readiness with a reduced number of aircraft. To succeed with this challenge, it is not sufficient to improve current maintenance concepts, but also new ones have to be introduced.

25 citations


Cites background from "A decision support model for determ..."

  • ...Electronic systems, on the other hand, have traditionally not been subject to PHM since their time to wear-out has been longer than the life cycle of the whole system [21]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This article includes optimization models for repair, replacement, and inspection of systems subject to stochastic deterioration and a classification scheme is used that categorizes recent research into inspection models, minimal repair models, shock models, or miscellaneous replacement models.
Abstract: A survey of the research done on preventive maintenance is presented. The scope of the present survey is on the research published after the 1976 paper by Pierskalla and Voelker [98]. This article includes optimization models for repair, replacement, and inspection of systems subject to stochastic deterioration. A classification scheme is used that categorizes recent research into inspection models, minimal repair models, shock models, or miscellaneous replacement models.

768 citations


"A decision support model for determ..." refers background in this paper

  • ...Although many applicable models for single and multiunit maintenance planning have appeared [9,10], the majority of the models assume that monitoring information is perfect (without uncertainty) and complete (all units are monitored the same), i....

    [...]

Journal ArticleDOI
TL;DR: This paper surveys the literature related to optimal maintenance and replacement models for multi-unit systems and provides a quick guide to a variety of classification schemes.

687 citations


"A decision support model for determ..." refers background in this paper

  • ...Although many applicable models for single and multiunit maintenance planning have appeared [9,10], the majority of the models assume that monitoring information is perfect (without uncertainty) and complete (all units are monitored the same), i....

    [...]

Proceedings ArticleDOI
18 Mar 2000
TL;DR: This paper reviews the fundamentals of prognostics with emphasis on the estimation of remaining life and the interrelationships between accuracy, precision and confidence and demonstrates a hypothesized trend of improved accuracy and lower uncertainty as remaining life decreases.
Abstract: This paper reviews the fundamentals of prognostics with emphasis on the estimation of remaining life and the interrelationships between accuracy, precision and confidence. A distinction is made between the static view of failure distributions derived from historical data and the dynamic view of remaining life derived from condition. The nonstationary nature of prognoses is illustrated using data from a failing SH-60 helicopter gearbox. A method is demonstrated that measures the accuracy and uncertainty of remaining life estimates using example prognostic features. This method isolates the uncertainty attributable to features and their interpretation from the uncertainty due to the random variables that govern the physics of component failure. Results from the example features support a hypothesized trend of improved accuracy and lower uncertainty as remaining life decreases.

281 citations


"A decision support model for determ..." refers background in this paper

  • ...All PHM approaches are essentially the extrapolation of trends based on recent observations to estimate remaining life, [1]....

    [...]

  • ...Without comprehending the corresponding measures of the uncertainty associated with the calculation, remaining life projections have little practical value, [1]....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a physics-of-failure-based methodology for determining the damage or life consumption in a product is presented, where a data recorder has been used to monitor the temperature and vibration loads on a printed circuit board placed under the hood of a car.
Abstract: Failures in electronic products are often attributable to various combinations, intensities, and durations of environmental loads, such as temperature, humidity, vibration, and radiation. For many of the failure mechanisms in electronic products, there are models that relate environmental loads to the time to failure of the product. Thus, by monitoring the environment of a product over its life cycle, it may be possible to determine the amount of damage induced by various loads and predict when the product might fail. This paper describes the development of a physics-of-failure-based methodology for determining the damage or life consumption in a product. As a demonstration of the methodology, a data recorder has been used to monitor the temperature and vibration loads on a printed circuit board placed under the hood of a car. The data collected by the recorder has been used to determine the life consumption in the solder joints of the printed circuit board due to temperature and vibration loading. The calculated remaining life has then been compared with temperature cycling test results on the board to assess the validity of the approach.

167 citations


"A decision support model for determ..." refers background or methods in this paper

  • ..., thermal, vibration) is used in conjunction with physics of failure models to compute damage accumulated and thereby forecast life remaining, [4]....

    [...]

  • ...Previous PHM work on electronic systems has demonstrated life consumption monitoring for electronic systems, [4]....

    [...]

Journal ArticleDOI
TL;DR: In condition-based maintenance, a common practice is to record a condition reading at a regular interval, and once the reading is higher than a pre-set critical level, the item monitored is declare...
Abstract: In condition-based maintenance, a common practice is to record a condition reading at a regular interval, and once the reading is higher than a pre-set critical level, the item monitored is declare...

161 citations


"A decision support model for determ..." refers background in this paper

  • ...Previous work that treats imperfect monitoring includes, [11] and [12]....

    [...]

Frequently Asked Questions (1)
Q1. What contributions have the authors mentioned in the paper "A decision support model for determining the applicability of prognostic health management (phm) approaches to electronic systems" ?

This paper presents a model that enables the determination of when scheduled maintenance makes sense, and how to optimally interpret Prognostic Health Management ( PHM ) results for electronic systems. Given that the forecasting ability of PHM ( whether health monitoring or life consumption monitoring based ) is fraught with uncertainties in the sensor data collected, the models applied, the material parameters assumed in the models, etc., the model in this paper addresses how PHM results can be interpreted so as to provide value to the system. The model provides the type of information needed to construct a business case showing the application-specific usefulness of health monitoring and/or life consumption monitoring for electronic systems.