Journal ArticleDOI

A Bayesian Semiparametric Analysis of the Reliability and Maintenance of Machine Tools

01 Feb 2003-Technometrics (Taylor & Francis)-Vol. 45, Iss: 1, pp 58-69
TL;DR: A Bayesian semiparametric proportional hazards model is presented to describe the failure behavior of machine tools, and development of optimal replacement strategies are discussed.
Abstract: A Bayesian semiparametric proportional hazards model is presented to describe the failure behavior of machine tools. The semiparametric setup is introduced using a mixture of Dirichlet processes prior. A Bayesian analysis is performed on real machine tool failure data using the semiparametric setup, and development of optimal replacement strategies are discussed. The results of the semiparametric analysis and the replacement policies are compared with those under a parametric model.

1. INTRODUCTION

• The useful life of a machine tool is the duration of time that the tool maintains an acceptable quality of performance.
• Because of the variation in individual characteristics of machine tools and the possible omission of relevant variables (describing the operational environment) from the model, a fully parametric PHM for tool life may not be adequate.
• Neither group considered the effect of covariate information, however.
• Neither a nonparametric nor semiparametric approach nor the use of covariates has been considered from a Bayesian point of view for developing optimal replacement strategies.

2. A BAYESIAN SEMIPARAMETRIC PROPORTIONAL HAZARDS MODEL

• This section discusses the use of a PHM to analyze machine tool failures under varying operating conditions.
• Unlike the parametric analysis of Mazzuchi and Soyer (1989), the authors develop a semiparametric inference using an MDP approach, with inference furnished using the ef cient algorithms of MacEachern (1994).
• This development is, therefore, a nontrivial combination of previously published techniques that allows a full analysis in this particular application.
• This section outlines the necessary methodology for this combination of techniques.

2.1 A Proportional Hazards Model for Machine Tool Failure

• The model has been widely applied in survival and reliability analysis.
• The parameters of the model are suppressed in the notation ‹i4t3Zi5.
• An alternative analysis of the parametric model was developed by Dellaportas and Smith (1993) using Markov chain Monte Carlo techniques, although an analysis of the machine tool failure data using these methods has not yet been published.
• Approaches for modeling the baseline cumulative failure rate function include the gamma process proposed by Kalb eisch (1978) and criticized by Hjort (1990), the extended gamma process presented by Laud, Damien, and Smith (1996), and the beta process presented by Hjort (1990), with a computational model developed by Laud et al. (1998).
• (For a full review of other semiparametric approaches to inference on regression models, see Gelfand 1999.).

2.2 A Mixture of Dirichlet Processes Prior for the Proportional Hazards Model

• Under the MDP approach, the baseline failure rate is assumed to be some continuous function ‹04t3 ˆi5, where ˆi is the vector of unknown parameters speci c to the ith machine tool.
• One way to model this uncertainty is to follow the development of MacEachern (1994) and West, Muller, and Escobar (1994) and describe uncertainty about G by a Dirichlet process prior denoted by G ¹ DP4G01M51 where G0 is the baseline prior and M is the strength of belief parameter.
• (See Ferguson 1973 for a discussion of Dirichlet process priors.).
• Speci cation of the semiparametric PHM is completed by specifying a parametric prior for the covariate effects ‚, denoted by 4‚5, which is independent of the ˆi’s.
• In addition to its exibility and ability to capture individual characteristics of the machine tools, the proposed semiparametric PHM also provides an assessment of the completeness of the set of covariates included in the analysis.

2.3 Posterior Inference and Prediction

• Instead of attempting to perform inference on the mixing distribution G directly, one can perform simple inference using the Markov chain Monte Carlo methods in algorithm 1 of Escobar and West (1995) to obtain a sample from the posterior distribution of ä and ‚ given the data D.
• For their problem, the attractive feature of this approach is that computation of 4ä1‚—D5 based on the Gibbs sampler can be achieved without sampling from the posterior distribution of 4G—‚1 D5, thus reducing the problem to n dimensions.
• Samples from this distribution can be obtained using the methods discussed by Dellaportas and Smith (1993) for the parametric model, because given ä, a conditionally parametric model is speci ed by (1).
• The authors approach follows that of Escobar and West (1995), assuming a priori that M follows an arbitrary prior 4M5.

3. ANALYSIS OF MACHINE TOOL FAILURE DATA USING PARAMETRIC AND SEMIPARAMETRIC INFERENCE

• The data used in this analysis, given in Table 1, were rst presented by Taraman (1974).
• Each experimental run used a workpiece material of SAE 1018 cold-rolled steel, 4 inches in diameter and 2 feet long.
• The 24 machine tools used for the cutting were tungsten carbide disposable inserts mounted in a tool holder.
• The cutting operations were performed without using cutting uids.
• The semiparametric inference follows the methods developed in Section 2.

3.1 Comparison of the Posterior Distributions of the Model Parameters

• For comparison of the semiparametric inference method to the parametric method proposed by Mazzuchi and Soyer (1989), the conditional baseline failure rate in (2) is assumed to be a Weibull density with scale parameter i and shape parameter ƒ.
• Figure 1 shows marked differences between the posterior distributions of the scale parameters.
• (See Kass and Raftery 1995 for further discussion of Bayes factors and their approximation.).
• Comparison of the posterior distributions of the covariate effect parameters, ‚1, ‚2, and ‚3, and the shape parameter, ƒ, is shown in Figures 3–6 graphically and given in Table 2 numerically.
• Under the semiparametric model, using Jeffreys’s scale there is still decisive evidence that cutting velocity has an effect, but there is also substantial evidence that feed rate and depth of cut affect the lifetime of machine tools.

3.2 Comparison of the Predicted Reliability

• The distribution of the predictive reliability of a given machine tool can be found using the sample approximations given in (6).
• The variance in the predicted mission time reliabilities is greater under the semiparametric model.
• Figure 9 shows the medians of the posterior predictive distributions of the reliabilities of the same machine tool at different mission times, with the dotted line indicating the parametric model and the solid line indicating the semiparametric model.
• The question now becomes which of these two predictions is better.

3.3 Comparison of the Predictive Ability

• The parametric and semiparametric models can be compared using posterior predictive densities, as discussed by Gelfand (1996), and the DIC, as described by Spiegelhalter et al. (2002).
• Thus in their analysis 100 partitions were selected at random, and the posterior predictive densities were calculated under each model.
• For the semiparametric model, the effective number of parameters is about 12, although the model includes 3 covariate effect parameters, a shape parameter, and potentially 24 scale parameters.
• In conclusion, each predictive comparison criterion shows strong evidence of the superior predictive ability of the semiparametric model over the parametric model.

4. SEMIPARAMETRIC OPTIMAL REPLACEMENT STRATEGIES FOR MACHINE TOOLS

• As in the original article by Taraman (1974) and the later work by Balakrishnan and DeVries (1985), the aim of machine tool life modeling is to aid decisions concerning the operation of machine tools.
• In this section the authors demonstrate how the MDP setup can lead to markedly different recommendations when compared with the parametric model.
• Under the age-replacement protocol, a planned replacement is made at age tA if the item survives until then, or an in-service replacement is made whenever the item fails.
• This protocol can be applied to individual machine tools.
• The machine tools are nonrepairable and thus operate under goodas-new (GN) replacement.

4.1 Replacing Individual Machine Tools

• The rst term represents the cost per unit time of in-service failures, and the second term represents the cost per unit time of planned replacements.
• D5 d‚ dä1 where D represents the observed failure and covariate data.
• This shows that some individual variation not explained by the three covariates can result in higher predicted machine tool reliability under the semiparametric model.
• 1 per unit time of the age-replacement protocol is predicted to be lower under the semiparametric model compared with the parametric model, again because the machine tool is predicted to be more reliable.
• For other covariate combinations this situation may be reversed.

4.2 Replacing a Group of Machine Tools

• Whereas the age-replacement policies are adequate for maintenance of a single machine tool, when several machine tools are to be maintained it may be more cost-effective to replace them as a group rather than individually.
• An alternative approach is to simulate the renewal process and approximate the conditional renewal function Mi4tB—ˆ0i1‚1 Zi5 (see, e.g., Ross 1989).
• The covariate values of the rst four machine tools listed in Table 1 are chosen to illustrate block replacement.
• The three tools have signi cantly lower scale parameter values than the other tools (see Fig. 1).
• The predicted optimal replacement intervals are now 27 for the parametric model and 33 for the semiparametric model—a larger difference, with the expected cost again lower for the semiparametric model due to the lower scale parameter values.

5. CONCLUSIONS

• In this article the authors have presented a semiparametric model developed for the analysis of machine tool failure data.
• The difference in the two models was apparent, with the semiparametric results favored because of its superior predictive ability.
• Such an analysis would require the use of time-dependent covariates or dynamic environment modeling techniques different from those used here.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

A Bayesian Semiparametric Analysis
of the Reliability an d Maintenance
of Machine Tools
Jason R. W. Merrick
Department of Statistical Sciences and
Operations Research
Virginia Commonwealth University
Richmond, VA 23284
( jrmerric@vcu.edu)
Re’ k Soyer
Department of Management Science
The George Washington University
Washington, DC 20052
( soyer@gwu.edu)
Thomas A. Mazzuchi
Department of Engineering Management & Systems Engineering
The George Washington University, Washington, DC 20052
( mazzuchi@seas.gwu.edu)
A Bayesian semiparametric proportional hazards model is presented to describe the failure behavior of
machine tools. The semiparametric setup is introduced using a mixture of Dirichlet processes prior. A
Bayesian analysis is performed on real machine tool failure data using the semiparametric setup, and
development of optimal replacement strategies are discussed. The results of the semiparametric analysis
and the replacement policies are compared with those under a parametric model.
KEY WORDS: Bayesian; Markov chain Monte Carlo; Mixtures of Dirichlet processes prior; Propor-
tional hazards model; Semiparametric inference.
1. INTRODUCTION
The useful life of a mac hine tool is the duration of t ime that
the tool maintains an acceptable quality of performance. As an
alternative to replacing tools on observing unacceptable qual-
ity of performance, using planned replacements can reduce
costs associated with in-service failures of m achine tools,
thereby resulting in increased productivity due t o decreased
downtime and scrapping of material and decreased inventory
costs due to improved planning. Key to the determination
of an optimal replacement strategy is the development of
an adequate statist ical model for tool life. The development
of such a model is complicated because o f the fact that the
conditions under which a tool operates (called the operational
environment) vary even for tools on the same shop oor.
In addition, commercial engineering materials require vari-
ability within speci ed ranges of chemistry and mechanical
properties for their economical production, corresponding to
variability in the properties that degrade the tool during its
operational li fe. Variabilities can also be expected in machine
tool dynamics and materials-handling performance of the
processing systems. Thus, each machine tool may exhibit
inherent variabilities in its useful life.
In the early literature, the lack of a universally acceptable
physical theory of tool failure led to the use of an empirical
model describing the relationsh ip between tool life and oper-
ational variables, including cutting speed, feed rate, and depth
of cut. Taraman (1974) performed an experiment designed
to estimate the parameters of this empirical model. Balakr-
ishnan and DeVries (1985) extended this analysis to allow
sequential updating of parameter estimates and inclusion of
prior information in the estimation procedure. Mazzuchi and
Soyer (1989) noted that the empirical model proposed by Tara-
man (1974) accounted for the effect of the machine operating
environment bu t failed to account for aging (or wear out)
characteristics of the tool. To account for both aging and the
characteristics of the machine operating environment, a pro-
portional hazards model (PHM) was proposed to assess tool
life. In specifying the PHM, a Weibull model was assumed for
the baseline failure rate to incorporate aging of the tool, and
the effect of machining environment, as speci e d by Taraman
(1974), was used to modulate the baseline failure rate.
Because of t he variation in individual characteristics of
machine tools and the possible omission of relevant variables
(describing the operational environme nt) from the model,
a fully parametric PHM for tool life may not be adequate.
This may re sult in suboptimal replacement strategies. Thus
a more general model that relaxes some of these parametric
assumptions may be more desirable for modeli ng machine
tool life. As pointed out by Gelfand (1999), in a situation
where the parametric assumptions may be too restrictive, a
semiparametric mo del can be developed by a nonparametric
speci cation of some portions of the model.
machine t ool life that treats the baseline fai lure rate function
in a nonparametric manner while treating the effect of the
covariates parametrically, as implied by Taraman (1974),
using the PHM approach. A mixture of Dirichlet processes
© 2003 American Statistical Association and
the American Society for Quality
TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1
DOI 10.1198/004017002188618707
58

BAYESIAN SEMIPARAMETRIC ANALYSIS OF MACHINE TOOL RELIABILITY AND MAINTENANCE 59
(MDP) prior, proposed by Antoniak (1974), is assumed for
the part of the model that de nes the underlying fa ilure
distribution. Whereas the foregoing set-up leads t o a complex
analysis, inference is developed using the framework of
Escobar and West (1995) and the ef cient sampling algorithms
of MacEachern (1994). The MDP approach reduces the
restriction of the parametric baseline failure rate and allows
assessment of differences between the failure characteristics
of tools that cannot be described by the covariates in the
model of Taraman (1974). An additional bene t of the MDP
approach for the PHM is the ability to compare the parameters
of the baseline failure rate corresponding to each individual
machine tool, thus addressing the question of whether all
necessary covariates are included in th e model.
The development of optimal replacement strategies using a
nonparametric form for the failure model has been considered
from a samplin g theory perspective by Frees and Ruppert
(1985) and Aras and Whitaker (1991). Neither group con-
sidered the effect of covariate information, however. Neither
a nonparametric nor semiparametric approach nor the use of
covariates has been considered from a Bayesian point of v iew
for developing optimal replacement strategies. The B ayesian
decision theoretic framework for replacement policies for
machine tools presented h ere represents an ext ension of
the work of Mazzuchi and Soyer (1995, 1996) to include
semiparametric models and covariate effects.
While building on existing analytical techniques proposed
in the literature for semiparametri c models, the analysis herein
has four basic aims:
1. To develop a model describing the failure behavior of
machine tools accounting for the operational environment
effect as well as aging, thereby c apturing a wi de class of tool
failure behavior
2. To illustrate the use of this mo del as a diagnostic tool for
assessing model adequacy in describing the differences among
the machine tools
3. To illustrate the use of this model in predicting the useful
life of a machine tool op erating in a particular operational
environment
4. To illustrate the use of this model in developing optimal
replacement strategies for single machine tools and groups of
tools.
The PHM for machine tool life is discussed in Section 2,
along with the relaxation of parametric assumpt ions. In
Section 3 the semiparametric model is used to analyze the
machine tool failure data given by Taraman (1974). The
posterior distribution of the parameters of the model are
compared to a parametric model, and the predictive ability
of the two approaches is compared using both posterior
predictive densities and the decision information criterion
from Spiegelhalter, Best, Carlin, and van der Linde (2002).
The model is used to develop Bayesian semiparametric
replacement strategies in Section 4. Conclusions are presented
in Section 5.
2. A BAYESIAN SEMIPARAMETRIC
PROPORTIONAL HAZARDS MODEL
This section discusses the use of a PHM to analyze machine
tool failures under varying operating conditions. Unlike the
parametric analysis of Mazzuchi and Soyer (1989), we
develop a semiparametric inference using an MDP approach,
with inference furnished using the ef cient algorithms of
MacEachern (1994). This development is, therefore, a nontriv-
ial combination of previously published techniques that allows
a full analysis in this particular application. This section
outlines the necessary methodology for this combination of
techniques.
2.1 A Proportional Hazards Model
for Machine Tool Failure
The PHM was proposed by Cox (1972) to incorporate
covariate information in a survival or time to failure model.
The model has been widely applied in survival and reliabilit y
analysis. The PHM is de ned using the concept of the failure
rate. Let T
i
be the life length of ma chine tool i. Assuming that
T
i
is continuous, the failure rate function of the distribution
of T
i
is de ned as
i
4t5 D
f
i
4t5
R
i
4t5
1
where f
i
4t5 is the probability density function of T
i
and
R
i
4t5 D P4T
i
t5 D exp8ƒå
i
4t59
is the reliability of machine tool i at time t, with å
i
4t5 D
R
t
0
i
4s5 ds the cumulative failure rate.
Let Z
i
be a vector of p measured covariates de scribing the
operational environments of machine tool i. The cova riates
available for t he machine tool analysis are constant with
respect to time and are known before any failure data are
observed. Cox (1972) propo sed that the distribution of T
i
could be made dependent on Z
i
via the failure rate by
assuming that the failure rate of the ith item is a product of
a common base failure rate function and a function of the
covariates, explicitly
i
4t3 Z
i
5 D
0
4t5e
T
Z
i
1
where is a vector of p regression parameters and
0
4t5 is
a baseline failure rate function. The parameters of the model
are suppressed in the notation
i
4t3 Z
i
5.
Often a parametric form is assumed for the baseline
failure rate. This is equivalent to choosing a common family
of distributions for the life lengths of the machine tools.
Mazzuchi and Soyer (1989) performed an analysis of the
Weibull parametric model for the machine tool fai lure data.
Their analysis used integral approximation techniques to nd
the marg inal posterior distributions of the model parameters.
An alternative analysis of the parametric model was developed
by Dellaportas and Smith (1993) using Markov chain Monte
Carlo techniques, although an analysis of the machine tool
failure data using these methods has not yet been published.
Approaches for modeling the baseline cumulative fail-
ure rate function include the gamma process propo sed
by Kalb eisch (1978) an d criticized by Hjort (1990), the
extended gamma process p resented by Laud, Damien, and
Smith (1996), and the beta process presented by Hjort (1990),
with a computational model developed by Laud et al. (1998).
(For a full review of other semiparametric approaches to
inference on regression models, see Gelfand 1999.) In the
TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

60 JASON R. W. MERRICK, REFIK SOYER, AND THOMAS A. MAZZUCHI
next section we present an MDP prior, as de ned by Antoniak
(1974), for the baseline failure rate of the PHM to an alyze the
machine tool problem. This prior distribution allows a large
family of continuous failure time distributions, thus relaxing
the full parametric assumption. Our setup is similar to the
semiparametric accelerated failure time model proposed by
Kuo and Mallick (1997), but applied to the PHM.
2.2 A Mixture of Dirichlet Processes Prior
for the Proportional Hazards Model
Under the MDP approach, the baseline failure rate is
assumed to be some continuous function
0
4t3 ˆ
i
5, where ˆ
i
is
the vector of unknown parameters speci c to the ith machine
i
s is described by specifying
a prior distribu tion G. If the form of G is known but the
hyperparameters are unknown, then this class of problems is
referred to as hierarchical Bayes problems in the se nse of
Lindley and Smith (1972). If the form of G is unknown, then
uncertainty about G must be modeled. One way to model
this uncertainty is to follow the development of MacEachern
(1994) and West, Muller, and Escobar (1994) and describe
uncertainty about G by a Dirichlet process prior denoted by
G ¹ DP4G
0
1 M51
where G
0
is the baseline prior and M is the strength of belief
parameter. (See Ferguson 1973 for a discussion of Dirichlet
process priors.)
By specifying a form for
0
4t
i
3 ˆ
i
5 conditional on ˆ
i
, we
specify a conditional parametric model for T
i
whose density
f 4t
i
ˆ
i
1 ‚1 Z
i
5 is given by
0
4t
i
3 ˆ
i
5e
T
Z
i
exp8ƒå
0
4t
i
3 ˆ
i
5e
T
Z
i
91 (1)
where å
0
4t
i
3 ˆ
i
5 D
R
t
i
0
0
4s3 ˆ
i
5 ds. Speci cation of the semi-
parametric PHM is completed by specifying a parametric prior
for the covariate effects , denoted by 4‚5, which is inde-
pendent of the ˆ
i
s. The nonparametric nature of the model
arises because the distribution of T
i
, unconditional on ˆ
i
, is an
unknown mixture of f 4t
i
ˆ
i
1 ‚1 Z
i
5 given by
f 4t
i
G1 ‚1 Z
i
5 D
Z
f 4t
i
ˆ
i
1 1 Z
i
5 dG4ˆ
i
51
where the distribution of T
i
results from mixing with respect
to G. These models were termed Dirichlet process mixed
models by M ukhopadhyay and Gelfand (1997), because G is
assumed a priori to be a Dirichlet process.
The semiparametric PHM using a an MDP approach can be
summarized as
4T
i
ˆ
i
1 ‚1 Z
i
5 ¹ f 4t
i
ˆ
i
1 1 Z
i
51
i
G5 ¹ G1
4G5 ¹ DP4G
0
1 M51
4‚5 ¹ 4‚50 (2)
It is also assumed a priori that and ä D 4ˆ
1
1 : : : 1 ˆ
n
5 are
independent of each other.
In addition to its exibility and ability to capture individual
characteristics of the machine tools, the proposed semipara-
metric PHM also provides an assessment of the completeness
of the set of covariates included in the analysis. In the clas-
sical literature, a residual analysis is performed to determine
whether differences in the fail ure characteristics among the
machine tools remain after th e effect of the covariates has
been removed. In the proposed model, differences between
the individual m achine tools can be assessed by differences
between the distributions of the ˆ
i
s.
2.3 Posterior Inference and Prediction
Given fail ure and covariate data D D 8T
1
D t
1
1 : : : 1 T
n
D
t
n
1 Z
1
1 : : : 1 Z
n
9 on n machine tools, the likelihood fu nction of
G and given the data D is obtained as
L4G1 D5 D
n
Y
i
D1
Z
f 4t
i
ˆ
i
1 ‚1 Z
i
5 dG4ˆ
i
50
Given an arbitrary prior on , say 4‚5, which is independent
of ä and G, following Kuo and Mallick (1997), the posterior
distribution of G given and D can be obtained as a mixture
of Dirichlet processes,
4G1D5 ¹
Z
DP
³
MG
0
C
n
X
j
D1
ˆ
j
´
‚1 D51 (3)
where
ˆ
j
denotes a point mass distribution concentrated at ˆ
j
and dç4ä1 D5 is proportional to
n
Y
i
D1
f 4t
i
ˆ
i
1 ‚1 Z
i
5
µ
MG
0
C
i
ƒ1
X
j
D1
ˆ
j
i
0
It is dif cult to sample from the distribution 4G‚1 D5 given in
(3), because G is effect ively an in nite-d imensional parameter
(see, e.g., Kuo 1986).
Instead of attempting to perform inference on the mixing
distribution G directly, one can perform simple inference using
the Markov chain Monte Carlo methods in algorithm 1 of
Escobar and West (1995) to obtain a sample from the posterior
distribution of ä and given the data D. For our problem,
the attractive feature of this approach is that computation of
1 D 5 based on the Gibbs sampler can be achieved with-
out sampling from the posterior distribution of 4G‚1 D5, thus
reducing the problem to n dimensions.
Following the derivations of Escobar and West (1995), it
can be shown that the full conditional for each ˆ
i
is
i
ˆ
1
1 : : : 1 ˆ
i
ƒ1
1 ˆ
i
C1
1 : : : 1 ˆ
n
1 ‚1 D5
¹ q
i1
0
G
b
i
t
i
1 1 Z
i
5 C
X
j
6D
i
q
i1 j
ˆ
j
i
51
where
ˆ
j
i
5 equals 1 if ˆ
i
D ˆ
j
and 0 otherwise. The term
G
b
i
t
i
1 ‚1 Z
i
5 is the baseline posterior distribu tion
dG
b
i
t
i
1 ‚1 Z
i
5 / f 4t
i
ˆ
i
1 1 Z
i
5 dG
0
i
50
The terms q
i1 j
, for j 6D i represent the positive probability that
some of the ˆ
i
s will take the same values due to the discrete-
ness of G (as a result of the Dirichlet process prior). These
are given by
q
i1
0
/ M
Z
f 4t
i
ˆ
i
1 1 Z
i
5 dG
0
i
5
and
q
i1 j
/ f 4t
i
ˆ
j
1 ‚1 Z
i
51
TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

BAYESIAN SEMIPARAMETRIC ANALYSIS OF MACHINE TOOL RELIABILITY AND MAINTENANCE 61
where f 4t
i
ˆ
j
1 1 Z
i
5 is the density of T
i
when ˆ
i
D ˆ
j
and
q
i1
0
C
X
j
6D
i
q
i1 j
D 10
An algorithm proposed by MacEachern (1994) exploits this
fact to increase the ef ciency of the samp ling processs by
updating the ˆ
i
s in groups or cluste rs. Using this algorithm,
we may draw a sample from the c onditional distribution of
ä given and the data. The other distribution needed to
implement the Gibbs sampler for this model is the condi-
tional distribution of given ä and the data. Samples from
this distribution can be obtained using the methods discussed
by Dellaportas and Smith (1993) for the parametric model,
because given ä, a conditionally parametric model is speci ed
by (1).
In implementing t he Gibbs sampler, given a current ä and
, the general steps are as follows:
1. Generate a new ä conditional on using the ef cient
MDP algorithm of MacEachern (1994).
2. Generate a new conditional on ä using the methods
discussed by Dellaportas and Smith (1993).
An attractive feature of the MDP setup and the proposed
algorithm is that posterior predictive densities and reliability
functions can be easily evaluated once the posterior sample
from ä and is available. For example, in predicting T
n
C1
,
the posterior predictive density f 4t
n
C1
D1Z
n
C1
5 is
Z
‚1 ä
f 4t
n
C1
ä1 ‚1 Z
n
C1
5 1 D 5 dä d‚1
where f 4t
n
C1
ä1 ‚1 Z
n
C1
5 is
Z
ˆ
n
C1
f 4t
n
C1
ˆ
n
C1
1 ‚1 Z
n
C1
5
n
C1
ä5
n
C1
and
n
C1
ä5 is
M
M C n
G
0
n
C1
5 C
1
M C n
n
X
i
D1
ˆ
i
n
C1
50 (4)
Thus the posterior predictive density f 4t
n
C1
D1 Z
n
C1
5 can be
written as
Z
‚1 ä1 ˆ
n
C1
f 4t
n
C1
ˆ
n
C1
1 ‚1 Z
n
C1
5 4ˆ
n
C1
ä5
1 D5 dˆ
n
C1
d‚1 (5)
and using the posterior sample from 1 D5, denoted
by 4ˆ
1
1 l
1 : : : ˆ
n1 l
1
l
5 for l D 11 : : : 1 S, and draws from (4),
denoted by ˆ
n
C1
1 l
, f 4t
n
C1
D1 Z
n
C1
5 can be approximated as
1
S
S
X
l
D1
f 4t
n
C1
ˆ
n
C1
1 l
1
l
1 Z
n
C1
50 (6)
Because the results u nder the MDP setup may be sensitive
to the choice of M , we use a further extension by incorporat-
ing M into the Gibbs sampler. Our approach follows that of
Escobar and West (1995), assuming a priori that M follows
an arbitrary prior 4M5. In their development, Escobar and
West de ned K as the number of unique values of ˆ
1
1 : : : ˆ
n
,
also referred t o as the number of cliques by MacEachern
(1998). Conditioned on K, th e full conditional distribution of
M is independent of all other parameters with density propor-
tional to
M
K
ƒ1
4M C n5B4M C 11 n5 4M 51
where B4¢ 5 is the standard beta function. Escobar and West
(1995) offered a simple two-step process for sampling from
this distribution if 4M 5 is assumed to be a gamma distribu-
tion. The approach involves using a data-augmentation step at
each iteration of the Gibbs sampler. Thus K is a lso recorded
in the Gibbs sample, because the distribution of the number
of cliques is of interest in the reliability analysis in Section 3.
We note that at each iteration of t he Gibbs sampler, once M
is drawn, the rest of the quantities are sampled as discussed
in steps 1 and 2.
3. ANALYSIS OF MACHINE TOOL
FAILURE DATA USING PARAMETRIC
AND SEMIPARAMETRIC INFERENCE
The data used in this analysis, given in Table 1, were rst
presented by Taraman (1974). They consist of the failure times
of 24 machine tools and their corresponding cutting speed,
feed rate, and depth of cut. Each experimental run used a
workpiece material of SAE 1018 cold-rolled steel, 4 inches
in diameter and 2 feet long. The 24 machine tools used for
the cutting were tungsten carbide disposable inserts mounted
in a tool h older. A 7.5-horsepower engine lathe equipped with
a three-jaw universal chuck and a live center mounted in the
tailstock was used to perform the cutting operation. The cut-
ting operations were performed without using cutting ui ds.
In this section we use both parametric and semiparametric
inference methods to assess the effect of the covariates on the
useful life of these machine tools. The parametric analysis mir-
rors the development of Mazzuchi and Soyer (1989) by speci-
fying
i
4t3 z
i
5 D
0
4t5 exp8‚
1
ln Z
i1
1
C
2
ln Z
i1
2
C
3
ln Z
i1
3
9,
Table 1. The Machine Tool Failure Data
Machine Speed Depth of cut Tool life
tool (fpm) Feed (ipr) (inches) (min.)
1 340 .00630 .02100 70.0
2 570 .00630 .02100 29.0
3 340 .01410 .02100 60.0
4 570 .01416 .02100 28.0
5 340 .00630 .02100 64.0
6 570 .00630 .04000 32.0
7 340 .01416 .04000 44.0
8 570 .01416 .04000 24.0
9 440 .00905 .02900 35.0
10 440 .00905 .02900 31.0
11 440 .00905 .02900 38.0
12 440 .00905 .02900 35.0
13 305 .00905 .02900 52.0
14 635 .00905 .02900 23.0
15 440 .00472 .02900 40.0
16 440 .01732 .02900 28.0
17 440 .00905 .01350 46.0
18 440 .00905 .04550 33.0
19 305 .00905 .02900 46.0
20 635 .00905 .02900 27.0
21 440 .00472 .02900 37.0
22 440 .01732 .02900 34.0
23 440 .00905 .01350 41.0
24 440 .00905 .04550 28.0
TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

62 JASON R. W. MERRICK, REFIK SOYER, AND THOMAS A. MAZZUCHI
where Z
i1
1
is cutting speed, Z
i1
2
is feed rate, Z
i1
3
is depth of
cut, and
0
4t5 D ƒt
ƒ
ƒ1
. For computational ef cien cy, we use
the techniqu es developed by Dellaportas and Smith (1993)
for inference in the parametric model. The semiparametric
inference follows the methods developed in Section 2. The two
approaches are compared in several ways, using the posteri or
distributions of the model parameters, posterior distributions
of the predicted reliabilities, and posterior predictive densit ies,
as discussed by Gelfand (1996), and the deviance information
criteria (DIC), as described by Spiegelhalter et al. (2002).
3.1 Comparison of the Posterior Distributions
of the Model Parameters
For comparison of the semiparametric i nference method
to the parametric method proposed by Mazzuchi and
Soyer (1989), the conditional baseline failure rate in (2)
is assumed to be a Weibull density with scale parameter
i
and shape parameter ƒ. Thus, under the notation of
Section 2, ˆ
i
D 4
i
1 ƒ5, with n D 24. Note that each item
is assumed to wear at the same rate; thus ƒ is common
to each item, whereas the scale parameter
i
is allowed
to vary from one machine tool to the next. The condi-
tional parametric density in (2) is then f 4t
i
i
1 ƒ1 ‚1 Z
ü
i
5 D
i
ƒt
ƒ
ƒ1
i
exp8‚
T
Z
ü
i
9 exp8ƒ
i
t
ƒ
i
exp8‚
T
Z
ü
i
99, where Z
ü
i
D
4ln Z
i1
1
1 ln Z
i1
2
1 ln Z
i1
3
5
T
. For the prior best guess of the
mixing distribution of the
i
s, G
0
in (2), a gamma distribution
is chosen.
The pri or assumptions of Mazzuchi and Soyer (1989) had
low varia nces and seemingly speci c values fo r the means of
each paramete r. However, no motivation was given for these
prior assumptions. Thus our prior distributions are noninfor-
mative with large varia nces assumed on each pa rameter. A pri-
ori, ƒ,
1
,
2
, and
3
are independent of each other and
1
1 : : : 1
n
. A normal prior, with mean 0 and variance 20,
was assumed for each of the covariate effect p arameters,
1
,
2
, and
3
, where the covariate values had been scaled so that
they were of the same order of magnitude. This re ects our
lack of knowledge of whether the c ovariates would increase
or decrease failure time. The prior distribution of the shape
parameter, ƒ, was assumed to be a normal distribution trun-
cated at 0 with mean 1 and variance 10 re ec ting no strong
prior belief concerning the failure rate behavior (whether it is
increasing or decreasing). The best-guess prior distribution G
0
for the scale parameters was assumed to be a gamma distri-
bution with mean 1 and variance 10. A priori, M is assumed
to follow an un informative gamma distribution with mean 24
and standard deviation 100.
A single-chain Gibbs sampler was run to obtain 2,500 sam-
ples with a warmup of 5,000 and a lag of 25 between succes-
sive samples. Boxplots of th e marginal posterior distributions
of the natural log of the scale parameters for the tools in the
data obtained under the semiparametric model are shown in
Figure 1. A second Gibbs sampler was run for the parametric
model using the methods of Dellaportas and Smith (1993).
The same pri or distributions were assumed, except that the
prior distribution of the scale parameter was assumed to be the
best-guess prior, G
0
, in the semiparametric model. The box-
plot of the distribution obtained under the parametric model
is also shown in Figure 1, denoted by a P on the x-axis.
Figure 1. Boxplots of the Marginal Posterior Distributions of the Log
Scale Parameters of the Machine Tools Under the Parametric and Semi-
parametric Models.
Figure 1 shows marked differences between the posterior
distributions of the scale parameters. The posterior distribu-
tion of the scale parameter under the parametric model has a
smaller variance and obviously cannot represent the variability
among the tools demonstrated by the individual scale param-
eters under the M DP setup. In the semiparametric model, the
individual scale parameters express the differences between
the machine tools that are not ex plained by the covariates.
We can examine such differences among the machine tools
through the marginal posterior distribution of K, the num-
ber of groups of
i
s in the MDP setup, shown in Figure 2.
Figure 2 indicates t hat there is little support for 1 group, as i n
Figure 2. Marginal Posterior Distribution of the Number of Groups
of
i
’s in the Semiparametric Model.
TECHNOMETRICS, FEBRUARY 2003, VOL. 45, NO. 1

Citations
More filters
Journal ArticleDOI

[...]

TL;DR: This work model the survival distribution employing a flexible Dirichlet process mixture, with a Weibull kernel, that yields rich inference for several important functionals and develops a method for hazard function estimation.
Abstract: Bayesian nonparametric methods have been applied to survival analysis problems since the emergence of the area of Bayesian nonparametrics. However, the use of the flexible class of Dirichlet process mixture models has been rather limited in this context. This is, arguably, to a large extent, due to the standard way of fitting such models that precludes full posterior inference for many functionals of interest in survival analysis applications. To overcome this difficulty, we provide a computational approach to obtain the posterior distribution of general functionals of a Dirichlet process mixture. We model the survival distribution employing a flexible Dirichlet process mixture, with a Weibull kernel, that yields rich inference for several important functionals. In the process, a method for hazard function estimation emerges. Methods for simulation-based model fitting, in the presence of censoring, and for prior specification are provided. We illustrate the modeling approach with simulated and real data.

82 citations

Journal ArticleDOI

[...]

TL;DR: In this article, the main analysis methods and modeling techniques used in reliability assessment of numerical control (NC) machine tools are illustrated with brief case studies and the reliabilities of typical subsystems of NC machine tools were discussed.
Abstract: Numerical control (NC) machine tools are the fundamental equipment of the machinery industry. The reliability of NC machine tools directly influences the processing quality, productivity, and efficiency. This paper reviews the latest developments in the reliability analysis of NC machine tools. The main analysis methods and modeling techniques used in reliability assessment of NC machine tools are illustrated with brief case studies. The reliabilities of typical subsystems of NC machine tools are discussed. In addition, several important key problems and issues remain to be addressed about the reliability analysis of NC machine tools and opportunities for further research are identified.

23 citations

Journal ArticleDOI

[...]

TL;DR: New results are derived for the optimal preventive maintenance schedule of a single item over a finite horizon, based on Bayesian models of a failure rate function, using real failure time data from the South Texas Project Nuclear Operating Company.
Abstract: New results are derived for the optimal preventive maintenance schedule of a single item over a finite horizon, based on Bayesian models of a failure rate function. Two types of failure rate functions—increasing and bathtub shapes—are considered. For both cases, optimality conditions and efficient algorithms to find an optimal maintenance schedule are given. A Bayesian parametric model for bathtub-shaped failure rate functions is used, while the class of increasing failure rate functions are tackled by an extended gamma process. We illustrate both approaches using real failure time data from the South Texas Project Nuclear Operating Company in Bay City, Texas.

22 citations

Cites methods from "A Bayesian Semiparametric Analysis ..."

• [...]

Journal ArticleDOI

[...]

TL;DR: In this article, a mixture model along with both Bayesian inference and stochastic dynamic programming approaches are used to find the multi-stage optimal replacement strategy for serial machines that produce non-conforming items.
Abstract: If at least one out of two serial machines that produce a specific product in manufacturing environments malfunctions, there will be non conforming items produced. Determining the optimal time of the machines' maintenance is the one of major concerns. While a convenient common practice for this kind of problem is to fit a single probability distribution to the combined defect data, it does not adequately capture the fact that there are two different underlying causes of failures. A better approach is to view the defects as arising from a mixture population: one due to the first machine failures and the other due to the second one. In this article, a mixture model along with both Bayesian inference and stochastic dynamic programming approaches are used to find the multi-stage optimal replacement strategy. Using the posterior probability of the machines to be in state λ1, λ2 (the failure rates of defective items produced by machine 1 and 2, respectively), we first formulate the problem as a stochastic dynam...

18 citations

Journal ArticleDOI

[...]

23 Sep 2011
TL;DR: The results show that the mixture model of three-parameter Weibull distribution, with advantages over other WeibULL models, is suitable for modelling failure data of multiple NC machine tools with multiple failure modes and causes.
Abstract: To analyse the reliability of numerical control (NC) machine tools, a three-parameter Weibull mixture model is proposed in this case study. Negative log-likelihood function is used as an optimal objective instead of solving complex system of equations; the problem of parameter estimation of mixture model is solved by a non-linear programming method. A comprehensive criteria, which include Akaike information criterion, Bayesian information criterion, and root mean squared error, is presented for selecting the number of components of mixture model. The field failure data of three NC machine tools are analysed and the results of reliability assessment, such as the reliability, maintainability, availability, etc., are all presented. The results show that the mixture model of three-parameter Weibull distribution, with advantages over other Weibull models, is suitable for modelling failure data of multiple NC machine tools with multiple failure modes and causes.

18 citations

References
More filters
Book ChapterDOI

[...]

TL;DR: The analysis of censored failure times is considered in this paper, where the hazard function is taken to be a function of the explanatory variables and unknown regression coefficients multiplied by an arbitrary and unknown function of time.
Abstract: The analysis of censored failure times is considered. It is assumed that on each individual arc available values of one or more explanatory variables. The hazard function (age-specific failure rate) is taken to be a function of the explanatory variables and unknown regression coefficients multiplied by an arbitrary and unknown function of time. A conditional likelihood is obtained, leading to inferences about the unknown regression coefficients. Some generalizations are outlined.

28,225 citations

"A Bayesian Semiparametric Analysis ..." refers background or methods in this paper

• [...]

• [...]

Journal ArticleDOI

[...]

TL;DR: In this paper, the authors consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined and derive a measure pD for the effective number in a model as the difference between the posterior mean of the deviances and the deviance at the posterior means of the parameters of interest, which is related to other information criteria and has an approximate decision theoretic justification.
Abstract: Summary. We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. Using an information theoretic argument we derive a measure pD for the effective number of parameters in a model as the difference between the posterior mean of the deviance and the deviance at the posterior means of the parameters of interest. In general pD approximately corresponds to the trace of the product of Fisher's information and the posterior covariance, which in normal models is the trace of the ‘hat’ matrix projecting observations onto fitted values. Its properties in exponential families are explored. The posterior mean deviance is suggested as a Bayesian measure of fit or adequacy, and the contributions of individual observations to the fit and complexity can give rise to a diagnostic plot of deviance residuals against leverages. Adding pD to the posterior mean deviance gives a deviance information criterion for comparing models, which is related to other information criteria and has an approximate decision theoretic justification. The procedure is illustrated in some examples, and comparisons are drawn with alternative Bayesian and classical proposals. Throughout it is emphasized that the quantities required are trivial to compute in a Markov chain Monte Carlo analysis.

10,825 citations

BookDOI

[...]

TL;DR: The Markov Chain Monte Carlo Implementation Results Summary and Discussion MEDICAL MONITORING Introduction Modelling Medical Monitoring Computing Posterior Distributions Forecasting Model Criticism Illustrative Application Discussion MCMC for NONLINEAR HIERARCHICAL MODELS.
Abstract: INTRODUCING MARKOV CHAIN MONTE CARLO Introduction The Problem Markov Chain Monte Carlo Implementation Discussion HEPATITIS B: A CASE STUDY IN MCMC METHODS Introduction Hepatitis B Immunization Modelling Fitting a Model Using Gibbs Sampling Model Elaboration Conclusion MARKOV CHAIN CONCEPTS RELATED TO SAMPLING ALGORITHMS Markov Chains Rates of Convergence Estimation The Gibbs Sampler and Metropolis-Hastings Algorithm INTRODUCTION TO GENERAL STATE-SPACE MARKOV CHAIN THEORY Introduction Notation and Definitions Irreducibility, Recurrence, and Convergence Harris Recurrence Mixing Rates and Central Limit Theorems Regeneration Discussion FULL CONDITIONAL DISTRIBUTIONS Introduction Deriving Full Conditional Distributions Sampling from Full Conditional Distributions Discussion STRATEGIES FOR IMPROVING MCMC Introduction Reparameterization Random and Adaptive Direction Sampling Modifying the Stationary Distribution Methods Based on Continuous-Time Processes Discussion IMPLEMENTING MCMC Introduction Determining the Number of Iterations Software and Implementation Output Analysis Generic Metropolis Algorithms Discussion INFERENCE AND MONITORING CONVERGENCE Difficulties in Inference from Markov Chain Simulation The Risk of Undiagnosed Slow Convergence Multiple Sequences and Overdispersed Starting Points Monitoring Convergence Using Simulation Output Output Analysis for Inference Output Analysis for Improving Efficiency MODEL DETERMINATION USING SAMPLING-BASED METHODS Introduction Classical Approaches The Bayesian Perspective and the Bayes Factor Alternative Predictive Distributions How to Use Predictive Distributions Computational Issues An Example Discussion HYPOTHESIS TESTING AND MODEL SELECTION Introduction Uses of Bayes Factors Marginal Likelihood Estimation by Importance Sampling Marginal Likelihood Estimation Using Maximum Likelihood Application: How Many Components in a Mixture? Discussion Appendix: S-PLUS Code for the Laplace-Metropolis Estimator MODEL CHECKING AND MODEL IMPROVEMENT Introduction Model Checking Using Posterior Predictive Simulation Model Improvement via Expansion Example: Hierarchical Mixture Modelling of Reaction Times STOCHASTIC SEARCH VARIABLE SELECTION Introduction A Hierarchical Bayesian Model for Variable Selection Searching the Posterior by Gibbs Sampling Extensions Constructing Stock Portfolios With SSVS Discussion BAYESIAN MODEL COMPARISON VIA JUMP DIFFUSIONS Introduction Model Choice Jump-Diffusion Sampling Mixture Deconvolution Object Recognition Variable Selection Change-Point Identification Conclusions ESTIMATION AND OPTIMIZATION OF FUNCTIONS Non-Bayesian Applications of MCMC Monte Carlo Optimization Monte Carlo Likelihood Analysis Normalizing-Constant Families Missing Data Decision Theory Which Sampling Distribution? Importance Sampling Discussion STOCHASTIC EM: METHOD AND APPLICATION Introduction The EM Algorithm The Stochastic EM Algorithm Examples GENERALIZED LINEAR MIXED MODELS Introduction Generalized Linear Models (GLMs) Bayesian Estimation of GLMs Gibbs Sampling for GLMs Generalized Linear Mixed Models (GLMMs) Specification of Random-Effect Distributions Hyperpriors and the Estimation of Hyperparameters Some Examples Discussion HIERARCHICAL LONGITUDINAL MODELLING Introduction Clinical Background Model Detail and MCMC Implementation Results Summary and Discussion MEDICAL MONITORING Introduction Modelling Medical Monitoring Computing Posterior Distributions Forecasting Model Criticism Illustrative Application Discussion MCMC FOR NONLINEAR HIERARCHICAL MODELS Introduction Implementing MCMC Comparison of Strategies A Case Study from Pharmacokinetics-Pharmacodynamics Extensions and Discussion BAYESIAN MAPPING OF DISEASE Introduction Hypotheses and Notation Maximum Likelihood Estimation of Relative Risks Hierarchical Bayesian Model of Relative Risks Empirical Bayes Estimation of Relative Risks Fully Bayesian Estimation of Relative Risks Discussion MCMC IN IMAGE ANALYSIS Introduction The Relevance of MCMC to Image Analysis Image Models at Different Levels Methodological Innovations in MCMC Stimulated by Imaging Discussion MEASUREMENT ERROR Introduction Conditional-Independence Modelling Illustrative examples Discussion GIBBS SAMPLING METHODS IN GENETICS Introduction Standard Methods in Genetics Gibbs Sampling Approaches MCMC Maximum Likelihood Application to a Family Study of Breast Cancer Conclusions MIXTURES OF DISTRIBUTIONS: INFERENCE AND ESTIMATION Introduction The Missing Data Structure Gibbs Sampling Implementation Convergence of the Algorithm Testing for Mixtures Infinite Mixtures and Other Extensions AN ARCHAEOLOGICAL EXAMPLE: RADIOCARBON DATING Introduction Background to Radiocarbon Dating Archaeological Problems and Questions Illustrative Examples Discussion Index

7,284 citations

Journal ArticleDOI

[...]

TL;DR: In this article, a class of prior distributions, called Dirichlet process priors, is proposed for nonparametric problems, for which treatment of many non-parametric statistical problems may be carried out, yielding results that are comparable to the classical theory.
Abstract: The Bayesian approach to statistical problems, though fruitful in many ways, has been rather unsuccessful in treating nonparametric problems. This is due primarily to the difficulty in finding workable prior distributions on the parameter space, which in nonparametric ploblems is taken to be a set of probability distributions on a given sample space. There are two desirable properties of a prior distribution for nonparametric problems. (I) The support of the prior distribution should be large--with respect to some suitable topology on the space of probability distributions on the sample space. (II) Posterior distributions given a sample of observations from the true probability distribution should be manageable analytically. These properties are antagonistic in the sense that one may be obtained at the expense of the other. This paper presents a class of prior distributions, called Dirichlet process priors, broad in the sense of (I), for which (II) is realized, and for which treatment of many nonparametric statistical problems may be carried out, yielding results that are comparable to the classical theory. In Section 2, we review the properties of the Dirichlet distribution needed for the description of the Dirichlet process given in Section 3. Briefly, this process may be described as follows. Let $\mathscr{X}$ be a space and $\mathscr{A}$ a $\sigma$-field of subsets, and let $\alpha$ be a finite non-null measure on $(\mathscr{X}, \mathscr{A})$. Then a stochastic process $P$ indexed by elements $A$ of $\mathscr{A}$, is said to be a Dirichlet process on $(\mathscr{X}, \mathscr{A})$ with parameter $\alpha$ if for any measurable partition $(A_1, \cdots, A_k)$ of $\mathscr{X}$, the random vector $(P(A_1), \cdots, P(A_k))$ has a Dirichlet distribution with parameter $(\alpha(A_1), \cdots, \alpha(A_k)). P$ may be considered a random probability measure on $(\mathscr{X}, \mathscr{A})$, The main theorem states that if $P$ is a Dirichlet process on $(\mathscr{X}, \mathscr{A})$ with parameter $\alpha$, and if $X_1, \cdots, X_n$ is a sample from $P$, then the posterior distribution of $P$ given $X_1, \cdots, X_n$ is also a Dirichlet process on $(\mathscr{X}, \mathscr{A})$ with a parameter $\alpha + \sum^n_1 \delta_{x_i}$, where $\delta_x$ denotes the measure giving mass one to the point $x$. In Section 4, an alternative definition of the Dirichlet process is given. This definition exhibits a version of the Dirichlet process that gives probability one to the set of discrete probability measures on $(\mathscr{X}, \mathscr{A})$. This is in contrast to Dubins and Freedman [2], whose methods for choosing a distribution function on the interval [0, 1] lead with probability one to singular continuous distributions. Methods of choosing a distribution function on [0, 1] that with probability one is absolutely continuous have been described by Kraft [7]. The general method of choosing a distribution function on [0, 1], described in Section 2 of Kraft and van Eeden [10], can of course be used to define the Dirichlet process on [0, 1]. Special mention must be made of the papers of Freedman and Fabius. Freedman [5] defines a notion of tailfree for a distribution on the set of all probability measures on a countable space $\mathscr{X}$. For a tailfree prior, posterior distribution given a sample from the true probability measure may be fairly easily computed. Fabius [3] extends the notion of tailfree to the case where $\mathscr{X}$ is the unit interval [0, 1], but it is clear his extension may be made to cover quite general $\mathscr{X}$. With such an extension, the Dirichlet process would be a special case of a tailfree distribution for which the posterior distribution has a particularly simple form. There are disadvantages to the fact that $P$ chosen by a Dirichlet process is discrete with probability one. These appear mainly because in sampling from a $P$ chosen by a Dirichlet process, we expect eventually to see one observation exactly equal to another. For example, consider the goodness-of-fit problem of testing the hypothesis $H_0$ that a distribution on the interval [0, 1] is uniform. If on the alternative hypothesis we place a Dirichlet process prior with parameter $\alpha$ itself a uniform measure on [0, 1], and if we are given a sample of size $n \geqq 2$, the only nontrivial nonrandomized Bayes rule is to reject $H_0$ if and only if two or more of the observations are exactly equal. This is really a test of the hypothesis that a distribution is continuous against the hypothesis that it is discrete. Thus, there is still a need for a prior that chooses a continuous distribution with probability one and yet satisfies properties (I) and (II). Some applications in which the possible doubling up of the values of the observations plays no essential role are presented in Section 5. These include the estimation of a distribution function, of a mean, of quantiles, of a variance and of a covariance. A two-sample problem is considered in which the Mann-Whitney statistic, equivalent to the rank-sum statistic, appears naturally. A decision theoretic upper tolerance limit for a quantile is also treated. Finally, a hypothesis testing problem concerning a quantile is shown to yield the sign test. In each of these problems, useful ways of combining prior information with the statistical observations appear. Other applications exist. In his Ph. D. dissertation [1], Charles Antoniak finds a need to consider mixtures of Dirichlet processes. He treats several problems, including the estimation of a mixing distribution, bio-assay, empirical Bayes problems, and discrimination problems.

4,678 citations

"A Bayesian Semiparametric Analysis ..." refers background in this paper

• [...]

Journal ArticleDOI

[...]

TL;DR: In this article, the authors describe and illustrate Bayesian inference in models for density estimation using mixtures of Dirichlet processes and show convergence results for a general class of normal mixture models.
Abstract: We describe and illustrate Bayesian inference in models for density estimation using mixtures of Dirichlet processes. These models provide natural settings for density estimation and are exemplified by special cases where data are modeled as a sample from mixtures of normal distributions. Efficient simulation methods are used to approximate various prior, posterior, and predictive distributions. This allows for direct inference on a variety of practical issues, including problems of local versus global smoothing, uncertainty about density estimates, assessment of modality, and the inference on the numbers of components. Also, convergence results are established for a general class of normal mixture models.

2,323 citations

• [...]

• [...]

• [...]

• [...]

• [...]

[...]

23 Sep 2011
Zhongling Wang

[...]

Jan Luts