scispace - formally typeset
Open AccessJournal ArticleDOI

Bayesian analysis and model selection for interval-censored survival data.

TLDR
Some Bayesian discretized semiparametric models, incorporating proportional and nonproportional hazards structures, along with associated statistical analyses and tools for model selection using sampling-based methods are presented.
Abstract
Summary. Interval-censored data occur in survival analysis when the survival time of each patient is only known to be within an interval and these censoring intervals differ from patient to patient. For such data, we present some Bayesian discretized semiparametric models, incorporating proportional and nonproportional hazards structures, along with associated statistical analyses and tools for model selection using sampling-based methods. The scope of these methodologies is illustrated through a reanalysis of a breast cancer data set (Finkelstein, 1986, Biometrics42, 845–854) to test whether the effect of covariate on survival changes over time.

read more

Content maybe subject to copyright    Report

! .
BAYESIAN ANALYSIS AND MODEL SELECTION
FOR
INTERVAL-CENSORED SURVIVAL
DATA
by
Debajyoti Sinha, Ming-Hui Chen, and Sujit K. Ghosh
Institute
of
Statistics Mimeograph Series No. 2298
May 1997
NORTH CAROLINA STATE UNIVERSITY
Raleigh,
North
Carolina

The
Library
of
the
Departm~~,
., ;
idtistica
Nann
Carolina
State
U'"versi/y
Date
'es
Mimeo
Serl.
No.
2298
9
97
and
Model
. May 1 .
Analysis
I
Bayes1an
Interval-
.
for
S
lect10n
. 1
data
e d
surV1va
d
Ghosh
censore
Chen
an
BY:
Sinha,
I==:N=-ame
~
__
,
I

Bayesian
Analysis
and
Model
Selection
for
Interval-censored
Survival
Data
Debajyoti
Sinha;
Ming-Hui
Chent
and
Sujit
K.Ghosh
t
,
April
10, 1997
Abstract
Interval-censored
data
occur in survival analysis when
the
survival
time
of
each
patient
is only known
to
be within
an
interval
and
these censoring intervals differ from
patient
to
patient.
This
kind
of
data
pose some challenges
to
the
semi
parametric
analysis
and
model diagnostics. For such
data,
we
present
some Bayesian discretized semi
parametric
models, incorporating
the
proportional
and
non-proportional
hazards structures, along
with
the
associated
statistical
analyses
and
tools for model selection using
sampling based methods.
The
scope
of
these methodologies is
illustrated
through
a re-analysis
of
the
historical
data
set from Finkelstein (1986).
Key
Words:
CPO,
Gibbs sampler,
Prior
process.
1
Introd
uction
Many clinical
trials
and
medical studies use periodic scheduled follow-ups
of
each
patient
to
monitor
the
time
to
an
event
of
interest or disease (Le. survival
time
T
of
the
patient)
whose occurrence is
not
apparent
from outside.
The
occurrence
of
such event
can
be detected only
through
some invasive
procedure (such as testing blood
or
tissue samples etc.) performed during these clinic visits. Medical
researchers often come across interval censoring in such studies when
the
patients
miss some
of
the
scheduled
appointments
for reasons
not
related
to
the
survival times
and
the
observed censoring intervals
containing
their
survival times frequently overlap with each
other.
Interval-censored survival
data
have
'Department
of
Mathematics,
University
of
New Hampshire,
Durham,
NH
03824-3591.
Dr.
Sinha's
research was
supported
by
the
grant
R29-CA69222-02 from NCI.
tDepartment
of
Mathematical
Sciences, Worcester Polytechnic
Institute,
Worcester,
MA
01609-2276
*Department
of
Statistics,
North
Carolina
State
University, Raleigh,
NC
27695-8203
1
",
.
~~.,
:... ;
,~~,

recently received much
attention
in biostatistical
and
statistical
literature
due
to
diseases such as AIDS
and
some forms
of
cancers. For recent reviews, see
Satten
(1996),
and
Frydman
(1995).
The
data
set
in Table 3
of
Finkelstein
and
Wolfe (1985) is a historical
data
set
of
interval-censored
data.
In this
data
set, 46 early
breast
cancer patients receiving only
radiotherapy
(covariate value z =
0)
and
48
patients
receiving radio-chemotherapy
(z
= 1) were monitored for cosmetic change
during
weekly clinic
visits.
But,
some
patients
missed some
of
their weekly visits. So,
the
data
on survival
time
are
typically
recorded as, for example, (7,18]
(at
the
7th
week clinic-visit,
patient
had
shown no change
and
then
in
the
next clinic visit
at
the
18th
week the
patient's
tissue showed
that
the
change
had
already occurred).
Since,
the
clinic visits
of
different
patients
occurred
at
different times,
the
censoring intervals in
the
data
set are found
to
be often overlapping.
We are interested
to
see
the
effect
of
the
covariate z associated
with
the
patient,
on
the
survival
time
T.
A
popular
semiparametric approach
to
model survival time, in
the
presence
of
covariate
effects is proposed
in
the
Co~'s
(1972) proportional hazards model, given by, A(tlz) =
Ao(t)e.8
z
.
Here
A(tlz) = -It
10gP(T
> tlz) is
the
hazard function
of
T given z,
(3
is
the
time-independent regression
coefficient for
the
covariate z
and
AO(t)
is
the
baseline
hazard
function. Finkelstein (1986)
and
Satten
(1996) analyzed interval-censored
data
under
the
assumption
of
Cox model.
But,
such
an
assumption
of
time-independent regression coefficient
may
not
always be valid.
The
major
contribution
of
the
present
paper
is two fold.
With
the
advancement
of
the
sampling based
computational
tools,
it
is now
feasible
to
consider more general models which incorporates time-varying coefficients. Secondly, while
powerful
computational
tools enable us
to
fit remarkably complex models we should
not
loose sight
of
the
need
to
make
suitably
parsimonious choices. So,
we
develop some Bayesian tools for model selection
and
model validation. So far,
to
our
knowledge there is no formal
statistical
method
to
select among
the
models
we
propose
or
to
check any modeling assumption such as time-independent coefficient for
interval-censored
data.
In addition, Bayesian
method
enables us
to
obtain
exact small sample inference
on
the
parameter
of
interest (i.e.
the
regression coefficient), from
the
moderate
sized
data
set
even
with
a high-dimensional nuisance
parameter
(Le.,
the
baseline hazard).
In Section
2,
we
propose a Bayesian version
of
discretized Cox model
and
a model
with
time-varying
coefficients. In Section 3,
we
describe model fitting using sampling based method. In Section 4,
we
present some Bayesian model selection
and
model checking methods.
In
Section 5,
we
illustrate
the
proposed
methods
by reanalyzing the
breast
cancer
data
of
Finkelstein
and
Wolfe (1985). Section 6
concludes
with
some remarks.
2

2
Models
(2)
131c+1
I
131,
...
,131e
'"
N
(13Ie,
w~)
for k =
0,
..
·,9
- 1
and
the
N(13o,
w1J)
and
13
is apriori
(2)
13
We
take
the
hazard
to
be a piecewise
constant
function
with
A(tlz) =
A/c0k
for t E
l/c,
where
O/c
=e
13
1r.,
lie
= (ale-1,
ale]
for k =
1,2,
...
,9,0
=
ao
<
a1
< ... < a
g
=
00,
and
9 is
the
total
number
of
grid
intervals.
The
length
of
each grid can be taken
to
be sufficiently small
to
approximate
any
hazard
function for all practical purposes. Now,
we
present two Bayesian
semiparametric
discretized models,
viz. a discretized version
of
the
Cox model (which
we
call M
o
)
and
a discretized
hazard
model with
time-dependent regression coefficient (which
we
call M
1
).
More precisely, these features are
captured
through
their
prior specifications as follows:
indep
(
M
o
: (1)
Ale
'"
Gamma
Tile,
ile) for k = 1,
..
·,9;
independent
of
A=
(A1'
..
" A
g
).
M
1
:
(1) A has
same
prior as in M
o
;
13Ie's
are
apriori independent
of
A .
In above, we assume
that
the
hyperparameters
of
these models, viz.,
Tile'S,
ile'S,
WIe'S
and
130
are known
in advance.
M
o
is a discretized version
of
Cox model with a discretized version
of
the
gamma
process prior
(Kalbfleisch 1978) for
the
baseline
hazard
AOO
where
TlIe/ile
is
the
prior
mean
and
TlIe/i~
is
the
prior
variance
of
Ale.
When
the
grid intervals are sufficiently small, this discretized version will be indistin-
guishable from
the
actual
time-continuous
gamma
process.
The
discretized
autocorrelated
prior process
for
13Ie's
in M
1
allows
the
covariate effect
to
change over time,
but
also incorporates
the
prior informa-
tion
that
the
values
of
the
coefficient
13
in adjacent intervals are expected
to
be
somewhat
close
and
the
dependence
among
the
13's
decrease as
the
intervals become further
apart.
This
assumption
seems
to
be in complete accordance
with
some studies where
the
covariate effect
may
change over time,
but
is
not
expected
to
change
too
wildly over time.
The
parameters
w/c
's·
can
be used as a
tuning
device
to
determine
our
prior opinion
about
the
possible change in
the
magnitude
of
13
over
time.
For example,
apriori we expect
the
131c+1
to
be within approximately
1.96wIe
from
the
131e
with
95% confidence.
The
w/c's
should depend on
the
lengths
of
the
lie's allowing
the
coefficient
to
change
more
for bigger grid
intervals.
It
is possible
to
use
an
autocorrelated prior process for
the
baseline
hazard
also. For details
on
the
use
and
properties
of
an
autocorrelated process, see Sinha
and
Dey (1997),
and
Sargent (1996).
Our
major
interest is
to
compare the Cox model
(M
o
)
with
the
time
varying coefficient model
(Md.
For
the
example
of
breast cancer
data,
we
consider following values
of
the
hyperparameters.
3

Citations
More filters
Book

Bayesian Survival Analysis

TL;DR: This chapter reviews Bayesian advances in survival analysis and discusses the various semiparametric modeling techniques that are now commonly used, with a focus on proportional hazards models.
Reference EntryDOI

Bayesian Survival Analysis

TL;DR: This paper reviewed parametric and semiparametric approaches to Bayesian survival analysis, with a focus on proportional hazards models, and reference to other types of models are also given, including Gibbs sampling and Weibull model.
Journal ArticleDOI

A survey of Bayesian predictive methods for model assessment, selection and comparison

TL;DR: A unified review of Bayesian predictive model assessment and selection methods, and of methods closely related to them, with an emphasis on how each method approximates the expected utility of using a Bayesian model for the purpose of predicting future data.
Journal ArticleDOI

Tutorial on methods for interval-censored data and their implementation in R

TL;DR: The purpose of this tutorial is to present, in a pedagogical and unified manner, the methodology and the available software for analyzing interval-censored data using the data from a dental study.
Journal ArticleDOI

Fast Bayesian Inference in Dirichlet Process Mixture Models

TL;DR: This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models, viewing the partitioning of subjects into clusters as a model selection problem, and proposes a sequential greedy search algorithm for selecting the partition.
References
More filters
Book ChapterDOI

Regression Models and Life-Tables

TL;DR: The analysis of censored failure times is considered in this paper, where the hazard function is taken to be a function of the explanatory variables and unknown regression coefficients multiplied by an arbitrary and unknown function of time.
Journal ArticleDOI

Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review

TL;DR: All of the methods in this work can fail to detect the sorts of convergence failure that they were designed to identify, so a combination of strategies aimed at evaluating and accelerating MCMC sampler convergence are recommended.
Journal ArticleDOI

Adaptive Rejection Sampling for Gibbs Sampling

TL;DR: In this paper, the authors proposed a method for rejection sampling from any univariate log-concave probability density function, which is adaptive: as sampling proceeds, the rejection envelope and the squeezing function converge to the density function.
Related Papers (5)