scispace - formally typeset
Open AccessJournal ArticleDOI

A Nonparametric Multidimensional Latent Class IRT Model in a Bayesian Framework.

Reads0
Chats0
TLDR
A nonparametric item response theory model for dichotomously-scored items in a Bayesian framework that makes inference on the number of dimensions and clusters items according to the dimensions when unidimensionality is violated.
Abstract
We propose a nonparametric item response theory model for dichotomously-scored items in a Bayesian framework. The model is based on a latent class (LC) formulation, and it is multidimensional, with dimensions corresponding to a partition of the items in homogenous groups that are specified on the basis of inequality constraints among the conditional success probabilities given the latent class. Moreover, an innovative system of prior distributions is proposed following the encompassing approach, in which the largest model is the unconstrained LC model. A reversible-jump type algorithm is described for sampling from the joint posterior distribution of the model parameters of the encompassing model. By suitably post-processing its output, we then make inference on the number of dimensions (i.e., number of groups of items measuring the same latent trait) and we cluster items according to the dimensions when unidimensionality is violated. The approach is illustrated by two examples on simulated data and two applications based on educational and quality-of-life data.

read more

Content maybe subject to copyright    Report

A nonparametric multidimensional latent class
IRT model in a Bayesian framework
Francesco Bartolucci, Alessio Farcomeni and Luisa Scaccia
Abstract We propose a nonparametric Item Response Theory model for dichoto-
mously scored items in a Bayesian framework. Partitions of the items are defined
on the basis of inequality constraints among the latent class success probabilities. A
Reversible Jump type algorithm is described for sampling from the posterior distri-
bution. A consequence is the possibility to make inference on the number of dimen-
sions (i.e., number of groups of items measuring the same latent trait) and to cluster
items when unidimensionality is violated.
Key words: Item response theory, unidimensionality, stochastic partition.
1 Introduction
Educational and psychological tests are often based on a set of items which measure
a unidimensional latent trait, that is, a single personal aspect which is not directly
observable (e.g., ability in a certain subject, tendency toward a certain behavior).
When the test is unidimensional, the responses to the items may be validly sum-
marized by a single indicator (e.g., the sum of the correct responses at individual
level) and respondents may be globally ranked according to such an indicator and
the distance between any two respondents in terms of the single latent trait may be
Francesco Bartolucci
Dipartimento di Economia, Finanza e Statistica, Universit
`
a di Perugia, Via A. Pascoli 20, 06123
Perugia, Italy, e-mail: bart@stat.unipg.it
Alessio Farcomeni
Dipartimento di Sanit
´
a Pubblica e Malattie Infettive, Sapienza - Universit
`
a di Roma, Piazzale Aldo
Moro, 5, 00186 Roma, Italy, e-mail: alessio.farcomeni@uniroma1.it
Luisa Scaccia
Dipartimento di Economia e Diritto, Universit
`
a di Macerata, Via Crescimbeni 20, 62100 Macerata,
Italy, e-mail: scaccia@unimc.it
1

2 Francesco Bartolucci, Alessio Farcomeni and Luisa Scaccia
simply measured. A consequent important aspect is how to test the unidimension-
ality assumption and, in case it is violated, how to group items in a sensible way
so that items in the same group measure the same latent trait. Bartolucci (2007)
introduced a multidimensional parametric Item Response Theory (IRT) model for
dichotomously-scored items, which is based on the assumption that respondents are
grouped into k latent classes of ability, and found the number of dimensions, s, and
clusters of items through a hierarchical agglomerative clustering algorithm based
on the model likelihood. However, this approach is based on certain parametric as-
sumptions which may affect the selected number of dimensions.
In this work, we propose to select s relying on a completely nonparametric model
formulated along the lines of Forcina and Bartolucci (2004). This formulation is
based on a set of inequalities on the conditional probabilities of success in each
item given the level of the ability. The distribution of the ability is still assumed to
be discrete, therefore having k latent classes. Consequently, two items measure the
same dimension if their success probabilities have the same ordering with respect to
the latent classes. Any specific model depends on the number of latent classes and
the set of inequalities on success probabilities, which, in turn, determines a certain
partition of the items into s groups. Inference on the nonparametric IRT models pro-
posed is based on the Bayesian paradigm, allowing us to work with unknown k and
s. Relying on the encompassing approach of Klugkist et al (2005), we formulate
the priors on the parameters of a model that includes any other model of interest.
See also Bartolucci et al (2012). Such encompassing model is the latent class model
(Lazarsfeld and Henry, 1968) with k classes. This automatically defines the priors
on any nested model. For estimation purposes, we use the Reversible Jump (RJ)
algorithm (Green, 1995; Green and Richardson, 2001) applied to the latent class
model. The output is then suitably post-processed to estimate the posterior prob-
ability of any nonparametric IRT model. An alternative algorithm, expected to be
more efficient, is also outlined.
The paper is organized as follows. Section 2 formalizes the nonparametric
IRT model and deals with Bayesian estimation. Section 3 illustrates the approach
through an application on the Mathematics test data used in Bartolucci (2007).
2 Model Formulation and Bayesian Inference
Let Y
i j
, i = 1, . . . , n, j = 1,...,r denote the binary outcome measured on the i-th
subject for the j-th item. We assume that the sample of respondents is drawn from
a population divided into k latent classes, with individuals in the same class sharing
the same ability level. Thus the ability is represented by a discrete latent variable
C = having k support points denoted, without loss of generality, by 1,...,k. Let
π
1
,...,π
k
be the class weights and λ
c j
= p(Y
i j
= 1|C = c) denote the probability
of success at the j-th item for any subject i in class c. Given two items, j
1
and j
2
say, these are said to measure the same dimension if there exists a permutation of
1,...,k, denoted by c
1
,...,c
k
, such that

A nonparametric multidimensional latent class IRT model in a Bayesian framework 3
λ
c
1
j
... λ
c
k
j
, j = j
1
, j
2
. (1)
In other words, the success probabilities of the two items are ordered in the same
way. Such a characterization of items measuring the same dimension is completely
nonparametric, in contrast with the one in Bartolucci (2007) which is based on a
parametric formulation of λ
c j
. For the full set of items, the nonparametric IRT model
is specified by fixing k and a certain permutation c
( j)
1
,...,c
( j)
k
, of the type (1), for ev-
ery item j = 1, . . . , r. If there are s different permutations, there are s groups of items
measuring distinct dimension, which are are denoted by J
1
,...,J
s
, collected in
J .
The observed log-likelihood of the model defined above may be easily computed
as
`(Λ,π) =
i
log
"
c
π
c
j
λ
y
i j
c j
(1 λ
c j
)
1y
i j
#
, (2)
where Λ is the k × r dimensional matrix of probabilities λ
c j
, π is the vector of
class weights π
c
, and y
i j
is the observed value of Y
i j
. To make estimation easier it is
convenient to introduce the latent class indicators z
ic
, i = 1,...,n, c = 1, . . . ,, where
z
ic
= 1 if the i-th subject is in latent class c; see for instance Diebolt and Robert
(1994). The complete or augmented data log-likelihood, after augmenting the data
with z
ic
, is then
`
c
(Λ,π) =
c
z
ic
log(π
c
) +
c
i
j
z
ic
[y
i j
log(λ
c j
) +(1 y
i j
)log(1 λ
c j
)]. (3)
2.1 Prior Distributions
Any model of the type above is nested in a latent class model in which the prob-
abilities λ
c j
are left unconstrained (Lazarsfeld and Henry, 1968). Then, once the
priors have been specified for this model, we can automatically specify those of any
nested model by the encompassing approach (Klugkist et al, 2005): prior distribu-
tions for nested models are automatically derived by truncating the parameter space
according to the constraints of interest.
For the encompassing model we adopt Bayes-Laplace priors for the success prob-
abilities and class weights (Tuyl et al, 2009). This choice reduces to an (uncondi-
tional) uniform prior for λ
c j
, c = 1,...,k. For the class weights this choice cor-
responds to a Dirichlet distribution with vector of parameters having all elements
equal to 1. Finally, we use a uniform prior for k in the discrete set 1, . . .,k
max
.

4 Francesco Bartolucci, Alessio Farcomeni and Luisa Scaccia
2.2 Estimation strategy based on the Reversible Jump algorithm
Our estimation strategy makes use of the RJ algorithm, which samples from the
posterior distribution of all the parameters of the latent class model, including k.
The RJ output is then post-processed for identifiability (Fr
¨
uhwirth-Schnatter, 2001)
and to deliver all the different partitions of items visited by the algorithm.
The algorithm performs the following steps:
1. Sample indicators of latent class z
ic
from their full conditional distribution:
Pr(z
ic
= 1|Y,λ , π ) =
π
c
j
λ
y
i j
c j
(1 λ
c j
)
1y
i j
h
π
h
j
λ
y
i j
h j
(1 λ
h j
)
1y
i j
.
2. Update λ
c j
. For each j = 1,...,r, we propose simultaneous independent zero-
centered normal increments of the current logit (λ
j
), where λ
j
= (λ
1 j
,...,λ
k j
).
The candidate λ
?
j
is accepted with probability equal to min(1, p
λ
?
j
), where
log(p
λ
?
j
) =
c
i
z
ic
{y
i j
log(λ
?
c j
/λ
c j
) +(1 y
i j
)log[(1 λ
?
c j
)/(1 λ
c j
)]} +
+
c
log(λ
?
c j
) + log(1 λ
?
c j
) log(λ
c j
) log(1 λ
c j
)
. (4)
The first line on the right side is the log-likelihood ratio. The ratio between the
prior densities cancels out when using uniform priors for λ
c j
, as suggested. Also
the ratio between the proposal densities cancels out, apart from logarithm of Ja-
cobian of the logit transformation, given in the second line of 4.
3. Sample the weights π
1
,...,π
k
from the full conditional distribution, which is a
Dirichlet with parameters (1 +
i
z
i1
,...,1 +
i
z
ik
).
4. Update k. We follow the approach consisting on a random choice between split-
ting an existing latent class into two and merging two existing classes into one.
The probabilities of these alternatives are b
k
and 1 b
k
, respectively. Of course
b
1
= 1 and b
k
max
= 0, and otherwise we choose b
k
= 0.5 for k = 2, . . . , k
max
1.
For the combine proposal we randomly choose a pair of classes (c
1
,c
2
), with
π
c
1
< π
c
2
, not necessarily adjacent in terms of the current value of their weights.
These two classes are merged into a new one, labeled c
?
= c
2
1, reducing k by
1. We then reallocate all those observations y
i j
, j = 1,...,r, with z
ic
1
= 1 and
z
ic
2
= 1 to the new class c
?
and create values for λ
c
?
j
and π
c
?
in such a way that:
λ
c
?
j
= λ
c
2
j
and π
c
?
= π
c
1
+ π
c
2
.
In the split proposal, a class c
?
is chosen at random and split into two new ones
labeled c
1
and c
2
, augmenting k by 1. The place assigned to the class c
1
is ran-
domly chosen between 1 and c
?
, while the class c
2
takes the place c
?
+ 1. Values
for π
c
1
,π
c
2
,λ
c
1
j
,λ
c
2
j
, for j = 1,...,r, are created by generating a scalar u
1
and a
vector u
2
= (u
2 j
)
r
j=1
, respectively as u
1
U[0;0.5] and u
2 j
U[0;1] and setting:

A nonparametric multidimensional latent class IRT model in a Bayesian framework 5
π
c
1
= u
1
π
c
?
, π
c
2
= (1 u
1
)π
c
?
,
λ
c
1
j
= u
2 j
and λ
c
2
j
= λ
c
?
j
for j = 1,...,r. (5)
Finally we reallocate all those observations y
i j
, j = 1, . . . , r, with z
ic
?
= 1 between
the two new classes, in a way analogous to the standard Gibbs allocation move,
used in step 1. We accept the split move with probability min(1, p
k
), where
p
k
= (likelihood ratio) ×
Pr(k + 1)
Pr(k)
×
D(π
1
,...,π
k+1
)
D(π
?
1
,...,π
?
k
)
×
(π
c
1
)
i
z
ic
1
(π
c
2
)
i
z
ic
2
(π
?
c
?
)
i
z
?
ic
?
×
2(1 b
k+1
)
b
k
P
alloc
× π
?
c
2
1
, (6)
where P
alloc
is the probability of this particular allocation and D is the Dirich-
let density with all parameters equal to 1. The first four terms in the prod-
uct are the ratio of the likelihood and the priors for the new parameter set to
those for the old one. The fifth term is the proposal ratio. The last term is
the Jacobian of the transformation from (π
c
?
,λ
c
?
1
,...,λ
c
?
r
,u
1
,u
21
,...,u
2r
) to
(π
c
1
,λ
c
1
1
,...,λ
c
1
r
,π
c
2
,λ
c
2
1
,...,λ
c
2
r
). The combine move is accepted with prob-
ability min(1, p
1
k
), with some obvious substitutions in the expression for p
k
.
From the RJ output, we estimate the posterior probability of any nonparametric
IRT model visited at least once and the posterior distribution of its parameters. Let
k
(t)
be the number of classes of the model visited at sweep t of the algorithm and
Λ
(t)
and π
(t)
be the parameters of this model, with t = 1,...,T . Then, we exam-
ine every matrix Λ
(t)
and, for j = 1,...,r, we obtain the permutations c
( j)
1
,...,c
( j)
k
(t )
such that the probabilities in the j-th column of this matrix satisfy inequality (1). As
clarified before, these permutations define a partition of the items in groups corre-
sponding to different dimensions. In particular, the permutation at step t is denoted
J
(t)
1
,...,J
(t)
s
(t )
, where s
(t)
is the number of dimensions that is found. To avoid a
sort of label-switching problem, the groups are ordered so that J
(t)
1
includes the
first item, J
(t)
2
includes the item with the smallest index among those excluded
from J
(t)
1
, and so on. Finally, the posterior probability of the model with a certain
k and a certain partition of items J
1
,...,J
s
based on s dimensions is estimated
as:
Pr(k,J
1
,...,J
s
) =
1
T
t:s
(t )
=s
I
n
J
(t)
1
= J
1
,...,J
(t)
s
= J
s
o
, (7)
where the sum is over all sweeps for which s
(t)
= s and I{·} is the indicator function.
On the basis of posterior probabilities in (7), different strategies may be adopted
for model selection. We suggest selecting first the value of k on the basis of the
largest number of visits. Then, conditionally on the value of k, we take the partition
with the highest value of the probability in (7). This strategy is similar to that in
Bartolucci (2007). Alternatively, k and the partition J
1
,...,J
s
can be chosen
jointly as those with the highest posterior probability in (7). This method may lead

Citations
More filters
Journal ArticleDOI

A dynamic inhomogeneous latent state model for measuring material deprivation

TL;DR: In this paper, the authors developed a time inhomogeneous latent Markov model which enables them to classify households according to their current and intertemporal poverty status, and to identify transitions between classes that may occur year by year.
Journal ArticleDOI

Assessing Diabetes Distress Among Type 2 Diabetes Mellitus in Malaysia Using the Problem Areas in Diabetes Scale.

TL;DR: The PAID questionnaire in Malaysia was found to be a reliable and valid instrument exhibiting good psychometric properties and positive significant association between glycated hemoglobin A1c and diabetes duration.
Posted Content

A Nonparametric Bayesian Item Response Modeling Approach for Clustering Items and Individuals Simultaneously

TL;DR: This paper introduces a nonparametric Bayesian approach for clustering items and individuals simultaneously under the Rasch model based on the mixture of finite mixtures (MFM) model.
Posted Content

GPIRT: A Gaussian Process Model for Item Response Theory.

TL;DR: A Bayesian nonparametric model that allows us to simultaneously relax assumptions about the shape of the IRFs while preserving the ability to estimate latent traits, which provides a simple and intuitive solution to several longstanding problems in the IRT literature.
References
More filters
Journal ArticleDOI

Estimating the Dimension of a Model

TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Journal ArticleDOI

The Hospital Anxiety and Depression Scale.

TL;DR: It is suggested that the introduction of the scales into general hospital practice would facilitate the large task of detection and management of emotional disorder in patients under investigation and treatment in medical and surgical departments.
Book ChapterDOI

Information Theory and an Extension of the Maximum Likelihood Principle

TL;DR: In this paper, it is shown that the classical maximum likelihood principle can be considered to be a method of asymptotic realization of an optimum estimate with respect to a very general information theoretic criterion.
Journal ArticleDOI

Reversible jump Markov chain Monte Carlo computation and Bayesian model determination

Peter H.R. Green
- 01 Dec 1995 - 
TL;DR: In this article, the authors propose a new framework for the construction of reversible Markov chain samplers that jump between parameter subspaces of differing dimensionality, which is flexible and entirely constructive.
Journal ArticleDOI

Regression and time series model selection in small samples

TL;DR: In this article, a bias correction to the Akaike information criterion, called AICC, is derived for regression and autoregressive time series models, which is of particular use when the sample size is small, or when the number of fitted parameters is a moderate to large fraction of the sample sample size.
Related Papers (5)