scispace - formally typeset
Open AccessJournal ArticleDOI

Bounded-Influence Robust Estimation in Generalized Linear Latent Variable Models

Irini Moustaki, +1 more
- 01 Jun 2006 - 
- Vol. 101, Iss: 474, pp 644-653
Reads0
Chats0
TLDR
This article proposed a robust estimator that is made consistent using the basic principle of indirect inference and can be easily numerically implemented, which is significantly better than that of the ML estimators in terms of bias and variance.
Abstract
Latent variable models are used for analyzing multivariate data. Recently, generalized linear latent variable models for categorical, metric, and mixed-type responses estimated via maximum likelihood (ML) have been proposed. Model deviations, such as data contamination, are shown analytically, using the influence function and through a simulation study, to seriously affect ML estimation. This article proposes a robust estimator that is made consistent using the basic principle of indirect inference and can be easily numerically implemented. The performance of the robust estimator is significantly better than that of the ML estimators in terms of both bias and variance. A real example from a consumption survey is used to highlight the consequences in practice of the choice of the estimator.

read more

Content maybe subject to copyright    Report

Article
Reference
Bounded-Influence Robust Estimation in Generalized Linear Latent
Variable Models
MOUSTAKI, Irini, VICTORIA-FESER, Maria-Pia
Abstract
Latent variable models are used for analyzing multivariate data. Recently, generalized linear
latent variable models for categorical, metric, and mixed-type responses estimated via
maximum likelihood (ML) have been proposed. Model deviations, such as data contamination,
are shown analytically, using the influence function and through a simulation study, to
seriously affect ML estimation. This article proposes a robust estimator that is made
consistent using the basic principle of indirect inference and can be easily numerically
implemented. The performance of the robust estimator is significantly better than that of the
ML estimators in terms of both bias and variance. A real example from a consumption survey
is used to highlight the consequences in practice of the choice of the estimator.
MOUSTAKI, Irini, VICTORIA-FESER, Maria-Pia. Bounded-Influence Robust Estimation in
Generalized Linear Latent Variable Models. Journal of the American Statistical Association
, 2006, vol. 101, no. 474, p. 644-653
DOI : 10.1198/016214505000001320
Available at:
http://archive-ouverte.unige.ch/unige:6460
Disclaimer: layout of this document may differ from the published version.
1 / 1

JASA jasa v.2004/12/09 Prn:16/12/2005; 14:49 F:jasatm04107r2.tex; (Daiva) p. 1
1 60
2 61
3 62
4 63
5 64
6 65
7 66
8 67
9 68
10 69
11 70
12 71
13 72
14 73
15 74
16 75
17 76
18 77
19 78
20 79
21 80
22 81
23 82
24 83
25 84
26 85
27 86
28 87
29 88
30 89
31 90
32 91
33 92
34 93
35 94
36 95
37 96
38 97
39 98
40 99
41 100
42 101
43 102
44 103
45 104
46 105
47 106
48 107
49 108
50 109
51 110
52 111
53 112
54 113
55 114
56 115
57 116
58 117
59 118
Bounded-Influence Robust Estimation in Generalized
Linear Latent Variable Models
Irini MOUSTAKI and Maria-Pia VICTORIA-FESER
Latent variable models are used for analyzing multivariate data. Recently, generalized linear latent variable models for categorical, metric,
and mixed-type responses estimated via maximum likelihood (ML) have been proposed. Model deviations, such as data contamination, are
shown analytically, using the influence function and through a simulation study, to seriously affect ML estimation. This article proposes
a robust estimator that is made consistent using the basic principle of indirect inference and can be easily numerically implemented. The
performance of the robust estimator is significantly better than that of the ML estimators in terms of both bias and variance. A real example
from a consumption survey is used to highlight the consequences in practice of the choice of the estimator.
KEY WORDS: Indirect inference; Influence function; Latent variable models; Mixed variables; Robust estimation.
1. INTRODUCTION
Latent variable models are widely used in social sciences
for studying the interrelationships among observed variables.
More specifically, latent variable models are used for reduc-
ing the dimensionality of multivariate data, for assigning scores
to sample members on the latent dimensions identified by the
model, and for constructing measurement scales (e.g., in edu-
cational testing and psychometrics). Moustaki and Knott (2000)
proposed a generalized linear latent variable model (GLLVM)
framework for any type of observed data (metric or categorical)
in the exponential family. They extended the work of Moustaki
(1996) and Sammel, Ryan, and Legler (1997) for mixed binary
and metric variables (the latter with covariate effects as well)
and Bartholomew and Knott (1999) for categorical variables.
A similar framework was also discussed by Skrondal and Rabe-
Hesketh (2004) that includes multilevel models (random-effects
models) as a special case.
In the literature, the parameters of GLLVM are estimated us-
ing a classical maximum likelihood (ML) approach. However,
ML estimation is based on the fundamental assumptions that
the data are generated exactly from the model and, in partic-
ular, that there are no gross errors in the set of responses. For
example, in the case of normal variables, a subject with a re-
sponse more than 3 standard deviations away from the mean
is considered an unexpected response under the normal model,
which can be either an error (e.g., recording error) or just an
unusual subject not representative of the sampled population.
For binary variables, an unexpected response occurs when the
associated probability is low under the true model. To illustrate
the proposed methodology, in Section 5 we attempt to construct
a measurement scale for the construct “wealth” using five in-
dicators (data collected from Swiss households). Two of these
indicators are binary recording, the possession of a dishwasher
and a car, and three are continuous, measuring expenditures on
food, clothing, and housing. When a standard maximum likeli-
hood estimator (MLE) is used, three of the five indicators are
found to have significant loadings on the construct “wealth.
When the robust estimator is used that accounts for the presence
Irini Moustaki is Associate Professor, Department of Statistics, Athens
University of Economics and Business, 104-34 Athens, Greece (E-mail:
moustaki@aueb.gr). Maria-Pia Victoria-Feser is Professor, Faculty of Eco-
nomics and Social Sciences (HEC), University of Geneva 40, Geneva 4,
Switzerland (E-mail: Maria-Pia.VictoriaFeser@hec.unige.ch). This work was
supported in part by the Swiss National Science Foundation (grants 610-
057883.99 and PP001-106465). The authors thank the two anonymous referees
and the associate editor for their comments, which helped improve the quality
of the article.
of few extreme observations in the data, one more indicator is
added to the “wealth” scale.
In this article we investigate the effect of an unexpected (un-
usual) set of responses on the MLE with respect to bias and ef-
ficiency. We show theoretically and through a simulation study
that MLEs may change significantly if subjects that do not “fit
the model” are present in the sample. That makes the MLE less
stable (not robust), and therefore, in principle, one subject may
change the conclusions drawn from the data analysis. This is a
rather undesirable property of the estimation procedure. In that
case, a robust estimator that built to be resistant to model devia-
tions is developed and used in practice. The aim of this article is
therefore twofold: to investigate the robustness properties of the
MLE both theoretically and through a simulation for GLLVM,
and to propose a robust estimator.
General robustness theory has been given by Huber (1981)
and Hampel, Ronchetti, Rousseeuw, and Stahel (1986), who set
the foundations. To assess some of the robustness properties
of any statistic, such as an estimator or a test statistic, one can
use the influence function (IF) (Hampel 1968, 1974). Hampel
et al. (1986) showed that the asymptotic bias of an estimator is
proportional to its IF. To build a robust estimator, one can con-
sider a general class of estimators, such as M-estimators (Huber
1964), and choose one that has a bounded IF. The optimal bias
robust estimator (OBRE) is the most efficient M-estimator with
bounded IF for general parametric models that has been defined
by Hampel et al. (1986). But the OBRE is very hard to compute
when the models are complicated, like GLLVM. Other robust
estimators like those based on weighted score functions, such
as weighted MLEs (WMLEs) (see Dupuis and Morgenthaler
2002), can be used, but if the model is not based on symmetric
models (such as the normal model), then care must be taken
to avoid inconsistent estimators. To correct WMLEs for bias,
Dupuis and Morgenthaler (2002) proposed a first-order approx-
imation correction term. In this article we propose a simple
M-estimator based on weighted score functions (i.e., a WMLE),
and adapt indirect estimation (Gouriéroux, Monfort, and
Renault 1993; Gallant and Tauchen 1996; Genton and Ronchetti
2003) to make the resulting estimator consistent.
The article is organized as follows. The GLLVM and the
MLE of the model parameters are presented in Section 2.1.
A robust estimator is presented in Sections 2.2 and 2.3, and
its robustness, efficiency, and consistency properties are studied
© 0 American Statistical Association
Journal of the American Statistical Association
???? 0, Vol. 0, No. 00, Theory and Methods
DOI 10.1198/00
1

JASA jasa v.2004/12/09 Prn:16/12/2005; 14:49 F:jasatm04107r2.tex; (Daiva) p. 2
2 Journal of the American Statistical Association, ???? 0
1 60
2 61
3 62
4 63
5 64
6 65
7 66
8 67
9 68
10 69
11 70
12 71
13 72
14 73
15 74
16 75
17 76
18 77
19 78
20 79
21 80
22 81
23 82
24 83
25 84
26 85
27 86
28 87
29 88
30 89
31 90
32 91
33 92
34 93
35 94
36 95
37 96
38 97
39 98
40 99
41 100
42 101
43 102
44 103
45 104
46 105
47 106
48 107
49 108
50 109
51 110
52 111
53 112
54 113
55 114
56 115
57 116
58 117
59 118
in Section 3 along with the robustness properties of the MLE.
In Section 4 the behavior of the MLE and the robust estima-
tor under model contamination are studied through a simulation
study, and in Section 5 the consumption dataset is analyzed us-
ing both methods.
2. ESTIMATION OF GENERALIZED LINEAR
LATENT VARIABLE MODELS
2.1 Approximate Maximum Likelihood Estimator
The basic idea of latent variable analysis is as follows. For
a given set of response variables x
1
,...,x
p
, one wants to find
a set of latent variables or factors z
1
,...,z
q
, that are fewer in
number than the observed variables, but that contain essentially
the same information. The factors are supposed to account for
the dependencies among the response variables in the sense that
if the factors are held fixed, then the observed variables are in-
dependent. This is known as the assumption of conditional or
local independence.
The conditional distribution of x
m
|z (z =[z
1
,...,z
q
])is
taken from the exponential family (with canonical link func-
tions)
g
m
(x
m
|z, θ
m
) =exp
x
m
α
m
z
φ
m
b(α
m
z
)
φ
m
+c(x
m
m
)
,
m =1,...,p,
with α
m
=[α
m0
,...,α
mq
], z
=[1, z
1
,...,z
q
]
T
, and θ
m
=
(α
T
m
m
)
T
. The functions b(α
m
z
) and c(x
m
m
) take different
forms depending on the distribution of the response variable x
m
(see McCullagh and Nelder 1989). Under the assumption of
conditional independence, the joint marginal distribution of the
manifest variables is
f (x;θ) =
···
p
m=1
g
m
(x
m
|z, θ
m
)
ϕ(z) dz, (1)
with x =[x
1
,...,x
p
], θ =[θ
T
1
,...,θ
T
p
]
T
, and where z is mul-
tivariate standard normal, that is, ϕ(z) =
q
j=1
ϕ(z
j
).Thein-
dependence assumption for the latent variables can be relaxed.
Moreover, Bartholomew (1988) showed that the choice of the
latent variable distribution has a negligible effect on the inter-
pretation of the results. He suggested using the normal distribu-
tion because it has rotational advantages when it comes to more
than one latent variable.
For a sample of size n, the log-likelihood is then L(θ) =
1
n
n
i=1
logf (x
i
;θ) with partial derivatives
L(θ)
α
T
m
=
1
n
n
i=1
s
(1)
m
(x
i
;θ)
=
1
n
n
i=1
1
f (x
i
;θ)
···
g(x
i
|z, θ)
×
x
mi
b
(α
m
z
)
φ
m
z
ϕ(z) dz, (2)
where b
(x) =
x
b(x). Note that b
(α
m
z
) = E[x
m
|z] and that
b

(α
m
z
m
= var[x
m
|z]. The roots of (2) define the MLE
α
m
, m. Differentiating the log-likelihood with respect to the
scale parameter leads to
L(θ)
∂φ
m
=
1
n
n
i=1
s
(2)
m
(x
i
;θ)
=
1
n
n
i=1
1
f (x
i
;θ)
···
g(x
i
|z, θ)
×
x
mi
α
m
z
b(α
m
z
)
φ
2
m
+
∂φ
m
c
m
, x
mi
)
ϕ(z) dz. (3)
The scale parameter φ for the case of binomial, multinomial,
and Poisson distributed variables is 1. For the normal distribu-
tion, we have
∂φ
m
c
m
, x
mi
) =.5
x
2
m
φ
2
m
1
φ
m
.
To compute the MLE, one must solve (2) and (3). The inte-
grals in (2) and (3) can be approximated using fixed Gauss–
Hermite quadrature (see, e.g., Bock and Lieberman 1970),
adaptive quadrature points (see, e.g., Bock and Schilling 1997;
Schilling and Bock 2005), Monte Carlo approximations (see,
e.g., Sammel et al. 1997), or Laplace approximation (see, e.g.,
Huber, Ronchetti, and Victoria-Feser 2004). All of these ap-
proximations lead to approximate MLEs. The models that we
consider here are one-factor models, and even though it is
known that Gauss–Hermite rule can give biased estimators in
some situations, we nevertheless use it to compute the integrals.
2.2 Robust
M
-Estimator
As we show in Section 3, the MLE is not robust to small
model deviations (e.g., presence of outliers). Therefore, here
we propose a robust estimator that belongs to the class of
M-estimators (see Huber 1981) that has well-known proper-
ties. Given a relatively general function ψ (see Huber 1981),
an M-estimator is defined implicitly as the solution in θ of
n
i=1
ψ(x
i
;θ) =0. (4)
It is known that choosing a bounded ψ or controlling the bound
on ψ defines a robust estimator. A simple choice for ψ is given
by a weighted score function leading to a WMLE, with smaller
weights when the score function becomes too large, that is,
1
n
n
i=1
ψ
c
(x
i
;θ) =
1
n
n
i=1
s(x
i
;θ)w(x
i
, c) =0. (5)
The weight function w can be defined through the Huber func-
tion with parameter c given by
w(x;c) =min
1;
c
s(x;θ)
, (6)
where ···denotes the Euclidean norm. Such weights guaran-
tee a bounded IF for the corresponding estimator. For GLLVM,
we have s(x
i
;θ) =[s
m
(x;θ)
T
]
T
m=1,...,p
with s
m
(x;θ) =
[
s
(1)
m
(x;θ)
T
s
(2)
m
(x;θ)
T
]
T
, and the score functions s
(1)
m

Moustaki and Victoria-Feser: Bounded-Influence Robust Estimation
JASA jasa v.2004/12/09 Prn:16/12/2005; 14:49 F:jasatm04107r2.tex; (Daiva) p. 3
3
1 60
2 61
3 62
4 63
5 64
6 65
7 66
8 67
9 68
10 69
11 70
12 71
13 72
14 73
15 74
16 75
17 76
18 77
19 78
20 79
21 80
22 81
23 82
24 83
25 84
26 85
27 86
28 87
29 88
30 89
31 90
32 91
33 92
34 93
35 94
36 95
37 96
38 97
39 98
40 99
41 100
42 101
43 102
44 103
45 104
46 105
47 106
48 107
49 108
50 109
51 110
52 111
53 112
54 113
55 114
56 115
57 116
58 117
59 118
and s
(2)
m
are given in (2) and (3). We call the resulting estimator
a globally weighted robust (GWR) estimator.
When an M-estimator defined generally through a ψ-function
given in (4) is not consistent, one can make the M-estimator
Fisher-consistent by adding a proper quantity in its defini-
tion, that is,
1
n
n
i=1
ψ(x
i
;θ) a(θ) = 0 such that a(θ) =
···
ψ(x;θ)f (x;θ) dx. For the GWR estimator, we have that
a(θ) =
···
g(x
i
|z, θ)
×

x
m
b
(α
m
z
)
φ
m
z
T
x
m
α
m
z
b(α
m
z
)
φ
2
m
+c
m
, x
m
)

T
m=1,...,p
×w(x, c) dx ϕ(z) dz.
The double integration with respect to z and x makes a(θ) hard
to compute, and so we adopt a different approach that makes
the GWR consistent, namely indirect inference.
2.3 Robust Indirect Estimator
Indirect estimation (see Gouriéroux et al. 1993; Gallant and
Tauchen 1996) has been proposed as an estimation procedure
for a complex model F
θ
with intractable likelihood functions.
It involves the computation of an estimator π(F
n
) (where F
n
is
the empirical distribution) for the parameters of an auxiliary
model
F
π
that does not provide a consistent estimator of θ.In
particular, let π(F
n
) be an M-estimator defined implicitly by
1
n
n
i=1
ψ(x
i
;π(F
n
)) =0 (7)
for a sample {x
1
,...,x
n
} supposedly generated from F
θ
.
Here π(F
n
) is a Fisher-consistent estimator of π in that the
ψ-function satisfies
ψ(x;π) d
F
π
(x) = 0. When π = θ,we
suppose that a locally injective binding function h exists such
that θ = h
1
(π) and such that a consistent estimator of θ is
given by
θ(F
n
) = h
1
(π(F
n
)) (see also Genton and de Luna
2000), with π(F
n
) given by (7). The latter is obtained implic-
itly by the solution in θ of
ψ(x;π(F
n
)) dF
θ
(x) =0. (8)
When F
n
→F
θ
,wehave
ψ(x;h(θ)) dF
θ
(x) =0. (9)
In other words, to make the estimator of θ Fisher-consistent, in-
direct estimation implicitly allows us to transform ψ(x;θ) into
ψ(x;θ) =ψ(x;h(θ)).
The indirect estimator given in (8) results as a particular case
of the general minimization problem defining indirect estima-
tors, that is,
θ =argmin
θ
π(F
n
) h(θ )
T
π(F
n
) h(θ )
, (10)
in which =I. First-order conditions imply that
θ
h(θ)
T
π(F
n
) h(θ)
=0, (11)
and hence the solution is h
1
(π(F
n
)),if
θ
h(θ)
T
is invertible
and it is obtained using (8).
For the model considered in this article, ψ is given in
(5) and (6), π(F
n
) is the GWR estimator [i.e. solution of (5)
with θ replaced by π ], and F
θ
is the GLLVM with density
given in (1) and with unknown θ. Note that if the weights in (6)
are all equal to 1 [i.e., c := in (6)], then the solution of (8)
is
θ(F
n
) = π(F
n
). Indeed, in that case ψ(x;π) = s(x;π) and
π(F
n
) is the MLE that is unbiased, and therefore no correction
is required.
One can estimate the integral in (8) by simulating n
obser-
vations x
i
(θ) from F
θ
for a given θ . To solve (8) with respect
to θ, we can use a Newton step given by
θ
(k+1)
=
θ
(k)
S
1
π,
θ
(k)
n
i=1
ψ
x
i
θ
(k)
;π
, (12)
with
θ :=
θ(F
n
) and π :=π(F
n
), and where
S(π,
θ) =
n
i=1
ψ
x
i
(
θ);π
s
T
x
i
(
θ);
θ
(13)
is the sample approximation of the derivative of (8) with re-
spect to θ . Note that we can take
θ
(1)
=π, and that we should
set the seed parameter to a fixed value for all values of θ to en-
sure successful optimization. For efficiency reasons, n
should
be chosen as large as possible (see, e.g., Genton and Ronchetti
2003) and can be set equal to n
= n · l, where l is chosen a
priori.
Based on the foregoing, for the GLLVM, we propose as a
robust estimator the converged
θ given in (12) [which is the so-
lution of (8)], with π and ψ-function defined in (7) and (5). We
call that estimator an indirect globally weighted robust (IGWR)
estimator. In the Appendix we present an iterative procedure to
compute it.
We note that an alternative asymptotically equivalent estima-
tor to (10) is defined as
θ =argmin
θ
n
i=1
ψ(x
i
(θ);π)
T
n
i=1
ψ(x
i
(θ);π)
(see Gouriéroux et al. 1993; Gallant and Tauchen 1996). When
=I, this quadratic form admits a unique solution at
θ given
implicitly as the solution in θ of
n
i=1
ψ(x
i
(θ);π) = 0.The
Newton iterative procedure described in (12) produces this so-
lution. We also note that Cabrera and Fernholz (1999) called
the indirect estimator given in (8) a target estimator and studied
it for M-estimators of location and scale.
Finally, it should be stressed that the IGWR estimator is
general because it can be used in principle for any parametric
model F
θ
and can be extended to other types of ψ-functions.
3. STATISTICAL PROPERTIES OF THE ESTIMATORS
In this section we investigate the robustness properties of the
MLE and the IGWR estimator for the GLLVM by means of the
IF. We then look into the efficiency properties of the IGWR to
develop guidelines for choosing the constant c in (6).

JASA jasa v.2004/12/09 Prn:16/12/2005; 14:49 F:jasatm04107r2.tex; (Daiva) p. 4
4 Journal of the American Statistical Association, ???? 0
1 60
2 61
3 62
4 63
5 64
6 65
7 66
8 67
9 68
10 69
11 70
12 71
13 72
14 73
15 74
16 75
17 76
18 77
19 78
20 79
21 80
22 81
23 82
24 83
25 84
26 85
27 86
28 87
29 88
30 89
31 90
32 91
33 92
34 93
35 94
36 95
37 96
38 97
39 98
40 99
41 100
42 101
43 102
44 103
45 104
46 105
47 106
48 107
49 108
50 109
51 110
52 111
53 112
54 113
55 114
56 115
57 116
58 117
59 118
3.1 Robustness Properties
The robustness properties of the MLE and IGWR estima-
tor are investigated using the IF. For a multidimensional func-
tional
θ at a model F
θ
, the IF is defined by
IF(y,
θ, F
θ
) =lim
ε0
θ(F
ε
)
θ(F
θ
)
ε
,
where F
ε
= (1 ε)F
θ
+ε
y
, with
y
the probability distrib-
ution with point mass of 1 at an arbitrary location y. When it
exists, we can use IF(y,
θ, F
θ
) =
∂ε
θ(F
ε
)|
ε=0
. For MLEs
θ
ML
,
the IF is proportional to the score function (see Hampel
et al. 1986). For the GLLVM, the score function is given in
(2) and (3) and is affected by the point of contamination y
through the quantities f (y;θ), g(y|z, θ) =
p
m=1
g
m
(y
m
|z, θ
m
),
and y
m
. Therefore, the effect of an extreme value in the mth
manifest variable has an influence not only on the MLE of α
m
and φ
m
, corresponding to the contaminated manifest variable,
but also on the other estimates of the model. Actually, in prin-
ciple, the MLE θ of all model parameters can be influenced by
extreme data. What is not clear is the size of the IF for different
types of variables. Indeed, the quantity (y
m
b
m
(α
m
z
t
))/φ
m
can be very large if y
m
is far away from its expectation, but
at the same time its conditional density g
m
(y
m
|z, θ
m
) becomes
very small and the behavior of
g(y|z,θ)
f (y;θ)
is not straightforward to
study. Moustaki and Victoria-Feser (2004) studied the behavior
of the IF by means of a numerical example.
The following proposition provides the IF of the IGWR
θ
IGWR
estimator.
Proposition 1. Let
θ
IGWR
(F
n
) :=
θ
IGWR
be the IGWR, de-
fined implicitly as the solution of (11) based on the GWR
π(F
n
), a consistent estimator of π of the auxiliary model
F
π
,
and the ψ -function defined in (5) and (6). Suppose that the
binding function h exists and that it is locally injective and such
that
θ(F
n
) = h
1
(π(F
n
)) is a consistent estimator of θ.TheIF
of
θ
IGWR
is then
IF(y,
θ
IGWR
, F
θ
) =S
1
(π, θ)ψ(y;π), (14)
with S(π, θ) =
ψ(x;π)s
T
(x;θ) dF
θ
(x).
Proof. Writing
θ
IGWR
as a functional
θ
IGWR
(F
ε
) of F
ε
,
(11) becomes
θ
h(θ)
T
θ=
θ
IGWR
(F
ε
)
π(F
ε
) h(
θ
IGWR
(F
ε
))
=0.
Taking derivatives with respect to ε at ε = 0, we get (see also
Genton and de Luna 2000)
IF(y,
θ
IGWR
, F
θ
)
=
θ
h(θ)
T
θ
h(θ)
1
θ
h(θ)
T
IF(y,π, F
θ
)
=B
π
ψ(x;π) dF
θ
(x)
1
ψ(y;π), (15)
with B =[
θ
h(θ)
T
θ
h(θ)]
1
θ
h(θ)
T
. Moreover, we can de-
duce
θ
h(θ) from (9) by taking derivatives with respect
to θ [i.e.,
π
ψ(x;π) dF
θ
(x)
θ
h(θ) +
ψ(x;π)s
T
(x;θ) ×
dF
θ
(x) = 0], so that
θ
h(θ) =−M
1
(π, θ)S(π, θ) with
M(π, θ ) =
π
ψ(x;π) dF
θ
(x). Then
B =−
S(π, θ)
T
M
T
(π, θ)M
1
(π, θ)S(π, θ)
1
×S(π , θ)
T
M
T
(π, θ), (16)
and by replacing in (15), we get (14).
Because ψ given in (5) is bounded, the IF of
θ
IGWR
is also
bounded. Although in our context the ψ defines the GWR, the
result of the proposition applies to other M-estimators as well.
The IF measures directional effects of model deviations on
the estimator. A more global measure is self-standardized sen-
sitivity (Hampel et al. 1986), which is taken as the supremum
in y of a function of the IF. It is a measure of the asymptotic bias
of the estimator
θ due to small model deviations (see Hampel
et al. 1986, p. 175). Genton and Ronchetti (2003, prop. 1)
showed that the indirect estimator has a self-standardized sen-
sitivity smaller than or equal to the self-standardized sensitivity
of the estimator of the auxiliary model. Because the IF of the
latter is based on the ψ-function (5), which is bounded, the self-
standardized sensitivity of the IGWR is then also bounded and
so is the asymptotic bias of the IGWR estimator under small
model deviations. That is not the case with the MLE, however.
To make this point even more strongly, we perform simulation
studies in Section 4 and compare the performance in terms of
bias of the MLE to the IGWR estimator under small model de-
viations or data contamination.
3.2 Efficiency
To compute the IGWR estimator, we must choose the
bound c in the weight function (6). Obviously, the smaller its
value, the more robust (but also the less efficient) the estimator.
A strategy commonly used (for choosing an appropriate value
for c) is to fix a degree of efficiency loss for the robust estimator
compared to the MLE and choose c accordingly.
From work of Genton and Ronchetti (2003), we can de-
duce the asymptotic covariance matrix of the IGWR estima-
tor
θ and obtain V
θ
=BV
π
B
T
+
1
l
BV
π
(F
θ
)
B
T
. Note that for an
M-estimator as in (7), we have V
π
= M
1
(π, θ)Q(π, θ) ×
M
T
(π, θ), with Q(π, θ) =
ψ(x;π)ψ
T
(x;π) dF
θ
(x). When
l is sufficiently large and using (16), we get V
θ
=
S
1
(π, θ) ×
Q(π, θ)S
T
(π, θ), which can be estimated by
V
θ
=S
1
(π,
θ)Q(π,
θ)S
T
(π,
θ), (17)
where Q(π,
θ) is computed as in (13).
For a fixed value of θ, we can use (17) to compute the effi-
ciency of the IGWR estimator (vs. the MLE) as a function of
the bounding constant c. Taking the same model and parameter
values as in the simulation study (see Sec. 4), we simulated a
(uncontaminated) sample of 1,000 observations and calculated
the relationship between the efficiency of the IGWR estima-
tor and the bounding constant c. This relationship is illustrated
in Figure 1. In particular, for an efficiency ratio of 95%, we
can use a bounding constant of approximately c =3.5, whereas
a bounding constant of c = 2 leads to an efficiency ratio of
approximately 82%. It should be noted that in principle, effi-
ciency depends on the parameter values. A strategy that is often

Citations
More filters
Journal ArticleDOI

Generalized Linear Models (2nd ed.)

John H. Schuenemeyer
- 01 May 1992 - 
Journal ArticleDOI

Testing Negative Error Variances Is a Heywood Case a Symptom of Misspecification

TL;DR: Heywood cases, or negative variance estimates, are a common occurrence in factor analysis and latent variable structural equation models Though they have several potential causes, structural missp as mentioned in this paper is one of the most common.
Journal ArticleDOI

A Bayesian modeling approach for generalized semiparametric structural equation models.

TL;DR: A generalized semiparametric SEM is developed that is able to handle mixed data types and to simultaneously model different functional relationships among latent variables using a Bayesian model-comparison statistic called the complete deviance information criterion (DIC).
Journal ArticleDOI

Detecting outliers in factor analysis using the forward search algorithm

TL;DR: The forward search algorithm is extended and implemented for identifying atypical subjects/observations in factor analysis models and the performance of the forward search when the wrong model is specified is illustrated.
References
More filters
Book

Generalized Linear Models

TL;DR: In this paper, a generalization of the analysis of variance is given for these models using log- likelihoods, illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables), and gamma (variance components).
Journal ArticleDOI

Generalized linear models. 2nd ed.

TL;DR: A class of statistical models that generalizes classical linear models-extending them to include many other models useful in statistical analysis, of particular interest for statisticians in medicine, biology, agriculture, social science, and engineering.
Journal ArticleDOI

Robust Estimation of a Location Parameter

TL;DR: In this article, a new approach toward a theory of robust estimation is presented, which treats in detail the asymptotic theory of estimating a location parameter for contaminated normal distributions, and exhibits estimators that are asyptotically most robust (in a sense to be specified) among all translation invariant estimators.
Journal ArticleDOI

Generalized Linear Models

TL;DR: Generalized linear models, 2nd edn By P McCullagh and J A Nelder as mentioned in this paper, 2nd edition, New York: Manning and Hall, 1989 xx + 512 pp £30
Related Papers (5)
Frequently Asked Questions (1)
Q1. What are the contributions mentioned in the paper "Bounded-influence robust estimation in generalized linear latent variable models" ?

Model deviations, such as data contamination, are shown analytically, using the influence function and through a simulation study, to seriously affect ML estimation. This article proposes a robust estimator that is made consistent using the basic principle of indirect inference and can be easily numerically implemented.