What is the main point of Stromberg and Ruppert's discussion of this model?

The main point of Stromberg and Ruppert (1992) to discuss this model is that if outliers are such that the estimator for K diverges while that for remains constant, the estimator is broken in the Donoho-Huber sense.

What is the constraint for a simulated data set?

For a simulated data set, the authors contaminate the 3 observations most to the right by moving them in parallel to the ray X. Using (16), the authors are looking for a (or ) such that the squared vertical discrepancies between the observations and the pictured line segments are minimal.

What is the way to get a constant boundary set?

if the authors take the badness function to be the bias, the only way to get a constant boundary set is to let the estimator diverge to plus or minus in nity.

What is the breakdown-point of the LMS estimator?

Using their de nition of breakdown, it is clear that the breakdown-point of the (highly robust) LMS estimator in a time-series context is far below 0.5, and even far below 0.5/(p+ 1) with p the order of the autoregression.

What is the breakdown point of the estimator?

De nition 1 The breakdown-point of the estimator ̂ of is given by" lim !0 minm 1nlim !1 Rn( Y n ;Z n;m) \\lim !1 Rn( Y n ;Z n;m+1) 6= ; 8 Y n :The de nition looks for the smallest fraction of extreme outliers for which the boundary of the set of possible badness values does not expand any morein all directions if an additional outlier is added to the sample.

What is the boundary badness for extreme outliers?

So with m outliers, the boundary badness set for extreme outliers and given X 2 [0; 3] is given by either f X; ̂mXg or f Xg, where ̂m can still vary for increasing m.

What is the breakdown point for a variogram?

Now consider a highly robust variogram estimator ̂HR(h; Yn) = S 2(Yi+h Yi), (e.g. Genton, 1998a), where S2 is a highly robust estimator or the variance of the process Yi+h Yi. Typically, S2 has breakdown-point b(n h)=2 1c=(n h), where b c denotes the integer part.

What is the breakdown-point of the estimator?

The breakdown-point "(̂; Y ; Z ) of the estimator ̂ at the (uncontaminated) process Y for the set of allowable outlier con gurations Z , isgiven by"(̂; Y ; Z ) = inf9 > 0 : lim !1 R( Y ;Z ) \\lim !1 R( Y ;Z + )

(Open Access) Comprehensive definitions of breakdown points for independent and dependent observations (2003) | Marc G. Genton

Q: What have the authors contributed in "Comprehensive definitions of breakdown-points for independent and dependent observations" ?

The authors provide a new de nition of breakdown in nite samples with an extension to asymptotic breakdown.

TI 2000-40/2

Tinbergen Institute Discussion Paper

Comprehensive Definitions of

Breakdown-Points for

Independent and Dependent

Observations

Marc G. Genton

André Lucas

Tinbergen Institute

The Tinbergen Institute is the institute for economic research of the

Erasmus Universiteit Rotterdam, Universiteit van Amsterdam and

Vrije Universiteit Amsterdam.

Tinbergen Institute Amsterdam

Keizersgracht 482

1017 EG Amsterdam

The Netherlands

Tel.: +31.(0)20.5513500

Fax: +31.(0)20.5513555

Tinbergen Institute Rotterdam

Burg. Oudlaan 50

3062 PA Rotterdam

The Netherlands

Tel.: +31.(0)10.4088900

Fax: +31.(0)10.4089031

Most TI discussion papers can be downloaded at

http://www.tinbergen.nl

Comprehensive Denitions of

Breakdown-Points for

Indep endent and Dep endent Observations

Marc G. Genton and Andre Lucas



May 3, 2000

Abstract

We provide a new denition of breakdown in nite samples with

an extension to asymptotic breakdown. Previous denitions center

around dening a critical region for either the parameter or the ob-

jective function. If for a particular outlier constellation the critical

region is entered, breakdown is said to o ccur. In contrast to the tradi-

tional approach, we leave the denition of the critical region implicit.

Our denition encompasses all previous denitions of breakdown in

b oth linear and non-linear regression settings. In some cases, it leads

to a dierent notion of breakdown than other pro cedures available.

An advantage is that our new denition also applies to mo dels for

dep endent observations (time-series, spatial statistics) where current

breakdown denitions typically fail. We illustrate our p oints using

examples from linear and non-linear regression as well as time-series

and spatial statistics.

Key words:

Bias curve; Linear regression; Non-linear regression;

Outliers; Spatial statistics; Statistical robustness; Time series.



Marc G. Genton is Lecturer, Department of Mathematics, 2-390, Massachusetts

Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139-4307, gen-

ton@math.mit.edu. Andre Lucas is Associate-Professor, Department of Finance,

ECO/FIN, Vrije Universiteit, De Boelelaan 1105, 1081HV Amsterdam, the Netherlands,

alucas@econ.vu.nl. Andre Lucas thanks the Dutch Organization for Scientic Research

(N.W.O.) for nancial supp ort.

1 Intro duction

The issue of qualitative robustness and esp ecially the denition of breakdown

has made considerable progress over the last three decades. Hamp el (1971)

dened breakdown as the fraction of contamination (or outliers) that suces

to drive the estimator beyond all b ounds. Since the original intro duction

of the concepts of breakdown and the breakdown-point by Hamp el (1971),

the breakdown-p oint has b een extended to nite samples (Donoho and Hu-

b er, 1983), b ounded parameter spaces, dep endent observations (Martin and

De Jong, 1977; Martin, 1980), test statistics (He et al., 1990; He, 1991),

and non-linear regression mo dels (Stromberg and Rupp ert, 1992; Sakata and

White, 1995, 1998). Esp ecially Stromberg and Rupp ert (1992) and Sakata

and White (1995) convincingly argue that the bias in the parameter esti-

mates is not always a good criterion to assess breakdown of an estimator.

Instead, Stromberg and Rupp ert propose to consider the fraction of con-

tamination that drives at least one of the tted values to its supremum or

inmum. Sakata and White argue that the tted value may sometimes not

be a satisfactory criterion either, and therefore prop ose several alternative

criterion functions to assess breakdown.

Though these alternative denitions cover a wide range of mo dels and

estimators, one can easily construct examples that are not covered by the

available denitions. A very simple example is given by the autoregressive

time-series mo del of order 1,

Y



;

(1)

with



(



;

1) and

an i.i.d. innovation. Supp ose

is observed with error

, where



when

for a single

;:::;n



, and

=0 otherwise. Then the OLS estimator of



based on the contaminated

sample

;:::;

, is given by









(





Y



(2)

Clearly, as





0. So the OLS estimator in this simple time-series

mo del breaks with one outlier to zero, which is at the center of the parameter

space. This form of breakdown typically rules out the classical denition of

Hamp el, b ecause the estimator does not diverge. Moreover, it also violates

the straightforward extension of Hamp el's denition to compact parameter

spaces. In that denition, breakdown o ccurs if the estimator is pushed to the

edge of the parameter space. Here, however, the estimator do es not go to the

edge, but rather to the center of the parameter space. Also note that this

simple example do es not t the more recent denitions of breakdown either.

In particular, following the denition of He and Simpson (1992, 1993), break-

down o ccurs if the supremum bias is reached. This, however, need not b e the

case if



is negative or p ositive, in which case the sup bias is reached upon

breakdown to plus one or minus one instead of zero, respectively. Alterna-

tively, Stromb erg and Rupp ert and also Sakata and White dene breakdown

as the p oint where the mo del's t (

Y



) or some other criterion function

tends to either its supremum or its inmum for some observation in the sam-

ple. Clearly, this would again induce breakdown to either plus or minus one

given the restricted parameter space, and

not

breakdown to zero.

Given the drawbacks of the previous denitions available, we introduce

a new concept of breakdown. All previous denitions make explicit use of

a criterion function combined with a critical region. For example, Hamp el's

original denition uses the absolute bias as the criterion function and inn-

ity as the critical region. If the criterion function enters the critical region

for a certain fraction of outliers/contamination, breakdown is said to have

o ccurred. Following Sakata and White (1995), we consider a sp ecic model

badness measure as our criterion function. This encompasses the denitions

of Hamp el (badness is bias) as well as Stromb erg and Ruppert (badness is

mo del t). In contrast to previous work, however, we leave the denition of

the critical region implicit. In particular, we lo ok for the fraction of contam-

ination such that the set of p ossible badness values under extreme outlier

congurations do es not expand any more if additional outliers are added. In

this way, we are able to accomo date most of the earlier denitions of break-

down. In addition, we also cover situations of breakdown that are not covered

by the earlier denitions. We illustrate the main issues with examples from

linear and non-linear regression as well as time-series and spatial statistics.

In some cases, our denition of breakdown gives a dierent breakdown

p oint than available denitions. We provide a typical example in the non-

linear regression context, confronting our breakdown p oint with that of Stromb erg

and Rupp ert. The new notion of breakdown checks whether the non-contaminated

sample information still has some inuence on the estimator. If this is no

longer the case, the estimator is said to have broken down. This may happ en

even in case the mo del's t over a pre-specied domain of interest remains

b ounded.

The remainder of the pap er is set up as follows. In Section 2 we in-

tro duce the basic notation and our new denition of breakdown for nite

samples. The denition is related to alternative ones in Section 3. Some il-

lustrative examples are given in Section 4. Section 5 extends the denition of

the breakdown-point to the asymptotic case and provides some illustrations.

Section 6 concludes.

Comprehensive definitions of breakdown points for independent and dependent observations

Figures

Citations

Robust Estimation in Signal Processing: A Tutorial-Style Treatment of Fundamental Concepts

Robust Likelihood Methods Based on the Skew-t and Related Distributions

Estimators of Fractal Dimension: Assessing the Roughness of Time Series and Spatial Data

Robust Statistics for Signal Processing

Robust estimation for ARMA models

References

Robust Regression and Outlier Detection

Robust statistics: the approach based on influence functions

A General Qualitative Definition of Robustness

High breakdown point conditional dispersion estimation with application to s&p 500 daily returns volatility

Highly Robust Estimation of the Autocovariance Function

Related Papers (5)

Robust statistics: the approach based on influence functions

Robust Statistics: Theory and Methods

Robust Regression and Outlier Detection

Least Median of Squares Regression

A General Qualitative Definition of Robustness

Frequently Asked Questions (9)

Q1. What have the authors contributed in "Comprehensive definitions of breakdown-points for independent and dependent observations" ?

Q2. What is the main point of Stromberg and Ruppert's discussion of this model?

Q3. What is the constraint for a simulated data set?

Q4. What is the way to get a constant boundary set?

Q5. What is the breakdown-point of the LMS estimator?

Q6. What is the breakdown point of the estimator?

Q7. What is the boundary badness for extreme outliers?

Q8. What is the breakdown point for a variogram?

Q9. What is the breakdown-point of the estimator?