Semi-Parametric Probability-Weighted Moments Estimation Revisited

doi:10.1007/S11009-012-9295-6

Semi-Parametric Probability-Weighted Moments

Estimation Revisited

∗

Frederico Caeiro

Universidade Nova de Lisboa, FCT and CMA

M. Ivette Gomes

Universidade de Lisboa, DEIO, CEAUL and FCUL

and

Bj¨orn Vandewalle

Universidade Nova de Lisboa, ISEGI, and CEAUL

January 19, 2012

Abstract

In this paper, for heavy-tailed models and through the use of probability weighted moments

based on the largest observations, we deal essentially with the semi-parametric estimation of the

Value-at-Risk at a level p, the size of the loss occurred with a small probability p, as well as the

dual problem of estimation of the probability of exceedance of a high level x. These estimation

procedures depend crucially on the estimation of the extreme value index, the primary parameter in

Statistics of Extremes, also done on the basis of the same weighted moments. Under regular variation

conditions on the right-tail of the underlying distribution function F , we prove the consistency and

asymptotic normality of the estimators under consideration in this paper, through the usual link of

their asymptotic behaviour to the one of the extreme value index estimator they are based on. The

performance of these estimators, for ﬁnite samples, is illustrated through Monte-Carlo simulations.

An adaptive choice of thresholds is put forward. Applications to a real data set in the ﬁeld of

insurance as well as to simulated data are also provided.

AMS 2000 subject classiﬁcation. Primary 62G32, 62E20; Secondary 65C05.

Keywords and phrases. Heavy tails, value-at-risk or high quantiles, probability of exceedance of a high level,

semi-parametric estimation.

∗

Research partially supported by National Funds through FCT — Funda¸c˜ao para a Ciˆencia e a Tecnologia, projects

PEst-OE/MAT/UI0006/2011 and PEst-OE/MAT/UI0297/2011, and PTDC/FEDER.

1

1 Introduction, preliminaries and scope of the article

Let X

1

, X

2

, . . . , X

n

be a set of n independent and identically distributed (i.i.d.), or even possibly weakly

dependent and stationary, random variables (r.v.’s), from a population with distribution function (d.f.)

F . Let us arrange them in ascending order, to get the order statistics (o.s.’s) X

1:n

≤ ··· ≤ X

n:n

. Suppose

that we are interested in the estimation of a high quantile of probability 1 − p, or equivalently, in the

estimation of the Value-at-Risk (VaR) at a level p, the size of the loss occurred with a small probability

p, given by

VaR

p

≡ χ

1−p

:= F

←

(1 −p) = inf{x : F (x) ≥ 1 −p}, (1.1)

with the notation F

←

standing thus for the generalized inverse function of F . Moreover, we are also

interested in the estimation of the probability of exceedance of a high level x = x

n

,

p = p

x

:= 1 − F (x) =: F (x). (1.2)

Extreme Value Theory (EVT) provides a great variety of results that enable us to to deal with alternative

approaches in the statistical analysis of extreme events. Those approaches are essentially based on the

well-established limiting results described in the following.

1.1 Main limiting results in EVT

The main limiting result in EVT can be attributed to Gnedenko (1943), who fully characterized the pos-

sible non-degenerate limiting distribution of the linearly normalised maximum, (X

n:n

− b

n

)/a

n

, a

n

> 0,

b

n

∈ R. Such a limit is of the type of the general extreme value distribution (EVD),

EV

γ

(x) :=

(

exp(−(1 + γx)

−1/γ

), 1 + γx > 0 if γ 6= 0

exp(−exp(−x)), x ∈ R if γ = 0.

(1.3)

When such a non-degenerate limit exists, we say that F belongs to the max-domain of attraction of

EV

γ

and denote this by F ∈ D

M

(EV

γ

). The shape parameter γ is related with the heaviness of the

right-tail F = 1 − F and it is often called the extreme value index (EVI).

Another seminal result in the ﬁeld of EVT is due to Balkema and de Haan (1974) and Pickands

(1975). If we properly scale the excesses over a high threshold u, the limit distribution of those scaled

excesses is the Generalized Pareto distribution (GPD), strongly related with the d.f. EV

γ

(x), in (1.3),

and deﬁned by,

GP

γ

(x) := 1 + ln EV

γ

(x) =

(

1 − (1 + γx)

−1/γ

, 1 + γx > 0, x > 0 if γ 6= 0

1 − exp(−x), x > 0 if γ = 0

(1.4)

(see, for instance, Embrechts et al., 1997, Section 3.4, and Reiss and Thomas, 2007, Section 1.4, for

more details).

2

1.2 Most relevant approaches in the ﬁeld of Statistics of Univariate Extremes

We shall brieﬂy refer the three most important approaches in the area of Statistics of Univariate Ex-

tremes: the block maxima (BM) method, the peaks-over-threshold (POT) or even the peaks-over-random-

threshold (PORT) methods and the largest observations (LOB) method. For a more detailed review,

with extensive associated references, see Gomes et al. (2008) and Beirlant et al. (2012).

• The ﬁrst method, the BM method, is of a parametric nature: we work with a sample of maxima of

adequate blocks of observations, and estimate the parameters (λ, δ, γ) of the EVD, EV

γ

((x − λ)/δ),

λ ∈ R, δ > 0, γ ∈ R, with EV

γ

(x) given in (1.3). This method is known to be possibly ineﬃcient,

due to the fact that the loss of information in each block can be catastrophic.

• In the second approach, the POT method, inference is performed through the use of the sample

of excesses over a high deterministic threshold u. The limiting d.f. of these excesses is, up to

a scale factor, the distribution GP

γ

(x), in (1.4), and the method can be of a parametric or a

semi-parametric nature. Note that the high threshold can also be a random value, leading to the

PORT methodology, a terminology recently introduced in Ara´ujo Santos et al. (2006).

• The third approach, the LOB method, is the one we shall consider in this paper. It uses the largest

k observations to make inference about the right tail F = 1 − F , assuming only that F belongs

to a wide sub-domain of D

M

(EV

γ

).

1.3 Estimators under study

Under the largest observations framework, and whenever dealing with heavy-tailed models, the classi-

cal semi-parametric EVI and VaR-estimators are the Hill (Hill, 1975) and Weissman-Hill’s estimators

(Weissman, 1978), with functional expressions

ˆγ

H

k,n

:=

1

k

X

i=1

(ln X

n−i+1:n

− ln X

n−k:n

) (1.5)

and

b

Q

H

k,n

(p) := X

n−k:n

c

ˆγ

H

k,n

k

, c

k

≡ c

k

(p) :=

k

np

, k = 1, 2, . . . , n − 1, (1.6)

respectively, which are pseudo-maximum likelihood estimators, consistent in the whole D

+

M

:=

D

M

(EV

γ

)

γ>0

, provided that k is intermediate, i.e. if

k = k

n

→ ∞ and k/n → 0, as n → ∞. (1.7)

In a way dual to (1.6), and given a high level x = x

n

, the probability p = p

x

of exceedance of such a

level can be estimated by

ˆp

H

k,n

(x) :=



k

n



e

C

−1/ˆγ

H

k,n

k

,

e

C

k

≡

e

C

k

(x) :=

x

X

n−k:n

, k = 1, 2, . . . , n − 1. (1.8)

3

Under further adequate restrictions on k, we can guarantee the asymptotic normality of the estimators

ˆγ

H

k,n

,

b

Q

H

k,n

(p) and ˆp

H

k,n

(x), in (1.5), (1.6) and (1.8), respectively. But most of the times, these estimators

exhibit a large variance for small k, a strong bias for moderate k, sample paths with very short stability

regions around the target value and a very peaked mean square error (MSE) structure, as a function of

k. This has led researchers to the search of alternative estimators, with a smaller MSE.

Since heavy-tailed models only have mean value if γ < 1, methods based on sample moments have

been rarely considered when we work with such a type of distributions. But in many practical ﬁelds

like in ﬁnance or insurance, for example, we usually have a positive EVI smaller than one, and even

smaller than 1/2. In this article, and for the estimation of the above mentioned parameters of extreme

events, we now revisit the use of a probability weighted moments (PWM) method based on the largest

observations, developed in Caeiro and Gomes (2011) for the EVI.

The PWM method is a generalization of the method of moments. It also consists in equating sample

moments with their corresponding theoretical moments, and then solving those equations in order to

obtain estimates of the diﬀerent parameters under play. The PWM of a r.v. X are deﬁned by

M

p,r,s

:= E(X

p

(F (X))

r

(1 − F (X))

s

),

where p, r and s are any real numbers (Greenwood et al., 1979). When r = s = 0, M

p,0,0

are the

usual noncentral moments of order p. Hosking et al. (1985) advise the use of M

1,r,s

, because then the

relations between parameters and moments have usually a much simpler form. Also, when r and s are

integers, F

r

(1 − F )

s

can be written as a linear combination of powers of F or 1 − F . So it is usual to

work with the particular case,

a

r

:= M

1,0,r

= E(X(1 − F (X))

r

), r ≥ 0,

and the associated estimator,

ˆa

r

=

1

n

n−r

X

i=1

(n − 1 − r)!(n − i)!

(n − 1)!(n − i − r)!

X

i:n

=

1

n

X

i=1

(n − i)(n − i − 1) . . . (n − i − r + 1)

(n − 1)(n − 2) . . . (n − r)

X

i:n

. (1.9)

For γ < 1 and for d.f.’s like the EVD, EV

γ

((x − λ)/δ), with EV

γ

(x) given in (1.3), the Pareto d.f.,

P

γ

(x; δ) = 1 − (x/δ)

−1/γ

, x > δ, (1.10)

and the GPD, GP

γ

(x/δ), with GP

γ

(x) deﬁned in (1.4), the PWM have simple expressions, which allow

a straightforward estimation of the EVI, γ. For the EVD, see Hosking et al. (1985) and the improved

versions in Diebolt et al. (2007, 2008). As an example, the Pareto PWM (PPWM) and the generalized

Pareto PWM (GPPWM) estimators of γ are valid for γ < 1, and given by

ˆγ

P P W M

= 1 −



ˆa

1

ˆa

0

− ˆa

1



and ˆγ

GP P W M

= 1 −

2ˆa

1

ˆa

0

− 2ˆa

1

, (1.11)

4

respectively, where ˆa

0

and ˆa

1

are given in (1.9). The estimator ˆγ

GP P W M

, in (1.11), was introduced and

studied in Hosking and Wallis (1987).

We shall consider in this paper, the PPWM estimators of VaR

p

and p

x

, the parameters respec-

tively deﬁned in (1.1) and (1.2), associated with the PPWM EVI-estimators studied in Caeiro and

Gomes (2011). Those estimators are semi-parametric in nature and, for comparison with the equiv-

alent estimators based on the Hill EVI-estimator, in (1.5), are based on the top k + 1 largest o.s.’s,

X

n−k:n

≤ X

n−k+1:n

≤ ··· ≤ X

n:n

. Under such a framework, the estimators ˆa

0

and ˆa

1

, in (1.9), should

be replaced by,

ˆa

0

(k) :=

1

k + 1

k+1

X

i=1

X

n−i+1:n

and ˆa

1

(k) :=

1

k + 1

k+1

X

i=1

i

k + 1

X

n−i+1:n

,

respectively. The PPWM EVI, VaR and p-estimators, based on the largest values are

ˆγ

P P W M

k,n

:= 1 −

ˆa

1

(k)

ˆa

0

(k) − ˆa

1

(k)

, (1.12)

ˆ

Q

P P W M

k,n

(p) :=

ˆa

0

(k) ˆa

1

(k)

ˆa

0

(k) − ˆa

1

(k)



k

np



ˆγ

P P W M

k,n

(1.13)

and

ˆp

P P W M

k,n

(x) :=



k

n





x(ˆa

0

(k) − ˆa

1

(k))

ˆa

0

(k)ˆa

1

(k)



−1/ˆγ

P P W M

k,n

, (1.14)

respectively, with k = 1, 2, . . . , n − 1, and are consistent whenever γ < 1.

De Haan and Ferreira (2006) considered, also for γ < 1, the semi-parametric GPPWM EVI-

estimator, based on the sample of excesses over the high random level X

n−k:n

, i.e.,

ˆγ

GP P W M

k,n

:= 1 −

2ˆa

?

1

(k)

ˆa

?

0

(k) − 2ˆa

?

1

(k)

, (1.15)

with k = 1, 2, . . . , n −1, and ˆa

?

s

(k) :=

P

k

i=1

(i/k)

s

(X

n−i+1:n

−X

n−k:n

)/k , s = 0, 1. For a ﬁnite-sample

comparison between the PPWM EVI-estimators in (1.12) and the GPPWM EVI-estimators in (1.15),

see Caeiro and Gomes (2011).

1.4 Scope of the article

In Section 2, after reviewing a few results already available in the literature, we state a lemma and a

theorem related with the asymptotic properties of the PPWM-estimators, deﬁned in (1.13) and (1.14), of

the above mentioned parameters of extreme events, the Value-at-Risk at the level p and the probability

p

x

of exceedance of a high level x, deﬁned in (1.1) and (1.2), respectively. The performance of these

estimators, for ﬁnite samples, is illustrated, in Section 3, through a Monte-Carlo simulation study. In

Section 4, we put forward an adaptive choice of thresholds, again on the basis of bootstrap computer-

intensive methods. Applications to a real data set in the ﬁeld of insurance as well as to a simulated

data set are provided in Section 5.

5

Semi-Parametric Probability-Weighted Moments Estimation Revisited

Figures

Citations

Extreme Value Theory and Statistics of Univariate Extremes: A Review

Comparison of precipitation extremes estimation using parametric and nonparametric methods

A location-invariant probability weighted moment estimation of the Extreme Value Index

Goodness-of-fit tests for semiparametric and parametric hypotheses based on the probability weighted empirical characteristic function

A Log Probability Weighted Moment Estimator of Extreme Quantiles

References

Statistics of extremes

Modelling Extremal Events

Modelling Extremal Events: for Insurance and Finance

Statistical Inference Using Extreme Order Statistics

A Simple General Approach to Inference About the Tail of a Distribution

Related Papers (5)

Semi-parametric tail inference through probability-weighted moments

Quantile-based inference and estimation of heavy-tailed distributions

Modelling of Income and Wage Distribution Using the Method ofL-Moments of Parameter Estimation

Nonparametric probability weighted empirical characteristic function and applications

Weighted approximations of tail copula processes with application to testing the bivariate extreme value condition