What is the reason for the bad performance of the CHL estimator?

The bad performance of the CHL estimator is due to the fact that the transformation functions in this case always starts at 0 when α >

Why does the KMCE estimator overestimate the tail?

The authors have also seen that the KMCE estimator overestimates the tail, which is because the transformation function has a heavy Pareto tail.

What is the mean and variance of the classical kernel density estimator?

The mean and variance of the classical kernel density estimator isE [̂g(y)] = g(y) + 1 2 b2µ2(K)g ′′(y) + o(b2), (6)V [̂g(y)] = 1 NbR(K)g(y) + o ( 1Nb) . (7)The transformation kernel density estimator can be expressed by the standard kernel density estimator:f̂ (x) = T ′(x)ĝ(T (x))implyingE [ f̂ (x) ] = T ′(x)E [̂g(T (x))] = T ′(x) ( g(T (x)) + 12 b2µ2(K)∂2g(T (x))∂(T (x))2 + o(b2)) .

(Open Access) Kernel density estimation for heavy-tailed distributions using the champernowne transformation (2005) | Tine Buch-larsen

Q: What are the contributions mentioned in the paper "Kernel density estimation for heavy-tailed distributions using the champernowne transformation" ?

In this article, a unified approach to the estimation of loss distributions is presented. The authors propose an estimator obtained by transforming the data set with a modification of the Champernowne cdf and then estimating the density of the transformed data by use of the classical kernel density estimator. The authors investigate the asymptotic bias and variance of the proposed estimator. In a simulation study, the proposed method shows a good performance. The authors also present two applications dealing with claims costs in insurance.

Statistics, Vol. 39, No. 6, December 2005, 503–518

Kernel density estimation for heavy-tailed distributions using

the champernowne transformation

TINE BUCH-LARSEN†, JENS PERCH NIELSEN‡, MONTSERRAT GUILLÉN*§ and

CATALINA BOLANCÉ¶

†Department of Research, Codan, 60 Gammel Kongevej, DK-1790 Copenhagen V, Denmark

‡Royal & SunAlliance, 60 Gammel Kongevej, DK-1790 Copenhagen V, Denmark

§Department of Econometrics, RFA-IREA, University of Barcelona, Diagonal,

690, 08034 Barcelona, Spain

¶Department of Econometrics, RFA-IREA, University of Barcelona, 08034 Barcelona, Spain

(Received 17 January 2005; in ﬁnal form 17 October 2005)

When estimating loss distributions in insurance, large and small losses are usually split because it is

difﬁcult to ﬁnd a simple parametric model that ﬁts all claim sizes. This approach involves determining

the threshold level between large and small losses. In this article, a uniﬁed approach to the estimation

of loss distributions is presented. We propose an estimator obtained by transforming the data set with

a modiﬁcation of the Champernowne cdf and then estimating the density of the transformed data by

use of the classical kernel density estimator. We investigate the asymptotic bias and variance of the

proposed estimator. In a simulation study, the proposed method shows a good performance. We also

present two applications dealing with claims costs in insurance.

Keywords: Actuarial loss models; Transformation; Skewness; Champernowne distribution

2000 Mathematics Subject Classiﬁcations: 62G07; 62-07; 91B30

1. Introduction

In ﬁnance and non-life insurance, estimation of loss distributions is a fundamental part

of the business. In most situations, losses are small, and extreme losses are rarely observed,

but the number and the size of extreme losses can have a substantial inﬂuence on the proﬁt of

the company. Standard statistical methodology, such as integrated error and likelihood, does

not weigh small and big losses differently in the evaluation of an estimator. These evaluation

methods do not, therefore, emphasize an important part of the error: the error in the tail.

Practitioners often decide to analyse large and small losses separately, because no single,

classical parametric model ﬁts all claim sizes. This approach leaves some important chal-

lenges: choosing the appropriate parametric model, identifying the best way of estimating the

parameters and determining the threshold level between large and small losses.

*Corresponding author. Email: mguillen@ub.edu

Statistics

http://www.tandf.co.uk/journals

DOI: 10.1080/02331880500439782

504 T. Buch-Larsen et al.

This work presents a systematic approach to the estimation of loss distributions which is

suitable for heavy tailed situations. The proprosed estimator is obtained by transforming the

data set with a parametric estimator and afterwards estimating the density of the transformed

data set using the classical kernel density estimator [1, 2]

f(y) =



i=1



y − Y



where K is the kernel function, b is the bandwidth and Y

, i =

{

1,...,N

}

is the transformed

data set. The estimator of the original density is obtained by back-transformation of

f(y).

We will call this method a semiparametric estimation procedure because a parametrized

transformation family is used. We propose to use a transformation based on the little-known

Champernowne cdf, because it produces good results in all the studied situations and it is

straightforward to apply.

The semiparametric estimator with shifted power transformation was introduced by Wand

et al. [3] in 1991. They showed that the classical kernel density estimator was improved

substantially by applying a transformation and suggested the shifted power transformation

family. Bolancé et al. [4] improved the shifted power transformation for highly skewed data

by proposing an alternative parameter selection algorithm. The semiparametric estimator with

the Johnson family transformation function was studied by Yang and Marron [5]. Hjort and

Glad [6] advocated a semiparametric estimator with a parametric start, which is closely related

to the bias reduction method described by Jones et al. [7]. The Möbius-like transformation

was introduced by Clements et al. [8]. In contrast to the shifted power transformation, which

transforms (0, ∞) into (−∞, ∞), the Möbius-like transformation transforms (0, ∞) into

(−1, 1) and the parameter estimation method is designed to avoid boundary problems. Scaillet

[9] has recently studied non-parametric estimators for probability density functions which have

support on the non-negative real line using alternative kernels.

The original Champernowne distribution has density [10]

f(x) =

(

(1/2)(x/M)

−α

+ λ + (1/2)(x/M)

)

x ≥ 0, (1)

where c is a normalizing constant and α, λ and M are parameters. The distribution was men-

tioned for the ﬁrst time in 1936 by D.G. Champernowne when he spoke on ‘The Theory of

Income Distribution’at the Oxford Meeting of the Econometric Society [11, 12] in 1936. Later,

he gave more details about the distribution [13], and its application to economics. When λ

equals 1 and the normalizing constant c equals (1/2)α, the density of the original distribution

is simply called the Champernowne distribution

f(x) =

αM

α−1

(

+ M

)

with cdf

F(x) =

+ M

. (2)

The Champernowne distribution converges to a Pareto distribution in the tail, while looking

more like a lognormal distribution near 0 when α>1. Its density is either 0 or inﬁnity at 0

(unless α = 1).

In the transformation kernel density estimation method, if we transform the data with the

Champernowne cdf, the inﬂexible shape near 0 results in boundary problems. We argue that a

modiﬁcation of the Champernowne with an additional parameter can solve this inconvenience.

Kernel density estimation 505

We did not choose to work with classical extensions of the Pareto distribution such as the

generalized Pareto distribution [14], GPD. The reason for doing so is that the GPD often

estimates distributions of inﬁnite support to have ﬁnite support and hence it cannot be used as

a transformation. We carried out a small simulation study of a standard lognormal distribution;

more than half the time the GPD suggested a distribution with ﬁnite support. Furthermore,

the GPD needs a (hard to pick) threshold from where the distribution starts; such that the

transformation methodology meets problems also in the beginning of the distribution.

In this paper, we study the transformation kernel density estimation method. The conclu-

sion of the simulation study is that the new approach based on the modiﬁed Champernowne

distribution is the preferable method, because it is the only estimator which has a good per-

formance in most of the investigated situations. Section 2 describes the transformation family

and explains the parameter estimation procedure. Section 3 presents the semiparametric kernel

density estimator and its properties. In section 4, the simulation study is presented and section

5 shows two applications. Finally, section 6 outlines the main conclusions.

2. The modiﬁed Champernowne distribution function

We generalize the Champernowne distribution with a new parameter c. This parameter ensures

the possibility of a positive ﬁnite value of the density at 0 for all α.

EFINITION 2.1 The modiﬁed Champernowne cdf is deﬁned for x ≥ 0 and has the form

α,M,c

(x) =

(x + c)

− c

(x + c)

+ (M + c)

− 2c

∀x ∈ R

(3)

with parameters α>0,M>0 and c ≥ 0 and density

α,M,c

(x) =

α(x + c)

α−1

((M + c)

− c

)

((x + c)

+ (M + c)

− 2c

)

∀x ∈ R

Corresponding to the Champernowne distribution, the modiﬁed Champernowne distribution

converges to a Pareto distribution in the tail:

α,M,c

(x) →



(

(M + c)

− c

)

1/α



α+1

as x −→ ∞ .

The effect of the additional parameter c is different for α>1 and for α<1. The parameter

c has some ‘scale parameter properties’: when α<1, the derivative of the cdf becomes larger

for increasing c, and conversely, when α>1, the derivative of the cdf becomes smaller for

increasing c. When α = 1, the choice of c affects the density in three ways. First, c changes

the density in the tail. When α<1, positive cs result in lighter tails, and the opposite when

α>1. Secondly, c changes the density in 0. A positive c provides a positive ﬁnite density

in 0:

0 <t

α,M,c

(0) =

αc

α−1

(M + c)

− c

< ∞ when c>0.

Thirdly, c moves the mode. When α>1, the density has a mode, and positive cs shift the

mode to the left. We therefore see that the parameter c also has a shift parameter effect. When

α = 1, the choice of c has no effect.

Figure 1 illustrates the role of c: the two graphs on the top show the cdfs and the densities

for the modiﬁed Champernowne distribution for ﬁxed α<1 and M = 3. In the cdf plot, we

506 T. Buch-Larsen et al.

Figure 1. Different shapes of the modiﬁed Champernowne distribution with different choices of α, as well as the

effect of the parameter c. In all plots c = 0, dashed line and c = 2, solid line.

see that increasing c results in lower values of the cdf in the interval [0,M)and higher values

of the cdf in the interval [M, ∞). In the density plot, we see that increasing c results in a

lighter tail and a ﬁnite density at 0. In the two graphs in the middle, we have ﬁxed α = 1 and

M = 3. We see that changing c has no effect. The two graphs at the bottom illustrate the effect

of increasing c when α>1, for M = 3. Notice that the values of the cdf become higher in

the interval [0,M)and lower in the interval [M,∞). The density plot shows that positive cs

move the mode to the left and produce a heavier tail.

From a computational point of view, it is simpler to estimate M and then proceed to the

other parameters.

In the Champernowne distribution, we notice that T

α,M,0

(M) = 0.5. The same holds for the

modiﬁed Champernowne distribution:T

α,M,c

(M) = 0.5.Thissuggests that M canbeestimated

as the empirical median of the data set. The empirical median is a robust estimator, especially

for heavy-tailed distributions, as shown by Lehmann [15]. He studied the properties of the

median and the mean as an estimator of location for the normal distribution and the Cauchy

Kernel density estimation 507

distribution, and showed that whereas the mean works well as an estimator of location for the

normal distribution, it works poorly for the Cauchy distribution due to its heavy tail. Tukey [16]

reached the same conclusion when he studied the efﬁciency of the median and the mean. He

showedthat the median efﬁciency increases as the tail becomes heavier. Corresponding models

have also been studied for heavy-tailed distributions [17–19]. A similar type of discussion for

the variance estimation was done by Hubert [20]. As we are especially concerned about heavy

tails, we consider the robustness of the median to be important.

After parameter M has been estimated as described earlier, the next step is to estimate the

pair

(

α, c

)

which maximizes the log likelihood function:

(

α, c

)

= N log α + N log

(

(M + c)

− c

)

+ (α − 1)



i=1

log(X

+ c)

− 2



i=1

log

(

+ c)

+ (M + c)

− 2c

)

. (4)

For a ﬁxed M, this likelihood function is concave and has a maximum.

3. The semiparametric transformation kernel density estimator

In this section, we will make a detailed derivation of the estimator based on the modiﬁed

Champernowne distribution, which we will call KMCE. The resulting estimator is obtained

by computing a non-parametric classical kernel density estimator for the transformed data set

and, ﬁnally, the result is back-transformed.

3.1 Transformation with the modiﬁed Champernowne distributions

Let X

, i = 1,...,N,be positive stochastic variables with an unknown cdf F and density f .

The following describes in detail the transformation kernel density estimator of f , and ﬁgure 2

illustrates the four steps of the estimation procedure for a data set with 1000 observations

generated from a Weibull distribution. The resulting transformation kernel density estimator

of f based on the Champernowne distribution is denoted by KMCE.

(i) Calculate the parameters



α,





of the modiﬁed Champernowne distribution as

described in section 2 to obtain the transformation function. In the ﬁrst plot in ﬁgure 2,

we see the estimated transformation function and the true Weibull distribution. Notice

that the modiﬁed Chapernowne density has a larger mode and that the tail is too heavy.

(ii) Transform the data set X

, i = 1,...,N,with the transformation function, T :

= T

α,





), i = 1,...,N.

The transformation function transforms data into the interval

(

0, 1

)

, and the parameter

estimation is designed to make the transformed data as close to a uniform distribution as

possible. The transformed data are illustrated in the second plot in ﬁgure 2.

Kernel density estimation for heavy-tailed distributions using the champernowne transformation

Figures

Citations

Composite Lognormal-Pareto model with random threshold

Skewed bivariate models and nonparametric estimation for the CTE risk measure

Modeling Extreme Events in Time Series Prediction

A UK best-practice approach for extreme sea-level analysis along complex topographic coastlines

Parameter recovery for the Leaky Competing Accumulator model

References

Density estimation for statistics and data analysis

Continuous univariate distributions

Theory of point estimation

Robust Statistics: Huber/Robust Statistics

A reliable data-based bandwidth selection method for kernel density estimation

Related Papers (5)

Density estimation for statistics and data analysis

A note on the estimation of a distribution function and quantiles by a kernel method

Modelling Extremal Events

Quantitative Risk Management: Concepts, Techniques, and Tools

Coherent Measures of Risk

Frequently Asked Questions (7)

Q1. What are the contributions mentioned in the paper "Kernel density estimation for heavy-tailed distributions using the champernowne transformation" ?

Q2. What is the recent study of the nonparametric estimator?

Q3. What is the reason for the bad performance of the CHL estimator?

Q4. Why does the KMCE estimator overestimate the tail?

Q5. Who improved the shifted power transformation for highly skewed data?

Q6. What is the mean and variance of the classical kernel density estimator?

Q7. What does the standard statistical methodology do not weigh differently in the evaluation of an estimator?