A priori ratemaking using bivariate Poisson regression models

doi:10.1016/J.INSMATHECO.2008.11.005

Journal Article•DOI•

A priori ratemaking using bivariate Poisson regression models

Lluís Bermúdez i Morata¹•Institutions (1)

01 Feb 2009-Insurance Mathematics & Economics (Xarxa de Referència en Economia Aplicada (XREAP))-Vol. 44, Iss: 1, pp 135-141

TL;DR: In this article, the authors examined an a priori ratemaking procedure when including two different types of claim, and the consequences for pure and loaded premiums when the independence assumption is relaxed by using a bivariate Poisson regression model are analyzed.

read less

Abstract: In automobile insurance, it is useful to achieve a priori ratemaking by resorting to generalized linear models, and here the Poisson regression model constitutes the most widely accepted basis. However, insurance companies distinguish between claims with or without bodily injuries, or claims with full or partial liability of the insured driver. This paper examines an a priori ratemaking procedure when including two different types of claim. When assuming independence between claim types, the premium can be obtained by summing the premiums for each type of guarantee and is dependent on the rating factors chosen. If the independence assumption is relaxed, then it is unclear as to how the tariff system might be affected. In order to answer this question, bivariate Poisson regression models, suitable for paired count data exhibiting correlation, are introduced. It is shown that the usual independence assumption is unrealistic here. These models are applied to an automobile insurance claims database containing 80,994 contracts belonging to a Spanish insurance company. Finally, the consequences for pure and loaded premiums when the independence assumption is relaxed by using a bivariate Poisson regression model are analysed.

...read moreread less

Summary (2 min read)

Jump to: [1 Introduction] – [2 Bivariate Poisson regression models] – [3 The database] – [4.1 Fitting bivariate Poisson models] and [5 Conclusions]

1 Introduction

Designing a tariff structure for insurance is one of the main tasks for actuaries.
When this assumption is relaxed, it is interesting to see how the tariff system might be affected.
In the next section, the model used here is defined.

2 Bivariate Poisson regression models

The usual methodology to obtain the a priori premium under the assumption of independence between types of claims can be described as follows.
This principle builds on the net premium by including a risk loading that is proportional to the variance of the risk.
This is the so-called trivariate reduction method that leads to the bivariate Poisson distribution.
Here the authors follow the zero-inflated bivariate Poisson model proposed by Karlis and Ntzoufras (2005).
Standard errors for the parameter estimates are calculated using standard bootstrap methods (boot package in R).

3 The database

The original sample comprised a ten percent sample of the automobile portfolio of a major insurance company operating in Spain in 1995.
Only cars categorised as being for private use were considered.
The sample is not representative of the actual portfolio as it was drawn from a larger panel of policyholders who had been customers of the company for at least seven years; however, it will be helpful for illustrative purposes.
The meaning of those variables referring to the policyholders’ coverage should also be clarified.
The simplest policy only includes third-party liability (claimed and counted as N1 type) and a set of basic guarantees such as emergency roadside assistance, legal assistance or insurance covering medical costs (claimed and counted as N2 type).

4.1 Fitting bivariate Poisson models

First, parameters related to the type of coverage (v10 and v11 ) were always significant and their presence increased the expected number o claims markedly.
In order to model the covariance term (λ3 ), the covariates were introduced in the bivariate Poisson model with the result that only the parameter for v10 was significant.
A profile with a mean lying very close to this average was chosen for the third profile.
In Table 7, it can be observed that the zero-inflated bivariate models did not present any noticeable differences with the non zero-inflated models in terms of the mean scores, but they were present in the case of the variance.

5 Conclusions

This paper has tested the independence assumption between claim types given a set of known risk factors and it has shown that independence should be rejected.
The interpretation of a number of bivariate Poisson models has been illustrated in the context of automobile insurance claims and the conclusion is that using a bivariate Poisson model leads to an a priori ratemaking that presents larger variances and, hence, larger loadings than those obtained under the independence assumption.
For the five models analysed here there seems to be a relationship between the goodness of fit and the level of overdispersion considered in each model.
In short, the main finding is that the independence assumption that is implicitely used when pricing automobile insurance by adding the pure premium for each guarantee (which are obtained using count data regression models) is insufficient because correlations (conditional on the covariates) are ignored.
3In Frees and Valdez (2008) a hierarchical model allows to capture possible dependencies of claims among the various types through a t-copula specification.

Did you find this useful? Give us your feedback

Figures (7)

Table 4: Results for bivariate Poisson model with regressor on λ3

Table 5: Results for zero-inflated bivariate Poisson models

Table 3: Results for bivariate Poisson and double Poisson models

Table 6: Five different policyholders to be compared

Table 7: Comparision of a priori ratemaking

Table 1: Explanatory variables used in the model

Content maybe subject to copyright Report

A priori ratemaking using bivariate Poisson regression models

∗

Llu´ıs Berm´udez i Morata

†

Departament de Matem`atica Econ`omica, Financera i Actuarial.

Risc en Finances i Assegurances-IREA. Universitat de Barcelona.

September 22, 2008

Abstract

In automobile insurance, it is useful to achieve a priori ratemaking by resorting to gene-

ralized linear models, and here the Poisson regression model constitutes the most widely

accepted basis. However, insurance companies distinguish between claims with or without

bodily injuries, or claims with full or partial liability of the insured driver. This paper exa-

mines an a priori ratemaking procedure when including two diﬀerent types of claim. When

assuming independence between claim types, the premium can be obtained by summing the

premiums for each type of guarantee and is dependent on the rating factors chosen. If the

independence assumption is relaxed, then it is unclear as to how the tariﬀ system might

be aﬀected. In order to answer this question, bivariate Poisson regression models, suitable

for paired count data exhibiting correlation, are introduced. It is shown that the usual

independence assumption is unrealistic here. These models are applied to an automobile

insurance claims database containing 80,994 contracts belonging to a Spanish insurance

company. Finally, the consequences for pure and loaded premiums when the independence

assumption is relaxed by using a bivariate Poisson regression model are analysed.

JEL classiﬁcation: C51; IM classiﬁcation: IM11; IB classiﬁcation: IB40.

Keywords: Bivariate Poisson regression models, Zero-inﬂated models, Automobile insurance,

Bootstrap methods, A priori ratemaking.

∗

Acknowledgements. The author wishes to acknowledge the support of the Spanish Ministry of Education and

FEDER grant SEJ 2007-63298. The author is grateful for the valuable suggestions from the participants in the

12th International Congress on Insurance: Mathematics and Economics in Dalian on July 16-18, 2008.

†

Corresponding Author. Departament de Matem`atica Econ`omica, Financera i Actuarial, Universitat

de Barcelona, Diagonal 690, 08034-Barcelona, Spain. Tel.: +34-93-4034853; fax: +34-93-4034892; e-mail:

lbermudez@ub.edu

1 Introduction

Designing a tariﬀ structure for insurance is one of the main tasks for actuaries. Such pricing

is particularly complex in the branch of automobile insurance because of highly heterogeneous

portfolios. A thorough review of ratemaking systems for automobile insurance, including the

most recent developments, can be found in Denuit et al. (2007).

One way to handle this problem of heterogeneity in a portfolio -referred to as tariﬀ segmenta-

tion or a priori ratemaking- involves segmenting the portfolio in homogenous classes so that all

insured parties belonging to a particular class pay the same premium. This procedure ensures

that the exact weight of each risk is fairly distributed within the portfolio. In the case of auto-

mobile insurance, in order to group the policies in homogenous classes, a series of classiﬁcation

variables are used (i.e., age, sex and place of residence of driver or horsepower, class and use of

the vehicle). These variables are called a priori ratemaking variables, since their values can be

determined before the insured party begins to drive.

If all the factors inﬂuencing a risk could be identiﬁed, measured and introduced in the tariﬀ

system, then the classes deﬁned would be homogenous. However, this is not that case as there

are important risk factors that are not considered in the a priori tariﬀ. Some examples are

especially diﬃcult to quantify, such as a driver’s reﬂexes, his or her aggressiveness, or knowledge

of the Highway Code, among others. As a result, tariﬀ classes can be quite heterogeneous.

Hence, the idea has arisen of considering individual diﬀerences in policies within the same class

by using an a posteriori mechanism, i.e., ﬁtting an individual premium based on the experience

of claims for each insured party. This concept has received the name of a posteriori tariﬀ,

experience rating or the bonus-malus system.

Here, only the ﬁrst step in pricing is studied, the a priori ratemaking. In short, the classi-

ﬁcation or segmentation of risks involves establishing diﬀerent classes of risk according to their

nature and probability of occurrence. For this purpose, factors are determined in order to classify

each risk, and it is statistically tested that the probability of a claim depends on these factors,

and hence, their inﬂuence can be measured. A priori classiﬁcation based on generalized linear

models is the most widely accepted method; see e.g. Dionne and Vanasse (1989), Haberman

and Renshaw (1996), Pinquet (1999), Berm´udez et al. (2001) and Boucher and Denuit (2006)

for applications in the actuarial sciences, and Mc Cullagh and Nelder (1989) or Dobson (1990)

for a general overview of the statistical theory.

The most commonly used generalized linear model for this tariﬀ system is the Poisson re-

gression model and its generalizations (Denuit et al., 2007). Introduced by Dionne and Vanasse

(1989) in the context of automobile insurance, the model can be applied if the number of claims

for each individual policy observation is known. Although it is possible to use the total number

of claims as the response variable, the nature of automobile insurance policies (covering diﬀe-

rent risks) is such that the response variable is the number of claims for each type of guarantee.

Therefore, a premium is obtained for each class of guarantee as a function of diﬀerent factors.

Then, assuming independence between types of claim, the total premium is obtained from the

sum of the expected number of claims of each guarantee.

Here, two diﬀerent types of guarantee are assumed: third-party liability automobile insurance

and the rest of guarantees. Following the usual methodology, assuming independence between

types, the premium paid by the policyholder is obtained by summing the premiums for each

type of guarantee and this depends on the rating factors. However, the question remains as

to whether the independence assumption is realistic. When this assumption is relaxed, it is

interesting to see how the tariﬀ system might be aﬀected.

In this study, a bivariate Poisson regression model is introduced. Holgate (1964) provided a

practical basis for the bivariate Poisson distribution but its use has been largely ignored, mainly

because of computational diﬃculties. Therefore, only a few applications can be found, for

example, Jung and Winkelmann (1993) used a bivariate Poisson regression in a labour mobility

study and Karlis and Ntzoufras (2003) modelled sports data. For a comprehensive review of the

bivariate Poisson distribution and its applications (especially multivariate regression), the reader

should see Kocherlakota and Kocherlakota (1992, 2001) and Johnson, Kotz and Balakrishnan

(1997).

One early application of the bivariate Poisson distribution in the actuarial literature is des-

cribed in Cummins and Wiltbank (1983). In ruin theory, some applications of this distribution

are also to be found, for example Partrat (1994), Ambagaspitiya (1999), Walhin and Paris (2000)

and Centeno (2005). Cameron and Trivedi (1998) studied the relationship between type of health

insurance and various responses that measure the demand for health care by using a bivariate

Poisson regression. In addition, two studies related to ﬁtting purposes should also be quoted,

albeit that no factors are considered. First, Vernic (1997) carried out a comparative study

with the bivariate Poisson distribution based on data related to natural events insurance and

third-party liability automobile insurance. Second, Walhin (2003) compared bivariate Hofmann

and bivariate Poisson distributions by ﬁtting a data set for accidents sustained by members of

a sample of 122 shunters in two consecutive 2-year periods. However, in a ratemaking context,

bivariate Poisson regression models have not been used to model claim counts that depend on

the usual rating factors.

In the next section, the model used here is deﬁned. This model is based on the bivariate

Poisson regression model, which is appropriate for modelling paired count data that exhibit

correlation. In Section 3 the database obtained from a Spanish insurance company is described.

In Section 4 the results are summarised. Finally, some concluding remarks are given in Section

2 Bivariate Poisson regression models

Let N

and N

be the number of claims for third-party liability and for the rest of guarantees

respectively and N = N

. The usual methodology to obtain the a priori premium under the

assumption of independence between types of claims can be described as follows. First, the model

assumed is N

∼ P oisson(λ

) and N

∼ P oisson(λ

) independently, and λ

and λ

depend on

a number of rating factors associated with the characteristics of the car, the driver and the use of

the car. Second, with λ

and λ

estimated for each policyholder and following the net premium

principle, the total net premium

( π ) is obtained as π = E[N] = E[N

] + E[N

] = λ

+ λ

However, an amount inﬂates the net premium to ensure that the insurer will not, on average,

lose money. Many well-known premium principles can be applied for this purpose. Here the

variance premium principle is used. This principle builds on the net premium by including a

risk loading that is proportional to the variance of the risk. Under the above assumptions,

the variance is equal to the expected value, and the total loaded premium ( π

∗

) is equal to

∗

= E[N ] + αV [N] = (1 + α)(E[N

] + E[N

]) .

In bivariate Poisson regression models, the independence assumption is relaxed. The model

Assuming the amount of the expected claim equals one monetary unit.

can be deﬁned as follows. Let us consider independent random variables X

(i = 1, 2, 3) to

be distributed as Poisson with parameters λ

respectively. Then the random variables N

+ X

and N

= X

+ X

follow jointly a bivariate Poisson distribution:

, N

) ∼ BP (λ

, λ

This is the so-called trivariate reduction method that leads to the bivariate Poisson distribution.

Its joint probability function is given by:

P (N

= n

, N

= n

) = e

−(λ

+λ

)

min(n

)

i=0











. (1)

The bivariate Poisson distribution deﬁned above presents several interesting and useful pro-

perties. First, it allows for positive dependence between the random variables N

and N

which

is what we expect for these types of claims

. Moreover Cov(N

, N

) = λ

and therefore λ

a measure of this dependence. Obviously, if λ

= 0 the two random variables are independent

and the bivariate Poisson distribution reduces to the product of two independent Poisson dis-

tributions, referred to as a double Poisson distribution (Kocherlakota and Kocherlakota, 1992).

Second, the marginal distributions for N

and N

are Poisson with E[N

] = λ

+ λ

and

E[N

] = λ

+ λ

Hence, the total net premium can be obtained with π = E[N] = E[N

] + E[N

] = λ

+ λ

2λ

. The variance necessary to obtain the loaded premium is now V [N ] = λ

+ λ

+ 4λ

. Since

is expected to be positive, the relaxation of the independence assumption leads to a variance

greater than the expected value. Overdispersion has often been observed when modelling claim

counts in automobile insurance data (Denuit et al., 2007).

Let us assume that N

and N

denote the random variables indicating the number of

claims of each type of guarantee for the jth policyholder. If covariates are introduced to model

, λ

and λ

, a bivariate Poisson regression model can be deﬁned with the following scheme:

, N

) ∼ BP (λ

, λ

log(λ

) = x

log(λ

) = x

log(λ

) = x

, (2)

In case of negatively correlated claims (not considered here) it would be necessary a more general speciﬁcation.

HTML Viewer

A priori ratemaking using bivariate Poisson regression models

Summary (2 min read)

1 Introduction

2 Bivariate Poisson regression models

3 The database

4.1 Fitting bivariate Poisson models

5 Conclusions

Figures (7)

Citations

References

Related Papers (5)