Next Article in Journal
Improving Convergence of Binomial Schemes and the Edgeworth Expansion
Next Article in Special Issue
Ruin Probabilities with Dependence on the Number of Claims within a Fixed Time Window
Previous Article in Journal / Special Issue
Macro vs. Micro Methods in Non-Life Claims Reserving (an Econometric Perspective)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Estimating Quantile Families of Loss Distributions for Non-Life Insurance Modelling via L-Moments

1
Department of Statistical Science, University College London, London WC1E 6BT, UK
2
Oxford-Man Institute, Oxford University, Oxford OX2 6ED, UK
3
System Risk Center, London School of Economics, London WC2A 2AE, UK
4
Discipline of Business Analytics, The University of Sydney, Sydney 2006, Australia
*
Author to whom correspondence should be addressed.
Risks 2016, 4(2), 14; https://doi.org/10.3390/risks4020014
Submission received: 28 February 2016 / Revised: 19 April 2016 / Accepted: 2 May 2016 / Published: 20 May 2016

Abstract

:
This paper discusses different classes of loss models in non-life insurance settings. It then overviews the class of Tukey transform loss models that have not yet been widely considered in non-life insurance modelling, but offer opportunities to produce flexible skewness and kurtosis features often required in loss modelling. In addition, these loss models admit explicit quantile specifications which make them directly relevant for quantile based risk measure calculations. We detail various parameterisations and sub-families of the Tukey transform based models, such as the g-and-h, g-and-k and g-and-j models, including their properties of relevance to loss modelling. One of the challenges that are amenable to practitioners when fitting such models is to perform robust estimation of the model parameters. In this paper we develop a novel, efficient, and robust procedure for estimating the parameters of this family of Tukey transform models, based on L-moments. It is shown to be more efficient than the current state of the art estimation methods for such families of loss models while being simple to implement for practical purposes.

1. Introduction: Context of Modelling Losses in General Insurance

In general, one can consider insurance to be principally a data-driven industry in which the primary cash out-flows comprise of claim payments. Insurance companies therefore employ large numbers of analysts which include actuaries, to understand the claims data. There are many categories of insurance business lines from which claims payment flows arise. The types of insurance business lines considered in this paper will be the area of general insurance and non-life insurance, which is probably one of the most active areas of research in actuarial science. General insurance is typically defined as any insurance that is not determined to be life insurance. It is called property and casualty insurance in the U.S. and Canada and non-life insurance in continental Europe. Non-life insurance typically includes modelling lines of business such as health insurance, personal/property insurance such as home and motor insurance as well as large commercial risks and liability insurance.
When considering the claim payments in non-life lines of business, traditionally the claim actuaries are concerned with the amount the insurance company will have to pay. Therefore, the general insurance actuary needs to have an understanding of the various models for the risk consisting of the total or aggregate amount of claims payable by an insurance company over a fixed period of time. Actuaries may consider fitting statistical models to insurance data that may contain relatively large but infrequent claim amounts. Therefore, an important aspect of actuarial science involves the development of statistical models that can be utilised to describe the claim process accurately so that reserves and liability management can be accurately performed. The models that non-life insurance actuaries develop should enable them to accurately make decisions on things such as: premium loading, expected profits, reserve necessary to ensure (with high probability) profitability, and the impact of reinsurance and deductibles. In particular, a core role for a non-life insurance actuary involves preserving the insurance company’s financial security by accurately estimating future claims liabilities. Reserving for the amount of future claims payments involves a large degree of uncertainty, especially for long tail class business where tail behaviors can be largely diverse. Hence, it can be difficult to estimate the loss reserve precisely. Fortunately, there are now numerous classes of models that have been developed for modelling claims in non-life insurance settings. For excellent reviews, see [1,2,3,4], and for models discussed in a similar context for heavy tailed, lepto-kurtic and plato-kurtic loss models, see [5].
In particular, many studies have pointed to the importance of considering flexible models with a variety of skewness and kurtosis properties in modelling non-life insurance loss processes. In practice, it is popular to consider two-parameter shape-and-scale models such as log-normal, gamma, Weibull, and Pareto models. See [6] for discussions. Primarily, these models have been popular due to the simplicity in parameter estimation and model selection. However, it has been observed that in making these distributional assumptions, actuaries may underestimate the risk inherited in the long tail, which is associated with large claim liabilities. This is because these distributions do not possess flexible tails to describe the features caused by large claims. Failure to estimate the large claim liabilities adequately can cause financial instability of the company, which may eventually lead to financial insolvency. In order to improve modelling accuracy and reliability, a number of more sophisticated models have been studied for such loss modelling. These include the Poisson-Tweedie family of models in the additive exponential dispersion class [7], the GB2 models [8,9], the generalised-t (GT) [10], the Stable family [11,12], and the Pearson family [13]. In this paper, we aim to raise awareness in the actuarial community of another alternative class of models that can be considered for such loss modelling based on different variations of the Tukey family.
We note that, in practice, many practitioners are reluctant to utilise such flexible models due to complications that can arise in real applications relating to parameter estimation and model selection. In this paper, we will therefore focus on two aspects, firstly the introduction of an under-utilised family of flexible skewness and kurtosis models for such non-life insurance modelling applications, and secondly a novel, accurate, and robust estimation method that we have developed for fitting such models in practice. This will allow the general Tukey transform models, such as the g-and-h, g-and-k, and g-and-j models, to be easily implemented in practice. We show that our proposed estimation method, based on matching L-moments, is more accurate than the existing methods. We will discuss in detail the family of Tukey transform models, then introduce our estimation method, and finally present an empirical application using a real world dataset.

2. General Families of Quantile Transform Distributions

Here we discuss several distributional families relevant to loss modelling in insurance which can only be specified via the transformation of another standard random variable, for example a Gaussian. Examples of such models which are typically defined through their quantile functions include the Johnson family, with base distribution given by Gaussian or logisitic, and the Tukey family with base distribution typically given by a Gaussian or logistic. The concept of constructing skewed and heavy-tailed distributions through the use of a transformation of a Gaussian random variable was originally proposed in the work of [14] and is therefore aptly named the family of Tukey distributions. This family of distributions was then extended by [15,16,17,18]. The multivariate versions of these models have been discussed by [19].
Within this family of distributions, two particular subfamilies have received the most attention in the literature; these correspond to the g-and-h and the g-and-k distributions. The first of these families, the g-and-h, has been studied in several contexts. See for instance the developments in the areas of risk and insurance modelling in [20,21,22,23], and the detailed discussion in ([24], Chapter 9). The second family of g-and-k models has been looked at in works such as [25,26].
A key advantage of models such as the g-and-h family for modelling losses in a non-life insurance setting is the fact that they provide a very flexible range of skewness, kurtosis, and heavy-tailed features while also being specified as a rather simple transformation of standard Gaussian random variates, making simulation under such models efficient and simple. It is important to note that the support of the g-and-h density includes the entire real line. In some cases this is appropriate in non-life insurance modelling such as under logarithmic transfoms of the loss or claim amounts. In other cases it may be more appropriate to consider loss models with strictly positive supports, in general this can be achieved either by truncation or by restriction of the parameter values. In some subfamily members, the g-and-h family automatically takes a positive support such as the double h-h subfamily.

2.1. The Tukey Family of Loss Models

We begin by discussing the general family of Tukey distributions. Basically, Tukey suggested several nonlinear transformations of a standard Gaussian random variable, denoted below throughout by W Normal ( 0 , 1 ) . The g-and-h transformations involve a skewness transformation of type g and a kurtosis transformation of type h. If one replaces the kurtosis transformation of the type h with the type k, one obtains the g-and-k family of distributions discussed by [27]. If the type h transformation is replaced by the type j transformation, one obtains the g-and-j transformations of [28].
The generic specification of the Tukey transformation is provided in Definition 2.1. These types of transformations were labelled elongation transformations, where the notion of elongation was noted to be closely related to tail properties such as heavy-tailedness. See discussions by [17]. In considering such a class of elongation transformations to obtain a distribution, one is comparing the tail strength of the new distribution with that of the base distribution (such as a Gaussian or logistic). In this regard, one can think of tail strength or heavy-tailedness as an absolute concept, whereas the notion of elongation strength is a relative concept. In the following, we will first consider relative elongation compared to a base distribution for a generic random variable W. It should be clear that such a measure of relative tail behavior is independent of location and scale. An elongation transformation T ( · ) should also satisfy the following properties: (1) it should preserve symmetry T ( w ) = T ( - w ) ; (2) the base distribution should not be significantly transformed in the centre, such that T ( w ) = w + O ( w 2 ) for w around the mode; (3) to increase the tails of the resulting distribution relative to the base, it is important to assume that T ( · ) is a strictly monotonically increasing transform that is convex, that is, one has the transform satisfying for w > 0 that T ( w ) > 0 and T ( w ) > 0 . One such transformation family satisfying these properties is the Tukey transformations.
Definition 2.1 (Tukey transformations).
Consider a Gaussian random variable W Normal ( 0 , 1 ) and transformation X = r ( W ) then the resultant transformed loss random variable X will be from a Tukey law if the corresponding transformation r ( W ) is given by
X = r ( W ) = W T ( W ) θ ,
for a parameter θ R . Under this transformation we also have directly in closed form the quantile function of the loss random variable X in terms of the quantile function of the base random variable W as follows
Q X ( α ) = a + b Q W ( α ) T ( Q W ( α ) ) θ
with translation and scaling constants a, b.
Typically, in several applications, it will be desirable when working with such severity models to enforce a constraint that the tails of the resulting distribution after transformation are heavier than the Gaussian distribution. In this case, one should consider a transformation T ( w ) , which is positive, symmetric, and strictly monotonically increasing for positive values of w 0 . In addition, it will be desirable to obtain this property of heavy tails relative to the Gaussian to also consider setting the parameter θ 0 . As discussed, a series of kurtosis transformations are proposed in the literature. The Tukey transformations of types h, k, and j are provided in Definition 2.2.
Definition 2.2 (Tukey’s kurtosis transformations of types h, k and j).
The h-type transformation, denoted by T h ( w ) , is given by
T h ( w ) = exp w 2 .
The k-type transformation, denoted by T k ( w ) , is given by
T k ( w ) = 1 + w 2 .
The j-type transformation, denoted by T j ( w ) , is given by
T j ( w ) = 1 2 exp ( w ) + exp ( - w ) .
In addition to the kurtosis transformations, there are skewness transformations that have been developed in the Tukey family, such as the g-type transformation.
Definition 2.3 (Tukey’s skewness transformation).
The g-type transformation, denoted by T g ( w ) , is given by
T g ( w ) = exp ( w ) - 1 w
The generalised g-type transformation, denoted by T g * ( w ) , is given by
T g * ( w ) = 1 + c 1 - exp - g W 1 + exp - g W
To nest all these transformations within one class of transformations, the work of [29] proposed a power series representation denoted by the subscript a given in Equation (8). This suggestion, though it nested the other families of distributions, is not practical for use as it involves the requirement of estimating a very large (infinite) number of parameters a i to obtain the data-generating mechanism:
T a ( w ) = i = 0 a i w 2 i .
It was further observed in [29] that this nesting structure may be replaced with a different form, given by the general transformation taking the form given in Equation (9):
T h j k ( w ; α , β , γ ) = 1 + w 2 + γ α - γ α β β , α > 0 , β 1 , γ > 0 .
Then it is clear that the original h-, k-, and j-type transformations are recovered with T h ( w ) = T h j k ( w ; 1 , , γ ) , T k ( w ) = T h j k ( w ; 1 , 1 , γ ) , and T j ( w ) T h j k ( w ; 0 . 5 , , 0 . 5 ) . Next, we explain the properties of specific subfamilies of distributions, showing how these results are derived for the basic g-and-h family and the g-and-k family.

2.2. Examples of the g-and-h, g, h, and h-h Loss Models

The g-and-h family can be considered as composed of three transformations that can produce subfamilies of non-Gaussian distributions for loss amount severity based on the g-distributions, the h-distributions, and the g-and-h distributional families. The basic specifications in which g and h components are treated as constants are given in Definition 2.4 in terms of transformations of a random variable W, typically Gaussian.
Definition 2.4 (g-and-h distributional family).
Let W Normal ( 0 , 1 ) be a standard Gaussian random variable. Then the loss random variable X has severity distribution given by the g-and-h distribution with parameters a , b , g , h R , denoted X GH ( a , b , g , h ) , if X is given by (for g 0 )
X = a + b W T g , h W ; g , h θ : = a + b G ( W ) H ( W ) W ,
where θ = 1 and
G ( w ) = exp ( g w ) - 1 g w
and
H ( w ) = exp h w 2 2 .
One can observe that a and b account for location and scale, respectively. It can be checked from Equation (11) that the reshaping function G ( · ) is bounded from below by zero, that it is either monotonically increasing or monotonically decreasing for g being, respectively, positive or negative, and that by rewriting it as its series expansion,
G ( w ) = 1 + g w 2 ! + ( g w ) 2 3 ! + ( g w ) 3 4 ! + ,
G ( · ) is equal to one at zero for all g. Thus G ( · ) generates asymmetry by scaling w differently on each side of zero via the parameter g. Furthermore, as G ( w ; g ) = G ( - w ; - g ) , the sign of g affects only the direction of skewness. For g = 0 , by Equation (13), the constant function G ( w ) = 1 is obtained, and thus the symmetry remains unmodified. For h > 0 , H ( · ) is a strictly convex even function with H ( 0 ) = 1 , and thus it generates heavy tails by scaling upward the tails of W while preserving the symmetry. When h = 0 , the transformation given by Equation (10) generates the subfamily of g-distributions, which coincides with the family of shifted log-normal distributions for g > 0 . When g = 0 , the transformation generates the subfamily of h-distributions, which is symmetric and has heavier tails than normal distributions. The parameters a and b are linear transformations whereas the parameters g and h can be significantly extended to polynomials as discussed later, and play an important role in the skewness and kurtosis properties of the g-and-h family.
Remark 2.1. 
In general, one may consider the constants g and h to be more flexibly selected as polynomials, which would include higher orders of W 2 . These polynomials could take the form, for example, of any integers p and q :
g ( w ) : = α 0 + α 1 w + + α p w p , h ( w ) : = β 0 + β 1 w + + β q w q .
The addition of these polynomial terms can provide additional degrees of freedom to improve the ability to fit data. These have been shown to be significant when modelling certain types of loss data, as demonstrated by [21,23].
Within this family of g-and-h distributions, one can also define the subfamilies of distributions given by the g and the h families. Again, we present these models in their simplest form, with constant g or h, though in practice one may include polynomials in W for such models.
Definition 2.5 (g distributional family).
Let W Normal ( 0 , 1 ) be a standard Gaussian random variable. Then the loss random variable X has severity distribution given by the g distribution with parameters a , b , g R , denoted X G ( a , b , g ) , if X is given by (for g 0 )
X = a + b T g W ; g : = a + b exp g W - 1 g .
Remark 2.2. 
Note that the g-distribution subfamily corresponds (in the case that g is a constant) to a scaled log-normal distribution.
Definition 2.6 (h distributional family).
Let W Normal ( 0 , 1 ) be a standard Gaussian random variable. Then the loss random variable X has severity distribution given by the h distribution with parameters a , b , h R , denoted X H ( a , b , h ) , if X is given by
X = a + b W T h W ; a , b , h : = a + b W exp h W 2 2 .
In addition, one may obtain an asymmetric class of h-h distributions studied by ([30], Section 2.2), [31,32]. The asymmetric h-h distribution transformation is given in Definition 2.7.
Definition 2.7 (Double h-h distributional family).
Let W Normal ( 0 , 1 ) be a standard Gaussian random variable. Then the loss random variable X has severity distribution given by the unit h-h distribution with parameters h l , h r R , denoted X HH ( h l , h r ) , if X is given by
X = a + b W T h , h W ; h l , h r : = a + b W exp 1 2 h l W 2 , W 0 , a + b W exp 1 2 h r W 2 , W 0 ,
for h r 0 and h l 0 .
To conclude this section, there is also a generalised g-and-h family that is given in Definition 2.8, see discussions in [27].
Definition 2.8 (Generalised g-and-h distributional family).
Let W Normal ( 0 , 1 ) be a standard Gaussian random variable. Then the loss random variable X has severity distribution given by the g-and-h distribution with parameters a , b , g , h R , denoted X Generalised - GH ( a , b , g , h ) , if X is given by (for g 0 )
X = a + b W T g , h * W ; g , h θ : = a + b G * ( W ) H ( W ) W ,
where θ = 1 and
G * ( w ) = 1 + c 1 - exp - g W 1 + exp - g W
and
H ( w ) = exp h w 2 2 .

2.3. Examples of the g-and-k and g-and-j Loss Models

The g-and-k family of loss models, as parameterised in [33] is given by combining the g and the k transforms as given in Definition 2.9.
Definition 2.9 (g-and-k distributional family).
Let W Normal ( 0 , 1 ) be a standard Gaussian random variable. Then the loss random variable X has severity distribution given by the g-and-k distribution with parameters a , b , g , k R , denoted X GK ( a , b , g , k ) , if X is given by (for g 0 )
X = a + b W T g , k W ; a , b , g , k θ : = a + b G * ( W ) K ( W ) W
where θ = 1 and
G * ( w ) = 1 + c 1 - exp - g w 1 + exp - g w
and
K ( w ) = 1 + w 2 k .
where a R is location, b > 0 is scale, g R is the skewness measure, k > - 0 . 5 is a measure of kurtosis and c is a constant.
Similarly, the g-and-j family of loss models is obtained by combining the g and the j transforms as given in Definition 2.10.
Definition 2.10 (g-and-j distributional family).
Let W Normal ( 0 , 1 ) be a standard Gaussian random variable. Then the loss random variable X has severity distribution given by the g-and-j distribution with parameters a , b , g , j R , denoted X GJ ( a , b , g , j ) , if X is given by (for g 0 )
X = a + b W T g , j W ; a , b , g , j θ : = a + b G * ( W ) J ( W ) W
where θ = 1 and
G * ( w ) = 1 + c 1 - exp - g w 1 + exp - g w
and
J ( w ) = 1 2 exp ( w ) + exp ( - w ) .
with a R is location, b > 0 is scale, g R is the skewness measure, and in this case one can set j = 1 .
In the following sections we will explore properties of the two more widely used families of models the g-and-h, and its sub-families, as well as the g-and-k claims severity models.

3. Distribution and Density Functions of the g-and-h, g, h, h-h, and g-and-k Families

In this section we discuss properties of the Tukey sub-families of loss models and in particular different ways that people have sought to evaluate and present the distribution and density functions for the popular sub-families such as the g-and-h, generalised g-and-h and g-and-k families. In general, it will be informative for this section to remind the reader of the following basic property.
Proposition 3.1. 
If X is a continuous random variable distributed according to distribution X F X ( x ) , which is monotonically increasing on support Supp F X ( x ) = x : 0 < F X ( x ) < 1 , then, in this general case, one can show that the quantile function Q X ( α ) = F X - 1 ( α ) for α [ 0 , 1 ] determines the relationship between the random variable X and any other continuous random variable with monotonically increasing distribution, say W F W ( w ) , through the relationship given as follows
Q X ( α ) = F X - 1 F W Q W ( α ) .
Furthermore, the following relationship between the quantile function of a random variable X and its density can be obtained by using the identity for differentiation of an inverse function given by
d d x g - 1 ( x ) = 1 d x g g - 1 ( x ) - 1 .
This result, when applied to the quantile function of the random variable X, produces the following relationship
d Q X ( α ) d α = 1 f X Q X ( α )
where Q X ( α ) is the quantile function for random variable X at quantile level α, and f X ( · ) represents the density for random variable X. One can then also apply this to the relationship in Equation (27) to obtain
d Q X ( α ) d α = d d u Q X ( u ) d u d α = f W Q W ( α ) f X F X - 1 F W Q W ( α ) .
In the remainder of this paper we will consider to utilise the most popular choice of reference distribution in the literature which refers to the standard Gaussian base distribution, i.e., W F W ( w ) = Φ ( w ; 0 , 1 ) . We note that it is however trivial to modify the results below for other choices of distribution. In this Gaussian case, one can show that for any continously differentiable transformation X = T ( W ) , X will have a density given in Equation (31) with respect to the standard Gaussian density ϕ ( · ) .
f X ( x ) = ϕ T - 1 ( x ) T T - 1 ( x ) .
In this case, one can also observe that when the transform T ( · ) increases rapidly, the resulting density is heavy-tailed. For instance, conversely a slower linear growth in the function T ( · ) results in tail behavior for the distribution of random variable X being equivalent to a Gaussian.
These two general results in Proposition 3.1 can then been used to characterise the distribution and density functions for different members of the Tukey family of quantile specified loss models. In the following results we basically apply the same methodology to obtain the density and distribution for each of the different Tukey classes of loss model. We begin with the density of the superclass of transformations presented previously according to the quantile transformation T h j k .
Lemma 3.1 (Super class T h j k density).
One can state the following basic properties for the loss random variable X = r ( W ) = W T h j k ( W ) θ , the loss density f X ( · ) and quantile functions Q X ( · ) , for loss random variable X, are given by
f X ( x ; h , j , k ) = 1 Q X Q X - 1 x = ϕ r - 1 ( x ) r r - 1 ( x ) , inf x : x S < x < sup x : x S Q X ( α ) = r Q W ( α ) , α [ 0 , 1 ] ,
with S the appropriate support of the random variable X and
r ( x ) = T h j k ( x ) θ - 1 T h j k ( x ) + θ x T h j k ( x ) .
Clearly, this density representation is a composition of two functions, one of which can only be evaluated typically numerically due to general non-closed from expressions for the inversion. In an analogous manner one can of course then find the distribution and density for the other Tukey families of loss model.
The first observation one can make for the g-and-h family is that since the transformations are monotonically increasing as long as h > 0 , the quantile function of the g-and-h distribution is readily available. This result was utilised in [20,34] to obtain expressions for the density as such a composite function.
Lemma 3.2 (g-and-h distribution and density functions (constant g and h with h > 0 )).
Consider the g-and-h distributed random variable X GH ( a = 0 , b = 1 , g , h ) with constant parameters g and h > 0 . The distribution function can be specified according to the following composite function:
F X ( x ; g , h ) = Φ r - 1 ( x ) ,
f X ( x ; g , h ) = 1 Q X Q X - 1 x , (34) = ϕ r - 1 ( x ) r r - 1 ( x ) ,
where Φ ( · ) is the standard Gaussian distribution and the function r ( x ) is specified by
r ( x ) = exp g x - 1 g exp h x 2 2 .
where the derivative is then given by
r ( x ) = d d x exp g x - 1 g exp h x 2 2 = exp g x + h x 2 2 + h g x exp h x 2 2 exp ( g x ) - 1 .
In this parameterisation, the parameter g will control the skew of the distribution both in terms of the sign and the magnitude, while the parameter h will control heaviness of the tails and is related directly to the kurtosis. This will be discussed further when the regular variation properties of this model are explored. As demonstrated previously, the original Tukey h-type transformation had θ = 1 and an additional scaling of 1 2 . This transformation has the property that its derivative
d d x r ( x ) = 1 + h x 2 exp 1 2 h x 2 1
for all h 0 . In addition, in the following discussions, it will be useful to recall the following properties of the g-and-h family of distributions:
  • the g-and-h transformation can be shown to be strictly monotonically increasing in its argument, that is, for all w 1 w 2 one has T g h ( w 1 ) T g h ( w 2 ) ;
  • if a = 0 , then the g-and-h transformation satisfies the condition T ( - g ) h ( W ) = - T g h ( - W ) .
In the case of the generalised g-and-h distribution one has the Tukey quantile transform producing loss random variable X according to base random variable W given by
T g h * ( W ) = 1 + c 1 - exp - g W 1 + exp - g W exp h W 2 2 ,
with r ( W ) = W T ( W ) θ and typically in this family we also consider θ = 1 . This then produces the following density and distributions.
Lemma 3.3 (Generalised g-and-h distribution and density functions).
Consider the generalised g-and-h distributed random variable X Generalised - GH ( a = 0 , b = 1 , g , h ) with constant parameters g and h > 0 . The distribution function can be specified according to the following composite function:
F X ( x ; g , h ) = Φ r - 1 ( x ) ,
f X ( x ; g , h ) = 1 Q X Q X - 1 x , = ϕ r - 1 ( x ) r r - 1 ( x ) ,
where Φ ( · ) is the standard Gaussian distribution and the function r ( x ) is specified by
r ( x ) = 1 + c 1 - exp - g x 1 + exp - g x x exp h x 2 2 ,
where the derivative is then given by
r ( x ) : = d d x 1 + c 1 - exp - g x 1 + exp - g x x exp h x 2 2 = exp ( h x 2 / 2 ) 1 + exp ( g x ) 2 ( 1 + h x 2 ) + 2 c exp ( g x ) g x + ( 1 + h x 2 ) sinh ( g x ) 1 + exp ( g x ) 2 .
In the case of the g-and-k distribution family one has the quantile function, for a = 0 and b = 1 , given by
Q X ( α ) = Q W ( α ) T g k ( Q W ( α ) ) θ = 1 + c 1 - exp - g Q W ( α ) 1 + exp - g Q W ( α ) Q W ( α ) 1 + Q W ( α ) 2 k .
Lemma 3.4 (g-and-k distribution and density functions).
Consider the g-and-k distributed random variable X GK ( a = 0 , b = 1 , g , k ) with constant parameters g and k > - 0 . 5 . The distribution function can be specified according to the following composite function:
F X ( x ; g , k ) = Φ k - 1 ( x ) ,
f X ( x ; g , k ) = 1 Q X Q X - 1 x , = ϕ r - 1 ( x ) r r - 1 ( x ) ,
where Φ ( · ) is the standard Gaussian distribution and the function r ( x ) is specified by
r ( x ) = 1 + c 1 - exp - g x 1 + exp - g x x 1 + x 2 k ,
where the derivative is then given by
r ( x ; g , k ) : = d d x 1 + c 1 - exp - g x 1 + exp - g x x 1 + x 2 k = 2 exp ( g x ) exp ( 1 + x 2 ) k 1 + c g x + 2 k x 2 + ( 1 + 2 k x 2 ) cosh ( g x ) + c ( 1 + 2 k x 2 ) sinh ( g x ) 1 + exp ( g x ) 2 .

4. Statistical Properties of g-and-h, g, h, h-h, and g-and-k Families Related to Claim Modelling

One advantage of the specifications presented above, of the distribution and density functions with regard to a particular quantile function, is that the statistical properties of these distributions can now be easily studied. For instance, the mode and moments of the distribution can be characterized. The result in Proposition 4.1 provides the mode for the g-and-h, generalised g-and-h and the g-and-k distributions.
Proposition 4.1 (Mode of the g-and-h, generalised g-and-h and g-and-k densities).
Consider the g-and-h distributed random variable X GH ( a = 0 , b = 1 , g , h ) with constant parameters g and h > 0 , the generalised g-and-h given by X Generalised - GH ( a = 0 , b = 1 , g , h ) and the g-and-k distribution X GK ( a = 0 , b = 1 , g , k ) with constant parameters g and k > - 0 . 5 . In each of these models, which will be generically denoted here by transform X = r ( W ) , the mode of the density is located at the value w ˜ = Mode W , which produces a maximum value of the densities at f r ( W ) ( w ˜ ) , depending on the transform r ( w ) in each case, and can be found as the solution to the following equations when w = w ˜ , which is selected to satisfy
d d w f W ( w ) r ( w ) = 0 ,
Analogously the medians can also be obtained.
Proposition 4.2 (Median of the g-and-h, generalised g-and-h and g-and-k densities).
Consider the g-and-h distributed random variable X GH ( a = 0 , b = 1 , g , h ) with constant parameters g and h > 0 , the generalised g-and-h given by X Generalised - GH ( a = 0 , b = 1 , g , h ) and the g-and-k distribution X GK ( a = 0 , b = 1 , g , k ) with constant parameters g and k > - 0 . 5 . In each of these models, which will be generically denoted here by transform X = r ( W ) , the median of the density is located at the value w 0 . 5 = Median W and will correspond to the median being the limit of lim w 0 r ( w ) = 0 .
Remark 4.1. 
Therefore, one sees that in each of the g-and-h, generalised g-and-h and g-and-k distributions the median of the data set will be the parameter a. Furthermore, in the case of the h-type and double h-type Tukey distributions, the median and mode are at the origin (for a = 0 ).
One can also obtain the moments of Tukey family of distributions, with generically denoted Tukey quantile transform given by r ( W ) = W T ( W ) θ , as the solution to the following integrals, where the n-th moment is given with respect to the transformed moments of the base density as follows:
E X n = E r ( W ) n = - r ( w ) n f W ( w ) d w ,
From such a result, one may now express the moments of the g-and-h, generalised g-and-h and g-and-k distributed random variables according to the results in Proposition 4.3, Proposition 4.4 and Proposition 4.5.
Before presenting these we note the following from [35] that since the g-distribution is a horizontally shifted log-normal distribution, then the moments of the g-distribution take the same form as those of a log-normal model with appropriate adjustment for the translation. The h-distributional family is symmetric (except the double h-h family); consequently, all odd-order moments for the h-subfamily are zero.
Proposition 4.3 (Moments of the g-and-h density).
Consider the g-and-h distributed random variable X GH ( a = 0 , b = 1 , g , h ) with constant parameters g and h > 0 . The n-th integer moment is given with respect to the standard Normal distribution and the n-th power of the transformed quantile function given by
r ( W ) = a + b exp g W - 1 g exp h W 2 2 .
to produce moments according to the relationship
E X n = E r ( W ) n
which will exist if h 0 , 1 n . One can also observe more generally that under the g-and-h transform the following identity holds with regard to powers of the standard Gaussian, W Normal ( 0 , 1 ) , such that
X n = r ( W ) n = T g , h ( W ; a , b , g , h ) n = a + b T g , h ( W ; a = 0 , b = 1 , g , h ) n = i = 0 n n ! ( n - i ) ! i ! a n - i b i T g , h ( W ; a = 0 , b = 1 , g , h ) i ,
which will produce the n-th moment given by
E X n = E a + b T g , h ( W ; a = 0 , b = 1 , g , h ) n = i = 0 n n ! ( n - i ) ! i ! a n - i b i E T g , h ( W ; a = 0 , b = 1 , g , h ) i .
Furthermore, it was shown by [35] that when it exists one can obtain the general expression
E T g , h ( W ; a = 0 , b = 1 , g , h ) i = r = 0 i ( - 1 ) r i ! ( i - r ) ! r ! exp ( i - r ) 2 g 2 2 ( 1 - i h ) ( 1 - i h ) g i ,
The results in Proposition 4.3 then produce the following four population moments for the basic g-and-h loss model in closed-form for a = 0 and b = 1 :
E X = exp g 2 2 - 2 h - 1 g 1 - h - 1 E X 2 = 1 - 2 exp g 2 2 - 4 h + exp 2 g 2 1 - 2 h g 2 1 - 2 h - 1 E X 3 = 3 exp g 2 2 - 6 h + exp 9 g 2 2 - 6 h - 3 exp 2 g 2 1 - 3 h - 1 g 3 1 - 3 h - 1 E X 4 = s ( g , h ) exp 8 g 2 1 - 4 h g 4 1 - 4 h - 1 .
with the function s ( g , h ) being given by
s ( g , h ) = 1 + 6 exp 6 g 2 4 h - 1 + exp 8 g 2 4 h - 1 - 4 exp 7 g 2 8 h - 2 - 4 exp 15 g 2 8 h - 2 .
Analagously then we can also find the n-th order integer moments for the generalised g-and-h and the g-and-k models as follows.
Proposition 4.4 (Moments of the generalised g-and-h density).
Consider the generalised g-and-h distributed random variable X Generalised - GH ( a = 0 , b = 1 , g , h ) with constant parameters g and h > 0 . The n-th integer moment is given with respect to the standard normal distribution and the n-th power of the transformed quantile function
r ( W ) = a + b 1 + c 1 - exp - g W 1 + exp - g W W exp h W 2 2 .
given by
E X n = E r ( W ) n
which will exist if h 0 , 1 r . Hence, one obtains the n-th moment by
E X n = E a + b T g , h * ( W ; a = 0 , b = 1 , g , h ) n = i = 0 n n ! ( n - i ) ! i ! a n - i b i E T g , h * ( W ; a = 0 , b = 1 , g , h ) i .
The moments of the generalised transform can not in general be obtained in closed-form except in some special cases. However, one make the following McClaren series expansion of the term G * ( W ) H ( W ) / W i and then approximate the moments as follows at say pth order. We provide the result for 3rd order series expansion of the transform below
T g , h * ( W ; a = 0 , b = 1 , g , h ) i = W i 1 + 1 2 i c g W + i h 2 + 1 8 c 2 g 2 ( i - 1 ) i W 2 + 1 4 i 2 c g h - 1 24 i c g 3 + 1 48 c 3 g 3 ( i - 2 ) ( i - 1 ) i W 3 + O ( W 4 ) .
This can then be integrated to produce the approximate i-th order moments given by
3 π 2 5 - i 2 E T g , h * ( W ; a = 0 , b = 1 , g , h ) i = c g i 12 ( 2 + h i ( 2 + i ) ) + g 2 ( 2 + i ) ( - 2 + c 2 ( 2 - 3 i + i 2 ) ) Γ 1 + i 2 + 3 2 c 2 g 2 i ( - 1 + i 2 ) + 4 ( 2 + h i ( 1 + i ) ) Γ 1 + i 2 + ( - 1 ) i - c g i ( 12 ( 2 + h i ( 2 + i ) ) + g 2 ( 2 + i ) ( - 2 + c 2 ( 2 - 3 i + i 2 ) ) ) Γ 1 + i 2 + ( - 1 ) i 3 2 ( c 2 g 2 i ( - 1 + i 2 ) + 4 ( 2 + h i ( 1 + i ) ) ) Γ 1 + i 2 + O ( W 4 + i + 1 )
Similarly to the results obtained above we can obtain the moments of the g-and-k as follows.
Proposition 4.5 (Moments of the g-and-k density).
Consider the g-and-k distributed random variable X GK ( a = 0 , b = 1 , g , k ) with constant parameters g and k > - 0 . 5 . The n-th integer moment of the distribution is given with respect to the standard normal distribution and the n-th power of the transformed quantile function
r ( W ) = 1 + c 1 - exp - g W 1 + exp - g W W 1 + W 2 k ,
given by
E X n = E r ( W ) n .
Hence, one obtains the n-th moment by
E X n = E a + b T g , k * ( W ; a = 0 , b = 1 , g , k ) n = i = 0 n n ! ( n - i ) ! i ! a n - i b i E T g , k * ( W ; a = 0 , b = 1 , g , k ) i .
The moments of the g-and-k transform can not in general be obtained in closed-form except in some special cases. However, one may make the following McClaren series expansion and then approximate the moments as follows at say p-th order. We provide the result for 3rd order series below.
T g , k * ( W ; a = 0 , b = 1 , g , k ) i = W i ( g W ) i 1 2 i + 1 π - 1 3 π 1 2 3 + i ( 6 + g 2 i - 12 k ) W 2 + O ( W 4 ) .
This can then be integrated to produce the approximate i-th order moment given by
E T g , h * ( W ; a = 0 , b = 1 , g , h ) i - Γ 1 / 2 + i ( - 1 ) i ( - g ) i + g i ( g 2 - 12 k ) i ( 1 + 2 i ) - 12 24 2 π .
Remark 4.2. 
These results allow one to perform model estimation via moment matching of model moments to empirical moments of the loss data.
Furthermore, using these moment identities one can easily then find the skew, kurtosis, and coefficient of variations for the g-and-h, generalised g-and-h and g-and-k loss distribution models as well as the subfamilies for the g-distributions and h-distributions.
In addition, there are numerous authors who have studied the generalised properties of quantile-based functionals of asymmetry and kurtosis (See examples in Definition 4.1; also see [36,37,38]).
Definition 4.1 (Generalised skewness and kurtosis functionals).
In considering the generalisations of the skewness and kurtosis for transformation-based quantile function severity models, one can utilise the generalised specifications given for the skewness functional, for a given distribution F X ( x ) with respect to its quantile function Q X ( x ) by
γ F = Q X ( α ) + Q X ( 1 - α ) - 2 Q X 1 2 Q X ( α ) - Q X ( 1 - α ) , α ( 0 , 1 ) .
In addition, there is the spread functional given by
S F = Q X ( α ) - Q X ( 1 - α ) , α ( 0 , 1 ) .
Such measures were discussed by [37] and it can be shown that γ F ( α ) 1 . In the case of the g-and-h family of severity models, one would obtain the forms given in Definition 4.2.
Definition 4.2 (Generalised skewness and kurtosis for g-and-h and generalised g-and-h families).
Consider the g-and-h distributed random varaible X GH ( a = 0 , b = 1 , g , h ) with constant parameters g and h > 0 then the generalised skewness and kurtosis are given for the g-and-h model according to expressions
S F = Q X ( α ) - Q X ( 1 - α ) = exp g Φ - 1 ( α ) - 1 g exp 1 2 h Φ - 1 ( α ) 2 - exp g Φ - 1 ( 1 - α ) - 1 g exp 1 2 h Φ - 1 ( 1 - α ) 2 , γ F = Q X ( α ) + Q X ( 1 - α ) - 2 Q X 1 2 Q X ( α ) - Q X ( 1 - α ) = exp g Φ - 1 ( α ) - 1 g exp 1 2 h Φ - 1 ( α ) 2 S F + exp g Φ - 1 ( 1 - α ) - 1 g exp 1 2 h Φ - 1 ( 1 - α ) 2 S F - 2 exp g Φ - 1 ( 0 . 5 ) - 1 g exp 1 2 h Φ - 1 ( 0 . 5 ) 2 S F .
In the case of the generalised g-and-h model, where X Generalised - GH ( a = 0 , b = 1 , g , h ) , they are given according to expressions
S F * = Q X ( α ) - Q X ( 1 - α ) = 1 + c 1 - exp - g Φ - 1 ( α ) 1 + exp - g Φ - 1 ( α ) Φ - 1 ( α ) exp h Φ - 1 ( α ) 2 2 - 1 + c 1 - exp - g Φ - 1 ( 1 - α ) 1 + exp - g Φ - 1 ( 1 - α ) Φ - 1 ( 1 - α ) exp h Φ - 1 ( 1 - α ) 2 2 , γ F * = Q X ( α ) + Q X ( 1 - α ) - 2 Q X 1 2 Q X ( α ) - Q X ( 1 - α ) = 1 + c 1 - exp - g Φ - 1 ( α ) 1 + exp - g Φ - 1 ( α ) Φ - 1 ( α ) exp h Φ - 1 ( α ) 2 2 S F + 1 + c 1 - exp - g Φ - 1 ( 1 - α ) 1 + exp - g Φ - 1 ( 1 - α ) Φ - 1 ( 1 - α ) exp h Φ - 1 ( 1 - α ) 2 2 S F - 2 1 + c 1 - exp - g Φ - 1 ( 1 / 2 ) 1 + exp - g Φ - 1 ( 1 / 2 ) Φ - 1 ( 1 / 2 ) exp h Φ - 1 ( 1 / 2 ) 2 2 S F .
Analogously one can trivially find the generalised skewness and kurtosis for the g-and-k and g-and-j families.

Tail Properties of the g-and-h and g-and-k Loss Models

In terms of the tail behavior of the g-and-h family of distributions, the properties of such severity models have been studied by numerous authors such as [20,30]. In particular, the tail property (index of regular variation) for the g-and-h family of distributions was first studied for the h-distribution by [30] and later for the g-and-h distribution by [20] (see Proposition 4.6). In addition, the second-order regular variation properties of the g-and-h family of distributions was studied by [20].
In order to study the properties of regular variation of the g-and-h family of loss distribution models, it is first important to recall some basic definitions. First, we note that a positive measurable function f ( · ) is regularly varying if it satisfies the conditions in Definition 4.3. See discussion in [39].
Definition 4.3 (Regularly varying function).
A positive measurable function f ( · ) is regularly varying (at infinity) with an index α R if it satisfies:
  • it is defined on some neighbourhood [ x 0 , ) of infinity;
  • it satisfies the following limiting relationship
    lim x f ( λ x ) f ( x ) = λ α , λ > 0 .
We note that when α = 0 , then the function f ( · ) is said to be slowly varying (at infinity). From this definition one can show that a random variable has a regularly varying distribution if it satisfies the condition in Definition 4.4.
Definition 4.4 (Regularly varying random variable).
A loss random variable X with distribution F X ( x ) taking positive support is said to be regularly varying with index α 0 if the right tail distribution F ¯ X ( x ) = 1 - F X ( x ) is regularly varying with index - α .
The following important features can be noted about regularly varying distributions as shown in Theorem 4.1, see detailed discussion in [40].
Theorem 4.1 (Properties of regularly varying distributions).
Given a loss distribution F X ( x ) satisfying F X ( x ) < 1 for all x 0 , the following conditions on F X ( x ) can be used to verify that it is regularly varying such that F X ( x ) RV α :
  • If F X ( x ) is absolutely continuous with density f X ( x ) such that for some α > 0 one has the limit
    lim x x f X ( x ) F ¯ X ( x ) = α .
    Then f X ( x ) is regularly varying with index - ( 1 + α ) and consequently F ¯ X ( x ) is regularly varying with index - α ;
  • If the density f X ( x ) for loss distribution F X ( x ) is assumed to be regularly varying with index - ( 1 + α ) for some α > 0 . Then the following limit,
    lim x x f X ( x ) F ¯ X ( x ) = α ,
    will also be satisfied if F ¯ X ( x ) is regularly varying with index - α for some α > 0 and the density f X ( x ) will be ultimately monotone.
Many additional properties are described for such heavy tailed distribution and density functions. Here we will utilise the above stated conditions to assess the regular variation properties of the right tail of the g-and-h family of loss models. In particular we will see if a single distributional parameter characterizes the heavy tailed feature as captured by the notion of regular variation index, or if the relationship is more complex.
Proposition 4.6 (Index of regular variation of g-and-h distribution).
Consider the random variable W Normal ( 0 , 1 ) and a loss random variable X, which has severity distribution given by the g-and-h distribution with parameters a , b , g , h R , denoted X GH ( a , b , g , h ) , with h > 0 and density (distribution) f ( x ) (and F ( x ) ) . Then the index of regular variation is obtained by considering the following limit
lim x x f ( x ) F ¯ ( x ) = lim x ϕ ( u ) exp ( g u ) - 1 1 - Φ ( u ) g exp ( g u ) + h u ( exp ( g u ) - 1 ) = 1 h
for u = k - 1 ( x ) where the function k ( x ) is given by
k ( x ) = exp g x - 1 g exp h x 2 2 .
Hence, one can state that F ¯ R V - 1 h .
The asymptotic tail behavior of the h-family of Tukey distributions was studied by [30] and is given in Proposition 4.7.
Proposition 4.7 (h-type tail behaviour).
Consider the h-type transformation, where W Normal ( 0 , 1 ) is a standard normal random variable and the loss random variable X has severity distribution given by the h-distribution with parameters a , b , h R , denoted X H ( a , b , h ) according to
X = T h W ; a , b , h : = a + b W exp h W 2 2 .
Then the asymptotic tail index of the h-type distribution is then given by 1 / h . This is equivalent to the g-and-h family for g 0 .
This shows that the h-type family has a Pareto heavy-tailed property, hence the restriction that moments will only exist on the order of less than 1 / h . The g-family of distributions can be shown to be sub-exponential in the tail behavior but not regularly varying. It was shown ([20] theorem 2.2) that one can obtain an explicit form for the function of slow variation in the g-and-h family as detailed in Theorem 4.2.
Theorem 4.2 (Slow variation representation of g-and-h severity models).
Consider the random variable W Normal ( 0 , 1 ) and a loss random variable X, which has severity distribution given by the g-and-h distribution with parameters a , b , g , h R , denoted X GH ( a , b , g , h ) , with g > 0 and h > 0 and density (distribution) f ( x ) (and F ( x ) ) . Then F ¯ ( x ) = x - 1 / h L ( x ) for some slowly varying function L ( x ) given as x by
L ( x ) = h 2 π g 1 / h exp g h g 2 + 2 h ln ( g x ) - g 2 h - 1 1 / h g 2 + 2 h ln ( g x ) - g 1 + O 1 ln x .
From this explicit Karamata representation developed by [20], it was also shown that one can obtain the second-order regular variation properites of the g-and-h family.
The implications of these findings are that the g-and-h distribution, under the parameter restrictions g > 0 and h > 0 , belongs to the domain of attraction of an Extreme Value Distribution, such that X GH ( a , b , g , h ) with distribution F satisfying F MDA H γ where γ = h > 0 . As a consequence, by the Pickands–Balkema–de Haan Theorem, discussed in detail in [41] and recently in [24], one can state that there exists an Extreme Value Index (EVI) constant γ and a positive measurable function β ( · ) such that the following result between the excess distribution of the g-and-h, denoted by F u ( x ) = P r X - u x | X > u , and the generalised Pareto distribution (GPD), denoted by F γ , β ( u ) GPD ( x ) , is satisfied in the tails
lim u sup x ( 0 , ) F u ( x ) - F γ , β ( u ) GPD ( x ) = 0 .
For discussion on the rate of convergence in the tails, see [42] and the application of this theorem to the g-and-h case by [20] where it is shown that the order of covergence is given by O A exp V - 1 ( u ) for functions
V ( x ) : = F ¯ - 1 exp ( - x ) , A ( x ) : = V ( ln x ) V ( ln x ) - γ .
Hence, the conclusion from this analysis regarding the tail convergence of the excess distribution of the g-and-h family toward the GPD F γ , β ( u ) GPD ( x ) is given explicitly by
ln L ( x ) ln x 2 g h 3 2 1 ln ( x ) = O 1 ln k - 1 ( x ) , x .
Remark 4.3. 
The implications of this slow rate of convergence are that when data for severities are obtained from a loss process, if a goodness-of-fit test suggests that one may not reject the null hypothesis that these data came from a g-and-h distribution, then one should avoid performing estimation of the extreme quantiles, such as those used to measure the capital via the Value-at-Risk, via methods based on Peaks Over Threshold (POT) or Extreme Value Theory (EVT) based penultimate approximations.
Proposition 4.8 (Index of regular variation of the generalised g-and-h distribution).
Consider the random variable W Normal ( 0 , 1 ) and a loss random variable X, which has severity distribution given by the generalised g-and-h distribution with parameters a , b , g , h R , denoted X Generalised - GH ( a , b , g , h ) , with g > 0 and density (distribution) f ( x ) (and F ( x ) ) . Recall that we have, for the generalised g-and-h loss model, the function r ( x ) with a = 0 and b = 1 given by
r ( x ) = 1 + c 1 - exp - g x 1 + exp - g x x exp h x 2 2 .
Using this, we can then find the index of regular variation at x given as follows
(78) lim x x f ( x ) F ¯ ( x ) = lim x x ϕ r - 1 ( x ) r r - 1 ( x ) 1 - Φ r - 1 ( x ) (79)     = 1 h
The proof of this result is detailed in Appendix A.
We note that this result is not unexpected since the g transform in each case drives the skewness but not the kurtosis. We can also obtain this analysis for the g-and-k model, which yields that the g-and-k does not admit a finite limit in either sign of the parameter g, showing that such a model is not regularly varying, as we see in the case of the g-and-h models. However, even though this is the case we can still assess the relative heavy-tailedness of the g-and-k models compared to the base distribution under the Tukey k-type transformation.

5. Estimating the General Tukey Family Loss Model Parameters

Several studies in the statistics literature have been performed on the estimation of these types of quantile function specified models. See likelihood based estimation in [26,27], and the Bayesian approaches such as in [23,33,43]. In this section we propose and develop a class of novel estimation methods based on L-moments for a range of Tukey families which shows favorable properties compared to previously proposed methods. Before presenting this new approach developed in this paper we first comment on a few approaches previously proposed for instance in the g-and-h family.

5.1. Estimating the g-and-h loss Model Parameters

As the g-and-h family does not admit a closed-form density function, the likelihood function can only be expressed in terms of the inverse quantile function;
f X ( x 1 , , x n ; θ ) = i = 1 n d d x i Q X - 1 ( x i ; θ ) ,
where x 1 , , x n are observations, θ = ( a , b , g , h ) is the parameter vector, and Q X - 1 ( · ) is the inverse quantile function. The high computational cost of evaluating the likelihood function comes from the fact that Q X - 1 ( · ) can not be expressed in closed-form, and thus the quantile function must be inverted using an iterative root-search algorithm. The maximum likelihood (ML) estimates of the g-and-h parameters can be found by iteratively searching over the parameter space. The quality of the ML estimates is investigated via simulations by [27] for quantile distribution families generated using skewness and spread functionals, where the authors find that the ML method can be unstable for small samples.
Another method of fitting the g-and-h distributions is by matching moments. The k-th moment exists if and only if h [ 0 , 1 k ) . The expression of the k-th raw moment can be found in [44]; for a = 0 , b = 1 , and g 0 , the k-th raw moment is given by
m k * = E x k = 1 g k 1 - k h i = 0 k ( - 1 ) i k i exp [ ( k - i ) g ] 2 2 ( 1 - k h ) ,
and for g = 0 ,
m k * = E x k = k ! 2 k / 2 [ ( k / 2 ) ! ] ( 1 - k h ) - ( n + 1 ) / 2 if k is even 0 if k is odd .
Given the k-th raw moment, the k-th central moment can be computed by,
m k = E x - m 1 * k = i = 0 k k i ( - 1 ) k - i m i * m 1 k - i .
From the central moments, the skewness ζ 3 and kurtosis ζ 4 are given by
ζ 3 = m 3 / m 2 3 / 2 , ζ 4 = m 4 / m 2 2 .
As the skewness and kurtosis are location and scale invariant shape measures, g and h can be simultaneously found by minimising the objective,
( ζ 3 - ζ ^ 3 ) 2 + ( ζ 4 - ζ ^ 4 ) 2 ,
subject to 0 h < 1 4 , where ζ ^ 3 is the sample skewness and ζ ^ 4 is the sample kurtosis. Given the estimates of g and h, b and a can be solved straightforwardly as follows.
b = m ^ 2 / m 2 , a = m ^ 1 - b m 1 ,
where m ^ 1 is the sample mean and m ^ 2 is the sample variance. The moment matching estimator is proposed by [45], however the quality of such estimator is not investigated in depth by the authors.
Instead of matching moments, the g-and-h parameters can be estimated by matching quantiles, as proposed by [46]. Let 0 < u 1 < < u q < 1 denote a set of quantile levels chosen a priori, Q X ( u 1 ) , , Q X ( u q ) denote the set of g-and-h quantiles, and Q ^ u 1 , , Q ^ u q denote the set of sample quantiles. The estimate of θ is found by minimising the objective,
i = 1 q [ Q X ( u i ) - Q ^ u i ] 2 ,
subject to b > 0 and h 0 . The quality of the quantile matching estimator is determined by the selection of u 1 , , u q . The authors of [46] choose equally spaced quantiles given by u i = i - 1 / 3 q + 1 / 3 , for i { 1 , , q } . By treating u 1 , , u q as auxiliary parameters, the number of quantiles q { 4 , , 20 } is then selected by minimising the Akaike information criterion (AIC) given by
AIC = n log SSE n + 2 ( q + 1 ) ,
where
SSE = i = 1 n [ Q X ( p i ) - Q ^ p i ] 2
and
p i = i - 1 / 3 n + 1 / 3 .
Notice that, for each q, the corresponding SSE is computed using the same number of quantiles as the number of observations, thus making use of the full sample.

5.1.1. New Robust Estimation Approach for g-and-h Loss Models based on the Method of L-moments

We propose a method for fitting Tukey transform distributions such as the g-and-h family using L-moments. This is a general extension of previous specific modified Tukey family models of [32] as there is no assumption here on the choice of base distribution for W. Compared to classical moments, L-moments have a number of advantages. L-moments are able to characterise a wider range of distributions as all L-moments of a distribution exist if and only if the mean exits; L-moment estimators are nearly unbiased for all sample sizes and all distributions; a distribution with finite mean is uniquely characterised by its sequence L-moments; L-moments estimators are relatively insensitive to outliers. See [47,48] for detailed discussions on the advantages of L-moments over classical moments. Although L-moment estimators are more robust to outliers than their classical counterparts, they still assign positive weights to the extreme observations, and therefore are not resistant enough to outliers in extremely heavy tailed settings. To tackle these cases we refer the interested reader to [49] who proposed trimmed L-moments (TL-moments) as a robust generalisation of L-moments, whereby the extreme observations are less influential on the estimation. TL-moments exist even if the distribution does not have a finite mean. The estimation method introduced in this section can be adapted to using TL-moments in a straightforward manner to make them more robust, however such approaches are only really required in very heavy tailed data analysis and will not be required for the study to be undertaken in this paper.
L-moments are defined by [50] to be certain linear combinations of expectations of order statistics. Specifically, let x ( 1 ) x ( 2 ) x ( n ) denote a sample of ordered observations. For k { 1 , 2 , } , the k-th L-moment is defined as
l k = 1 k i = 0 k - 1 ( - 1 ) i k - 1 i E x ( k - i ) .
The connection between L-moments and a quantile function becomes apparent when L-moments are expressed as projections of a quantile function onto a sequence of orthogonal polynomials that forms a basis of L 2 ;
ł k = 0 1 Q X ( u ) L k - 1 ( u ) d u ,
where L k is the k-th shifted Legendre polynomial in the sequence given generically by
L k - 1 ( u ) = j = 0 k - 1 ( - 1 ) k - j ( k + j ) ! ( j ! ) 2 ( k - j ) ! u j .
Using the representation of Equation (92), the first four L-moments are given by
l 1 = 0 1 Q X ( u ) d u , l 2 = 0 1 Q X ( u ) ( 2 u - 1 ) d u , l 3 = 0 1 Q X ( u ) ( 6 u 2 - 6 u + 1 ) d u , l 4 = 0 1 Q X ( u ) ( 20 u 3 - 30 u 2 + 12 u - 1 ) d u .
Remark 5.1. 
In the above notation, the Q X ( u ) is understood to be the quantile function of the random variable X at quantile level u. Hence, for the Tukey family of models the L-moments are to be considered with respect to the integral of the transform of the quantile function of the base distribution, which is implicitly included above when we write Q X ( u ) and could be considered with respect to the base distribution quantile function as follows Q X ( u ) : = r ( Q W ( u ) ) .
The location and scale invariant L-moment ratios, τ 3 and τ 4 , analogous to the classical skewness and kurtosis, respectively termed L-skewness and L-kurtosis in [50], are defined as
τ 3 = l 3 / l 2 , τ 4 = l 4 / l 2 .
Unlike the classical skewness and kurtosis, L-skewness and L-kurtosis are bounded, with τ 3 ( - 1 , 1 ) and τ 4 [ 1 4 ( 5 τ 3 2 - 1 ) , 1 ) for continuous base distributions. The boundedness of L-moment ratios makes them easy to interpret.
The sample L-moments, also known as L-statistics, are unbiased estimates of L-moments based on the order statistics of an observed sample. In particular, the first four sample L-moments are given by
l ^ 1 = M ^ 0 , l ^ 2 = 2 M ^ 1 - M ^ 0 , l ^ 3 = 6 M ^ 2 - 6 M ^ 1 + M ^ 0 , l ^ 4 = 20 M ^ 3 - 30 M ^ 2 + 12 M ^ 1 - M ^ 0 ,
where M ^ k is the k-th sample probability weighted moment [51], given by
M ^ k = 1 n i = 1 n x ( i ) if k = 0 1 n i = 1 n ( i - 1 ) ( i - 2 ) ( i - k ) ( n - 1 ) ( n - 2 ) ( n - k ) x ( i ) if k > 0 .
An alternative, but numerically equivalent, method of computing the sample L-moments is by following closely the definition in Equation (91). See [52] for details.
The most general approach to performing the method of L-Moments that will be applicable for any base distribution W and any Tukey sub-family of models involves matching L-moments of the population to the sample L-moments. We note that in general, the integrals in Equation (102) for the Tukey families of models may be obtained accurately via an one-dimensional numerical integration algorithm such as the adaptive quadrature. However, in some cases we can also obtain closed form expressions for these L-Moments as detailed below where we derive these moments in closed form for the Gaussian based distribution most commonly used in practice. To achieve the L-moment expressions we will work with the quantile of the Gaussian base function for W given for the standard normal by
Q W ( u ) = 2 erf - 1 2 u - 1 .
In the case of the first L-moment we can find a closed form expression for the Tukey families of models and in the case of higher order L-moments we will utilise the series expansion to find a result for the L-moments to any desired accuracy by truncation of the series expansion and explicit integrations as follows. We will first consider to make a change of variable given by
R = 2 erf - 1 2 u - 1 ,
which means that we have the n-th L-moment for the general Tukey family of models which can be written according to the expression
l n = j = 0 n - 1 ( - 1 ) n - j ( n + j ) ! ( j ! ) 2 ( n - j ) ! 0 1 r Q W ( u ) u j d u , = j = 0 n - 1 2 ( - 1 ) n - j ( n + j ) ! ( j ! ) 2 ( n - j ) ! - r R 1 2 ( Φ ( R ) + 1 ) j ϕ 2 R d R
where we used the fact that
d d u erf - 1 u = 1 2 π exp erf - 1 u 2 ,
and erf - 1 - 1 = - and erf - 1 1 = . Furthermore, for the first for L-Moments we can identify the Legendre polynomials for this change of variable as follows
L 0 ( R ) = 1 , L 1 ( R ) = Φ ( R ) , L 2 ( R ) = 3 Φ ( R ) 2 + 3 Φ ( R ) + 1 , L 4 ( R ) = 10 Φ ( R ) 3 + 15 Φ ( R ) 2 + 12 Φ ( R ) - 6 ,
which means we can write the first four L-Moments as follows
l 1 = 2 - r R ϕ 2 R d R , l 2 = 2 - r R Φ ( R ) ϕ 2 R d R , l 3 = 2 - r R 3 Φ ( R ) 2 + 3 Φ ( R ) + 1 ϕ 2 R d R , l 4 = 2 - r R 10 Φ ( R ) 3 + 15 Φ ( R ) 2 + 12 Φ ( R ) - 6 ϕ 2 R d R .
These representations under the change of variable will now be used in the following propositions to obtain the L-Moments for each of the different Tukey sub-families considered.
Proposition 5.1 (g-family loss model population L-moments).
By considering
r ( R ) = exp ( g R ) - 1 g ,
the first four population L-moments of the g-family of Tukey transform loss models are given for a = 0 , b = 1 as follows:
l 1 = exp g 2 2 - 1 g , l 2 = 2 π 2 g exp g 2 - 1 2 g + 2 2 π ψ 1 1 , 1 2 , 3 2 , 1 2 ; - 1 2 , g 2 4 , l 3 = 3 g π ψ 1 1 , 1 2 , 3 2 , 1 2 ; - 1 2 , g 2 4 + 3 2 π 4 g exp g 2 4 - 3 Arctan 3 g π π + 3 l 2 + l 1 , l 4 = 5 l 3 + 27 l 2 - 1 l 1 + 45 Arctan 3 g π π + 30 2 3 π Arctan 1 15 + O 0 R 3 Φ ( R ) 3 ϕ 2 R d R .
where ψ 1 ( · ) denotes the confluent hypergeometric function.
The proof of this result can be obtained in Appendix A. Similarly, one may obtain the first four L-moments for the h-family, the g-and-h family and the g-and-k-family of Tukey elongation loss models are given respectively as follows.
Proposition 5.2 (h-family loss model population L-skewness and L-kurtosis).
By considering
r ( R ) = R exp h R 2 2 ,
the first four population L-moments of the h-family of Tukey transform loss models are given for a = 0 , b = 1 as follows:
l 1 = 2 - r R ϕ 2 R d R = 1 π - R exp - ( 1 - h ) R 2 d R = 0 , h < 1 , l 2 = 2 - r R Φ ( R ) ϕ 2 R d R = 1 π 1 ( 1 - h ) 1 + 1 1 + 2 ( 1 - h ) , l 3 = 2 - r R 3 Φ ( R ) 2 + 3 Φ ( R ) + 1 ϕ 2 R d R = 3 l 2 + l 1 , l 4 = 2 - r R 10 Φ ( R ) 3 + 15 Φ ( R ) 2 + 12 Φ ( R ) - 6 ϕ 2 R d R = 10 8 2 π l 3 + 12 l 2 - 6 l 1 + 30 8 π ( 1 - h ) π [ 1 + 2 ( 1 - h ) ] Arctan 1 [ 2 + 4 ( 1 - h ) ] [ 3 2 + ( 1 - h ) ] .
The proof of this result is in Appendix A.
Proposition 5.3 (k-family loss model population L-skewness and L-kurtosis).
By considering
r ( R ) = R 1 + R k ,
the first four population L-moments of the h-family of Tukey transform Loss models are given for a = 0 , b = 1 as follows:
l 1 = 2 - r R ϕ 2 R d R = 0 , l 2 = 2 - r R Φ ( R ) ϕ 2 R d R = 1 2 π I 1 + I ˜ 1 + 2 k I 3 + ( k - 1 ) k I 5 + 1 / 3 ( k - 2 ) ( k - 1 ) k I 7 + O - R 9 Φ ( R ) exp - R 2 d R , l 3 = 2 - r R 3 Φ ( R ) 2 + 3 Φ ( R ) + 1 ϕ 2 R d R = 3 2 2 π ( G 1 + G ˜ 1 ) + k ( G 3 + G ˜ 3 ) + 1 / 2 ( k - 1 ) k ( G 5 + G ˜ 5 ) + 1 / 6 ( k - 2 ) ( k - 1 ) k ( G 7 + G ˜ 7 ) + 3 l 2 + 3 4 π + 1 l 1 + O - R 9 Φ ( R ) exp - R 2 d R , l 4 = 2 - r R 10 Φ ( R ) 3 + 15 Φ ( R ) 2 + 12 Φ ( R ) - 6 ϕ 2 R d R = 5 2 l 3 - 3 l 2 - l 1 + 12 l 2 - 6 l 1 + 30 8 π 3 π Arctan 1 4 + O - R 9 Φ ( R ) exp - R 2 d R ,
where
I 1 = 1 2 1 + 1 3 , I 2 m + 1 = ( - 1 ) m m ! 2 - ( - 1 ) m 2 m p m 1 p p + 1 / 2 p = 1 , I ˜ 1 = 1 2 1 - 1 3 , G 2 n + 1 = ( - 1 ) n 2 π n p n 1 p 1 / 2 + p Arctan 1 1 + 2 p p = 1 , G ˜ 2 n + 1 = ( - 1 ) n + 1 2 π n p n 1 p 1 / 2 + p Arctan - 1 1 + 2 p p = 1 .
Finally, one can also find the g-and-h family population L-moments in closed form as follows.
Proposition 5.4 (g-and-h family loss model population L-skewness and L-kurtosis).
By considering
r ( R ) = exp ( g R ) - 1 g exp h R 2 2 ,
the first four population L-moments of the g-and-h family of Tukey transform loss models are given for a = 0 , b = 1 as follows:
l 1 = 2 - r R ϕ 2 R d R = exp g 2 4 - 2 h 2 g 2 - h - 2 g 4 - 2 h , h < 2 ,
l 2 = 2 - r R Φ ( R ) ϕ 2 R d R = 1 2 g π exp g 2 ( 4 - 2 h ) 2 π 2 - h + 1 g π 2 ( 1 - h / 2 ) ψ 1 1 , 1 2 , 3 2 , 1 2 ; - 1 2 ( 1 - h / 2 ) , g 2 4 ( 1 - h / 2 ) - 1 g π 2 π 2 , l 3 = 2 - r R 3 Φ ( R ) 2 + 3 Φ ( R ) + 1 ϕ 2 R d R = 3 l 2 + l 1 + n = 0 6 ( g ) n - 1 π n ! 1 4 2 n ( 2 - h ) - 1 - n Γ ( 1 + n ) + 2 1 + n ( 2 - h ) - ( 3 / 2 ) - n Γ ( 3 / 2 + n ) 2 F 1 1 / 2 , 3 / 2 + n , 3 / 2 , 1 / ( - 2 + h ) 2 π + 1 / 4 ( - 1 ) n 1 π 2 n p n 1 p 1 / 2 + p Arctan 1 1 + 2 p p = ( 1 - h / 2 ) , l 4 = 2 - r R 10 Φ ( R ) 3 + 15 Φ ( R ) 2 + 12 Φ ( R ) - 6 ϕ 2 R d R = 12 l 2 - 6 l 1 + 1 g π 15 n = 0 ( g ) 2 n + 1 ( 2 n + 1 ) ! ( - 1 ) n 2 π n p n 1 p 1 / 2 + p Arctan 1 1 + 2 p p = ( 1 - h / 2 ) + 45 2 g π n = 0 ( g ) 2 n + 1 ( 2 n + 1 ) ! 2 1 + n ( 2 - h ) - ( 3 / 2 ) - n Γ ( 3 / 2 + n ) 2 F 1 1 / 2 , 3 / 2 + n , 3 / 2 , 1 / ( - 2 + h ) π + 5 g π n = 1 ( g ) n n ! 2 1 / 2 ( - 1 + n ) ( 1 + exp ( n π ) ) ( 2 - h ) - ( 1 / 2 ) - n / 2 Γ [ ( 1 + n ) / 2 ] + 5 4 g π n = 1 ( g ) n n ! 1 π ( 1 - h / 2 ) 3 1 + 2 ( 1 - h / 2 ) Arctan 1 2 1 / 2 + ( 1 - h / 2 ) , + O - R n erf ( R / 2 ) 3 exp - ( 1 - h / 2 ) R 2 d R ,
where = 3 / 2 + ( 1 - h / 2 ) .
Then, given the L-moments, for instance in the case of the g-and-h subfamily, the estimates of g and h are simultaneously found by iteratively minimising the objective
( τ 3 - τ ^ 3 ) 2 + ( τ 4 - τ ^ 4 ) 2 ,
subject to 0 h < 1 , where τ ^ 3 = l ^ 3 / l ^ 2 is the sample L-skewness and τ ^ 4 = l ^ 4 / l ^ 2 is the sample L-kurtosis. The estimates of a and b can be obtained using the following properties of L-moments.
Proposition 5.5 (L-moments of affine functions of random variables).
Let L k ( · ) denote the k-th L-moment operator. Consider random variables X and Y such that Y = a + b X , where a and b are constants. The first and second L-moments of Y can be expressed as
L 1 ( Y ) = a + b L 1 ( X ) , L 2 ( Y ) = b L 2 ( X ) .
Given the estimates of g and h, using Equation (111), one can estimate the values of b and a by
b = l ^ 2 / l 2 , a = l ^ 1 - b l 1 .
An alternative L-moment based approach can also be considered, where we consider a special choice of base distribution for W given by the logistic model. If one modifies the Tukey transform family in the g-and-h case as follows it is also possible to obtain a reparameterised form which admits closed form expressions for the L-moments. This particular sub-family case is known as the L-moment Tukey transformation families and it was first developed by [32]. The choice of logistic distribution for W means that it will take a density, distribution, and quantile functions given by
f ( w ) = exp - ( w - μ ) / s s 1 + exp - ( w - μ ) / s 2 , F ( w ) = 1 1 + exp - ( w - μ ) / s , Q W ( α ) = μ + s ln α 1 - α , α [ 0 , 1 ] ,
for all w R , μ R , and s R + . The motivation for modifying the distribution transformed under the Tukey structure was related to the fact that inference on the parameters can then be performed more readily via L-moments and L-correlation. The resulting four basic classes of modified Tukey quantile function transformations are then given in Definition 5.1.
Definition 5.1 (L-moment Tukey transforms).
Let W Logistic ( μ = 0 , s = 1 ) be a standard logistic distributed random variable. Then the loss random variable X has severity distribution given by the L-moment Tukey family as follows:
  • The γ-and-κ Tukey family transformation is given by
    X = T γ , κ ( W ) = γ - 1 exp ( γ W ) - 1 exp ( κ | W | ) .
    This is the analog of the g-and-h Tukey transform for the logistic distribution case for γ 0 and κ 0 ;
  • The κ L -and- κ R Tukey family transformation is given by
    X = T κ L , κ R ( W ) = W exp κ L | W | , W 0 W exp κ R | W | , W 0 .
    This is the analog of the double h-h Tukey transform for the logistic distribution case for κ L 0 , κ R 0 , and κ L κ R .
These modified transformations then allow one to obtain the population L-moments in terms of the parameters of the L-moment γ-and-κ Tukey family as well as the asymmetric L-moment κ L -and- κ R Tukey family, which can be matched to the sample-estimated L-moments and then utilised as a system of nonlinear equations to be solve numerically via root search for the resulting L-moment parameter estimates.
Proposition 5.6 (L-moment estimators for the γ-and-κ Tukey family).
Consider a γ-and-κ distributed random variable X F ( γ , κ ) and a sample of n loss data points with order statistics X ( i , n ) i = 1 n that will be used to fit the γ-and-κ distribution. Then under the restrictions that γ + κ < 1 , κ < 1 , and 1 + γ > κ , which allow the first two L-moments to be finite, one obtains the following two equations for the population’s first two L-moments λ 1 and λ 2 given by:
λ 1 = ( - γ - κ ) h 1 + ( γ - κ ) h 2 + ( - γ + κ ) h 3 + 2 κ h 4 + ( γ + κ ) h 5 - 2 κ h 6 2 γ λ 2 = 2 γ - ( γ + κ ) 2 h 1 + ( γ - κ ) 2 h 1 - h 3 + ( γ + κ ) 2 h 5 2 γ ,
where h 1 , h 2 , , h 6 are defined with respect to the Harmonic number functions with the following arguments according to
h 1 = H 1 2 ( - 1 - γ - κ ) , h 2 = H 1 2 ( - 1 + γ - κ ) h 3 = H 1 2 ( γ - κ ) , h 4 = H 1 2 ( - 1 - κ ) h 5 = H - 1 2 ( γ + κ ) , h 6 = H - 1 2 κ ,
with the harmonic number functions defined for any x > 0 by
H [ x ] : = x k = 1 1 k ( x + k ) .
One can then estimate sample L-moments that can be matched to the population moments to solve numerically for the parameters.
Remark 5.2. 
As noted by [32], expressions are also developed for the population L-skewness τ 3 and L-kurtosis τ 4 should one wish to utilise these for L-moment matching parameter estimation.
Analogously, the solutions for the first two population L-moments for the class of κ L -and- κ R Tukey transformations were detailed by [32] and can be used to perform parameter estimation, as detailed in Proposition 5.7.
Proposition 5.7 (L-moment estimators for the κ L -and- κ R Tukey family).
Consider the asymmetric κ L -and- κ R distributed random variable X F ( κ L , κ R ) and a sample of n loss data points with order statistics X ( i , n ) i = 1 n that will be used to fit the κ L -and- κ R distribution. Then under the restrictions that κ L < 1 and κ R < 1 , which allow the first two L-moments to be finite, one obtains the following two equations for the population’s first two L-moments λ 1 and λ 2 given by
λ 1 = 1 4 2 p 5 - 2 p 6 - 2 p 7 + 2 p 8 - κ L p 9 + κ L p 10 + κ R p 11 - κ R p 12 λ 2 = 1 4 4 + κ L - 4 p 5 + 4 p 6 + κ L p 9 - p 10 + 4 + κ R - 4 p 7 + 4 p 8 + κ R p 11 - p 12 ,
where p 5 , p 6 , , p 12 are defined with respect to the polygamma functions with the following arguments according to
p 5 = P 0 , 1 2 - κ L 2 , p 6 = P 0 , 1 - κ L 2 , p 7 = P 0 , 1 2 - κ R 2 p 8 = P 0 , 1 - κ R 2 , p 9 = P 1 , 1 2 - κ L 2 , p 10 = P 1 , 1 - κ L 2 p 11 = P 1 , 1 2 - κ R 2 , p 12 = P 1 , 1 - κ R 2 ,
with the polygamma functions defined by
P [ m , x ] : = ( - 1 ) m + 1 m ! k = 0 1 ( x + k ) m + 1 .
One can then estimate sample L-moments that can be matched to the population L-moments to solve numerically for the parameters.

6. Simulation Study: Comparison of Estimation Procedures for the Tukey Family of Claim  Models

To investigate the quality of the various parameter estimation methods, including the approach we have developed in this paper based on L-moments, in fitting the g-and-h distributions, we conduct a simulation study, focusing on bias, variability, and computational cost of each method. The methods under comparison are moment matching (MoM), maximum likelihood (ML), quantile matching (QM), and method of L-moments (MoLM).
Independent samples are generated from two g-and-h distributions whose parameters are ( 0 , 1 , 0 . 1 , 0 . 1 ) and ( 0 , 1 , 0 . 5 , 0 . 2 ) , with the second one being more skewed and heavier tailed than the first. Notice that the first four moments are finite for both distributions as h < 1 4 in both cases, and thus the MoM estimator is not being disadvantaged by design. Independent observations from a g-and-h distribution are generated by first generating u i from an uniform distribution on the interval ( 0 , 1 ) for i { 1 , , n } . A sample from the g-and-h distribution is then obtained by applying the transformation, x i = Q X ( u i ) , given by the g-and-h quantile function. We use the Mersenne Twister (MT) pseudo-random number generator [53] to generate the uniform observations. Samples of sizes n = 50 , n = 100 , and n = 1000 are considered. We generate 1000 samples of each sample size from each g-and-h distribution.
Summaries of the estimates by the studied methods are reported in Table 1. For each parameter, we report the Monte Carlo mean (Mean), standard deviation (SD), and mean squared error (MSE). We also report the mean and standard deviation of the per-estimation-time in seconds (Time) of each method, which allows us to assess the feasibility and scalability of the method.
It can be seen that for both g-and-h distributions, all sample sizes, and all four parameters, the MoLM has either lowest or equally lowest MSE amongst all considered estimators. The MoM has the highest MSE in most cases. As the sample size increases, the MSE is reduced for all estimators of all parameters, except for the MoM of the h parameter. Comparing the Monte Carlo means, the MoM is particularly poor at estimating the h parameter; the value of h is significantly underestimated even for n = 1000 . The MoLM slightly underestimates the g and the h parameters, however the bias is reducing with an increasing sample size. Comparing between the QM and the MoLM for the g and the h parameters, the mean of QM estimates is closer to the true value, however the MoLM has a lower standard deviation, especially for n = 50 and n = 100 .
Comparing the mean per-estimation-time amongst the estimators, the MoLM is the fastest for the first g-and-h distribution, and is sometimes slightly slower than MoM for the second one. The ML estimator is the slowest in each case and scales poorly with sample size.
The conclusion drawn from results in this simulation study is that the estimation of the g-and-h family of claim severity models is accurately and efficiently estimated using the proposed approach, based on equating our derived population L-moments to the estimated sample L-moments and solving this system of non-linear equations. Importantly, the method is able to estimate the h parameter with accuracy, which dictates the kurtosis of the severity distribution and is important in practice for calculating risk measures associated with extreme events. Furthermore, our proposed L-moments based approach is more efficient computationally compared to other procedures, especially with increasing sample size.

7. Empirical Application

In this section we consider an empirical study involving a very large real world dataset of claim records from Australia. The g, h, g-and-h, and g-and-k distributions are employed to model the total gross payment of individual claims of the Compulsory Third Party (CTP) insurance from an insurance company based in Queensland, Australia. The data contains 115,300 accident records from September 1994 to November 2008, covering 15 calendar years. Unlike many models in non-life insurance where the annual aggregate amounts of claims are modelled, in this paper, we are interested in modelling the distributions of the individual claim payment amounts. Such high-resolution data can provide us with valuable details about the claim distributions, which we can model through the use of a flexible severity loss model. Two key challenges of modelling such data are: the computational burden associated with the large number of observations, and the models for aggregate amounts may not be flexible enough to adequately model the losses at the individual claim level.
In undertaking this study, we are also interested in studying how the distribution of the losses varies over time. To achieve this, we first partition the records into monthly blocks and study the distributional properties of each month. We then track how the distribution of a particular month changes over the years covered in our dataset, which allows us to study the temporal variation in the model parameters while controlling for possible monthly seasonal patterns.
Let x s , t = { x 1 , s , t , , x n s , t , s , t } denote a vector representing a monthly block, where x i , s , t is the log total gross payment of the ith claim in the sth month of year t and n s , t is the block size. For example, x 1 , 1 , 1 is the first payment in January, 1994. We fit the g, h, g-and-h, and g-and-k models to each x s , t for s { 1 , , 12 } and t { 1 , , 15 } . The parameters of all the models are estimated using the method of L-moment. As standard, we set the overall asymmetry parameter c of the g-and-k model to be 0.8. The block size n s , t varies from month to month, with a sample mean of 674 and a sample standard deviation of 174. If a month has less than 50 claims, it is excluded from the study. To compare the goodness-of-fit between the models, we compute the root-mean-square-error (RMSE) between the model quantiles and the sample quantiles. Specifically
RMSE = 1 n s , t 1 n s , t [ Q X ( u i ) - Q ^ u i ] 2 1 / 2 ,
where Q X ( u i ) is the model quantile, Q ^ u i is the sample quantile, and the quantile level u i is given by
u i = i - 0 . 5 n s , t .
As an example, Figure 1 and Figure 2 show the Q-Q plots of the sample quantiles plotted against the quantiles of the models that are fitted to the monthly blocks of 1998 and 2003, respectively. In each panel (a, b ,c, or d), twelve Q-Q plots are shown, corresponding to the monthly blocks of a year. For year 1998, the g distribution (a) is unable to adequately model both tails of the data, as the data suggests a heavier-tailed distribution than the g distribution. The h distribution (b) also shows a lack of fit for both tails, primarily due to its inability to account for the asymmetry in the data. The g-and-h and g-and-k distributions (c and d) appear to be adequate for the right tail while showing a better fit for the left tail, as they are able to model both the asymmetry and the heavy tails. For year 2003, the fact that the g distribution fits most of the data well and that the h distribution fails to do so indicates that the primary feature of the data is asymmetry rather than possessing heavy tails. All of the models except for the h distribution are able to adequately describe most part of the data. Notice that for both years, all of the models struggle to fit the far left tail of the data. The data suggests that a distribution whose far left tail is lighter than that of our models is appropriate.
From studying the individual Q-Q plots, we know that it is important to model both the asymmetry and the heavy tails, and that the parameters may not be constant over time. Next, we study how the parameters of the g-and-h and g-and-k distributions evolve over the years. To achieve this, we construct a time series plot (against the years { t } ) of each parameter estimate for each month s. Such time series plots are shown in Figure 3 and Figure 4 for the g-and-h and the g-and-k models, receptively. For a given parameter, there are twelve observations plotted for each year, corresponding to the calendar months, except for 1994 and 2008. The missing observations in those years is due to the fact that our dataset starts in September and ends in November and that we have excluded November, 2008 from the study as there are only 37 claims in that month.
Some interesting observations can be made by examining these time series plots. It is clear from the plots that the parameters vary significantly over the years for both models. The underlying process of each parameter appears to be non-stationary. The scale parameters are negatively correlated with the tail-shape parameters; the location parameters are negatively correlated with the asymmetry parameters. It seems that there is a structural change in year 2002, where the severity distribution starts to increase in scale but become less heavy-tailed; the location of the distribution starts to shift downward while the skewness starts to move from negative to neutral or even positive. In fact, the g-and-k model reveals (in panel (d) of Figure 4) that the severity distribution switches gradually from being heavy-tailed to being light-tailed (relative to the normal distribution) after 2002.
To compare the goodness-of-fit between the models for each year, we compute RMSE using Equation (122) for each monthly block and then plot the mean of the twelve monthly RMSE values for each year. The result is a time series plot of the average RMSE values for each model, shown in Figure 5. We can see that the goodness-of-fit varies both over time and across the models. It is interesting to see that the g-and-k distribution clearly provides the best fit in every year and that the h distribution is worst for most of the years.

8. Conclusions

In this paper we have proposed a new family of claims reserving models for non-life insurance severity modelling corresponding to a flexible class of Tukey elongation transform models. We have outlined the characterization of the sub-families of g, h, k, j, h-h, g-and-h, g-and-k, g-and-j and generalised g variants. Furthermore, we have studied the properties of such models, including deriving novel relationships for the tail behaviour and the population L-moments for such models. Furthermore, we have demonstrated the estimation of these claim model parameters can be performed using the method of L-moment, which can often be more accurate and efficient than alternative procedures based on moment-matching or maximum likelihood. Finally, we have applied the methods and models to calibration of claim severity models for a large insurance database based on CTP automotive claims data from Australia.

Acknowledgments

We thank the two anonymous reviewers for their helpful comments.

Author Contributions

The authors Gareth W. Peters, Wilson Ye Chen, and Richard H. Gerlach designed the research, analysed the results, and wrote the paper in close collaboration. Gareth W. Peters derived the relevant properties of the Tukey family; Wilson Ye Chen developed the L-moment estimators.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1. Proof of Proposition 4.8

Proof. 
The derivation of this result is obtained by first considering the limit given by
(A1) lim x x f ( x ) F ¯ ( x ) = lim x x ϕ r - 1 ( x ) r r - 1 ( x ) 1 - Φ r - 1 ( x ) (A2)     = 1 h
which by defining u = r - 1 ( x ) and further noting that u ( 1 - Φ ( u ) ) / ϕ ( u ) 1 as x one can rewrite as follows using the reciprocal law of limits that states that if lim x c g ( x ) = M and M 0 then lim x c 1 g ( x ) = 1 M as well as the product law of limits that
lim u ϕ ( u ) r ( u ) ( 1 - Φ ( u ) ) r ( u ) = lim u ϕ ( u ) r ( u ) ( 1 - Φ ( u ) ) r ( u ) = lim u u r ( u ) r ( u ) = lim u 1 + exp ( g u ) u 2 1 + g u + h u 2 + exp ( g u ) ( 1 + h u 2 )
Now in the case that g < 0 then one has that exp ( g u ) 0 as u and hence one obtains
lim u ϕ ( u ) r ( u ) ( 1 - Φ ( u ) ) r ( u ) = lim u 1 + exp ( g u ) u 2 1 + g u + h u 2 + exp ( g u ) ( 1 + h u 2 ) = 1 / h
Where, as expected we again see that the geralized g-and-h model produces a distribution that satisfies F ¯ R V - 1 h . ☐
The following integral identities from [54] are useful for the following derivations of the population L-moments.
Φ ( x ) = 1 2 erfc - x 2 - Φ ( x ) d x = 2 π - Φ ( x ) 2 ϕ ( x ) d x = 1 π Arctan 3 - exp ( c x - x 2 / 2 ) d x = 2 π exp c 2 2 - 0 exp ( c x ) ϕ ( b x ) n d x = exp c 2 2 n b 2 b n ( 2 π ) n - 1 Φ b 2 x n - c b n + C - 0 Φ ( b x ) ϕ a x d x = 1 2 π | a | π 2 - Arctan b | a | 0 Φ ( b x ) ϕ a x d x = 1 2 π | a | π 2 + Arctan b | a | 0 erfc ( c x ) exp - p x 2 x n d x = I n , R c 2 + p > 0 I 1 = 1 2 p 1 - c c 2 + p , I 2 m = ( - 1 ) m π m p m 1 p Arctan p c I 2 m + 1 = ( - 1 ) m m ! 2 p m + 1 - ( - 1 ) m 2 m p m 1 p p + c 2 0 x a - 1 erf ( c x ) exp - p x - b x 2 d x = c π b ( a + 1 ) / 2 Γ a + 1 2 ψ 1 a + 1 2 , 1 2 , 3 2 , 1 2 ; - c 2 b , p 2 4 b - c p π b a / 2 + 1 Γ a 2 + 1 ψ 1 a 2 + 1 , 1 2 , 3 2 , 3 2 ; - c 2 b , p 2 4 b
0 erf 2 ( c x ) exp - p x 2 x 2 n + 1 d x = ( - 1 ) n c π n p n 1 p c 2 + p Arctan c c 2 + p , R { p } > 0 0 erf ( a x ) erf ( b x ) erf ( c x ) exp - p x 2 x d x = 1 π p a a 2 + p Arctan b c a 2 + p + b b 2 + p Arctan a c b 2 + p + c c 2 + p Arctan a b c 2 + p , = a 2 + b 2 + c 2 + p , R { p } > 0

Appendix A.2. Proof of Proposition 5.1

Proof. 
It will also be useful to recall the integral identities from [54] which are given in Appendix A which are used to make these proofs for each of the first four L-moments as follows:
l 1 = 2 - r R ϕ 2 R d R = 2 g - exp ( g R ) - 1 ϕ 2 R d R = exp g 2 2 - 1 g ,
l 2 = 2 - r R Φ ( R ) ϕ 2 R d R = 2 2 g 0 exp g R - R 2 erfc - R 2 d R + 2 2 g 0 exp - g R - R 2 erfc - R 2 d R - 2 g - 0 Φ ( R ) ϕ 2 R d R - 2 g 0 Φ ( R ) ϕ 2 R d R = 2 π 2 g exp g 2 - 1 2 g + 2 2 π ψ 1 1 , 1 2 , 3 2 , 1 2 ; - 1 2 , g 2 4
l 3 = 2 - r R 3 Φ ( R ) 2 + 3 Φ ( R ) + 1 ϕ 2 R d R = 3 2 g - exp g R - R 2 Φ ( R ) 2 d R - 3 Arctan 3 g π π + 3 l 2 + l 1 = 3 2 4 g - exp g R - R 2 erf R 2 2 d R + 2 - exp g R - R 2 erf R 2 d R + 3 2 π 4 g exp g 2 4 - 3 Arctan 3 g π π + 3 l 2 + l 1 = 3 g π ψ 1 1 , 1 2 , 3 2 , 1 2 ; - 1 2 , g 2 4 + 3 2 π 4 g exp g 2 4 - 3 Arctan 3 g π π + 3 l 2 + l 1 + 3 2 4 g n = 0 0 ( g R ) n n ! exp - R 2 erf R 2 2 d R + 0 ( g R ) n n ! exp - g R - R 2 erf R 2 2 d R = 3 g π ψ 1 1 , 1 2 , 3 2 , 1 2 ; - 1 2 , g 2 4 + 3 2 π 4 g exp g 2 4 - 3 Arctan 3 g π π + 3 l 2 + l 1
l 4 = 2 - r R 10 Φ ( R ) 3 + 15 Φ ( R ) 2 + 12 Φ ( R ) - 6 ϕ 2 R d R = 10 2 - r R Φ ( R ) 3 ϕ 2 R d R + 5 l 3 + 45 Arctan 3 g π π + 27 l 2 - 1 l 1 = 5 l 3 + 45 Arctan 3 g π π + 27 l 2 - 1 l 1 + 20 2 g n = 1 ( g ) 2 n + 1 ( 2 n + 1 ) ! 0 R 2 n + 1 Φ ( R ) 3 ϕ 2 R d R = 5 l 3 + 45 Arctan 3 g π π + 27 l 2 - 1 l 1 + 30 2 3 π Arctan 1 15 + O 0 R 3 Φ ( R ) 3 ϕ 2 R d R

Appendix A.3. Proof of Proposition 5.2

Proof. 
By considering
r ( R ) = R exp h R 2 2 ,
the first four population L-moments of the h-family of Tukey transform loss models are given, for a = 0 , b = 1 , as follows:
l 1 = 2 - r R ϕ 2 R d R = 1 π - R exp - ( 1 - h ) R 2 d R = 0 , h < 1
l 2 = 2 - r R Φ ( R ) ϕ 2 R d R = 1 π - R Φ ( R ) exp - ( 1 - h ) R 2 d R = 1 π 0 R erfc - R 2