1. Introduction
The Archimedean class is an attractive class of copulas that contains numerous famous families, such as Gumbel–Hougaard [
1,
2], Clayton [
3], Ali–Mikhail–Haq [
4] and Frank [
5]. The construction method of this class is exceptional, using a function with certain characteristics called the Archimedean generator. Marshall and Olkin [
6] used inverse Laplace transformations as Archimedean generators. Genest and Mackay [
7] earlier gave an expansion of this idea by defining Archimedean generators with fewer conditions than the requirements of inverse Laplace transformations. Precisely, the Archimedean generator denotes a real continuous function on
that is strictly decreasing, convex, and takes zero at one. It is interesting that many concepts and properties of Archimedean copulas could be expressed using their generators. For example, Archimedean copulas have a simpler form to obtain their Kendall distribution functions. Also, both Kendall’s tau correlation coefficient and the tail dependencies have special forms in terms of Archimedean generators (see Genest and Mackay [
7]). Genest and Rivest [
8] discussed the estimation of the dependence parameter for Archimedean copulas and gave goodness-of-fit test statistics based on the distribution function of the copula. Smith [
9] proposed a method to select the appropriate copula among the Archimedean class of copulas. Genest et al. [
10] introduced graphical tools and non-parametric goodness-of-fit tests for copulas, which can be easily conducted using Archimedean generators.
Durante et al. [
11] generalized the Archimedean class of copulas by replacing the Archimedean generator in the form of the Archimedean copula with two different functions; each has certain conditions. Michiels et al. [
12] studied the construction of Archimedean copulas through the ratio of the Archimedean generator to its first derivative. Wysocki [
13] introduced a form to construct Archimedean copulas through functions called diagonal generators, which are convex and satisfy the properties of diagonal sections of Archimedean copulas. Alhadlaq and Alzaid [
14] showed that truncations of cumulative distribution functions (CDFs) and probability generating functions (PGFs) under simple conditions are Archimedean generators. They obtained some of the well-known Archimedean copulas and introduced new ones using either CDFs or PGFs. What makes this technique practicable is that any CDF or PGF could easily inspire us with a new Archimedean generator. Alzaid and Alhadlaq [
15] studied in detail a new Archimedean copula produced by this technique, where its generator is the truncation of the PGF of the Poisson distribution, then used this new Archimedean copula to model real datasets.
In this paper, we propose a new family of Archimedean copulas (to be called half-logistic copula) based on right truncation of the half-logistic distribution function. Some dependence properties and measures are studied. Two real-life datasets are modeled using the new copula, along with a comparison with other well-known copula models. The motivation of this research was to add more families with various dependent structures to the class of Archimedean copulas. As will be shown in the discussion of the applications, the new half-logistic Archimedean copula gives the best fit of the data.
The remainder of this paper proceeds as follows: In
Section 2, the half-logistic copula is introduced for the bivariate case. Some relative functions and limiting cases are discussed. In
Section 3, some dependence concepts are investigated. Kendall’s tau correlation coefficient is obtained in exact form. In
Section 4, an extension of this copula by adding a second parameter is given.
Section 5 presents some applications of the new copula to real datasets. Some other known copula models are used for comparison. Our conclusions are presented in
Section 6.
First, we need to explain the concepts of additive and multiplicative Archimedean generators and how the latest is used to construct Archimedean copulas.
Definition 1 (Genest and MacKay [
7]).
A function that is continuous, strictly decreasing, and convex such that and is called an additive Archimedean generator. Definition 2 (Nelsen [
16]).
Let be a continuous, strictly increasing, and log-concave function such that . Then, ψ is called a multiplicative Archimedean generator.Note that ψ is a multiplicative Archimedean generator if and only if , where ϕ is an additive Archimedean generator.
Theorem 1.
Let ψ be a multiplicative generator that satisfies Definition 2, thenis a bivariate Archimedean copula, where is the pseudo-inverse of ψ. Note that if , the pseudo-inverse describes an ordinary inverse function, and in this case, is known as a strict multiplicative Archimedean generator.
2. The Half-Logistic Copula
The half-logistic distribution was considered in many research topics. Inference about this distribution was studied by Balakrishnan and Puthenpura [
17], Balakrishnan and Wong [
18], and others. Various extensions of this distribution were introduced; for instance, see Torabi and Bagheri [
19] and Olapade [
20]. The half-logistic distribution (or one of its extensions) was used to model data from various fields. For reliability applications, see Samuel and Kehinde [
21] for example. Also, we refer to Oluwatobi and Portharcourt [
22] and Awodutire et al. [
23] for survival analysis applications. Moreover, the half-logistic distribution addressed different censoring cases; for instance, see Balakrishnan and Asgharzadeh [
24], Balakrishnan and Saleh [
25], and Seo et al. [
26].
In this section, we will use a truncation of the half-logistic distribution function as a multiplicative Archimedean generator. Before introducing our generator, we need to recall Theorem 4 of Alhadlaq and Alzaid [
14].
Theorem 2.
If F is a strictly increasing log-concave cumulative distribution function (CDF) on I, thenis a copula. In other words, the function
F satisfying (
1) is a strict multiplicative Archimedean generator.
Now, the CDF of the half-logistic distribution, which is given by , is strictly increasing and log-concave. To restrict this function on I, we truncate it from the right at one as Thus, is strictly increasing on I and log-concave CDF as
So, if we set , we obtain from Theorem 2 that is a strict multiplicative Archimedean generator. The inverse of is
Figure 1 shows the shape of the multiplicative Archimedean generator
for different values of the dependence parameter (
) and the corresponding Kendall’s tau correlation coefficient values (
) (to be obtained in
Section 3). We note from the plot that as
gets smaller, the generator approaches the independence case.
The corresponding Archimedean copula to
is
Henceforth, we call copula (
2) the half-logistic copula. It is interesting to know that this copula is symmetric in
, i.e.,
. This implies that the dependence parameter of this copula is not identified for the whole set of real numbers. Thus, the range of
will be limited only to the non-negative real numbers.
The survival (half-logistic) copula and the copula density are, respectively, given by
Figure 2 and
Figure 3 exhibit the contour plots of the half-logistic copula and its copula density for different correlation levels in terms of Kendall’s tau correlation coefficient
. From the graphs, it is noticeable that the half-logistic copula is increasing in the dependence parameter
(a mathematical proof is provided at Theorem 4). Also, the contour curves are steeper for small values of
, i.e., the copula increases faster with smaller values of the parameter (i.e., weaker correlations). The copula density takes higher values when
and takes its maximum at the point (
). Also, it seems that as the dependence parameter increases (i.e., the correlation increases), the density gets steeper.
Theorem 3.
- i.
The product copula is a limit case of the half-logistic copula when θ approaches zero.
- ii.
The Fréchet–Hoeffding upper bound copula is a limit case of the half-logistic copula when θ tends to infinity.
The Kendall distribution function of
is
The Kendall distribution function is valuable in studying stochastic ordering; see Genest and Rivest [
27] and Nelsen et al. [
28] for more explanation. Also, some goodness-of-fit tools are based on comparing the Kendall distribution function with the empirical distribution function; see
Section 5.
3. Dependence
In the following, the dependence ordering of the half-logistic copula is investigated. We obtain Kendall’s tau correlation coefficient (
) in its exact form. This copula is proven to have no tails of dependence. The copula density is shown to be totally positive of order two (
), which is a strong positive dependence property that plays an important role in statistics. For more details, see Karlin [
29] and Joe [
30]. Accordingly, some dependence concepts are guarantees.
Theorem 4.
The half-logistic family of copulas is positively ordered; denoted , i.e., for any , we have for all in I.
For more details on positively ordered copulas, see Nelsen [
16].
Theorem 5.
The half-logistic copula density is .
Theorem 6.
Kendall’s tau correlation coefficient for the half-logistic copula is given by Figure 4 exhibits the plot of Kendall’s tau coefficient for the half-logistic copula. Notice that
takes a range between 0 and 1 at
and
, respectively.
Unfortunately, the Spearman’s rho correlation coefficient for the half-logistic copula is complicated and cannot be obtained in a neat form.
The following corollary lists some dependence concepts for the half-logistic copula, which are guaranteed by the total positivity of the density
(see Joe [
31], p. 366 and Nelsen [
16], p. 195).
Corollary 1.
Let X and Y be continuous random variables with half-logistic copula . Then, X and Y are
- i.
Stochastically increasing, thus left-tail decreasing and right-tail increasing, hence positive quadrant-dependent;
- ii.
is , and the survival copula is also ;
- iii.
, where denotes Spearman’s rho correlation coefficient;
- iv.
.
Theorem 7.
The upper and lower tail dependencies, and , for the half-logistic copula are both zeros.
4. The Two-Parameter Half-logistic Copula
In this section, we propose an extension for the half-logistic Archimedean generator
by adding a second parameter to generate a two-parameter half-logistic copula. Although two-parameter families have more complicated calculations than those with one-parameter families, they are more practicable in model selection techniques. The reasoning behind this is that they usually include more than one-parameter families as special cases; thus, testing the fit for that two-parameter family is easier than testing each of the one-parameter families alone. For a better understanding, see Chen and Fan [
32]. In our case, besides the one-parameter half-logistic copula, the two-parameter half-logistic copula is reduced to the Frank copula in a limiting case.
Consider an adjustment of for a real constant , as . We assure that this Archimedean generator still satisfies the conditions of a multiplicative Archimedean generator, as
is strictly increasing, because its first derivative is positive for all .
is log-concave. As
is decreasing in
s.
and .
Thus,
is indeed a multiplicative Archimedean generator. The corresponding two-parameter half-logistic copula, denoted
, is given by
Theorem 8.
The two-parameter half-logistic copula approaches the Frank copula with parameter θ as .
Figure 5 exhibits the density of the two-parameter half-logistic copula at
, with the special cases of the Frank copula and the half-logistic copula.
Theorem 9.
Kendall’s tau correlation coefficient for the two-parameter half-logistic copula is given bywhere , and is the dilogarithm function. Figure 6 exhibits Kendall’s tau correlation coefficient for different values of the second parameter
and
.
5. Applications
In this section, we will use the half-logistic copula to model two types of data. The first dataset considers the continuous case; the second is a discrete dataset. We chose datasets that were previously studied in the literature using other bivariate distributions (some of which are copulas). Some well-known copulas (such as Frank, Gumbel–Hougaard, Clayton, and Farlie–Gumbel–Morgenstern (FGM)) will also be fitted for the sake of comparison.
For each dataset, some goodness-of-fit methods are used. The Kendall plot is used to check the existence of dependence, as the dependence in this plot is confirmed when the points spread far from the diagonal line; see Genest and Boies [
33]. Genest and Rivest [
8] suggested the Lambda function
to use it as a goodness-of-fit graphical tool. Their technique is to choose the closest lambda among all the suggested copulas to the empirical one. Two goodness-of-fit tests are applied. The first test is based on Cramér–von Mises criterion
, while the other is based on the Kolmogorov–Smirnov criterion
. Small values of
and
indicate that the copula is suitable to fit the data; see Genest et al. [
34] for more explanation. After that, the dependence parameter is estimated using the maximum likelihood method. The models are compared using the Akaike information criterion (AIC) and the Bayesian information criterion (BIC).
5.1. St-Maurice’s River Annual Flow Data
A record of 76 annual flows taken from two stations at St-Maurice’s River, Province of Quebec, Canada, during the period of 1951–1988 is considered in this application. A tool to download these data is available on the official website of the Government of Canada (these data are freely available at
https://collaboration.cmc.ec.gc.ca/cmc/hydrometrics/www/, accessed on 1 April 2022). Mesfioui et al. [
35] used this dataset to check the performance of their new goodness-of-fit test for copula model selection. The annual flow at Gouin station is assigned to the first variable, and the annual flow at Rapide-blanc station is assigned to the second one. Using the univariate goodness-of-fit test, they chose the log-normal distribution for both of the marginals. They examined seven copula families, which are Gaussian, Clayton, Frank, Durante, Cuadras–Augé, Fréchet, and ordinal sum copulas. In terms of their goodness-of-fit test, all but one (the ordinal sum) of the suggested copulas are suitable to fit the data. The largest
p-value was corresponding to the Frank copula.
The scatter plot of these data is shown in
Figure 7. It seems that there is no special relationship between large (or small) values of the variables, i.e., these data have no upper (lower) tail of dependence. The Spearman’s rho and Kendall’s tau measures of correlation are, respectively,
and
, with both corresponding
p-value
, which indicates a strong positive correlation. This is also confirmed by the K-plot at
Figure 8, where the dots lie far above the main diagonal. Therefore, we guess that the half-logistic copula will give a good fit.
We modeled the data with one-parameter, survival, and two-parameter half-logistic copulas. Also, Frank, Clayton, and Gumbel–Hougaard copulas were considered for comparison.
A plot of the Lambda function for the empirical data against the copulas is depicted in
Figure 9. In this plot, Gumbel–Hougaard, Frank, and half-logistic copulas give the best matches to the empirical curve. The values of the goodness-of-fit test statistics with their corresponding
p-values are presented in
Table 1. Our results are the same as those of Mesfioui et al. [
35] for the Frank and Clayton copulas. The maximum likelihood estimates of the parameters using the full maximum likelihood method (FML) with the corresponding Akaike information criterion (AIC) and Bayesian information criterion (BIC) values are tabulated in
Table 2.
Regarding the p-values of the goodness-of-fit tests, it seems that all the models are acceptable. AIC and BIC criteria reveal that Frank and half-logistic outperform other models.
5.2. Shunters’ Accidents Data
These data were first discussed by Arbous and Kerrich [
36]. The data consist of 122 shunters’ major accidents on the South African Railways recorded in the period 1937–1947, inclusive. Arbous and Kerrich [
36] adopted the belief that most of the accidents occurred to a small part of the employees. So, after a period of time from the beginning of the study, they omitted the number of accidents corresponding to those individuals with the highest numbers of accidents and examined if the annual number of accidents would be significantly reduced in the following years. The first variable represents the number of accidents for the first six years, and the second variable represents the last five of them. A plot of these data is presented in
Figure 10. The sample mean and sample variance for each period are, respectively,
and
. The correlation between the occurrence of accidents in the two successive periods is 0.28. They showed that for each period’s dataset, a Poisson distribution gives a good fit for the data, but the Poisson distribution is not adequate to fit the whole dataset over the 11-year period. Kocherlakota and Kocherlakota [
37], in their book, modeled the data by the Marshal–Olkin model with negative binomial marginals and obtained the MLEs. Barbiero [
38] concluded that discrete Weibull distributions are more suitable to model the marginals. Thus, he fitted the data with the FGM copula using discrete Weibull margins.
We used the truncated Poisson, half-logistic, survival half-logistic, and two-parameter half-logistic copulas to fit this data. Also, we added the Frank copula for comparison. We used discrete Weibull margins, which are given by
and
, to compare our results with those from Barbiero [
38].
The parameters estimated of the discrete Weibull margins using the IFM method are
and
. Other estimation results using IFM are listed in
Table 3.
In terms of AIC and BIC values, the half-logistic and the Frank models with discrete Weibull margins seem to give the best fits. Also, one can notice that the truncated Poisson copula fits poorly, which is expected since the truncated Poisson copula is only suitable for datasets with smaller correlations.
Figure 11 exhibits the cumulative distribution functions of these two models with the empirical one. In most areas of the graph, the Frank copula is beneath the half-logistic copula, except in a small area near the point
. The surfaces of the half-logistic and the Frank copulas alternately go up and down around the surface of the empirical cumulative distribution.
To visualize the fits of our models, we calculate the mean of the absolute and the squared distance between their expected probabilities and the observed probabilities. The results multiplied by a hundred are shown in
Table 4.