Next Article in Journal
A Novel Interval-Valued Decision Theoretic Rough Set Model with Intuitionistic Fuzzy Numbers Based on Power Aggregation Operators and Their Application in Medical Diagnosis
Next Article in Special Issue
New Ways to Calculate the Probability in the Bertrand Problem
Previous Article in Journal
Selecting and Weighting Mechanisms in Stock Portfolio Design Based on Clustering Algorithm and Price Movement Analysis
Previous Article in Special Issue
Bertrand’s Paradox Resolution and Its Implications for the Bing–Fisher Problem
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Estimation of Pianka Overlapping Coefficient for Two Exponential Distributions

by
Suad Alhihi
1,† and
Maalee Almheidat
2,*,†
1
Department of Mathematics, Al-Balqa Applied University, Alsalt 19117, Jordan
2
Department of Mathematics, University of Petra, Amman 11196, Jordan
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2023, 11(19), 4152; https://doi.org/10.3390/math11194152
Submission received: 29 August 2023 / Revised: 24 September 2023 / Accepted: 27 September 2023 / Published: 2 October 2023
(This article belongs to the Special Issue Probability, Statistics and Random Processes)

Abstract

:
Overlapping coefficients (OVL) are commonly used to estimate the similarity between populations in terms of their density functions. In this paper, we consider Pianka’s overlap coefficient for two exponential populations. The methods for statistical inference of Pianka’s coefficient are presented. The bias and mean square error (MSE) of the maximum likelihood estimator (MLE) and the Bayes estimator of Pianka’s overlap coefficient are investigated by simulation. Confidence intervals for Pianka’s overlap measure are constructed.

1. Introduction

Overlapping coefficients (OVL) are measures of how similar two populations are; this similarity is a function that assigns a real number between 0 and 1, where a value of zero indicates that the distributions are completely different and a value of one indicates that they are identical. There are many overlapping coefficients in the literature, including measures of overlap that determine the percentage of area that the two distributions have in common [1]. Gini and Livada [2] first introduced the idea of overlapping in 1943. Matusita’s coefficient [3] was introduced to calculate the significant distance between two probability density functions, and has applications in several practical areas, including reliability analysis and clinical research [4,5]. Matusita developed a discrete version known as the Freeman–Tukey (FT) measure, which is related to the Hellinger distance [6,7] and the delta method [8]. The Chi-Squared measure [9] and Hellinger measure [10] play key roles in information theory, statistics, learning, signal processing, and other theoretical and applied branches of mathematics [11,12]. Morisita’s coefficient [13] was proposed as an index of similarity between two communities. Weitzman’s coefficient [14], primarily used to compare income distributions, was defined as the region where the curves of two probability distributions intersect. Kullback and Leibler [15] introduced the Kullback–Leibler measure, which measures the gain in information between two distributions and has been widely used in the literature on data mining. Jeffreys [16] introduced and studied a divergence measure called the Jeffreys distance, which is regarded as a symmetrization of the Kullback–Leibler measure. For a comprehensive review of various divergence measures, see [17,18,19].
The OVL coefficients are used in various fields, such as ecological processes [20], statistical ecology [21], clinical trials [5], data fusion [22], information processing [23], applied statistics [24], economics [25], and others.
Inference for OVL measures has been investigated by several researchers under normal, Weibull, and exponential distributions. In 2005, al-Saidy et al. [26] presented the inference of three OVL coefficients for two Weibull distributions with the same shape parameter and different scale parameters.
Al-Saleh and Samawi [27] used bootstrap and Taylor series approximation to investigate the interval estimation of three OVL coefficients for two exponential distributions with different means. Samawi and Al-Saleh [28] studied three OVL coefficients for two exponential distributions and estimated them using ranked set sampling. Hamza et al. [29] proposed a new OVL coefficient based on the Kullback–Leibler measure for two exponential distributions. Sibil et al. [30] investigated both interval estimation and hypothesis testing for the OVL coefficients for one- and two-parameter exponential distributions using the concept of a generalized pivotal quantity.
Pianka’s overlap coefficient is used to assess the similarity of resource use by two species [31], they used Pianka’s overlap coefficient as a summary measure and to make inferences, typically about competition for resources.
Pianka’s overlap is used in mechanisms that favour morphological co-occurrence; Vieira and Port [32] evaluated the Pianka’s overlap between two species based on three main niche dimensions: habitat, food, and time. Jacqueline et al. [33] calculated dietary overlap between foxes and dingoes using Pianka’s index. Sa Oliveira et al. [34] investigated diet and niche breadth in fish communities, for which they estimated niche breadth using the liven index and Pianka’s measure.
In this paper, we consider the Pianka’s OVL coefficient ( ρ ) between two exponential distributions. We determine both the limiting and exact distributions for the maximum likelihood estimator (MLE) of ρ . We study the MLE and Bayesian estimators and compare their efficiency with each other. In addition, we consider interval estimation of ρ using the asymptotic technique and the transformation technique, and compare the effectiveness of both techniques.

2. General Setting and Definition of the Pianka Overlap Measure

Let f 1 ( x ) and f 2 ( x ) be two continuous probability density functions. Pianka’s overlap measure is defined as follows [31]:
ρ ( f 1 , f 2 ) = f 1 ( x ) f 2 ( x ) d x f 1 2 ( x ) d x f 2 2 ( x ) d x .
If a random variable X follows the exponential distribution, then the respective cdf and pdf of X are provided by
F ( x ) = 1 e x θ , x > 0 ; θ > 0 and f ( x ) = 1 θ e x θ , x > 0 ; θ > 0 ,
and X is denoted by E x p ( θ ) .
Now, let ( X 1 , , X n ) and ( Y 1 , , Y m ) be two independent random samples taken from E x p ( θ 1 ) and E x p ( θ 2 ) , respectively. Then, the Pianka’s overlap coefficient ρ between the two exponential distributions, as defined in Equation (1), is provided by
ρ = ρ ( θ 1 , θ 2 ) = 0 1 θ 1 e x θ 1 1 θ 2 e x θ 2 d x 0 1 θ 1 2 e 2 x θ 1 d x 0 1 θ 2 2 e 2 x θ 2 d x = 2 θ 1 θ 2 θ 1 + θ 2 , θ 1 > 0 , θ 2 > 0 .
Let k = θ 1 θ 2 . Then, the Pianka’s OVL coefficient in (2) can be written as a function of k, as follows:
ρ = ρ ( k ) = 2 k k + 1 , k > 0 .
Several properties of ρ ( k ) are provided in the following lemma.
Lemma 1.
For ρ defined in (3):
1.
0 ρ ( k ) 1 for all k > 0
2.
ρ ( k ) = 1 iff k = 1 , i.e., θ 1 = θ 2
3.
ρ ( k ) > 0 , since θ 1 , θ 2 > 0
4.
ρ ( k ) = ρ ( 1 k )
5.
ρ ( k ) is monotonically increasing for k < 1 and decreasing for k > 1 , with a maximum of ρ ( k ) at k = 1 .
Proof. 
It is easy to derive the above results from the formula of the Pianka’s overlap coefficient formula in (3). □
Figure 1 shows the plot of the Pianka’s overlap coefficient between two exponential distributions as a function of k, where k < 1 .
In the following section, we find the maximum likelihood estimator of Pianka’s overlap coefficient ρ , namely, ρ ^ M L E , along with its distribution. In addition, we investigate the limiting distribution of ρ ^ M L E .

3. Maximum Likelihood Estimator of ρ ( θ 1 , θ 2 )

It is known that the MLEs for θ 1 and θ 2 based on two samples taken from E x p ( θ 1 ) and E x p ( θ 2 ) are provided by θ ^ 1 = X ¯ and θ ^ 2 = Y ¯ , respectively. From the basic properties of the exponential distribution, we have θ ^ 1 G a m m a n , θ 1 n with V a r θ ^ 1 = θ 1 2 n and θ ^ 2 G a m m a m , θ 2 m , and with V a r θ ^ 2 = θ 2 2 m , where G a m m a α , β stands for the gamma distribution with shape parameter α and scale parameter β . It follows that the estimates ( θ ^ 1 , θ ^ 2 ) represent a complete minimal sufficient statistic for ( θ 1 , θ 2 ) . Thus, from the invariance property of the MLE, the MLE of ρ ( θ 1 , θ 2 ) is
ρ ^ M L E = ρ ( θ ^ 1 , θ ^ 2 ) = 2 x ¯ y ¯ x ¯ + y ¯ .

3.1. Limiting Distribution of ρ ^ M L E

The following theorem concludes that the limiting distribution for the MLE of Pianka’s overlap coefficient for two exponential distributions with different scale parameters is the normal distribution, using N ( μ , σ 2 ) to denote the normal distribution with location parameter μ and scale parameter σ .
Theorem 1.
Let ( X 1 , , X n ) and ( Y 1 , , Y m ) be two independent random samples from E x p ( θ 1 ) and E x p ( θ 2 ) , respectively, with θ 1 θ 2 . Then, the asymptotic distribution for ρ ^ M L E is
n m n + m ρ ^ M L E ρ ( θ 1 , θ 2 ) d N 0 , θ 1 θ 2 θ 1 θ 2 2 θ 1 + θ 2 4 , n , m .
Proof. 
Using the asymptotic property of MLE and the multivariate delta method (δ−method), we have
n θ ^ 1 θ 1 d N 0 , I 1 1 ( θ 1 ) , n .
That is,
n θ ^ 1 θ 1 d N 0 , θ 1 2 , n and m θ ^ 2 θ 2 d N 0 , θ 2 2 , m ,
where I 1 1 ( θ ) is the Fisher information.
We want to find the asymptotic distribution of ρ ^ M L E = ρ ( θ ^ 1 , θ ^ 2 ) , as n , m .
Using the fact that θ ^ 1 P θ 1 and θ ^ 2 P θ 2 and the continuous mapping theorem, we obtain
ρ ^ M L E = 2 θ ^ 1 θ ^ 2 θ ^ 1 + θ ^ 2 P 2 θ 1 θ 2 θ 1 + θ 2 = ρ ( θ 1 , θ 2 ) .
Now, we are interested in the asymptotic distribution of ρ ^ M L E = ρ ( θ ^ 1 , θ ^ 2 ) .
Because ρ ^ M L E is a function of θ ^ 1 and θ ^ 2 , using an alternative form of the multivariate δ−method [35] we obtain E ( ρ ^ M L E ) ρ ( θ 1 , θ 2 ) and
V a r ρ ^ M L E V a r θ ^ 1 ρ ( θ 1 , θ 2 ) θ 1 2 + V a r θ ^ 2 ρ ( θ 1 , θ 2 ) θ 2 2 , = θ 1 2 n θ 2 θ 2 θ 1 θ 1 θ 2 θ 1 + θ 2 2 + θ 2 2 m θ 1 θ 1 θ 2 θ 1 θ 2 θ 1 + θ 2 2 , = ( n + m ) θ 1 θ 2 θ 1 θ 2 2 n m θ 1 + θ 2 4 .
Therefore, the asymptotic distribution of ρ ^ M L E is
n m n + m ρ ^ M L E ρ ( θ 1 , θ 2 ) d N 0 , θ 1 θ 2 θ 1 θ 2 2 θ 1 + θ 2 4 .

3.2. The Exact Distribution of ρ ^ M L E

To ease the derivation of the distribution of ρ ^ M L E , we can rewrite Equation (4) as follows:
ρ ^ M L E = 2 n m 1 n V W + 1 m W V ,
where V = i = 1 n X i G a m m a ( n , θ 1 ) and W = i = 1 m Y i G a m m a ( m , θ 2 ) . Now, we apply the following steps.
Step 1.
Find the pdf of H = V W by considering the following transformations.
Let H 1 = V W and H 2 = W ; then, V = H 1 2 H 2 2 and W = H 2 2 . The absolute value of the Jacobian of this transform is J = 4 h 1 h 2 3 .
Thus, the joint pdf of H 1 and H 2 is
f H 1 , H 2 ( h 1 , h 2 ) = f V , W ( v = h 1 2 h 2 2 , w = h 2 2 ) J , = ( h 1 2 h 2 2 ) n 1 e h 1 2 h 2 2 θ 1 Γ ( n ) θ 1 n ( h 2 2 ) m 1 e h 2 2 θ 2 Γ ( m ) θ 2 m ( 4 h 1 h 2 3 ) , = 4 h 1 2 n 1 h 2 2 n + 2 m 1 e h 2 2 h 1 2 θ 1 + 1 θ 2 Γ ( n ) Γ ( m ) θ 1 n θ 2 m , h 1 > 0 , h 2 > 0 ; θ 1 > 0 , θ 2 > 0 .
By integrating h 2 out, the pdf of H 1 is
f H 1 ( h 1 ) = 0 4 h 1 2 n 1 h 2 2 n + 2 m 1 e h 2 2 h 1 2 θ 1 + 1 θ 2 Γ ( n ) Γ ( m ) θ 1 n θ 2 m d h 2 = c h 1 2 n 1 θ 1 + θ 2 h 1 2 n + m , h 1 > 0 ; θ 1 > 0 , θ 2 > 0 ,
Consequently, the pdf of H is
f H ( h ) = c h 2 n 1 θ 1 + θ 2 h 2 n + m , h > 0 , θ 1 > 0 , θ 2 > 0 ,
where c = 2 Γ ( n + m ) θ 1 m θ 2 n Γ ( n ) Γ ( m ) .
Step 2.
Solve ρ ^ M L E for h .
From Equation (5) and the transformation H = V W , we have
ρ ^ M L E = 2 n m h n + 1 m h = 2 h n m h 2 n + 1 m .
Now, let ρ ^ M L E = R , allowing R = 2 h n m h 2 n + 1 m to be rewritten as the quadratic equation
m n m R h 2 2 n m h + m n m R = 0 .
The two solutions of Equation (6) are
U 1 = n m m R ( 1 1 R 2 ) , U 1 > 0 and U 2 = n m m R ( 1 + 1 R 2 ) , U 2 > 0 .
Step 3.
The pdf of R = ρ ^ M L E is
f R ( r ) = f H ( u 1 ) u 1 r + f H ( u 2 ) u 2 r = c r 1 r 2 n m 1 + 1 r 2 r 2 n θ 1 + n θ 2 1 + 1 r 2 2 m r 2 n m + n m 1 + 1 r 2 r 2 n θ 1 + n θ 2 1 + 1 r 2 2 m r 2 n m , 0 < r < 1 ; θ 1 > 0 , θ 2 > 0 .
Figure 2 shows different plots of the density of ρ ^ M L E for ( n , m ) = ( 30 , 50 ) . Based on the figure, the pdf of ρ ^ M L E can be bell-shape, bimodal, or J-shaped.

4. Interval Estimation of ρ ( θ 1 , θ 2 )

In this section, we find interval estimation of Pianka’s overlap coefficient ρ by considering both asymptotic and transformation techniques; later, in Section 6, we perform a Monte Carlo analysis to compare the effectiveness of these two different approaches.

4.1. Asymptotic Technique

A large sample confidence interval for ρ ( θ 1 , θ 2 ) can be easily calculated. From theorem (2.1) and the continuous mapping theorem, we have
θ ^ 1 θ ^ 2 θ ^ 1 θ ^ 2 2 θ ^ 1 + θ ^ 2 4 p θ 1 θ 2 θ 1 θ 2 2 θ 1 + θ 2 4 .
Hence, a 100 ( 1 α ) % large sample confidence interval for ρ ( θ 1 , θ 2 ) is
ρ ^ M L E Z 1 α 2 ( n + m ) θ ^ 1 θ ^ 2 θ ^ 1 θ ^ 2 2 n m θ ^ 1 + θ ^ 2 4 , ρ ^ M L E + Z 1 α 2 ( n + m ) θ ^ 1 θ ^ 2 θ ^ 1 θ ^ 2 2 n m θ ^ 1 + θ ^ 2 4 ,
where Z γ is the γth percentile of the standard normal distribution.

4.2. Transformation Technique

From the assumption in Equation (3), ρ ( k ) = 2 k k + 1 , where the MLE of k is k ^ = θ ^ 1 θ ^ 2 . From Section 3 and the relationship between the gamma distribution and the chi-square distribution, it is easy to conclude that 2 n θ ^ 1 θ 1 χ ( 2 n ) 2 and 2 m θ ^ 2 θ 2 χ ( 2 m ) 2 ; thus, θ ^ 2 θ ^ 1 θ 1 θ 2 has an F-distribution with ( 2 m , 2 n ) degrees of freedom.
Let L and U be the lower and upper confidence limits, respectively; from the concept of the confidence interval, we have
1 α = Pr ( F α 2 , 2 m , 2 n < θ ^ 2 θ ^ 1 k < F 1 α 2 , 2 m , 2 n ) .
By solving (7) for k, we obtain the values of L and U as L = θ ^ 1 θ ^ 2 F α 2 , 2 m , 2 n and U = θ ^ 1 θ ^ 2 F 1 α 2 , 2 m , 2 n . .
However, the overlap coefficient ρ ( k ) is not a monotone function of k. Therefore, using the transformation technique, we can obtain a 100 ( 1 α ) % confidence interval for ρ , as follows:
M i n 2 L L + 1 , 2 U U + 1 , M a x 2 L L + 1 , 2 U U + 1 ,
where F γ , r 1 , r 2 is the γ t h percentile of the F−distribution with ( r 1 , r 2 ) degrees of freedom.

5. Bayes Estimator of ρ ( θ 1 , θ 2 )

Let ( X 1 , , X n ) and ( Y 1 , , Y m ) be two independent random samples taken from E x p ( θ 1 ) and E x p ( θ 2 ) , respectively. Let V = X i , W = Y i ,   θ 1 I n v G a m m a ( a , b ) , and θ 2 I n v G a m m a ( c , d ) , where I n v G a m m a . , . is the inverse gamma distribution.
Using the fact that V G a m m a ( n , θ 1 ) and W G a m m a ( m , θ 2 ) , the posterior distribution of θ = ( θ 1 , θ 2 ) given V , W is
π ( θ 1 , θ 2 v , w ) = f V ( v θ 1 ) p 1 ( θ 1 ) f W ( w θ 2 ) p 2 ( θ 2 ) 0 0 f V ( v θ 1 ) p 1 ( θ 1 ) f W ( w θ 2 ) p 2 ( θ 2 ) d v d w = v n 1 e v θ 1 Γ ( n ) θ 1 n θ 1 a 1 b a e b θ 1 Γ ( a ) w m 1 e w θ 2 Γ ( m ) θ 2 m θ 2 c 1 d c e d θ 2 Γ ( c ) 0 0 v n 1 e v θ 1 Γ ( n ) θ 1 n θ 1 a 1 b a e b θ 1 Γ ( a ) w m 1 e w θ 2 Γ ( m ) θ 2 m θ 2 c 1 d c e d θ 2 Γ ( c ) d v d w θ 1 ( n + a ) 1 e 1 θ 1 ( v + b ) θ 2 ( m + c ) 1 e 1 θ 2 ( w + d ) , θ 1 > 0 , θ 2 > 0 ; a > 0 , b > 0 , c > 0 , d > 0 ,
where p 1 ( θ 1 ) and p 2 ( θ 2 ) are prior probability distributions for θ 1 and θ 2 , respectively.
Then,
π ( θ v , w ) = π 1 ( θ 1 v ) π 2 ( θ 2 w ) ,
where θ 1 v I n v G a m m a ( n + a , v + b ) and θ 2 w I n v G a m m a ( m + c , w + d ) .
The Bayes estimator ρ ^ B a y e s is
ρ ^ B a y e s = ρ ( θ 1 , θ 2 ) π ( θ 1 , θ 2 v , w ) d θ 1 d θ 2 = 0 0 2 θ 1 θ 2 θ 1 + θ 2 θ 1 ( n + a ) 1 e 1 θ 1 ( v + b ) ( v + b ) n + a Γ ( n + a ) θ 2 ( m + c ) 1 e 1 θ 2 ( w + d ) ( w + d ) m + c Γ ( m + c ) d θ 1 d θ 2 = 2 ( v + b ) n + a ( w + d ) m + c Γ ( n + a ) Γ ( m + c ) 0 0 θ 1 ( n + a ) 1 2 θ 2 ( m + c ) 1 2 e 1 θ 2 ( w + d ) e 1 θ 1 ( v + b ) θ 1 + θ 2 d θ 1 d θ 2 .
The above estimate does not have a simple closed form; thus, we obtain it numerically. For the asymptotic distribution of the Bayes estimator ρ ^ B a y e s , the Bernstein–von Misses theorem [36] concludes that the Bayesian estimator and the maximum likelihood estimator are asymptotically equivalent for large sample sizes.
In the next section, we present a simulation study to compare the two approaches for finding the interval estimator of Pianka’s overlap coefficient, as described earlier in Section 4. Additionally, we investigate the performance of the maximum likelihood estimator ( ρ ^ M L E ) and Bayes estimator ( ρ ^ B a y e s ) of the Pianka’s overlap coefficient detailed in Section 3 and Section 5.

6. Simulation Study

To compare the two approaches of interval estimation of Pianka’s overlap coefficient, we consider two criteria:
1.
The term “valid confidence level” can be applied to an interval estimation process when, in repeated sampling, the actual coverage of the true but unmeasured statistic is close to the nominal confidence level;
2.
If the expected length of the simulated period is short, a method for estimating intervals can be described as “valid length-efficient”.
To compare the estimators, we use the bias, mean square error (MSE), and efficiency for each estimator. In order to use the above criteria, we conducted a simulation study, as follows:
1.
A random sample of size n is generated from E x p ( θ 1 ) . This random sample is used to calculate θ ^ 1 .
2.
A random sample of size m is generated from E x p ( θ 2 ) . This random sample is used to calculate θ ^ 2 .
3.
The lower limit L i , upper limit U i , and width W i , are calculated with a nominal confidence level of 95 % .
4.
The MLE ( ρ ^ M L E ) i and the Bayes ( ρ ^ B a y e s ) i estimators are calculated.
5.
Steps 1–4 above are repeated 10,000 times.
6.
The average of the lower limits (AL), median of the lower limits (ML), average of the upper limits (AU), median of the upper limits (MU), average width (AW), and median width (MW) are calculated for each interval.
7.
The percentage of ρ out of the 10,000 samples generated in Step 3 is called the “coverage probability” and is denoted by A P .
8.
Histogram Plots for ρ ^ M L E and ρ ^ B a y e s are generated.
9.
Bias and MSE are calculated for ρ ^ M L E and ρ ^ B a y e s , then efficiency is calculated
i . e . , efficiency ( ρ ^ M L E . ρ ^ B a y e s ) = M S E ( ρ ^ M L E ) M S E ( ρ ^ B a y e s ) .
10.
Steps 1–9 above are repeated for
( n , m ) = ( 20 , 20 ) , ( 20 , 30 ) , ( 30 , 50 ) , ( 50 , 50 ) , ( 50 , 100 ) , ( 100 , 100 ) , and
( θ 1 , θ 2 ) = ( 2 , 10 ) , ( 5 , 10 ) , ( 8 , 10 ) for each value of k = 0.2 , 0.5 , 0.8 .
Mathematica was used to simulate each of the interval estimation and point estimation methods for the Pianka’s overlap measure ρ .
Table 1, Table 2 and Table 3 show the simulated interval estimators using the asymptotic and transformation techniques based on exponential random samples with a nominal confidence level of 95 % . These results show that the average width (AW) is almost the same as the median width (MW) and that the transformation method consistently performs better in terms of the confidence interval width. Moreover, the transformation method appears to be effective in terms of the coverage probability except for values of k around one and very small sample sizes.
As the sample size increases, the coverage probability of the two techniques approaches the nominal value. The coverage probability of the asymptotic technique works very well, and increases as k approaches one; however, when k < 0.5 and for small sample sizes the transformation technique performs exceptionally well.
Figure 3, Figure 4 and Figure 5 plot the MLE and Bayes estimators of ρ for k = 0.2 , 0.5 , 0.8 , and ( n , m ) = ( 50 , 50 ) .
Table 4, Table 5 and Table 6 present the results of the simulation study carried out to compare the MLE and Bayes estimators for the Pianka’s overlap coefficient. Based on these results, which only consider the values of k < 1 , the absolute values of the bias are in all cases less than 0.05 and decrease as the sample size increases. It appears that the MLE estimator works well, and the Bayes estimator seems to work quite well at k = 0.5 . However, for k > 1 the calculations are provided in terms of 1 k for the Pianka’s overlap measure. For sample sizes larger than 30, the bias and MSE are quite close to zero.
The estimates of the bias are plotted in Figure 6 for the MLE and Bayes estimators. From these results, it can be seen that the bias decreases significantly as the sample size increases. Figure 6a shows that the actual Pianka’s overlap is underestimated; however, for very small values of k and small sample sizes the true Pianka’s overlap is overestimated. Furthermore, the bias increases as k increases for the MLE estimator.
The estimates of MSE are plotted in Figure 7 for the MLE and Bayes estimators. From these results, it can be seen that the MSE decreases significantly as the sample size increases. Figure 7a shows that for small k values and small sample sizes, there is a significant increase in the MSE for the MLE estimator. For the Bayes estimator, Figure 6b and Figure 7b show that both the bias and the MSE decrease as the value of k increases.

7. Conclusions

We have estimated Pianka’s overlap coefficient for two exponential populations with different scale parameters using the MLE and Bayes estimators, then compared these estimators by calculating the bias and MSE in a simulation study. In addition, we have constructed confidence intervals for the Pianka’s overlap measure using asymptotic and transformation techniques, then compared them using the “valid confidence level” and “valid length-efficiency”.
We investigated the accuracy of the Pianka’s overlap coefficient through a Monte Carlo analysis. In conclusion, it appears that there is no ideal approach. Therefore, a transformation procedure is recommended when k < 0.5 and the sample size is small. The asymptotic approach can be used if computers are available. For larger sample sizes and k < 0.8 , the transformation approach is recommended.

Author Contributions

Conceptualization, S.A. and M.A.; Methodology, S.A. and M.A.; Software, S.A. and M.A.; Validation, S.A. and M.A.; Formal analysis, S.A. and M.A.; Resources, S.A. and M.A.; Writing—original draft, S.A.; Writing—review & editing, M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Tilton, J.W. The measurement of overlapping. J. Educ. Psychol. 1937, 28, 656–662. [Google Scholar] [CrossRef]
  2. Gini, C.; Livada, G. Nuovi Contribute Alla Teoria Della Transvariazione; Atti della VI Riunione della Società Italiana di Statistica: Rome, Italy, 1943. [Google Scholar]
  3. Matusita, K. Decision rules based on distance, for problems of fit, two samples and applications. Ann. Inst. Math. Stat. 1955, 19, 181–192. [Google Scholar] [CrossRef]
  4. Anderson, G. Toward an empirical analysis of polarization. J. Econom. 2004, 122, 1–26. [Google Scholar] [CrossRef]
  5. Mizuno, S.; Yamaquchi, T.; Fukushima, A.; Matsuyama, Y.; Ohashi, Y. Overlap coefficient for assessing the similarity of pharmacokinetic data between ethnically different populations. Clin. Trials 2005, 2, 174–181. [Google Scholar] [CrossRef]
  6. Beran, R. Minimum Hellinger distance estimates for parametric models. Ann. Stat. 1977, 5, 455–463. [Google Scholar] [CrossRef]
  7. Rao, K.J.N.; Tintner, G. On the variate difference method. Aust. J. Stat. 1963, 5, 106–116. [Google Scholar] [CrossRef]
  8. Smith, E.P. Niche breadth, resource availability, and inference. Ecology 1982, 63, 1675–1681. [Google Scholar] [CrossRef]
  9. Pearson, K. On the Criterion that a Given System of Deviations From the Probable in the Case of a Correlated System of Variables is such that it Can be Reasonably Supposed to have a Risen From Random Sampling. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1991, 50, 157–172. [Google Scholar] [CrossRef]
  10. Hellinger, E. Neue begründung der theorie quadratischer formen von unendlichvielen veränderlichen. J. Reine Angew. Math. 1909, 136, 210–271. [Google Scholar] [CrossRef]
  11. Nishiyama, T. A tight lower bound for the Hellinger distance with given means and variances. arXiv 2020, arXiv:2010.13548. [Google Scholar]
  12. Nishiyama, T.; Sason, I. On relations between the relative entropy and χ2-divergence, generalizations and applications. Entropy 2020, 22, 563. [Google Scholar] [CrossRef]
  13. Morisita, M. Measuring of the dispersion and analysis of distribution patterns, Memoires of the Faculty of Science, Series E. Biol. Kyushu Univ. 1959, 2, 215–235. [Google Scholar]
  14. Weitzman, M.S. Measures of overlap of income distributions of white and Negro families in the United States. In US Bureau of the Census; U.S. Department of Commerce: Washington, DC, USA, 1970; Volume 22. [Google Scholar]
  15. Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
  16. Jeffreys, H. An invariant form for the prior probability in estimation problems. Proc. R. Soc. Lond. Ser. A Math. Phys. Sci. 1946, 186, 453–461. [Google Scholar]
  17. Abu, A.H.; Hassanat, A.; Lasassmeh, O.; Tarawneh, A.; Alhasanat, M.; Eyal, S.H.; Prasath, V. Effects of distance measure choice on k-nearest neighbor classifier performance: A review. Big Data 2019, 7, 221–248. [Google Scholar]
  18. Cha, S. Comprehensive survey on distance/similarity measures between probability density functions. City 2007, 1, 1. [Google Scholar]
  19. Taneja, I. On symmetric and nonsymmetric divergence measures and their generalizations. Adv. Imaging Electron Phys. 2005, 138, 177–250. [Google Scholar]
  20. Abele, L.G. The community structure of coral-associated decapod crustaceans in a variable environment. Ecol. Process. Coast. Mar. Syst. Mar. Sci. 1979, 10, 265–287. [Google Scholar]
  21. Chao, A.; Hwang, W.; Chen, Y.; Kuo, C. Estimating the number of shared species in two communities. Stat. Sin. 2000, 10, 227–246. [Google Scholar]
  22. Moravec, H. Mind Children: The Future of Robot and Human Intelligence; Harvard University Press: Cambridge, MA, USA, 1988. [Google Scholar]
  23. Viola, P.; Wells, W., III. Alignment by maximization of mutual information. Int. J. Comput. Vis. 1997, 24, 137–154. [Google Scholar] [CrossRef]
  24. Inman, H.F.; Bradley, E.L. The overlapping coefficient as a measure of agreement between probability distributions and point estimation of the overlap of two normal densities. Commun. Stat. Theory Methods 1989, 18, 3851–3874. [Google Scholar] [CrossRef]
  25. Milanovic, B.; Shlomo, Y. Decomposing world income distribution: Does the world have a middle class? Rev. Income Wealth 2002, 48, 155–178. [Google Scholar] [CrossRef]
  26. Al-Saidy, O.; Samawi, H.M.; Al-Saleh, M.F. Inference on overlap coefficients under the Weibul distribution: Equal Shape Parameter. ESAIM Probab. Stat. 2005, 9, 206–219. [Google Scholar] [CrossRef]
  27. Al-Saleh, M.F.O.; Samawi, H. Interference on Overlapping Coefficients in Two Exponential Populations. J. Mod. Appl. Stat. Methods 2007, 6, 503–516. [Google Scholar] [CrossRef]
  28. Samawi, H.; Al-Saleh, M.F.O. Inference on Overlapping Coefficients in Two Exponential Populations Using Ranked Set Sample. Commun. Korean Stat. Soc. 2008, 15, 147–159. [Google Scholar]
  29. Hamza, D.; Papa, N.; Malick, M. Overlap Coefficients Based on Kullback-Leibler Divergence: Exponential Populations Case. Int. J. Appl. Math. Res. 2017, 6, 135–140. [Google Scholar]
  30. Sibil, J.; Seemon, T.; Thomas, M. Interval Estimation of the Overlapping Coefficient of Two Exponential Distributions. J. Stat. Theory Appl. 2019, 18, 26–32. [Google Scholar]
  31. Pianka, E. Niche Overlap and Diffuse Competition. Proc. Natl. Acad. Sci. USA 1974, 71, 2141–2145. [Google Scholar] [CrossRef]
  32. Vieira, E.M.; Port, D. Niche overlap and resource partitioning between two sympatric fox species in southern Brazil. J. Zool. 2006, 272, 57–63. [Google Scholar] [CrossRef]
  33. Jacqueline, B.C.; Mathew, S.C.; Georgeanna, S.; Mike, L. Dietary overlap and prey selectivity among sympatric carnivores: Could dingoes suppress foxes through competition for prey? J. Mammal. 2011, 92, 590–600. [Google Scholar]
  34. Sa-Oliveira, J.C.; Ronaldo, A.; Victoria, J.I.N. Diet and niche breadth and overlap in fish communities within the area affected by an Amazonian reservoir (Amapá, Brazil). Ann. Braz. Acad. Sci. 2014, 86, 383–405. [Google Scholar] [CrossRef] [PubMed]
  35. Bodkin, R.G.; Klein, L.R.; Marwah, K. A History of Macroeconometric Model-Building; Edward Elgar Publishing: Cheltenham, UK, 1991. [Google Scholar]
  36. Doob, J. Application of the theory of martingales. Calc. Des Probab. Ses Appl. 1949, 13, 23–27. [Google Scholar]
Figure 1. Pianka overlap coefficient as a function of k.
Figure 1. Pianka overlap coefficient as a function of k.
Mathematics 11 04152 g001
Figure 2. The pdf of ρ ^ M L E for ( θ 1 , θ 2 ) = (2,10), (5,10), and (8,10).
Figure 2. The pdf of ρ ^ M L E for ( θ 1 , θ 2 ) = (2,10), (5,10), and (8,10).
Mathematics 11 04152 g002
Figure 3. Histogram of Pianka estimators for ( n , m ) = ( 50 , 50 ) and k = 0.2 .
Figure 3. Histogram of Pianka estimators for ( n , m ) = ( 50 , 50 ) and k = 0.2 .
Mathematics 11 04152 g003
Figure 4. Histogram of Pianka estimators for ( n , m ) = ( 50 , 50 ) and k = 0.5 .
Figure 4. Histogram of Pianka estimators for ( n , m ) = ( 50 , 50 ) and k = 0.5 .
Mathematics 11 04152 g004
Figure 5. Histogram of Pianka estimators for ( n , m ) = ( 50 , 50 ) and k = 0.8 .
Figure 5. Histogram of Pianka estimators for ( n , m ) = ( 50 , 50 ) and k = 0.8 .
Mathematics 11 04152 g005
Figure 6. Relationship of bias to k for Pianka’s coefficient: (a) relation of bias to ρ for the MLE estimator and (b) relation of bias to ρ for the Bayes estimator.
Figure 6. Relationship of bias to k for Pianka’s coefficient: (a) relation of bias to ρ for the MLE estimator and (b) relation of bias to ρ for the Bayes estimator.
Mathematics 11 04152 g006
Figure 7. Relationship of MSE to k for Pianka’s coefficient: (a) relation of MSE to ρ for the MLE estimator and (b) relation of bias to ρ for Bayes estimator.
Figure 7. Relationship of MSE to k for Pianka’s coefficient: (a) relation of MSE to ρ for the MLE estimator and (b) relation of bias to ρ for Bayes estimator.
Mathematics 11 04152 g007
Table 1. Simulation results for the two approaches of interval estimation of Pianka’s OVL coefficient for k = 0.2 , ρ = 0.7454 .
Table 1. Simulation results for the two approaches of interval estimation of Pianka’s OVL coefficient for k = 0.2 , ρ = 0.7454 .
(n, m)TechniqueALAUAWMLMUMWAP
(20, 20)Asymptotic0.5940.8940.2990.5920.8900.3050.931
Transformation0.5920.8840.2920.5910.8910.2990.951
(20, 30)Asymptotic0.6060.8800.2750.6040.8850.2790.935
Transformation0.6080.8760.2690.6070.8820.2740.952
(30, 30)Asymptotic0.6220.8690.2470.6200.8720.2500.937
Transformation0.6210.8630.2430.6190.8670.2470.941
(30, 50)Asymptotic0.6320.8540.2210.6320.8570.2240.935
Transformation0.6340.8530.2180.6340.8560.2210.945
(50, 50)Asymptotic0.6490.8410.1920.6480.8430.1940.942
Transformation0.6480.8390.1910.6470.8310.1930.951
(50, 100)Asymptotic0.6510.8270.1670.6510.8290.1690.946
Transformation0.6620.8280.1660.6620.8290.1670.951
(100, 100)Asymptotic0.6760.8130.1370.6760.8140.1380.942
Transformation0.6760.8120.1360.6760.8130.1370.944
Table 2. Simulation results for the two approaches of interval estimation of Pianka’s OVL coefficient for k = 0.5 , ρ = 0.9428 .
Table 2. Simulation results for the two approaches of interval estimation of Pianka’s OVL coefficient for k = 0.5 , ρ = 0.9428 .
(n, m)TechniqueALAUAWMLMUMWAP
(20, 20)Asymptotic0.8411.0270.1850.8451.0350.1950.879
Transformation0.8120.9870.1760.8150.9940.1840.957
(20, 30)Asymptotic0.8481.0110.1720.8521.0280.1790.891
Transformation0.8270.9880.1620.8310.9950.1680.951
(30, 30)Asymptotic0.8591.0130.1540.8611.0200.1610.896
Transformation0.8390.9870.1490.8410.9940.1550.949
(30, 50)Asymptotic0.8671.0060.1310.8691.0130.1440.913
Transformation0.8530.9870.1340.8560.9930.1380.952
(50, 50)Asymptotic0.8790.9990.1200.8811.0040.1230.916
Transformation0.8660.9840.1180.8690.9890.1200.952
(50, 100)Asymptotic0.8770.9920.1050.8890.9960.1070.927
Transformation0.8790.9820.1020.8810.9850.1040.953
(100, 100)Asymptotic0.8980.9840.0860.8990.9860.0870.932
Transformation0.8910.9760.0850.8920.9790.0860.947
Table 3. Simulation results fir the two approaches of interval estimation of Pianka’s OVL coefficient for k = 0.8 , ρ = 0.9938 .
Table 3. Simulation results fir the two approaches of interval estimation of Pianka’s OVL coefficient for k = 0.8 , ρ = 0.9938 .
(n, m)TechniqueALAUAWMLMUMWAP
(20, 20)Asymptotic0.9351.0280.0930.9491.0310.0840.924
Transformation0.9080.9690.0870.9150.9710.0780.266
(20, 30)Asymptotic0.9431.0240.0810.9551.0260.0720.914
Transformation0.9110.9750.0740.9280.9830.0660.290
(30, 30)Asymptotic0.9521.0110.0680.9621.0220.0610.898
Transformation0.9300.9820.0650.9360.9890.0590.375
(30, 50)Asymptotic0.9581.0160.0580.9671.0180.0530.884
Transformation0.9410.9860.0540.9470.9910.0490.437
(50, 50)Asymptotic0.9651.0130.0470.9711.0150.0450.873
Transformation0.9500.9920.0460.9540.9960.0430.582
(50, 100)Asymptotic0.9611.0010.0400.9741.0110.0390.861
Transformation0.9590.9940.0380.9620.9970.3630.691
(100, 100)Asymptotic0.9761.0070.0310.9791.0080.0310.856
Transformation0.9670.9970.0310.9690.9990.0300.853
Table 4. Bias, MSE, and efficiency of the two estimators of Pianka’s OVL coefficient for k = 0.2 . Exact Pianka’s coefficient ρ = 0.7454 .
Table 4. Bias, MSE, and efficiency of the two estimators of Pianka’s OVL coefficient for k = 0.2 . Exact Pianka’s coefficient ρ = 0.7454 .
(n, m)MLE EstimatorBayes EstimatorEfficiency
BiasMSEBiasMSE
(20, 20)0.00170.00610.03860.00551.0752
(20, 30)0.00380.00510.04330.00520.9698
(30, 30)0.00050.00400.02740.00391.0188
(30, 50)0.00260.00320.03170.00350.9261
(50, 50)0.00040.00240.01720.00241.0111
(50, 100)0.00110.00180.02010.00190.9529
(100, 100)0.00010.00120.00860.00121.0145
Table 5. Bias, MSE, and efficiency of the two estimators of Pianka’s OVL coefficient for k = 0.5 Exact Pianka’s coefficient ρ = 0.9428 .
Table 5. Bias, MSE, and efficiency of the two estimators of Pianka’s OVL coefficient for k = 0.5 Exact Pianka’s coefficient ρ = 0.9428 .
(n, m)MLE EstimatorBayes EstimatorEfficiency
BiasMSEBiasMSE
(20, 20)0.00950.00250.01020.00181.3915
(20, 30)0.00930.00220.00510.00141.5146
(30, 30)0.00610.00170.00690.00131.2805
(30, 50)0.00590.00140.00240.00101.3485
(50, 50)0.00330.00010.00310.00091.1096
(50, 100)0.00380.00080.00120.00061.2178
(100, 100)0.00140.00050.00220.00051.0559
Table 6. Bias, MSE, and efficiency of the two estimators of Pianka’s OVL coefficient for k = 0.8 . Exact Pianka’s coefficient ρ = 0.9938 .
Table 6. Bias, MSE, and efficiency of the two estimators of Pianka’s OVL coefficient for k = 0.8 . Exact Pianka’s coefficient ρ = 0.9938 .
(n, m)MLE EstimatorBayes EstimatorEfficiency
BiasMSEBiasMSE
(20, 20)0.01110.00070.02130.00090.8145
(20, 30)0.01030.00060.01750.00060.8923
(30, 30)0.00790.00040.01460.00050.7897
(30, 50)0.00670.00030.01110.00030.9122
(50, 50)0.00490.00020.00910.00020.8199
(50, 100)0.00380.00010.00640.00010.9160
(100, 100)0.00230.00010.00460.00010.8554
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alhihi, S.; Almheidat, M. Estimation of Pianka Overlapping Coefficient for Two Exponential Distributions. Mathematics 2023, 11, 4152. https://doi.org/10.3390/math11194152

AMA Style

Alhihi S, Almheidat M. Estimation of Pianka Overlapping Coefficient for Two Exponential Distributions. Mathematics. 2023; 11(19):4152. https://doi.org/10.3390/math11194152

Chicago/Turabian Style

Alhihi, Suad, and Maalee Almheidat. 2023. "Estimation of Pianka Overlapping Coefficient for Two Exponential Distributions" Mathematics 11, no. 19: 4152. https://doi.org/10.3390/math11194152

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop