Next Article in Journal
Small-Angle Scattering from Fractional Brownian Surfaces
Previous Article in Journal
Artificial Intelligence Methodologies for Data Management
Previous Article in Special Issue
Random Permutations, Non-Decreasing Subsequences and Statistical Independence
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Advanced Approach for Estimating Failure Rate Using Saddlepoint Approximation

Department of Mathematics, Faculty of Science, Taibah University, Medina 54321, Saudi Arabia
Symmetry 2021, 13(11), 2041; https://doi.org/10.3390/sym13112041
Submission received: 29 September 2021 / Revised: 23 October 2021 / Accepted: 27 October 2021 / Published: 29 October 2021

Abstract

:
In the present study paper, a failure (hazard) rate function approximates the probability distribution for the linear combination of a random variable considered a highly complex model. The saddlepoint approximation approach is used to approximate the probability mass function and the cumulative distribution function to derive the approximation of the failure (hazard) rate with a high level of accuracy. The superior performance of this method is shown by numerical simulations and comparison with the performance of other approximation methods.

1. Introduction

Advanced statistical methods require models that calculate probabilities derived from intractable distributions, which causes complicated calculations. The saddlepoint approximation technique can solve some of these problems. This method of approximation was proposed by Daniels [1], and it is a particular application of the mathematical saddlepoint methods applied to statistical science. This technique provides an approximation formula for the cumulative distribution function (CDF) or the probability mass function (PMF) from its associated moment generating function (MGF). The saddlepoint approach can answer many scientific applied problems, such as bivariate symmetry tests for complete and competing risks data (Bell and Haller [2]). Kepner and Randies [3] introduced a comparison of tests for bivariate symmetry versus location and/or scale alternatives. Rao and Raghunath [4] proposed a simple nonparametric test for bivariate symmetry about a line. This method of approximation also provides us with a very accurate solution for estimating the hazard rate function (see Dissanayake and Trindade [5]), and Al Mutairi [6] discussed improved methods for calculating the hazard rate function for the stopped sum model using saddlepoint approximation. The distribution of a linear combination of random variables appears in many applied problems and has attracted the attention of many scholars; specifically, the linear combination of two independent random variables has been investigated by many researchers (see, e.g., Amari and Misra [7], Di Salvo [8], Almasi et al. [9], Al Mutairi and Low [10], and Al Mutairi [11] among others). Currently, the demand for these models has increased, due to their importance and wide range of applications.
This study improves the hazard rate function estimation methods for a linear combination of independent Poisson random variables by using the saddlepoint approximation.
It is well known that the Poisson distribution is stable; that is, the distribution of the sum of two independent Poisson random variables has the Poisson distribution. This means that if two random variables X 1 ~ P o i s s o n λ 1 and X 2 ~ P o i s s o n λ 2 are independent, the distribution of their sum X 1 + X 2 ~ P o i s s o n λ 1 + λ 2 .
At the same time, rescaling a Poisson random variable is not Poisson anymore. Random variable Y = 0.5 X , where X ~ P o i s s o n λ , is not integer-valued and, hence, not Poisson. Thus, the exact distribution of a linear combination of independent Poisson random variables is not always known and very difficult to obtain.
Therefore, approximation methods are essential and can provide a piece of helpful information about the distribution of these complicated models. The saddlepoint approximation method can replace the exact answer with high accuracy, and this adds to its many advantages in terms of saving time and effort.

2. Saddlepoint Approximation for the Linear Combination of Poisson Random Variables

The linear combination of Poisson random variables is given by linear combinations of convoluted random variables, which occur in a wide range of fields. In most cases, the exact distribution of these linear combinations is extremely difficult to determine, and the normal approximation usually performs very badly for these complicated distributions. A better method of approximating linear combination distributions involves the additional use of saddlepoint approximation. Saddlepoint approximation is able to provide accurate expressions for distribution functions that are unknown in their closed forms. This method not only yields an accurate approximation near the centre of the distribution but also controls the relative error in the far tail of the distribution.
The linear combination of Poisson random variables is given by
L c = c 1 X 1 + c 2 X 2 + + c n X n ,
where c i , i = 1 , 2 , , n are real constants and X i ~ P o i s s o n λ i , i = 1 , 2 , , n are independent Poisson random variables. Note that we do not require that X i , i = 1 , 2 , , n be identically distributed.
Recall that the cumulant generating function (CGF) of a random variable X is defined as a logarithm of its moment generating function
K X s = ln E e s X .
The saddlepoint technique requires calculating the cumulant generating function for the linear combination L c of Poisson random variables as
K L c s = i = 1 n λ i e c i s 1 .
Note that in Butler [12] Sections 1.1.2 and 1.1.5 (formulae (1.13) and (1.14)), the saddlepoint approximation is defined only for continuous and integer-valued distributions
f ^ x = 1 2 π K s ^ exp K s ^ s ^ x ,
where s ^ is the unique solution for the saddlepoint equation K s ^ = x .
The saddlepoint approximation formula for the discrete cumulative distribution function requires some modification of the continuous formula in order to derive the formula with a very high level of accuracy. Daniels [13] derived three such continuity corrections. Let X be a discrete random variable with CDF F (k) supporting the integers with mean μ . The right or left tail probability is used to solve many applied problems and the failure rate, as shown in this work.
All three saddlepoint continuity corrections are very accurate, and which one is more accurate than the others depends upon the specific applications or problems (Butler [12]).
The distribution of the linear combination of Poisson random variables L c is neither continuous nor integer-valued, and, hence, we need a continuity correction for the saddlepoint approximation. For a more detailed summary of these arguments, see Daniels [1] and Daniels [13] Section 2 for more details. The first continuity correction for the right tail is given as
P ^ r 1 X x = 1 Φ w ˜ 1 ϕ w ˜ 1 1 w ˜ 1 1 u ˜ 1 , i f x μ 1 2 1 2 π K 0 6 K ( 0 ) 3 / 2 1 2 K 0 , i f x = μ ,
where μ is the mean of the random variable X , Φ and ϕ are the standard normal distribution and density functions, respectively, and
w ˜ 1 = s g n s ^ 2 s ^ x K s ^ , u ˜ 1 = 1 e s ^ K s ^ .
The second continuity correction is given as
P ^ r 2 X x = 1 Φ w ˜ 2 ϕ w ˜ 2 1 w ˜ 2 1 u ˜ 2 , i f x μ 1 2 K 0 6 2 π K ( 0 ) 2 / 3 , i f x = μ ,
where x is the unique solution for the saddlepoint equation K s ˜ = x 0.5 = x , and
w ˜ 2 = s g n s ˜ 2 s ^ x K s ˜ , u ˜ 2 = 2 sin h s ˜ 2 K s ˜ .
The third continuity correction suggested in Butler [12] Section 1 (pp. 17–18) is the same as formula P ^ r 2 but with u ˜ 2 replaced by u ˜ 3 = s ˜ K s ˜ .

3. An Example of the Saddlepoint Approximation for the Distribution Function

Suppose the linear combination of two independent Poisson random variables
L c 1 = c 1 X 1 + c 2 X 2 ,
where X 1 ~ P o i s s o n λ 1 and X 2 ~ P o i s s o n λ 2 . The CGF is
K L c 1 s = λ 1 e c 1 s 1 + λ 2 e c 2 s 1 .
The saddlepoint equation is
K L c 1 s ^ = λ 1 c 1 e c 1 s ^ + λ 2 c 2 e c 2 s ^ = x .
To show the performance of this method, we consider the sum of two independent Poisson random variables with means λ 1 and λ 2 , which is also a Poisson random variable with mean λ 1 + λ 2 . Let
c 1 = c 2 = 1   and   λ 1 = 1 , λ 2 = 2 .
For X = 1 , we have
K L c 1 s ^ = e s ^ + 2 e s ^ = 1 ,  
and, hence, the saddlepoint is s ^ = 1.098612289 and the PMS f ^ x from Equation (1) is
f ^ 1 = 1 2 π exp 2 + 1.098612289 = 0.1619728996 .
The exact f x is given by calculating P o i s s o n 3 when X = 1 .
f 1 = 3 e 3 = 0.1493612051 .
The absolute error of our approximation is f 1 f ^ 1 = 0.01 .
Next, we find the first continuity correction given in Equation (2)
P ^ r 1 = 1 Φ 1.34267 ϕ 1.34267 1 1.34267 1 2 = 0.9428487449
because
w ˜ 1 = [ 2 ln 1 / 3 + 2 ] 0.5   and   u ˜ 1 = 1 e 1.098 .
The exact right tail is
P X 1 = 1 f 0 = 0.9502129316
with an absolute error of 0.007 .
We note that saddlepoint approximations give accurate results in calculating the mass function and distribution function and can replace the exact answers. We can compare this with other approximation methods, for example, the normal approximation for the right tail when X = 1
1 Φ 1 E L c V a r L c = 1 Φ 1 3 3 = 0.8749
with an absolute error of 0.07 .
To prove the performance of the saddlepoint approach, the present study compared it with another method of approximation, namely, the Haldane approximation. Pentikäinen [14] proved the accuracy of that method under certain circumstances. Suppose X is a random variable with mean μ x , variance σ x 2 , and coefficient of skewness γ x . The Haldane type A approximation for the right tail is given by
H ^ = 1 Φ ( 1 + r x ˜ o ) h μ h , r σ h , r ,
where
x ˜ o = x o μ x σ x , r = σ x μ x , h = 1 γ x 3 r ,
μ h , r = 1 1 2 h 1 h 1 1 4 2 h 1 3 h r 2 r 2 ,
σ h , r = h r 1 1 2 1 h 1 3 h r 2 ,
(see Borowiak [15]). We get H ^ = 0.8849 with an absolute error of 0.06 .
From the calculations above, we conclude that the saddlepoint approximation for the distribution function is more accurate than the normal and Haldane approximations and, hence, the saddlepoint hazard (rate) function is more accurate than those of the associated normal and Haldane approximations.
The aim of this study is to compare the saddlepoint and normal approximations, which are common approximation methods.

4. The Hazard Rate Function Based on the Saddlepoint Approximation

The hazard rate function for a discrete distribution taking values in + = 0 , 1 , 2 , 3 , is defined as
h x = P X = x P X x , x + ,  
(see, e.g., Daly [16]). Consider the same example of the sum of two independent Poisson random variables with means 1 and 2 as in the previous section. Equations (3) and (5) lead to the saddlepoint approximation of the hazard rate function with the first continuity correction
h ^ s 1 1 = 0.16197288996 0.9428487449 = 0.1717909589 ,
while the exact hazard rate based on Equations (4) and (6) is
h 1 = 0.1493612051 0.9428487449 = 0.1571870895 .
Here, the absolute error for the saddlepoint and exact values is 0.01 . For the normal approximation, based on Equation (7), the hazard rate approximation is
h N 1 = 0.0805 0.8749 = 0.0920105
with an absolute error of 0.06 .
The discussion above pertains to the special case of the linear combination of two independent, not identically Poisson random variables with constant c 1 = c 2 = 1 . We consider this case, because we already know the exact solution for this sum, which gives us the possibility of comparing the exact solution with the methods of approximation.
Now we consider other positive real values for c 1 and c 2 , which gives the linear combination of independent Poisson random variables. It has a complicated distribution, and the exact mass and density functions, in most cases, are unknown or difficult to obtain. Therefore, the methods of approximation are very important.
Table 1, Table 2 and Table 3 show the exact values for the three saddlepoint continuity corrections and the normal approximations, respectively, of the hazard rate function for the linear combination L c 1 = 3 X 1 + 5 X 2 , where X 1 ~ P o i s s o n 2 and X 2 ~ P o i s s o n 3 , for different values x . The exact value is found by generating 10 6 random samples from the linear combination using the R programme.
From the above calculations in Table 1, Table 2 and Table 3, the performance of the saddlepoint approach is evident because of the relative error between the saddlepoint and the exact values. Thus, the relative error of the saddlepoint is smaller than the relative error of the normal approximation.
Next, to show the effectiveness of this method of approximation, we consider a four-component linear combination of independent Poisson random variables
L c 2 = c 1 X 1 + c 2 X 2 + c 3 X 3 + c 4 X 4 .
Applying the same technique as for two independent Poisson random variables, we obtain that the CGF is given by
K L c 2 s = λ 1 e c 1 s 1 + λ 2 e c 2 s 1 + λ 3 e c 3 s 1 + λ 4 e c 4 s 1 ,
and the saddlepoint equation is obtained as
K L c 2 s ^ = λ 1 c 1 e c 1 s ^ + λ 2 c 2 e c 2 s ^ + λ 3 c 3 e c 3 s ^ + λ 4 c 4 e c 4 s ^ = x .
For example, suppose that
X 1 ~ P o i s s o n 1 , X 2 ~ P o i s s o n 2 , X 3 ~ P o i s s o n 5 ,   and   X 4 ~ P o i s s o n 3
with constants c 1 = c 2 = c 3 = c 4 = 1 . Consider the value x = 1 . This also gives us a good opportunity to investigate the accuracy of this method, because the exact solution is already known as P o i s s o n 11 .
From the saddlepoint equation K L c 2 s ^ = x , we obtain that e s ^ = 1 11 and the saddlepoint is s ^ = 2.3978952 .
Based on Equation (1),
f ^ 1 = 1 2 π exp 10 + 2.3978952 = 0.0001992819 ,
with its corresponding exact answer f 1 = 11 e 11 = 0.0001837187 . This leads to a high level of accuracy for calculating the mass function, with an absolute error of 0.00001556 .
The exact right tail probability when x = 1 is obtained as
P X 1 = 1 f 0 = 1 e 11 = 0.99998 .
Calculating the first continuity correction by Equation (2), we obtain
w ˜ 1 = [ 2 2.3978952 + 10 ] 0.5 = 3.89925     a n d     u ˜ 1 = 1 e 2.3978952 = 9.9999 .
This provides the first continuity correction of the saddlepoint approximation for the right tail with x = 1
P ^ r 1 X 1 = 1 Φ 3.89925 ϕ 3.89925 1 3.89925 + 1 9.9999 = 1.000
with an absolute error of 0.00002 .
The corresponding normal approximation for x = 1 is
P N x 1 = 1 Φ 1 11 11 = 0.8159
with an absolute error of 0.184 . This means that the saddlepoint approximation is closer to the exact value than the normal approximation.
From the calculations above, we can derive the saddlepoint approximation for the hazard rate for the first continuity correction when x = 1 as
h ^ s 1 = f ^ 1 P ^ r 1 x 1 = 0.0001992819
with an absolute error, for the exact hazard rate h e 1 = 0.0001837 , of 0.00019 .
Table 4, Table 5 and Table 6 present the three saddlepoint approximations with their associated normal approximation, respectively.
From Table 4, Table 5 and Table 6, we see that the saddlepoint method of approximation does well, with a higher level of accuracy than the normal approximation. Thus, the relative error for the saddlepoint is smaller than the relative error for the normal approximation.

5. Conclusions

The saddlepoint approximation technique is an efficient tool that can derive an accurate approximation of the hazard rate function for complicated models, such as a linear combination of a random variable, with the relative error still bounded and very small. All three saddlepoint continuity corrections work very well and can replace the exact solution.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The author would like to thank the Editorial Board and the referees for their valuable comments and suggestions, which improved the final version of the manuscript. Special thanks to Andrei Volodin, Department of Mathematics and Statistics, University of Regina, Canada, for his valuable comments and supervision to improve the manuscript.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Daniels, H.E. Saddlepoint approximations in statistics. Ann. Math. Stat. 1954, 25, 631–650. [Google Scholar] [CrossRef]
  2. Bell, C.B.; Haller, H.S. Bivariate symmetry tests: Parametric and nonparametric. Ann. Math. Stat. 1969, 40, 259–269. [Google Scholar] [CrossRef]
  3. Kepner, J.L.; Randies, R.H. Comparison of tests for bivariate symmetry versus location and/or scale alternatives. Commun. Stat.-Theory Methods 1984, 13, 915–930. [Google Scholar] [CrossRef]
  4. Rao, K.M.; Raghunath, M. A simple nonparametric test for bivariate symmetry about a line. J. Stat. Plan. Inference 2012, 142, 430–444. [Google Scholar]
  5. Dissanayake, M.; Trindade, A.A. An empirical saddlepoint approximation method for producing smooth survival and hazard functions under interval-censoring. Stat. Med. 2020, 39, 2755–2766. [Google Scholar] [CrossRef] [PubMed]
  6. Al Mutairi, A. Improved Methods of Hazard Rate Function Calculation Using Saddlepoint Approximations. Lobachevskii J. Math. 2021, 42, 408–414. [Google Scholar] [CrossRef]
  7. Amari, S.V.; Misra, R.B. Closed-form expressions for distribution of sum of exponential random variables. IEEE Trans. Reliab. 1997, 46, 519–522. [Google Scholar] [CrossRef]
  8. Di Salvo, F. A characterization of the distribution of a weighted sum of gamma variables through multiple hypergeometric functions. Integral Transform. Spec. Funct. 2008, 19, 563–575. [Google Scholar] [CrossRef]
  9. Almasi, I.; Jalilian, R.; Sayehmiri, K. The exact distribution of sums weights of Gamma variables. J. Iran. Stat. Soc. 2012, 11, 23–37. [Google Scholar]
  10. Al Mutairi, A.O.; Low, H.C. Improved measures of the spread of data for some unknown complex distributions using saddlepoint approximations. Commun. Stat.-Simul. Comput. 2016, 45, 33–47. [Google Scholar] [CrossRef]
  11. Al Mutairi, A. Modified Branching Process and Saddlepoint Approximations. Lobachevskii J. Math. 2021, 42, 404–407. [Google Scholar] [CrossRef]
  12. Butler, R.W. Saddlepoint Approximations with Applications; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
  13. Daniels, H.E. Tail Probability Approximations. Int. Stat. Rev. 1987, 55, 37–48. [Google Scholar] [CrossRef]
  14. Pentikäinen, T. Approximative Evaluation of the Distribution Function of Aggregate Claims 1. ASTIN Bull. J. IAA 1987, 17, 15–39. [Google Scholar] [CrossRef] [Green Version]
  15. Borowiak, D.S. A saddlepoint approximation for tail probabilities in collective risk models. J. Actuar. Pract. 1999, 7, 239–249. [Google Scholar]
  16. Daly, F. On strong stationary times and approximation of Markov chain hitting times by geometric sums. Stat. Probab. Lett. 2019, 150, 74–80. [Google Scholar] [CrossRef] [Green Version]
Table 1. First continuity saddlepoint correction h ^ s 1 x with its corresponding normal approximation h ^ N 1 x for L c 1 .
Table 1. First continuity saddlepoint correction h ^ s 1 x with its corresponding normal approximation h ^ N 1 x for L c 1 .
x Exact   h x h ^ s 1 x   h ^ N 1 x Relative Error Saddlepoint 1Relative Error Normal Approximation 1
50.05160.07760.01160.05000.7550
100.06200.08200.02200.32200.6451
150.10310.07810.04310.24200.5819
200.14310.13710.07310.04190.4108
250.15740.16140.08740.02540.4552
Table 2. Second continuity saddlepoint correction h ^ s 2 x with its corresponding normal approximation h ^ N 2 x for L c 1 .
Table 2. Second continuity saddlepoint correction h ^ s 2 x with its corresponding normal approximation h ^ N 2 x for L c 1 .
x Exact   h x h ^ s 2 x h ^ N 2 x   Relative Error Saddlepoint 2Relative Error Normal Approximation 2
50.05160.07890.00880.52900.8294
100.06200.08830.01930.42400.6887
150.10310.07710.04040.25210.6081
200.14310.13060.07040.08730.5080
250.15740.16640.08470.05710.4618
Table 3. Third continuity saddlepoint correction h ^ s 3 x with its corresponding normal approximation h ^ N 3 x for L c 1 .
Table 3. Third continuity saddlepoint correction h ^ s 3 x with its corresponding normal approximation h ^ N 3 x for L c 1 .
x Exact   h x h ^ s 3 x   h ^ N 3 x   Relative Error Saddlepoint 3Relative Error Normal Approximation 3
50.05160.07710.00880.82940.8294
100.06200.07510.01930.68870.6887
150.10310.07880.04040.60810.6081
200.14310.13610.07040.50800.5080
250.15740.16690.08470.46180.4618
Table 4. First continuity saddlepoint correction h ^ s 1 x with its corresponding normal approximation h ^ N 1 x for L c 2 .
Table 4. First continuity saddlepoint correction h ^ s 1 x with its corresponding normal approximation h ^ N 1 x for L c 2 .
X Exact   h e x h ^ s 1 x   h ^ N 1 x   Relative Error Saddlepoint 1Relative Error Normal Approximation 1
10.00018370.00019928190.02563140.08480138.528
20.00101020.00102030000.0360780034.7130
30.00370480.00071830000.01234900.806112.33300
40.01023800.01024960000.085380007.33950
50.02290000.02291000000.099713003.35420
Table 5. First continuity saddlepoint correction h ^ s 2 x with its corresponding normal approximation h ^ N 2 x for L c 2 .
Table 5. First continuity saddlepoint correction h ^ s 2 x with its corresponding normal approximation h ^ N 2 x for L c 2 .
X Exact   h e x h ^ s 2 x h ^ N 2 x   Relative Error Saddlepoint 2Relative Error Normal Approximation 2
10.00018370.00028930.02990.5748161.765
20.00101020.00117510.041034.71339.586
30.00370480.00077100.01410.79182.8058
40.01023800.01928100.07790.88326.6080
50.02290000.02281000.08891.22702.8820
Table 6. First continuity saddlepoint correction h ^ s 3 x with its corresponding normal approximation h ^ N 3 x for L c 2 .
Table 6. First continuity saddlepoint correction h ^ s 3 x with its corresponding normal approximation h ^ N 3 x for L c 2 .
X Exact   h e x h ^ s 3 x   h ^ N 3 x   Relative Error Saddlepoint 3Relative Error Normal Approximation 3
10.00018370.00018990.0299100.0337161.819
20.00101020.00199800.0401300.977838.7200
30.00370480.00080110.0139910.78372.77600
40.01023800.02033000.0903400.98577.82390
50.02290000.03891000.0889500.69912.88420
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Al Mutairi, A. Advanced Approach for Estimating Failure Rate Using Saddlepoint Approximation. Symmetry 2021, 13, 2041. https://doi.org/10.3390/sym13112041

AMA Style

Al Mutairi A. Advanced Approach for Estimating Failure Rate Using Saddlepoint Approximation. Symmetry. 2021; 13(11):2041. https://doi.org/10.3390/sym13112041

Chicago/Turabian Style

Al Mutairi, Alya. 2021. "Advanced Approach for Estimating Failure Rate Using Saddlepoint Approximation" Symmetry 13, no. 11: 2041. https://doi.org/10.3390/sym13112041

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop