Next Article in Journal
Global Asymptotic Stability Analysis of Fixed Points for a Density-Dependent Single-Species Population Growth Model
Previous Article in Journal
Particle Swarm Training of a Neural Network for the Lower Upper Bound Estimation of the Prediction Intervals of Time Series
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Exponential Dispersion Model Generated by the Landau Distribution—A Comprehensive Review and Further Developments

Faculty of Industrial Engineering and Technology Management, HIT-Holon Institute of Technology, 52 Golomb St., Holon 5810201, Israel
Mathematics 2023, 11(20), 4343; https://doi.org/10.3390/math11204343
Submission received: 12 September 2023 / Revised: 13 October 2023 / Accepted: 16 October 2023 / Published: 19 October 2023
(This article belongs to the Section Probability and Statistics)

Abstract

:
The paper comprehensively studies the natural exponential family and its associated exponential dispersion model generated by the Landau distribution. These families exhibit probabilistic and statistical properties and are suitable for modeling skewed continuous data sets on the whole real line. The study explores and further develops various probabilistic properties, including reciprocity, self-decomposability, reproducibility, unimodality, and characterizations. It delves into statistical aspects such as maximum-likelihood estimation, hypothesis testing, and generalized linear models.

1. Introduction

The present paper reviews, develops, and provides a comprehensive and thorough study of the natural exponential family (NEF) of distributions along with its exponential dispersion model (EDM), which is generated by the Landau distribution. Such NEFs and EDMs are associated with exponential variance function (VF). They will be abbreviated henceforth as NEF-EVF and EDM-EVF, respectively. These families are absolutely continuous, supported on the whole real line, and abundant with probabilistic and statistical properties.
Our study embraces both properties of these families and contains a collection of scattered results in the literature and the development of additional new results. We present these families exhaustively in a unified approach and propose them as statistical model candidates for fitting skewed continuous data sets on the real line (with or without covariates). Various other continuous distributions supported on the whole real line are available in the statistical literature (e.g., the Behrens–Fisher distribution [1], the exponentially modified Gaussian distribution [2], the hyperbolic secant distribution [3], and the Johnson SU distribution [4]. However, we trust that our proposed NEF and EDM have many virtues (detailed in the sequel), making them important models for statistical modeling and GLM applications.
The standard Landau distribution μ 0 is stable and supported on R . In general, the α -stable distribution requires four parameters for a complete description: a stability index α ( 0 , 2 ] (c.f., [5]), a skewness parameter β [ 1 , 1 ] , a scale parameter σ > 0 , and a location parameter b R . For μ 0 , the corresponding set of parameters is α = β = 1 , σ = π / 2 , and b = 0 , yielding a characteristic function (c.f.)
φ μ 0 ( t ) = exp π 2 | t | i t log | t | ) .
and density
f 0 ( x ) = 1 π 0 e x y y log y sin ( π y ) d y , x R .
The measure μ 0 is analyzed in [5]. It is also studied by [6] and further discussed in different contexts by [7,8,9]. It is named after Leb Landau due to its brilliant presentation in ionization losses—the energy losses by fast-charged particles traveling through matter. This process has been studied for over 100 years, and the theoretical explanation spans a similar period. About 80 years ago, ref. [10] published a theoretical paper on the subject, drastically leveling up the research and remaining among the most cited in the field. See [11,12] for more on the history of the theoretical developments and attempts to clarify Landau’s method of research and the function named after him).
By transforming the measure μ 0 by x x + b , b R , we obtain the measure μ b with density
f b ( x ) = 1 π 0 e ( b x ) y y log y sin ( π y ) d y , x R
and c.f.
φ μ b ( t ) = exp π 2 | t | + i t ( b log | t | ) .
This simple translation transformation yielding μ b has some practical importance, as seen in the sequel. First, it provides, for b = 1 , a more elegant expression for the VF of the NEF-EVF, and second, it allows a simple computation of the density of its corresponding EDM.
The paper is organized as follows. Section 2 is dedicated to preliminaries. We first elaborate more on the measure μ b and its Laplace transform and provide a new proof for the form of its Laplace transform that is needed for all derivations in the sequel. We then present basic preliminaries on NEFs, EDMs, and associated VFs. Section 3 introduces the EDM-EVF generated by the measure μ b and derives the corresponding densities. Section 4 and Section 5 present numerous probabilistic and statistical properties of EDM-EVFs. In Section 4, we derive expressions for the cumulants, skewness, and kurtosis coefficients and show that the EDM-EVF densities are skewed to the right and leptokurtic (i.e., have fatter tails). We show that the VF of the EDM-EVF is a limit of VFs in the Tweedie scale and thus reason to call it the Tweedie family with power infinity. We study the EDM-EVF concerning the following properties: reciprocity, self-decomposability and unimodality, reproducibility in the broad and regular sense, duality, chainability (a new notion), characterizations by zero regression on the sample mean, and large deviations. Section 5 describes and develops various aspects of statistical features related to EDM-EVFs. In particular, we consider maximum-likelihood estimation, second-order minimax estimation of the mean, testing hypotheses, and describe practical steps toward GLM applications. Some concluding remarks are presented in Section 6.

2. Preliminaries

This section includes two subsections. The first is devoted to deriving the Laplace transform (LT) of the Landau distribution μ b , which is needed to define the set NEF-EVF. The second introduces some essential preliminaries on NEFs and EDMs required to determine the sets NEF-EVF and EDM-EVF generated by the Landau distribution μ b .
Let μ be a non-Dirac positive measure on R . Its LT L μ and effective domain D μ are defined as
L μ ( θ ) = R e θ x μ ( d x )
and
D μ = θ R : L μ ( θ ) < 0 .
Also, let
Θ μ = i n t D μ
and define the cumulant transform of L μ by
k μ ( θ ) = L μ ( θ ) , θ Θ μ .

2.1. The Laplace Transform of the Landau Distribution

The form of the LT of the Landau measure μ 0 is presented in various sources in the literature as
L μ 0 ( θ ) = R e θ x μ 0 ( d x ) = ( θ ) θ , θ < 0 ,
from which that of μ b is simply
L μ b ( θ ) = R e θ x μ b ( d x ) = e b θ ( θ ) θ , θ < 0 , b R .
However, all our attempts to find proof for (5) in the literature were not successful. It is, of course, possible to derive the LT of μ 0 from its c.f. given in (1). However, such a derivation is only sometimes simple. Since the present paper reviews and studies in detail the NEF and EDM generated by the Landau distribution—a study which is heavily dependent on the form of L μ b —we decided, for the sake of completeness, to provide a proof for (6). This proof is presented in Proposition 1 below. It was provided to the author by Gérard Letac (Institut de Mathématiques de Toulouse, Université Paul Sabatier, France).
Proposition 1.
The Laplace transform of μ b has the form given in (6).
Proof. 
In the Lévy canonical representation, a c.f. of an infinitely divisible distribution μ has the form
e i t x μ ( d x ) = exp i b t 1 2 σ 2 t 2 + R { 0 } ( e i t x 1 i t τ ( x ) ) ν ( d x ) .
where b is a location parameter, σ 2 is the Gaussian parameter, τ ( x ) is a centering function, and ν is the Lévy measure defined by its c.f. (see [5,6]), which can be either type 0 , 1 , or 2. If the Lévy measure ν has type 2, then the convex support of μ is R even if the support of ν is contained in ( 0 , ) and the Gaussian coefficient σ 2 is zero. Ref. [6] uses τ ( x ) = sin x , while the Russian literature uses τ ( x ) = x / ( 1 + x 2 ) . We shall adopt Feller’s usage of τ . Since μ b is stable supported on R and its associated Lévy measure ν measure is ν ( d x ) = 1 x 2 I ( x > 0 ) d x , it follows that its c.f. is
exp i b t + R { 0 } ( e i t x 1 i t sin x ) 1 x 2 d x
and thus its LT has the form
L b ( θ ) = e θ x μ b ( d x ) = e θ b + 0 ( e θ x 1 θ sin x ) d x x 2
The trick is to use first integration by parts and then split into two integrals.
0 ( e θ x 1 θ sin x ) d x x 2 = θ 0 ( e θ x cos x ) d x x = θ ( I 1 + I 2 )
where
I 1 = 0 ( e θ x e x ) d x x , I 2 = 0 ( e x cos x ) d x x .
Recall that the Frullani integral is 0 ( f ( a x ) f ( b x ) ) d x x = f ( 0 ) log ( b / a ) , when a , b > 0 , f is continuous on [ 0 , ) and 1 | f ( x ) | d x x converges. By applying this to f ( x ) = e x , a = θ and b = 1 , we obtain I 1 = log ( θ ) . To show that I 2 = 0 is more complicated, a way to proceed is to consider the entire function f ( z ) = ( e z e i z ) / z in
0 R ( e x cos x ) d x x = 0 R f ( z ) d z ,
and to apply the Cauchy theorem to f and the segment ( 0 , R ) , to the quart of circle { R e i t ; 0 t π / 2 } , and to the segment ( i R , 0 ) . In this way, we obtain
2 0 R f ( z ) d z = i 0 π / 2 ( e R e i t e i R e i t ) d t .
We now use the following majorization for 0 t π / 2 :
e R e i t e i R e i t | e R e i t | + | e i R e i t | = e R   cos t + e R   sin t 2 e R / 2 R 0 .
Therefore, I 2 = 0 . Another way to show this is to compute F ( s ) = 0 ( e x cos x ) e s x d x x for s > 0 . Since
F ( s ) = 1 1 + s s 1 + s 2
is easily computed and since F ( ) = 0 , we obtain that F ( s ) = log ( 1 + s ) 1 2 log ( 1 + s 2 ) , and thus I 2 = F ( 0 ) = 0 . This concludes the proof of (6). □

2.2. Preliminaries on NEFs and EDMs

Let μ be a non-Dirac positive measure on R with LT L μ and cumulant transform k μ as defined (1)–(4). We henceforth define the notions of NEF, VF, EDM, and other related quantities needed to describe NEF-EVF (for a good reference, see [13]).
NEF. The NEF F generated by μ is defined by the probabilities
F = F ( μ ) = P ( θ , μ ( d x ) ) = exp θ x k μ ( θ ) μ ( d x ) , θ Θ μ .
Full NEF. It is defined as follows. Let F ( μ ) be an NEF generated by μ . If μ is a probability measure, then
F ¯ = F ¯ ( μ ) = F ( μ ) μ
is called the full NEF generated by μ .
Cumulants. The cumulants of F ( μ ) are obtained by the derivatives of k μ . The r-th cumulant of F ( μ ) is given by
κ r ( θ ) d r k μ ( θ ) d θ r , θ Θ μ , r = 1 , 2 ,
In particular, κ 1 ( θ ) = k μ ( θ ) and κ 2 ( θ ) = k μ ( θ ) are the mean and variance of F ( μ ) .
Mean domain. The image M F = κ 1 ( Θ μ ) of Θ μ under κ 1 is called the mean domain of F ( μ ) . Since k μ is strictly convex and real analytic on Θ μ , the map θ κ 1 ( θ ) is one to one, and its inverse function ψ μ : M F Θ μ is well defined.
VF. The variance corresponding to P ( m , F ) is V F ( m ) = 1 / ψ μ ( m ) = k μ ( θ ) . The map m V F ( m ) from M F into R + is called the VF of F. More precisely, a VF of an NEF F is a pair ( V F , M F ) . It uniquely determines an NEF within the class of NEFs c.f., (see [3,14]). Since they were first defined by [3], various classes of VFs have appeared in numerous papers in the last four decades. Morris himself characterized all six NEFs having (up) to quadratic VFs (e.g., Poisson with VF ( m , R + ) , gamma with VF ( a m 2 , R + ) , inverse Gaussian with VF ( a m 3 , R + ) ) .
Steep NEF. An NEF F = F ( μ ) is called steep iff M F = i n t C μ . The steepness of F (or μ ) ensures that the MLE of the canonical parameter θ or the related mean exists with probability 1 and is given as the unique solution of the maximum-likelihood equation based on n independent replicas X 1 , , X n taken from P ( θ , μ ) (c.f., [15], Chapter 9). This property is essential in various derivations of the MLE and GLM applications (see Section 4.8 and Section 6).
Mean value parameterization. For a given VF ( V F , M F ) of an NEF F, it is clear that ψ μ ( m ) = θ and φ μ ( m ) k μ ( θ ) are two primitives of 1 / V ( m ) and m / V ( m ) , respectively; i.e.,
ψ μ ( m ) = d m V ( m ) , φ μ ( m ) = m d m V ( m ) .
Accordingly, the NEF F can be represented as
F = P ( m , μ d x ) = exp x ψ μ ( m ) φ μ ( m ) μ ( d x ) : m M F .
The reparameterization in (11) is called the mean value parameterization of F [13], Proposition 2.3, and [16]. Such a parameterization of F in terms of its mean m is more appealing than that of θ , as θ is just an artificial parameter (the argument of the LT of μ ).
EDM. An EDM is related to an NEF as follows. The Jorgensen set Λ = Λ F associated with F is defined by
Λ = Λ ( F ) = p R + : p k μ ( θ ) is a cumulant transform of some Radon measure μ p .
Note that μ p , p Λ , is the p-th fold convolution of μ (even if p is not a positive integer); i.e., L μ p ( θ ) = L μ p ( θ ) , θ Θ μ .
The Jorgensen set is nonempty since, by convolution, it contains N . Also, Λ = R + iff μ is infinitely divisible (and thus F ( μ ) is composed of infinitely divisible (i.d.) distributions).
For p Λ , the NEF F p F ( μ p ) generated by μ p is
F p F p ( μ p ) = P ( θ , p , μ p ( d x ) ) = exp θ x p k μ ( θ ) μ p ( d x ) , θ Θ μ , p Λ ,
where the support of μ p may depend on p. For p Λ , we denote, respectively, the mean function, mean domain, and VF of the NEF F p by m p , M p and V p (instead of m F p , M F p , and V F p ). These are given by
m p = p κ ( θ ) = p m , M p = p M F
and
V p ( m ) = p κ μ ( θ ) = p V F ( m ) , m M F or V p ( m p ) = p V p ( m p p ) , m p M p .
Then, the set of families
G = F p : p Λ
is called the EDM associated with F ( μ ) , and it is parameterized by ( θ , p ) Λ × Θ μ .
EDMs have been studied thoroughly by [17,18] and others, suggesting them to describe the error component in generalized linear models (GLMs). The statistical literature contains hundreds of articles applying EDMs in GLM methodology.
Remark 1.
It should be noted, however, that applying EDMs in GLM methodology requires the knowledge of the exact expression of the measure μ p , p Λ , appearing in the EDM model (12). Such knowledge could be very complicated, if at all attainable. To feel the last statement’s complexity, we will refer to a case where p = n N . For this case, μ n , n N , is the n-th fold convolution of the generating measure μ of F ( μ ) —a rather complicated computational task.
In the sequel, when no confusion is caused, we shall suppress the dependence of D μ , Θ μ , k μ , M F , and V F on μ and F and write D , Θ , k , M , and V .
We now post two rules regarding appropriate transformations on an F = F ( μ ) to conclude these introductory preliminaries. These are
The Jorgensen rule:
if p Λ ( F ) then V F p ( m ) = p V F ( m / p )
and
the affine rule:
if φ ( x ) = a x + b then V φ ( F ) ( m ) = a 2 V F ( m b / a ) ) .

3. The NEF-EVF and EDM-EVF

These are described in the two following subsections.

3.1. The NEF-EVF Generated by the Landau Distribution μ 1

The density and LT of the Landau measure μ b are given, respectively, by (1) and (6). Hence,
m b = k μ b ( θ ) = b ln θ 1 , V F b ( θ ) = k μ b ( θ ) = 1 θ , θ < 0 ,
and thus
V b ( m ) = e m b b + 1 , m b R .
The choice b = 1 in (16) provides a more elegant form of the VF. We adopt this choice and consider the NEF-EVF to be generated by μ 1 with cumulant transform
k 1 ( θ ) k μ 1 ( θ ) = θ θ log ( θ ) .
Hence, the NEF-EVF F ( μ 1 ) is presented by density and VF,
f 1 ( x ; θ ) f 1 ( x ) exp θ x θ θ ln ( θ ) , x R , θ < 0 ,
and
V ( m ) = ( e m , R ) ,
where, by (1), f 1 is
f 1 ( x ) = 1 π 0 e ( 1 x ) y y log y sin ( π y ) d y .
The NEF-EVF F ( μ 1 ) is discussed in the context of EDMs by [17,18,19]. (For simplicity of notation, we have suppressed the dependence of m 1 and V 1 on b = 1 ).
Note that the full family F ¯ ( μ 1 ) (see (8)) related to F ( μ 1 ) is well defined as μ 1 is a probability measure, in which case F ¯ ( μ 1 ) = F ( μ 1 ) μ 1 . The full family is relevant in Section 4.4 when discussing the properties of self-decomposability and unimodality of F ( μ 1 ) .
The mean value parameterization of F ( μ 1 ) is obtained from (19), (10) and (11), with
ψ 1 ( m ) = θ ( m ) = e m and ϕ 1 ( m ) = k 1 ( θ ( m ) ) = e m m + 1 .
Thus, the mean value parameterization of (18) is
f 1 ( x ) exp e m x e m m + 1 , x , m R .
Such a representation of F ( μ 1 ) is needed when discussing GLM methodology in Section 5.4.

3.2. The EDM-EVF Generated by μ 1

Since μ 1 is stable (and thus is infinitely divisible), L 1 p is a LT of some measure μ p , for all p R + . Hence, the EDM-EVF generated by μ 1 has, by (12), densities of the form
f p ( x ; θ ) μ p ( x ) exp θ x p k μ ( θ ) = μ p ( x ) exp θ x p θ θ ln ( θ ) , x R , θ < 0 , p Λ ,
where
k p ( θ ) = p θ θ ln ( θ )
is the cumulant transform of μ p . So, the question is: what is μ p ? As indicated before, the answer is a rather complicated problem. Luckily enough, for our case, μ 1 is stable—a fact allowing the computation of μ p while using the general form of the density f b given in (1). This is performed in the following proposition.
Proposition 2.
Consider the EDM-EVF in (23), then
(i) The Jorgensen set of μ 1 is Λ = R + .
(ii) With f b denied in (1), μ p is
μ p ( x ) = 1 p f 1 + ln p ( x / p ) ,
and the densities f p ( x ; θ ) of the EDM-EVF in (23) are
l f p ( x ; θ ) = 1 p f 1 + ln p ( x / p ) exp θ x p θ θ ln ( θ ) = 1 p π 0 e ( 1 x p ) y y log y p y sin ( π y ) d y exp θ x p θ θ ln ( θ ) , x R , ( θ , p ) R × R + .
Proof. 
(i) This is simple as μ 1 is infinitely divisible, implying that L 1 p is a LT for all p R + = Λ .
(ii) Note that
p k 1 ( θ ) = p θ θ ln ( θ ) = p θ p θ ln ( p θ ) ln p = p θ 1 + ln p p θ ln ( p θ )
i.e.,
p k 1 ( θ ) = k 1 + ln p ( p θ ) ,
where k μ b ( θ ) is defined in (6). This means that
L p ( θ ) = L 1 p ( θ ) = e θ x μ p ( d x ) = e p θ x f 1 + ln p ( d x ) ,
which, by changing variables x x = p x , leads to
L p ( θ ) = 1 p e θ x f 1 + ln p ( x / p ) d x ,
or, equivalently, that
μ p ( x ) = 1 p f 1 + ln p ( x / p ) .
Thus, using (1) with b = 1 + ln p leads to (25). □
Remark 2.
In particular, we should notice the simple but interesting case where p = n . This case concerns the convolution of n i.i.d. random variables X 1 , , X n taken from an NEF-EVF distribution. So if Y n = i = 1 n X i , then by (25) the density of Y n is
f n ( x ; θ ) = 1 n π 0 e ( 1 x n ) y y log y n y sin ( π y ) d y exp θ x n θ θ ln ( θ ) , x R .
The VF corresponding to EDM-EVF is by (12)
( V , M ) = p e m / p , R , p > 0 ,
where its mean value parameterization is obtained from (10) by setting
ψ p ( m ) = e m p , φ p ( m ) = e m p m + p ,
in which case
l f p ( x ; θ ) = 1 p f 1 + ln p ( x / p ) exp e m p x ( e m p m + p , x R , ( m , p ) R × R +
Figure 1 plots the EDM-EVF density (26) for the four couples ( p , m ) = ( 1 , 2 ) , ( 1 , 6 ) , ( 0.5 , 2 ) , ( 0.5 , 4 ) . It can be seen that its skewness to the right is well evident.

4. Probabilistic Features of NEF-EVF and EDM-EVF

This section will present and develop several probabilistic features of both NEF-EVF and EDM-EVF. Usually, we present these for EDM-EVF, as those for NEF-EVF are obtained by setting p = 1 . However, sometimes, we only represent them for NEF-EVF for the sake of notation simplicity.
These probabilistic features include: (1) A derivation of cumulants, skewness, and kurtosis coefficients. Primarily, we show that all such distributions are skewed to the right and leptokurtic; (2) a presentation of EVFs as a limit of a sequence of VFs in the Tweedie scale; (3) the property of reciprocity; (4) properties of self-decomposability and unimodality; (5) reproducibility in the broad and regular sense; (6) duality; (7) chainability (a new property of infinitely divisible distributions); (8) characterizations by zero regression on the sample mean; and (9) large deviations.

4.1. Cumulants, Central Moments, Skewness, and Kurtosis

By (9), the r-th cumulant is κ r = d r k p ( θ ) / d θ r , where k p is given by (24). This yields
κ r + 1 ( θ ) = p ( r 1 ) ! θ r , r = 1 , 2 ,
or in terms of m ,
κ r + 1 = p ( r 1 ) ! e r m / p , r = 1 , 2 ,
Let m r , r = 2 , , denote the r-th central moment. Then, m 2 = κ 2 , m 3 = κ 3 and
m r + 2 = κ r + 2 + j = 2 r r + 1 j m j κ r j + 2 , r = 2.3 ,
The skewness and kurtosis coefficients are
l γ 1 = k 3 k 2 3 / 2 = p 1 / 2 e m / 2 p > 0 , γ 2 = k 4 + 3 k 2 2 k 2 2 = κ 4 κ 2 2 + 3 = 2 p 1 e m / p + 3 > 0 .
Hence, all members of the EDM-NEF are skewed to the right and leptokurtic. Note that (29) entails an interesting observation that the kurtosis coefficient γ 2 is quadratic in γ 1 ; i.e.,
γ 2 = γ 1 2 + 3 .
Also, note that by (27) and (28) all central moments are positive. Accordingly, by defining
γ 2 n + 1 = m 3 m 2 n + 3 m 2 n + 3 , n 1 ,
as general measures of “skewness” [20], Section 3.31, then all such skewness measures are also positive.

4.2. NEF-EVFs as a Limit of a Sequence of VFs in the Tweedie Scale

Mora [14] (see also [13]) discussed the situation when a limit of a sequence of VFs is a VF. As this work is also a review paper, we find it beneficial to quote Mora’s result.
Theorem 1
([14]). Let ( F n ) n = 1 be a sequence of NEF’s with VF’s ( V n , M n ) . Assume there exists a nonempty open interval J contained in n = 1 M n and a strictly positive function V on J with lim n V n ( m ) = V ( m ) , uniformly on all compact subintervals of J. Then:
(i) There exists an NEF F such that M F J and such that V F restricted to J is equal to V.
(ii) For all m J , lim n P ( m , F n ) = P ( m , F ) , in the week convergence sense.
We apply this theorem for an NEF-EVF (where the same holds for EDM-EVF). First note that ( m n , R + ) is a VF for all n N (c.f., [21]). For the latter VF, apply the Jorgensen rule (13) with p = n and then the affine rule (14) with
φ ( x ) = n 1 n 2 x n .
This yields
( V n , M n ) = ( 1 + m n ) n , ( n , ) .
Finally, by letting n , we obtain the limit ( e m , R ) —the VF of the NEF-EVF in (19). The latter limiting behavior is reasoned to call the NEF-EVF, a Tweedie NEF with power infinity (c.f., [17]).

4.3. Reciprocity

The definition of reciprocity among two NEFs is a bit ’boring’ and tedious, but on the other hand, it has some interesting probabilistic interpretations. For brevity, we provide here a theorem (which can also serve as a definition) regarding a reciprocal pair of NEFs, taken from [13] (for more details, see their Definition 5.1, Proposition 5.1, and Theorem 5.2).
Theorem 2
([13], Theorem 5.2). Let F and F 1 be two NEF’s and denote M ˜ F = M F ( 0 , ) , M ˜ F 1 = M F 1 ( 0 , ) . Then, ( F , F 1 ) is a reciprocal pair iff the three following conditions hold: (i) M ˜ F and M ˜ F 1 are nonempty; (ii) the mapping m 1 / m is bijective from M ˜ F onto M ˜ F 1 ; (iii)
V F ( m ) = m 3 V F 1 ( 1 / m ) , m M ˜ F .
The most famous examples of reciprocal pairs ( F , F 1 ) are perhaps: (i) The normal and inverse Gaussian NEF’s given, respectively, by their VF’s: ( V F , M F ) = ( p , R ) , where p > 0 is constant, and ( V F 1 , M F 1 ) = ( p m 3 , R + ) ; and (ii) The exponential and Poisson NEFs given, respectively, by their VF’s: ( V F , M F ) = ( m 2 , R + ) and ( V F 1 , M F 1 ) = ( m , R + ) .
Although a general probabilistic interpretation of a reciprocity is still lacking, certain cases (as the above two examples) can be explained using fluctuation theory (see [22], pp. 24–26, and [23]).
We now apply reciprocity to the NEF-EVF F ( μ 1 ) given by (18). Consider the image of F ( μ 1 ) of F ( μ 1 ) by dilation mapping φ : x x , and consider the pair ( F ( μ 1 ) , F ( μ 1 ) ) . The VF of F ( μ 1 ) is ( e m , R ) . As F ( μ 1 ) is composed of infinitely divisible members (see next property below). Hence, its Lévy measure is concentrated on the negative line, and thus, F ( μ 1 ) admits a reciprocal NEF, say A, whose mean domain is R + and VF
( V A , M A ) = ( m 3 e 1 / m , R + ) .
Here, A is the family of stopping times T = inf t : X ( t ) = 1 , where X ( t ) is a Lévy process such that the distribution of X ( 1 ) is P ( θ 0 , μ 1 ) (given by (18)) when θ 0 varies in ( , 0 ) ) . If we now consider the image F ( μ b ) of F ( μ 1 ) by a translation mapping x x + b , then like the above, F ( μ b ) admits a reciprocal NEF, say A b having mean domain M A b = R + and VF V A b = m 3 e 1 / m + b = e b V A ( m ) . This fact makes a marked difference among other NEFs and could be formulated in a sort (rather trivial) characterization of F ( μ 1 ) in (18).

4.4. Self-Decomposabilty and Unimodality

Let P be a probability on R and H α ( P ) be its image by the map x H α ( x ) = α x ( α 0 ). Then P is said to be self-decomposable if for α ( 0 , 1 ) there exists a probability Q α on R such that P = Q α H α ( P ) , where ∗ indicates convolution. Self-decomposabilty is an important property with a significant amount of literature—see [5,6,24]. All self-decomposable probabilities are also infinitely divisible. However, a striking property of self-decomposability is that it implies both absolute continuity and unimodality of P. This property has been shown by [25]. All stable distributions are self-decomposable and thus unimodal ([5]).
Bar-Lev Bshouty and Letac [26] dealt with the problem: Consider a full NEF F ¯ ( μ ) = P ( θ , μ ( d x ) ) , θ D μ generated by μ M . If a member of F ¯ ( μ ) is self-decomposable, can one conclude that all other members of F ¯ ( μ ) also have this property? They showed that this does not generally hold but provided necessary and/or sufficient conditions for this property. These conditions are related to the behavior of the Lévy measure associated with μ . In particular, they showed the full NEF-EVF F ¯ ( μ 1 ) generated by μ 1 in (18) satisfies such conditions and thus is composed of self-decomposable members, implying all NEF-EVF (and therefore its associated EDM-EVF) distributions are unimodal.

4.5. Reproducibility in the Broad and Regular Sense

Bar-Lev and Casslis [27] defined the notion of reproducibility in the broad sense and developed a discrete version of this definition. They showed that all NEFs in the Tweedie scale are such. This property is defined as follows. Let F ( μ ) be the NEF generated by μ . Then F is said to be reproducible in the broad sense if there exists a real number p belonging to the Jorgensen set Λ F and affine transformation f α , β : x α x + β , α 0 , such that f α , β ( F ) = F p . In other words, an NEF F is reproducible in the broad sense if a p-th power convolution of F equals an affine transformation of F .
The NEF-EVF F ( μ 1 ) can easily be shown to be reproducible in the broad sense. Indeed, the cumulant transform of μ 1 and μ p ( p Λ = R + ) are given, respectively, by (17) and (24) as θ θ ln ( θ ) and p θ θ ln ( θ ) . Thus, f α , β ( k 1 ) = k p implies
α θ α θ ln ( α θ ) + β θ = p θ θ ln ( θ ) , θ < 0 .
Thus, by choosing α = p and β = p ln p , we obtain
f p , p ln p ( F ( μ 1 ) ) = F p ( μ 1 ) ( = F ( μ p ) ) for all p > 0 ,
implying that F ( μ 1 ) is reproducible in the broad sense.
The regular definition of reproducibility, which preceded the one given by [27], is a particular case of the above definition. It was first defined by [28] for the one-parameter NEFs. It resulted in the characterization of NEFs having power variance functions or NEFs in the Tweedie scale (in this respect, see also [29].
Here, however, we consider a generalization of their definition by defining it as a two-parameter family. Indeed, let F = F ω 1 , ω 2 : ( ω 1 , ω 2 ) Υ R 2 be a family of distributions indexed by two parameters ω = ( ω 1 , ω 2 ) Υ , where Υ has a nonempty interior in R 2 . Also, let X 1 , , X n be i.i.d. r.v.’s with L ( X 1 ) = F θ F (where L ( X ) stands for the law of X 1 ). Then, F is said to be reproducible if, for all ω Υ and n N , there exist sequences ( α n ) n 1 and ( β n ) n 1   : N R , and there exists a mapping ( g n , h n ) : Υ Υ , n N , such that
L ( α n i = 1 n X i + β n ) = F ( g n ( ω ) , h n ( ω ) ) F , ( ω 1 , ω 2 ) Υ , n N .
This problem of reproducibility, in its general setting, is rather complex to solve, and its complexity is discussed in Bar-Lev (2021). We shall implement, however, this definition of reproducibility for the EDM-EVF F ( μ p ) given in (23) and (24), while considering ω 1 = θ , ω 2 = p and Υ = Θ × Λ = R × R + . Note that (31) can be expressed in terms of k p (see (24))—the cumulant transform of μ p as
n k p ( α n θ ) + β n θ = k h n ( p ) ( g n ( θ ) )
or
n p α n θ α n θ ln ( α n θ ) + β n θ = h n ( p ) g n ( θ ) g n ( θ ) ln ( g n ( θ ) ) ,
The general solution of (32) is rather intricate and cumbersome. So, we leave it as an open problem. We do, however, demonstrate some special solutions:
(a)
h n ( p ) = n p , α n 1 , β n 0 , and g n ( θ ) θ . This result implies that L ( i = 1 n X i ) = F ( μ n p ) , i.e., if X 1 , , X n are i.i.d. taken from F ( μ p ) , then the distribution of their random sum belongs to F ( μ n p ) —a rather important fact for statistical applications.
(b)
h n ( p ) = p , β n 0 , g n ( θ ) = n α n θ , where α n > 0 is increasing in n N . The implications of this result are the following: If X 1 , , X n are i.i.d. taken from F ( μ p ) , then the distribution of their random sum i = 1 n X i belongs to F ( μ 1 ) but with parameter n α n θ instead of θ —a quite surprising result.

4.6. Duality

The notion of duality among NEFs has been introduced by [30] as follows. Consider two NEFs F ( μ ) and F ( μ * ) generated by the two measures μ , μ * M and let k μ and k μ * , V F ( μ ) and V F ( μ * ) , Θ μ and Θ μ * , be their corresponding cumulant transforms, VFs, and canonical parameter spaces. Also, denote by l μ ( s ) = k μ ( s ) , s S μ = Θ μ . Then, μ * is called the dual of μ if
l μ * ( l μ ( s ) ) = s ,
which implies that
l μ * ( m ) = 1 V F ( μ ) ( m ) .
As k μ = k 1 = θ θ ln ( θ ) for an NEF-EVF and k μ * = e θ for the Poisson NEF generated by μ * = n = 0 1 n ! δ n , it follows that these two families are dual (c.f., [30], Section 4.1). Among many others, one more example of duality is the pair of normal and inverse Gaussian NEFs. It should be noted, however, that duality is not valid for all NEFs.

4.7. Chainability

We introduce a new notion of a property regarding infinitely divisible probability measures, which we term chainability. Let
M = { Non-Dirac Positive Radon measures μ on R with Θ μ = i n t D μ ϕ }
and M ¯ be the union of M with the set of positive Dirac measures on R .
It is well known ([6], Chapter XIII) that a measure ρ 0 M is infinitely divisible (i.d.) iff there exists a measure ρ 1 M ¯ such that
Θ ρ 1 = Θ ρ 0 and L ρ 1 = k ρ 0 .
If ρ 1 is also i.d., then there exists a measure ρ 2 M ¯ such that
Θ ρ 2 = Θ ρ 1 and L ρ 2 = k ρ 1
This procedure can proceed by assuming that ρ 2 and then ρ 3 and so forth, are also i.d. This process leads to the following property of chainable i.d. measures on M .
Definition 1.
With the definitions of M and M ¯ above, let ρ 0 be an i.d. measure in M and ρ k k = 0 a sequence of i.d. measures in M ¯ . Then, ρ 0 is called infinitely chainable iff
L ρ n = k ρ n 1 , n = 1 , 2 , , with Θ ρ n = Θ ρ 0 , n N .
It is called chainable of order r N , if ρ 1 , , ρ r are i.d. measures in M ¯ such that
l L ρ i = k ρ i 1 , for i = 1 . , r , but L ρ r + 1 = k ρ r is not a LT of an i . d . measure in M ¯ .
The problem of chainability raises some stimulating probabilistic questions. For instance, formulating necessary and/or sufficient conditions under which an i.d. measure ρ 0 M is infinitely chainable or just chainable of order r , r 2 . Yet, responding to such questions or others goes beyond this paper’s scope as it deserves a special study.
Here, we only analyze the Landau distribution μ 1 generating the NEF-EVF F ( μ 1 ) , with respect to chainability.
Proposition 3.
The Landau distribution μ 1 is infinitely chainable.
Proof. 
For simplicity, denote ρ 0 = μ 1 and define the i.d. measure ν α , β M by
ν α , β ( d x ) = β x α 1 / Γ ( α ) I ( x > 0 ) d x , α > 0 , β > 0 ,
where
L ν α , β ( θ ) = 0 e θ x ν α , β ( d x ) = β ( θ ) α , θ < 0 .
Now, for ρ 0 given by (23), we have
k ρ 0 ( θ ) = θ θ ln ( θ ) , k ρ 0 ( θ ) = ln ( θ ) , and k ρ 0 ( θ ) = ( θ ) 1 .
Thus, ρ 1 = ν 1 , 1 with L.T.
L ρ 1 ( θ ) = k ρ 0 ( θ ) = ( θ ) 1 ,
for which
k ρ 1 ( θ ) = ln ( θ ) , k ρ 1 ( θ ) = ( θ ) 1 , and k ρ 1 ( θ ) = ( θ ) 2 ,
imlying that ρ 2 = ν 2 , 1 with
L ρ 2 ( θ ) = k ρ 1 ( θ ) = ( θ ) 2 .
Continuing this way, we obtain, by a simple induction, that
L ρ n ( θ ) = 2 ( θ ) 2 , n 3 ,
i.e.,
ρ n = ν 2 , 2 fo all n 3 .
This concludes the proof. □

4.8. Zero Regression Characterizations of F ( μ 1 )

Let X = ( X 1 , , X n ) be a random sample taken from a common distribution P and let S = S ( X ) be a polynomial statistic (in the X i ’s) such that the regression of S on the sample mean X ¯ is zero (or constant). If P is the only distribution for which such a property holds, we say that P is characterized by the zero regression of S on X ¯ . The pioneering study of such characterizations is due to [22], who characterized all distributions (six at all) for which the regression of a quadratic form of S on X ¯ is constant. Since their seminal work, numerous such characterizations have appeared in the literature (e.g., [31,32,33,34,35,36,37] At this point, it should be noted that a zero regression characterization of P means a characterization of a family of distributions, say F, to which P belongs (e.g., the normal family with unknown location and scale parameters or the Poisson family with unknown mean).
Bar-Lev [38] provided methods enabling to characterize ’almost’ any family F (at least those that establish NEFs) by zero regression properties (e.g., a zero regression characterization of the generalized Laplace distribution—see [39]. Such methods suggest searching for cumulant relationships existing among the members of F. Indeed, let ( κ r 1 , , κ r m ) , 1 r 1 r m , be a set of m arbitrary cumulants of F. Derive functional relations among these cumulants in the form g ( κ r 1 , , κ r m ) = 0 , where g is a polynomial in the κ r j ’s, and then construct an unbiased polynomial statistic for g with the tools described in [38]. In the next proposition, we demonstrate such a process for obtaining a zero regression characterization of NEF-EVF F ( μ 1 ) (which can also be executed for the NEF-EDM F ( μ p ) ) . For this, note that by (27), the cumulants of F ( μ 1 ) satisfy
κ r + 1 = ( r 1 ) ! e r m = ( r 1 ) ! κ 2 r , r 1 .
Thus, possible polynomials g’s have the form
g r ( κ 2 , κ r + 1 ) κ r + 1 ( r 1 ) ! κ 2 r = 0 , r 1 ,
where, in particular, for r = 2 ,
g 2 ( κ 2 , κ 3 ) = κ 3 κ 2 2 = 0 .
Now, let n ( k ) = i = 1 k ( n ( i 1 ) ) , and define
S ( X ) T 3 ( X ) T 2 , 2 ( X ) ,
with
T 3 ( X ) = 1 n X i 3 3 n ( 2 ) i j X i X j 2 + 3 n ( 3 ) i j k X i X j X k ,
and
T 2 , 2 ( X ) = 1 n ( 2 ) i k X i 2 X j 2 2 n ( 3 ) i j k X i X j X k 2 + 1 n ( 4 ) i j k l X i X j X k X l ,
where the summations in (37) and (38) are taken over all distinct indices i , j , k , and l , ranging between 1 and n. It then can be simply shown that E S ( X ) = g 2 ( κ 2 , κ 3 ) = κ 3 κ 2 2 = 0 .
Note also that the two components of S ( X ) can be represented in terms of L r = i = 1 n X i r , r = 1 , 2 , , as follows:
T 3 ( X ) = 1 n L 3 3 n ( 2 ) L 1 L 2 L 3 + 3 n ( 3 ) L 1 3 3 L 1 L 2 + 2 L 3
and
T 2 , 2 ( X ) = 1 n ( 4 ) ( n 2 3 n + 3 ) L 2 2 ( n 2 n ) L 4 2 n L 2 L 1 2 + 4 ( n 1 ) L 3 L 1 + L 1 4 .
For brevity, we omit calculation details of (39) and (40).
We now have all the ingredients for the following characterization proposition of NEF-EVF distributions.
Proposition 4.
Let F be a non-degenerate distribution and X = ( X 1 , , X n ) be a random sample of size n 4 taken from F having a finite third moment with κ 3 > 0 . Then, S ( X ) has a zero regression on L 1 iff F is an NEF-EVF distribution.
Proof. 
We prove only the necessity part of the proposition as its sufficiency part is easily verified. Let f ( t ) be the characteristic function of F , h ( t ) = ln f ( t ) , and N δ be some δ -neighborhood of the origin. Then by Lemma 1.1.1 of [33] and Lemma 1 of [38], it follows that if S ( X ) has a zero regression on L 1 , then
E S ( X ) e i t L 1 = f n ( t ) i 3 h ( 3 ) ( t ) i 4 h ( 2 ) ( t ) 2 = 0 , t N δ ,
where h ( j ) ( t ) = d j h ( t ) / d t j , j = 1 , 2 , 3 . Thus,
i h ( 3 ) ( t ) h ( 2 ) ( t ) 2 = 0 or i h ( 3 ) ( t ) h ( 2 ) ( t ) = h ( 2 ) ( t ) , t N δ ,
which by integrating becomes
i ln h ( 2 ) ( t ) = h ( 1 ) ( t ) + c 1 .
Set u = h ( 1 ) , then u = c 2 e i u . Integration using the separation variable technique leads to
u = h ( 1 ) ( t ) = i ln ( 1 + c 3 t ) + c 4 ,
and hence
h ( t ) = i c 1 ( 1 + c t ) ln ( 1 + c t ) + a t + b ,
where c i , i = 1 , , 4 , a , b and c are arbitrary constants with c 0 . Since h ( 0 ) = 0 and h ( j ) ( 0 ) = i j κ j , it follows that b = 0 , a = i ( κ 1 + 1 ) , and c = i κ 2 , so that
h ( t ) = κ 2 1 ( 1 i κ 2 t ) ln ( 1 i κ 2 t ) + i ( 1 + κ 1 ) t .
To conclude the proof, we need to verify that (41) is the cumulant characteristic function of an NEF-EVF distribution. Indeed, by (17) it follows that h ( t ) corresponding F ( μ 1 ) has the form
i t + θ ln ( θ ) ( θ + i t ) ln ( ( ( θ + i t ) ) , θ < 0 .
With (17), we obtain κ 1 = k 1 ( θ ) = ln ( θ ) and κ 2 = k 1 ( θ ) = ( θ ) 1 . Substituting these into (42) yields the expression in (41). □

4.9. Large Deviations

  • In his seminal pioneering study of the characterization of NEFs by their VFs, ref. [3] introduced the following large deviation theorem for NEFs (see [3] Equation (9.1)). Let F = F ( μ ) be an NEF with VF V ( m ) = σ 2 , then for all t 0 ,
    P X m σ t exp ( B ( t ) ) ,
    where
    B ( t ) = σ 2 0 t t w V ( m + σ w ) d w .
    Applying this to an NEF-EVF F ( μ 1 ) in (22) and taking into account that the corresponding skewness coefficient is γ 1 = e m / 2 (see (29)) yields an interesting result in which the upper bound depends only on the skewness coefficient,
    P X m e m / 2 t exp γ 1 t + e γ 1 t 1 , t 0 .

5. Statistical Aspects of EDM-EVFs

In general, for various statistical aspects, particularly for generalized linear model (GLM) applications, it is more effective to represent an absolutely continuous EDM distribution to resemble the normal structure, i.e., instead of (12), writing the model densities as
P ( θ , φ , ν φ ( d y ) ) ) = exp φ 1 θ y k ν ( θ ) ν φ ( d y ) , θ Θ ν , φ Λ ,
(c.f., [18,40,41]. The structure in (43) is inappropriate for the discrete case (counting measures on N ), as for different φ s, it changes the support of the measure ν φ . For the latter case, the structure in (12) is appropriate.
Accordingly, for obtaining the structure (43) for EDM-EVF densities given in (25), we denote φ = p 1 and map x φ x . This yields
l g ( y : θ , φ ) = f 1 ln φ ( y ) exp φ 1 θ y k 1 ( θ ) = 1 π 0 e ( 1 y ) t t log t φ t sin ( π t ) d t exp φ 1 θ y θ θ ln ( θ ) , y R , ( θ , φ ) R × R + .
as the densities that are appropriate for GLM applications as well as other statistical applications. Note that we also changed the variable of interest from x to y to make it more suitable for GLM usage. Accordingly, in the sequel, we denote a random sample of size n taken from (44) by Y = ( Y 1 , , Y n ) .
Henceforth, we shall describe the following statistical features related to EDM-EVF of the form (44): (1) maximum-likelihood estimation; (2) second-order minimax estimation of the mean; (3) test of hypotheses aspects; and (4) practical steps towards GLM applications. For estimation problems, we shall see that the steepness property of EDM-EVF plays an important role.

5.1. Maximum-Likelihood Estimates (MLEs) of the Mean and Dispersion Parameters

The log-likelihood function of θ , φ based on the random sample Y = ( Y 1 , , Y n ) taken from (44) is
l θ , φ = i = 1 n f 1 ln φ ( y i ) + φ 1 θ i = 1 n y i n k 1 ( θ ) .
Note that for the one-parameter case with known φ , the maximum likelihood equation for θ yields
m ^ = k 1 ( θ ^ ) = Y ¯ n with probability 1 .
This follows since the NEF (44) is steep (see [15], Chapter 9.6). The same result holds for the two-parameter case as well. Note, by (21), that k 1 is strictly increasing so its inverse ψ 1 = e m is well defined. Hence, the MLE for θ (for both cases where φ known or unknown) is θ ^ = e Y ¯ n . Let
f 1 ln φ ( y i ) = d f 1 ln φ ( y ) d φ = 1 π 0 e ( 1 y ) t t log t φ ( t + 1 ) sin ( π t ) d t ,
then, the maximum-likelihood equation for φ is
i = 1 n f 1 ln φ ^ ( y i ) φ ^ n 1 θ ^ i = 1 n y i n k 1 ( θ ^ ) = 0 ,
where φ ^ = φ ^ n is the MLE of φ . Equation (48) can be solved numerically with Newton–Raphson’s method or any other search algorithm, as is performed with all NEFs generated by positive stable distributions—all of which have integral forms of their generating measures.
One more aspect should be raised concerning the corresponding information matrix. Let l ( i j ) = l ( i j ) θ , φ , i , j = 1 , 2 , denote the second partial derivative of l θ , φ with respect to θ and φ (where the first index i relates to differentiation with respect to θ and the second index j with respect to φ ), and I ( i j ) = I ( i j ) θ , φ = E ( l ( i j ) θ , φ ; y ) . Since f 1 ln φ ( y ) is differentiable in φ , for almost all y R , the corresponding Fisher information matrix I θ , φ = | | I i j θ , φ | | i , j = 1 2 is diagonal; i.e., the parameters θ and φ are orthogonal. This observation can be easily seen as
l ( 12 ) = 1 φ 2 i = 1 n Y i n k 1 ( θ )
and E ( Y i ) = k 1 ( θ ) , i = 1 , , n . Moreover, adopting the tools and methods developed by [42], it is seen that
n 1 / 2 ( φ ^ n φ ) = 1 I 22 θ , φ n 1 / 2 i = 1 l ( Y i : θ , φ ) φ + r n ( Y : θ , φ ) ,
where r n ( Y : θ , φ ) a . s . 0 as n . Moreover, the moments of n 1 / 2 ( φ ^ n φ ) exist and converge, respectively, to the moments of N ( 0 , I 22 1 θ , φ ) , from which it can be deduced that
n 1 / 2 ( φ ^ n φ ) D N ( 0 , I 22 1 θ , φ ) as n .

5.2. Second-Order Minimax Estimation of the Mean

Bar-Lev and Landsman [42] presented a modified second-order minimax estimator for the mean of EDMs associated with steep NEFs and established some of its asymptotic properties. They provided some necessary and sufficient conditions for such a modified estimator to improve on the sample mean Y ¯ n . One of their necessary conditions requires the steepness of the EDM, as indeed is the case with EDM-EVF. They considered the EDM-EVF and showed the following result as a specific example.
Theorem 3
([42], Theorem 4). The estimator Y ¯ n of mean m can be improved in the second-order minimax sense with respect to the power weight q ( m ) = e x p ( β m ) iff β = 2 . Consequently, the second-order minimax estimator, which improves Y ¯ n for any m R , is given by
m n * = Y ¯ n 1 n φ ^ n e Y ¯ n ,
where its mean squared error is
E m , φ ( m n * m ) 2 = 1 n φ e m ( 1 1 n φ e m + o ( 1 n ) ) .

5.3. Testing Hypotheses

Various tests are available for model fit of the EDM-EVF for real data. Among them are extensive literature studies dealing with goodness-of-fit (gof) tests. Some are based on characterizations of the distributions belonging to the null hypothesis. Indeed, as [43] pointed out, characterization theorems or properties can be natural and practical starting points for constructing gof tests and are essential for assessing the validity of distributional models. The first idea of constructing gof tests based on a characterization of distribution in the realm of the null hypotheses is due to [44] (see [45]). Since then, various studies of constructing gof tests have been suggested; for example, those developed by [46,47,48,49], and the references cited therein. However, the earliest explicit use of a characterization theorem for constructing a gof test was presented by [50], who used Shannon’s maximum entropy characterization to construct a test for a composite hypothesis of normality.
Recently, ref. [51] employed the zero regression characterizations for the Tweedie class with γ 1 , of NEFs having power VFs of the form a m γ to construct novel gof tests for deviation from any given family belonging to the Tweedie class. The zero regression characterizations are those obtained by [31,34] for all the Tweedie class, including those members with γ < 0 —see a comment on the latter members in the sections of conclusions.
Accordingly, a similar gof test for testing
l H 0 : F = NEF-EVF H 1 : F NEF-EVF
can be obtained by employing the zero regression characterization for the NEF-EVF presented in Proposition 4. The test statistic is naturally S ^ n ( Y ) S ( Y ) defined in (36)–(40). This test statistic has the desirable properties as detailed in the following theorem. Whenever convenient, we use F 0 and F A to denote F in H 0 and H 1 , respectively. We also adopt the symbols a . s . , D and ∼ for almost sure convergence, weak convergence (in distribution), and equivalence, respectively, as n .
Theorem 4.
Let Y = ( Y 1 , , Y n ) be a random sample of size n 4 taken from a distribution F having first six finite moments, and let S ^ n ( Y ) be the statistic defined in (36)–(40). Then, the following properties hold:
(i)
E F A ( S ^ n ( Y ) ) = s g 2 ( κ 2 , κ 3 ) = κ 3 κ 2 2 and E F 0 ( S ^ n ( Y ) ) = 0 .
(ii)
S ^ n ( Y ) a . s . s under F A and S ^ n ( Y ) a . s . 0 under F 0 .
(iii)
V F ( S ^ n ( Y ) ) c n
where c is a constant depending on the first six moments of F.
(iv)
S ^ n ( Y ) s V F A ( S ^ n ( Y ) ) D Z N ( 0 , 1 ) and S ^ n ( Y ) V F 0 ( S ^ n ( Y ) ) D Z ,
(v) Under H 0 ,
n S ^ n 2 ( Y ) D c χ 1 2 ,
where χ 1 2 denotes a chi-squared distribution with 1 degree of freedom.
The proof of this theorem is straightforward but somewhat tedious. It can be conducted like that used in [51] for the Tweedie scale families. As this paper is mainly expository, we omit such a proof for brevity. We do, however, sketch some helpful points related to this proof. The first part is followed easily by the derivations preceding Proposition 4. For the three other parts, note that S ^ n ( Y ) , defined in (36)–(40), is a polynomial in the sample moments L i / n , i = 1 , 2 , 3 . Hence, the almost sure convergence in part (ii) is straightforward. The variance of S ^ n ( Y ) can be computed by V F ( S ^ n ( Y ) ) = E F ( S ^ n 2 ( Y ) ) s 2 . Then, by (36)–(40), note that the squared form of S ^ n 2 will yield expressions involving Y i 6 , a fact implying that the variance of S ^ n , and thus also c, will be involved with the sixth moment of F. The proof of part (iv) follows from the asymptotic multivariate normality of ( L 1 , L 2 , L 3 ) , appropriately scaled, and then the application of well-known and old results concerning the asymptotic normality of a function of these sample moments (c.f., [52,53]. Part (v) trivially follows from (49) and (50).
Consequently, as a testing procedure under H 0 , one inclines to reject H 0 for absolute large values of S ^ n ( Y ) or large values of S ^ n 2 ( Y ) . Theorem 2 presents the general result concerning the asymptotic null distribution of n S ^ n 2 ( Y ) . This limiting distribution depends on c , which is a function of the first six moments α 1 , , α 6 of the NEF-EVF distribution. So, we can write c = c ( α ) , where α = ( α 1 , , α 6 ) . Expressions for the α i ’s can be obtained directly from (27) and (28) with p = 1 . Indeed, the cumulants of the NEF-EVF are given by
κ r + 1 = ( r 1 ) ! e r m , r = 1 , 2 , ,
and the central moments are given in (28) as functions of the κ r ’s. Thus, for instance, α 1 = κ 1 = m and α 2 = e m ( 1 + e m ) , i.e., the the α i ’s, i = 2 , , 6 , are polynomials in the e m . As the MLE for m is Y ¯ n , we immediately obtain the MLE c ^ = c ^ ( α ^ ) for c . The latter result is crucial when calculating the proposed test’s power.
Accordingly, an approximation of the p-value of the test is obtained by using (51) as
p ^ = 1 F χ 1 2 ( n c ^ 1 S n o b s ( y ) ) .
One can also approximate the p-value and the critical points using a parametric bootstrap approach by applying the following procedure as suggested by [51]:
  • For some large integer B, repeat the following steps for every b 1 , , B ;
    (a)
    Generate a bootstrap sample Y 1 * b , , Y n * b ;
    (b)
    Based on the bootstrap sample, calculate the bootstrap S n * b version of test statistic S n ;
  • Approximate the p-value with
p ^ = 1 B b = 1 B I ( S n * b S n o b s )
and the critical point with S c : B , n * b , where c = 1 α B and · is the ceiling function.
Various alternatives in H 1 are listed in the introduction section above. Simulations should then be executed to assess the performance of the proposed gof test in terms of type I error rate.

5.4. Practical Steps towards GLM Applications

For GLM applications, we need the following ingredients. Equation (44) presents the EDM-EVF densities in the form required by GLM. However, for better insight, we represent them in terms of the mean m (rather than in terms of θ ) as
g ( y : m , φ ) = f 1 ln φ ( y ) exp φ 1 θ ( m ) y k 1 ( θ ( m ) ) , y R , ( θ , φ ) R × R + ,
where, by (21), the mean value parameterization, θ ( m ) and k 1 ( θ ( m ) ) are
θ ( m ) = e m and k 1 ( θ ( m ) ) = e m m + 1 ,
with VF
( V , M ) = ( φ e m , m R ) .
If Y g ( · : m , φ ) , we also use the standard EDM’s notation and write Y E D M E V F ( m , φ ) . The mean, variance, and cumulants of such a Y are
E ( Y ) = m , V ( Y ) = φ e m , κ r ( m ) = ( r 1 ) ! φ r e r m , r 3 .
We shall now consider two essential ingredients needed for GLM applications of EDM-EVFs (52), namely, the scaled deviance and the link function. These were introduced by [18,41] (see also [40,54]. We also discuss some relevant computational aspects involved.
  • Scaled deviance and link function
    Consider
    t ( y , m ) = y θ ( m ) k 1 ( θ ( m ) ) .
    Then, as (52) is steep, it follows that max m g ( y : θ ( m ) , φ ) is obtained at y = m (see (46) for n = 1 ). Hence, the unit deviance
    d ( y , m ) = 2 t ( y , y ) t ( y , m )
    can be considered as a distance measure with two properties: d ( y , y ) = 0 and d ( y , m ) > 0 for y m . For the EDM-EVF, we obtain by (53) that
    t ( y , m ) = y e m + e m m + 1 ,
    and thus, for EDM-EVF,
    d ( y , m ) = 2 y e y + e y y + 1 ( y e m + e m m + 1 ) .
    Consequently, (52) can be rewritten as
    g ( y : m , φ ) = g ( y : y , φ ) exp 1 2 φ d ( y , m ) .
    GLMs assume a systematic component where the linear predictor
    η = β 0 + j = 1 p β j x j
    is linked to the mean m through a link function g such that g ( m ) = η . For the EDM-EVF, we choose the canonical link function
    η = g ( m ) = θ m = e m ,
    a relatively simple link function.
    The set of observations is y = ( y 1 , , y n ) T , where the y i ’s are independent with y i E D M E V F ( m i , φ ) and is associated with the link function η i = β 0 + j = 1 p β j i x j i , i = 1 , , n . Here, the set of covariates is n × p matrix X so that we may write η = X β . The total and scaled deviances are given, respectively, by
    D ( y , m ) = i = 1 n d ( y i , m i )
    and
    D * ( y , m ) = D ( y , m ) φ .
    When the saddlepoint approximation holds (and it holds for EDM-EVF—see [54]), the scaled deviance distribution follows an approximate chi-square distribution,
    D * ( y , m ) χ n 2 ,
    at the true values of m i (for all i) and φ . Consequently, the log-likelihood is
    l ( m . φ ) = ln g ( y : y , φ ) D * ( y , m ) .
    All of the above provides all the necessary ingredients for GLM applications.
    Computational aspects
    Therefore, we reason to call the Tweedie scale a Tweedie NEF with power infinity. The Tweedie class is composed of power VFs in the form V ( m ) = φ m γ , where for γ > 2 , the corresponding NEFs are generated by positive stable. We already noticed in Section 4.2 that the VF of the EDM-EVF is a limit of a sequence of VFs in the Tweedie scale distributions (which are absolutely continuous with respect to Lebesgue measures on R + ), except for the inverse Gaussian NEF ( γ = 3 ), none of which have an expressible density function but rather are expressed in terms of integral form (or power series)—a situation that also occurs with the EDM-NEF (power infinity). The Tweedie scale with power γ < 0 is comprises NEFs generated by extreme stable distributions and lacks the steepness property (this will be discussed in the concluding remarks section). At this point, it seems fair to note that the Tweedie class should also be attributed to [28] in their study of power VFs through the analysis of the notion of reproducibility (see [29], for further details). Indeed, in recent papers, the Tweedie class was abbreviated as the TBE class.
    The situation above, whereby this class of NEFs does not have explicitly expressed densities, probably prevented its use for statistical modeling for quite some years. This complexity has then been resolved due to the availability of powerful mathematical software. Ref. [54] studied two methods for evaluating the density function of a Tweedie distribution, which are based on the inversion of the cumulant generating function while using the Fourier inversion and the saddlepoint approximation. An algorithm for evaluating their density function based on series expansions was presented by [55] (for these evaluation aspects, see also [56,57]. Dunn created and maintained the Tweedie R package [58], while [59] contributed to and maintained the statmod R package. In this frame, the function tweedie.profile in the tweedie R package practically enables the fit of TBE models. These packages can be extended to include the EDM-EVF as well.

6. Concluding Remarks

In this study, we presented a comprehensive review and further developed various properties of the class of EDM-EVF distributions and found it is abundant with probabilistic and statistical properties. This class of absolutely continuous distributions, supported on the whole real line, possesses simple VF, cumulants, and central moments with skewed distributions to the right and leptokurtic. In the context of probabilistic aspects, we illustrated the following features of EDM-EVF distributions related to reciprocity, self-decomposability, unimodality, reproducibility, duality, chainability, and large deviations. Also, we provided some characterizations by zero regression on the sample mean.
We also described some aspects of statistical features. Mainly, we considered maximum-likelihood estimation, second-order minimax estimation of the mean, and hypotheses testing and presented practical steps toward generalized linear model applications. However, applying the EDM-EVF distributions to real-world data presents a multifaceted challenge that necessitates using advanced estimation techniques and corresponding goodness-of-fit tests. These challenges primarily revolve around computational complexities. The first significant challenge lies in estimating the parameter p. This estimation must be executed numerically, as the probability density function includes an integral with no closed-form solution. The second challenge arises when someone wants to use a classical goodness-of-fit test, such as the Kolmogorov–Smirnov. Since we have a composite goodness-of-fit test, the computation of the p-value of the test should be done by using bootstrap methods. This, in turn, requires both the development of an algorithm for generating random values from the EDM-EVF distribution and the development of an algorithm to estimate the unknown parameters. All of these make it even more complex for GLM applications. The execution and analysis of the latter statistical aspects constitute a distinct project as it involves developing appropriate tools, for example, in R. Such a computationally-oriented project is now being carried out in collaboration with other researchers.
We trust that the proposed EDM-EVF will play an important role in modeling real data, mainly due to its simple link function, simple mean value parameterization, and other properties.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Acknowledgments

I am grateful to Gérard Letac for his helpful comments and discussion, which enriched the presentation of the paper. I am also thankful to two reviewers for their constructive comments.

Conflicts of Interest

The author declares no conflict of interest

References

  1. Fenstad, G.U. A comparison between the U and V tests in the Behrens-Fisher problem. Biometrika 1983, 70, 300–302. [Google Scholar] [CrossRef]
  2. Jeansonne, M.S.; Foley, J.P. Review of the exponentially modified Gaussian function Since 1983. J. Chromatogr. Sci. 1991, 29, 258–266. [Google Scholar] [CrossRef]
  3. Morris, C.N. Natural Exponential Families with Quadratic Variance Functions. Ann. Statist. 1982, 10, 65–80. [Google Scholar] [CrossRef]
  4. Johnson, N.L. Systems of frequency curves generated by methods of translation. Biometrika 1949, 36, 149–176. [Google Scholar] [CrossRef]
  5. Lukacs, E. Characteristic Functions, 2nd ed.; Hafner: New York, NY, USA, 1970. [Google Scholar]
  6. Feller, W. An Introduction to Probability Theory and Its Applications, 1st ed.; Wiley: New York, NY, USA, 1966; Volume 2. [Google Scholar]
  7. Eaton, M.L.; Morris, C.; Rubin, H. On extreme stable laws and some applications. J. Appl. Probab. 1971, 8, 794–801. [Google Scholar] [CrossRef]
  8. Grosswald, E. The Student t-distribution of any degree of freedom is infinitely divisible. Z. Wahrscheinlichkeitstheorie Verw. Geb. 1976, 36, 103–109. [Google Scholar] [CrossRef]
  9. Nolan, J.P. Stable Distributions: Models for Heavy Tailed Data; Retrieved from American University; Birkhäuser: Boston, MA, USA, 2010. [Google Scholar]
  10. Landau, L. On the energy loss of fast particles by ionization. J. Phys. (USSR) 1944, 8, 201–205. [Google Scholar]
  11. Marucho, M.; Garcia-Canal, C.; Fanchiotti, H. The Landau distribution for charged particles traversing thin films. Int. J. Mod. Phys. C 2006, 17, 1461–1476. [Google Scholar] [CrossRef]
  12. Bulyal, E.; Shul’ga, N. Landau distribution of ionization losses: History, importance, extensions. arXiv 2022, arXiv:2209.06387v1. [Google Scholar]
  13. Letac, G.; Mora, M. Natural real exponential families with cubic variance functions. Ann. Stat. 1990, 18, 1–37. [Google Scholar] [CrossRef]
  14. Mora, M. La convergence des fonctions variances des familles exponentielles naturelles. Ann. Faculté Sci. Toulouse 1990, 11, 105–120. [Google Scholar] [CrossRef]
  15. Barndorff-Nielsen, O.E. Information and Exponential Families in Statistical Theory; Wiley: New York, NY, USA, 1978. [Google Scholar]
  16. Bar-Lev, S.K.; Kokonendji, C.C. On the mean value parameterization of natural exponential families—A revisited review. Math. Methods Stat. 2017, 26, 159–175. [Google Scholar] [CrossRef]
  17. Jorgensen, B. Exponential dispersion models (with discussion). J. R. Stat. Soc. Ser. B 1987, 49, 127–162. [Google Scholar]
  18. Jorgensen, B. The Theory of Dispersion Models; Chapman and Hall: London, UK, 1997. [Google Scholar]
  19. Burridge, J. Discussion on paper by B. Jorgensen, Exponential dispersion models. J. R. Soc. Ser. B 1987, 49, 150–152. [Google Scholar]
  20. Kendall, M.G.; Stuart, A. The Advanced Theory of Statistics 1, 4th ed.; Macmillan: New York, NY, USA, 1977. [Google Scholar]
  21. Bar-Lev, S.K. Discussion on paper by B. Jorgensen, Exponential dispersion models. J. R. Soc. Ser. B 1987, 49, 153–154. [Google Scholar]
  22. Laha, R.G.; Lukacs, E. On a problem connected with quadratic regression. Biometrika 1960, 47, 335–345. [Google Scholar] [CrossRef]
  23. Bingham, N.H. Fluctuation theory in continuous time. Adv. Appl. Probab. 1975, 7, 705–766. [Google Scholar] [CrossRef]
  24. Lukacs, E. Developments in Characteristic Functions Theory; Griffin: London, UK, 1983. [Google Scholar]
  25. Yamazato, M. Unimodality in infinitely divisible distribution functions of class L. Ann. Probab. 1978, 6, 253–531. [Google Scholar] [CrossRef]
  26. Bar-Lev, S.K.; Bshouty, D.; Letac, G. Natural exponential families and self-decomposability. Stat. Probab. Lett. 1992, 13, 147–152. [Google Scholar] [CrossRef]
  27. Bar-Lev, S.K.; Casalis, M. A classification of reducible natural exponential families in the broad sense. J. Theor. Probab. 2003, 16, 175–196. [Google Scholar] [CrossRef]
  28. Bar-Lev, S.K.; Enis, P. Reproducibility and natural exponential families with power variance functions. Ann. Stat. 1986, 14, 1507–1522. [Google Scholar] [CrossRef]
  29. Bar-Lev, S.K. Independent, tough Identical results: The class of Tweedie on power variance functions and the class of Bar-Lev and Enis on reproducible natural exponential families. Int. Stat. Probab. 2019, 9, 30–35. [Google Scholar] [CrossRef]
  30. Letac, G. Duality for real and multivariate exponential families. J. Multivar. Anal. 2022, 188, 104811. [Google Scholar] [CrossRef]
  31. Bar-Lev, S.K.; Stramer, O. Characterizations of natural exponential families with power variance functions by zero regression properties. Probab. Theory Related Fields 1987, 76, 509–522. [Google Scholar] [CrossRef]
  32. Gordon, F.S. Characterizations of populations using regression properties. Ann. Stat. 1973, 1, 114–126. [Google Scholar] [CrossRef]
  33. Kagan, A.M.; Linnik, Y.V.; Rao, C.R. Characterizations Problems in Mathematical Statistics; Wiley: New York, NY, USA, 1973. [Google Scholar]
  34. Bar-Lev, S.K.; Bshouty, D.; van der Duyn, S. Zero regression characterizations of natural exponential families—A complementary. Math. Methods Os Stat. 2004, 13, 1–12. [Google Scholar]
  35. Bar-Lev, S.K.; Kagan, A. Bivariate distributions with Gaussian-Type dependence structure. Commun.-Stat. Theory Methods 2009, 38, 2669–2676. [Google Scholar] [CrossRef]
  36. Fosam, E.B.; Shanbhag, D.N. An extended Laha{Lukacs characterization results based on a regression property. J. Stat. Planing Inference 1997, 63, 173–186. [Google Scholar] [CrossRef]
  37. Wesolowski, J. Characterizations of distributions by constant regression of quadratic statistics on a linear one. Sankhya Ser. A 1990, 52, 383–386. [Google Scholar]
  38. Bar-Lev, S.K. Methods of constructing characterizations by constancy of regression on the sample mean and related problems for NEF’s. Math. Methods Stat. 2007, 16, 96–109. [Google Scholar] [CrossRef]
  39. Bar-Lev, S.K.; Bshouty, D. A characterization of the generalized Laplace distribution by constant regression on the sample mean. Stat. Probab. Lett. 2016, 113, 79–83. [Google Scholar] [CrossRef]
  40. Dunn, P.K.; Smyth, G.K. Generalized Linear Models with Examples in R; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar] [CrossRef]
  41. McCullagh, P.; Nelder, J.A. Generalized Linear Models, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 1989. [Google Scholar]
  42. Bar-Lev, S.K.; Landsman, Z. Exponential dispersion models: Second-order minimax estimation of the mean for unknown dispersion parameter. J. Stat. Planing Inference 2006, 136, 3837–3851. [Google Scholar] [CrossRef]
  43. Wilding, G.E.; Mudholkar, G.S. A gamma goodness-of-fit test based on characteristic independence of the mean and coefficient of variation. J. Stat. Inference 2008, 138, 3813–3821. [Google Scholar] [CrossRef]
  44. Linnik, Y.V. Linear forms and statistical criteria. I, II. Ukrain. Mat. Zh. 1963, 5, 207–243, 247–290. (In Russian)English translation in Sel. Transl. Math. Statist. Prob 1963, 3, 1–90 [Google Scholar]
  45. Nikitin, Y.Y. Tests based on characterizations, and their efficiencies: A survey. Acta Comment. Univ. Tartu. Math. 2017, 21, 3–24. [Google Scholar] [CrossRef]
  46. Bar-Lev, S.K.; Batsidis, A.; Economou, P. Tweedie, Bar-Lev, and Enis class of leptokurtic distributions as a candidate for modeling real data. Commun. Stat. Case Stud. Data Anal. Appl. 2021, 7, 229–248. [Google Scholar] [CrossRef]
  47. Marchetti, C.E.; Mudholkar, G.S. Characterization theorems and goodness-of-fit test. In Goodness-of-Fit Tests and Model Validity, Statistics for Industry and Technology; Huber-Carol, C., Balakrishnan, N., Nikulin, M.S., Mesbah, M., Eds.; Birkhäuser: Boston, MA, USA, 2002. [Google Scholar]
  48. Milosević, B. Asymptotic efficiency of goodness-of-fit tests based on Too-Lin characterization. Commun. Stat.-Simul. Comput. 2020, 49, 2082–2101. [Google Scholar] [CrossRef]
  49. Mudholkar, G.S.; Lin, C.T. On two applications of characterization theorems to goodness-of-fit. Colloq. Math. Soc. Janos Bolyai 1984, 45, 395–414. [Google Scholar]
  50. Vasicek, O. A test of normality based on sample entropy. J. Roy. Statist. Soc. B 1976, 38, 54–59. [Google Scholar] [CrossRef]
  51. Bar-Lev, S.K.; Batsidis, A.; Einbeck, J.; Liu, X.; Ren, P. Cumulant-Based Goodness-of-Fit Tests for the Tweedie, Bar-Lev and Enis Class of Distributions. Mathematics 2023, 11, 1603. [Google Scholar] [CrossRef]
  52. Doob, J.L. The limiting distributions of certain statistics. Annals of Math. Stat. 1935, 6, 160–170. [Google Scholar] [CrossRef]
  53. Hsu, C.T. The limiting distribution of a general class of statistics. Sci. Rec. (Acad. Sin.) 1942, 1, 37–41. [Google Scholar]
  54. Dunn, P.K.; Smyth, G.K. Evaluation of Tweedie exponential dispersion model densities by Fourier inversion. Stat. Comput. 2008, 18, 73–86. [Google Scholar] [CrossRef]
  55. Dunn, P.K.; Smyth, G.K. Series evaluation of Tweedie exponential dispersion model Densities. Stat. Comput. 2005, 15, 267–280. [Google Scholar] [CrossRef]
  56. Vinogradov, V.; Paris, R.B.; Yanushkevichiene, O. New properties and representations for members of the power-variance family, I. Lith. Math. J. 2012, 52, 444–461. [Google Scholar] [CrossRef]
  57. Vinogradov, V.; Paris, R.B.; Yanushkevichiene, O. New properties and representations for members of the power-variance family, II. Lith. Math. J. 2013, 53, 103–120. [Google Scholar] [CrossRef]
  58. Dunn, P.K. Tweedie: Evaluation of Tweedie Exponential Family Models. R Package Version 2.3.5. 2022. Available online: https://cran.r-project.org/web/packages/tweedie/tweedie.pdf (accessed on 12 September 2023).
  59. Smyth, G.K. Statmod: Statistical Modeling. 2017. Available online: https://CRAN.R-project.org/package=statmod (accessed on 12 September 2023).
Figure 1. EDM-EVF density (26) for ( p , m ) = ( 1 , 2 ) , ( 1 , 6 ) , ( 0.5 , 2 ) , ( 0.5 , 4 ) .
Figure 1. EDM-EVF density (26) for ( p , m ) = ( 1 , 2 ) , ( 1 , 6 ) , ( 0.5 , 2 ) , ( 0.5 , 4 ) .
Mathematics 11 04343 g001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bar-Lev, S.K. The Exponential Dispersion Model Generated by the Landau Distribution—A Comprehensive Review and Further Developments. Mathematics 2023, 11, 4343. https://doi.org/10.3390/math11204343

AMA Style

Bar-Lev SK. The Exponential Dispersion Model Generated by the Landau Distribution—A Comprehensive Review and Further Developments. Mathematics. 2023; 11(20):4343. https://doi.org/10.3390/math11204343

Chicago/Turabian Style

Bar-Lev, Shaul K. 2023. "The Exponential Dispersion Model Generated by the Landau Distribution—A Comprehensive Review and Further Developments" Mathematics 11, no. 20: 4343. https://doi.org/10.3390/math11204343

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop