Next Article in Journal
Tail Risk in Commercial Property Insurance
Previous Article in Journal
Model Risk in Portfolio Optimization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Optimal Three-Way Stable and Monotonic Spectrum of Bounds on Quantiles: A Spectrum of Coherent Measures of Financial Risk and Economic Inequality

Department of Mathematical Sciences, Michigan Technological University, 1400 Townsend Drive, Houghton, MI 49931, USA
Risks 2014, 2(3), 349-392; https://doi.org/10.3390/risks2030349
Submission received: 15 June 2014 / Revised: 21 August 2014 / Accepted: 10 September 2014 / Published: 23 September 2014

Abstract

:
A spectrum of upper bounds Q α ( X ; p ) α [ 0 , ] on the (largest) ( 1 p ) -quantile Q ( X ; p ) of an arbitrary random variable X is introduced and shown to be stable and monotonic in α, p, and X, with Q 0 ( X ; p ) = Q ( X ; p ) . If p is small enough and the distribution of X is regular enough, then Q α ( X ; p ) is rather close to Q ( X ; p ) . Moreover, these quantile bounds are coherent measures of risk. Furthermore, Q α ( X ; p ) is the optimal value in a certain minimization problem, the minimizers in which are described in detail. This allows of a comparatively easy incorporation of these bounds into more specialized optimization problems. In finance, Q 0 ( X ; p ) and Q 1 ( X ; p ) are known as the value at risk (VaR) and the conditional value at risk (CVaR). The bounds Q α ( X ; p ) can also be used as measures of economic inequality. The spectrum parameter α plays the role of an index of sensitivity to risk. The problems of the effective computation of the bounds are considered. Various other related results are obtained.

Graphical Abstract

1. Introduction

The most common measure of risk is apparently the value at risk, VaR p ( X ) , defined as the largest ( 1 p ) -quantile of (the distribution of) a random variable (r.v.) X, which represents an uncertain future loss on an investment portfolio. Whereas very simple conceptually, the risk measure VaR p is not subadditive and, hence, is not coherent, in the sense established by Artzner et al. [1] and widely accepted afterwards. Other flaws of the value at risk are also well known; quoting Rockafellar and Uryasev [2]:
A very serious shortcoming of VaR, in addition, is that it provides no handle on the extent of the losses that might be suffered beyond the threshold amount indicated by this measure. It is incapable of distinguishing between situations where losses that are worse may be deemed only a little bit worse, and those where they could well be overwhelming. Indeed, it merely provides a lowest bound for losses in the tail of the loss distribution and has a bias toward optimism instead of the conservatism that ought to prevail in risk management.
In other words, the VaR is not sensitive to the amount of risk beyond the threshold. Moreover, as is also discussed in [2], VaR p ( X ) is unstable in p and unstable in (the distribution of) X: arbitrarily small changes of the confidence level 1 p or of the composition of the portfolio may effect arbitrarily large changes of the value of VaR p ( X ) . Closely related to these two kinds of instability is the inherent instability in the computation of VaR p ( X ) .
To address these deficiencies of the VaR , Rockafellar and Uryasev [2,3] proposed an alternative risk measure, CVaR , which stands for the conditional value at risk. In the case when (the distribution of) the r.v. X is continuous, CVaR p ( X ) can be defined as E ( X | X VaR p ( X ) ) , the conditional expectation of the loss given that the loss X exceeds the threshold VaR p ( X ) . This alternative risk measure, CVaR p ( X ) , is coherent and stable in p and in X; it also has a certain, fixed sensitivity to the losses beyond the threshold.
However, CVaR p ( X ) provides no handle on the degree of sensitivity to risk. In particular, as will be demonstrated in Section 5.2, one can easily construct two portfolios with the same value of CVaR p , such that one of the portfolios is clearly riskier than the other. Such indifference may generally be considered “an unwanted characteristic”; see e.g. comments on pages 36 and 48 in [4].
The main objective of the present paper is to remedy this indifference and provide the mentioned missing handle on the degree of sensitivity to risk, while retaining the coherence and stability properties. Indeed, we shall present a spectrum of risk measures Q α ( X ; p ) α [ 0 , ] , where the spectrum parameter α may be considered the degree of sensitivity to risk: the greater the value of α, the greater the sensitivity to risk; see Section 5.2 for details. In particular, α = corresponds to an “exponentially” high degree of risk sensitivity. Moreover, the proposed spectrum of risk measures possesses the following properties:
(I)
The common risk measures VaR and CVaR are in the spectrum: Q 0 ( X ; p ) = VaR p ( X ) and Q 1 ( X ; p ) = CVaR p ( X ) ; thus, Q α ( X ; p ) interpolates between VaR p ( X ) and CVaR p ( X ) for α ( 0 , 1 ) and extrapolates from VaR p ( X ) and CVaR p ( X ) on towards higher degrees of risk sensitivity for α ( 1 , ] . Details on this can be found in Section 5.1.
(II)
The risk measure Q α ( · ; p ) is coherent for each α [ 1 , ] and each p ( 0 , 1 ) , but it is not coherent for any α [ 0 , 1 ) and any p ( 0 , 1 ) . Thus, α = 1 is the smallest value of the sensitivity index for which the risk measure Q α ( X ; p ) is coherent. One may also say that for α [ 1 , ] the risk measure Q α ( · ; p ) inherits the coherence of CVaR p = Q 1 ( · ; p ) , and for α [ 0 , 1 ) it inherits the lack of coherence of VaR p = Q 0 ( · ; p ) . For details, see Section 5.3.
(III)
Q α ( X ; p ) is three-way stable and monotonic: in α ( 0 , ] , in p ( 0 , 1 ) , and in X. Moreover, as stated in Theorem 3.4 and Proposition 3.5, Q α ( X ; p ) is nondecreasing in X with respect to the stochastic dominance of any order γ [ 1 , α + 1 ] ; but, this monotonicity property breaks down for the stochastic dominance of any order γ ( α + 1 , ] . Thus, the sensitivity index α is in a one-to-one correspondence with the highest order of the stochastic dominance respected by Q α ( X ; p ) .
Rockafellar and Uryasev [2] also wrote: “Most importantly for applications, however, CVaR can be expressed by a remarkable minimization formula.” It will be shown (in Theorem 3.3) that our risk measures Q α ( X ; p ) possess quite a similar variational representation for each α ( 0 , ] , which in fact generalizes the minimization formula for CVaR . This representation allows of a comparatively easy incorporation of the risk measures Q α ( X ; p ) into more specialized optimization problems, with additional restrictions on the r.v. X; see Section 4.3 for details.
The spectrum of risk measures Q α ( X ; p ) α [ 0 , ] is naturally based on a previously developed spectrum P α ( X ; x ) α [ 0 , ] of upper bounds on the tail probability P ( X x ) for x R , with P 0 ( X ; x ) = P ( X x ) and P ( X ; x ) being the best possible exponential upper bound on P ( X x ) ; see, e.g., [5,6] and bibliography therein; a shorter version of [6] appeared as [7]. The spectrum P α ( X ; x ) α [ 0 , ] is shown in the present paper to be stable and monotonic in α, x, and X. The bounds P α ( X ; x ) are optimal values in certain minimization problems. It is shown that the mentioned minimization problems for which P α ( X ; x ) and Q α ( X ; p ) are the optimal values are in a certain sense dual to each other; in the special case α = , this corresponds to the bilinear Legendre–Fenchel duality.
A few related results are obtained as well. In particular, a generalization of the Cillo–Delquie necessary and sufficient condition for the so-called mean-risk (M-R) to be nondecreasing with respect to the stochastic dominance of order 1 is presented, with a short proof. Moreover, a necessary and sufficient condition for the M-R measure to be coherent is given.
It is also shown that the quantile bounds Q α ( X ; p ) can be used as measures of economic inequality, and then the spectrum parameter α may be considered an index of sensitivity to inequality: the greater is the value of α, the greater is the sensitivity of the function Q α ( · ; p ) to inequality.
In addition, it is demonstrated that P α ( X ; x ) and Q α ( X ; p ) can be effectively computed.
The paper is structured as follows.
*
In Section 2, the three-way stability and monotonicity, as well as other useful properties, of the spectrum P α ( X ; x ) α [ 0 , ] of upper bounds on tail probabilities are established.
*
In Section 3, the corresponding properties of the spectrum Q α ( X ; p ) α [ 0 , ] of risk measures are presented, as well as other useful properties.
*
The matters of effective computation of P α ( X ; x ) and Q α ( X ; p ) , as well as optimization of Q α ( X ; p ) with respect to X, are considered in Section 4.
*
An extensive discussion of results is presented in Section 5, particularly in relation with existing literature.
*
Concluding remarks are collected in Section 6.
*
The necessary proofs are given in Appendix A.
Further details can be found in the arXiv version of this paper [8].

2. An Optimal Three-Way Stable and Three-Way Monotonic Spectrum of Upper Bounds on Tail Probabilities

Consider the family ( h α ) α [ 0 , ] of functions h α : R R given by the formula
h α ( u ) : = I { u 0 } if α = 0 , ( 1 + u / α ) + α if 0 < α < , e u if α =
for all u R . Here, as usual, I { · } denotes the indicator function, u + : = 0 u and u + α : = ( u + ) α for all real u.
Obviously, the function h α is nonnegative and nondecreasing for each α [ 0 , ] , and it is also continuous for each α ( 0 , ] . Moreover, it is easy to see that, for each u R ,
h α ( u ) is nondecreasing and continuous in α [ 0 , ] .
Next, let us use the functions h α as generalized moment functions and thus introduce the generalized moments
A α ( X ; x ) ( λ ) : = E h α λ ( X x ) .
Here and in what follows, unless otherwise specified, X is any random variable (r.v.), x R , α [ 0 , ] , and λ ( 0 , ) . Since h α 0 , the expectation in formula (2.3) is always defined, but may take the value . It may be noted that in the particular case α = 0 , one has
A 0 ( X ; x ) ( λ ) = P ( X x ) ,
which does not actually depend on λ ( 0 , ) .
Now one can introduce the expressions
P α ( X ; x ) : = inf λ ( 0 , ) A α ( X ; x ) ( λ ) = P ( X x ) if α = 0 , inf λ ( 0 , ) E ( 1 + λ ( X x ) / α ) + α if 0 < α < , inf λ ( 0 , ) E e λ ( X x ) if α = .
By the property stated in (2.2), A α ( X ; x ) ( λ ) and P α ( X ; x ) are nondecreasing in α [ 0 , ] . In particular,
P 0 ( X ; x ) = P ( X x ) P α ( X ; x ) .
It will be shown later (see Proposition 2.3) that P α ( X ; x ) also largely inherits the property of h α ( u ) of being continuous in α [ 0 , ] .
The definition (2.5) can be rewritten as
P α ( X ; x ) = inf t T α A ˜ α ( X ; x ) ( t )
where
T α : = R if α [ 0 , ) , ( 0 , ) if α =
and
A ˜ α ( X ; x ) ( t ) : = E ( X t ) + α ( x t ) + α if α [ 0 , ) , E e ( X x ) / t if α = .
Here and subsequently, we also use the conventions 0 0 : = 0 and a 0 : = for all a [ 0 , ] . The alternative representation (2.7) of P α ( X ; x ) follows because (i) A α ( X ; x ) ( λ ) = A ˜ α ( X ; x ) ( x α λ ) for α ( 0 , ) ; (ii) A ( X ; x ) ( λ ) = A ˜ ( X ; x ) ( 1 λ ) ; and (iii) P 0 ( X ; x ) = P ( X x ) = inf t ( , x ) P ( X > t ) = inf t ( , x ) A ˜ 0 ( X ; x ) ( t ) .
In view of Formula (2.7), one can see (cf. Corollary 2.3 in [5]) that, for each α [ 0 , ] , P α ( X ; x ) is the optimal (that is, least possible) upper bound on the tail probability P ( X x ) given the generalized moments E g α ; t ( X ) for all t T α , where:
g α ; t ( u ) : = ( u t ) + α if α [ 0 , ) , e u / t if α = .
In fact (cf. e.g. Proposition 3.3 in [6]), the bound P α ( X ; x ) remains optimal given the larger class of generalized moments E g ( X ) for all functions g H α , where
H α : = g R R : g ( u ) = R g α ; t ( u ) μ ( d t ) for some μ M α and all u R ,
M α denotes the set of all nonnegative Borel measures on T α , and, as usual, R R stands for the set of all real-valued functions on R . By Proposition 1(ii) in [9] and Proposition 3.4 in [6],
0 α < β implies H α H β .
This provides the other way to come to the mentioned conclusion that
P α ( X ; x ) is nondecreasing in α [ 0 , ] .
By Proposition 1.1 in [10], the class H α of generalized moment functions can be characterized as follows in the case when α is a natural number: for any g R R , one has g H α if and only if g has finite derivatives g ( 0 ) : = g , g ( 1 ) : = g , , g ( α 1 ) on R , such that g ( α 1 ) is convex on R and lim x g ( j ) ( x ) = 0 for j = 0 , 1 , , α 1 . Moreover, by Proposition 3.4 in [6], g H if and only if g is infinitely differentiable on R , and g ( j ) 0 on R and lim x g ( j ) ( x ) = 0 for all j = 0 , 1 , .
Thus, the greater the value of α, the narrower and easier to deal with is the class H α and the smoother are the functions comprising H α . However, the greater the value of α, the farther away is the bound P α ( X ; x ) from the true tail probability P ( X x ) .
Of the bounds P α ( X ; x ) , the loosest and easiest one to get is P ( X ; x ) , the so-called exponential upper bound on the tail probability P ( X x ) . It is used very widely, in particular when X is the sum of independent r.v.’s X i , in which case one can rely on the factorization A α ( X ; x ) ( λ ) = e λ x i E e λ X i . A bound very similar to P 3 ( X ; x ) was introduced in [11] in the case when X the sum of independent bounded r.v.’s; see also [12,13,14]. For any α ( 0 , ) , the bound P α ( X ; x ) is a special case of a more general bound given in Corollary 2.3 in [5]; see also Theorem 2.5 in [5]. For some of the further developments in this direction, see [7] and the bibliography therein. The papers mentioned in this paragraph used the representation (2.7) of P α ( X ; x ) , rather than the new representation (2.5). The new representation appears, not only of more unifying form, but also more convenient as far as such properties of P α ( X ; x ) as the monotonicity in α and the continuity in α and in X are concerned; cf. (2.2) and the proofs of Propositions 2.3 and 2.4; those proofs, as well as the proofs of most of the other statements in this paper, are given in Appendix A. Yet another advantage of the representation (2.5) is that, for α [ 1 , ) , the function A α ( X ; x ) ( · ) inherits the convexity property of h α , which facilitates the minimization of A α ( X ; x ) ( λ ) in λ , as needed to find P α ( X ; x ) by Formula (2.5); relevant details on the remaining “difficult case” α ( 0 , 1 ) can be found in Section 4.1.
On the other hand, the “old” representation (2.7) of P α ( X ; x ) is more instrumental in establishing the mentioned connection with the classes H α of generalized moment functions; in proving Part (iii) of Proposition 2.2; and in discovering and proving Theorem 3.3.
***
Some of the more elementary properties of P α ( X ; x ) are presented in
Proposition 2.1.
(i) 
P α ( X ; x ) is nonincreasing in x R .
(ii) 
If α ( 0 , ) and E X + α = , then P α ( X ; x ) = for all x R .
(iii) 
If α = and E e λ X = for all real λ > 0 , then P ( X ; x ) = for all x R .
(iv) 
If α ( 0 , ) and E X + α < , then P α ( X ; x ) 1 as x and P α ( X ; x ) 0 as x , so that 0 P α ( X ; x ) 1 for all x R .
(v) 
If α = and E e λ 0 X < for some real λ 0 > 0 , then P α ( X ; x ) 1 as x and P α ( X ; x ) 0 as x , so that 0 P α ( X ; x ) 1 for all x R .
In view of Proposition 2.1, it will be henceforth assumed by default that the tail bounds P α ( X ; x ) – as well as the quantile bounds Q α ( X ; p ) , to be introduced in Section 3, and also the corresponding expressions A α ( X ; x ) ( λ ) , A ˜ α ( X ; x ) ( t ) , and B α ( X ; p ) ( t ) , as in Formulas (2.3), (2.9), and (3.9)) are defined and considered only for r.v.’s X X α (unless indicated otherwise), where
X α : = X if α = 0 , X X : E X + α < if α ( 0 , ) , X X : Λ X Ø if α = ,
X is the set of all real-valued r.v.’s on a given probability space (implicit in this paper), and
Λ X : = λ ( 0 , ) : E e λ X < .
Observe that the set X α is a convex cone containing all real constants; for details on this, one may see comments in the paragraph containing Formula (1.14) in [8].
As usual, we let Z α : = ( E | Z | α ) 1 / α , the L α -norm of a r.v. Z, which is actually a norm if and only if α 1 .
It follows from Proposition 2.1 and Formula (2.6) that
P α ( X ; x ) is nonincreasing in x R , with P α ( X ; ( ) + ) = 1 and P α ( X ; ) = 0 .
Here, as usual, f ( a + ) and f ( a ) denote the right and left limits of f at a.
One can say more in this respect. To do that, introduce
x * : = x * , X : = sup supp X and p * : = p * , X : = P ( X = x * ) .
Here, as usual, supp X denotes the support set of (the distribution of the r.v.) X; speaking somewhat loosely, x * is the maximum value taken by the r.v. X, and p * is the probability with which this value is taken. It is of course possible that x * = , in which case necessarily p * = 0 , since the r.v. X was assumed to be real-valued.
Introduce also
x α : = x α , X : = inf E α ( 1 ) ,
where
E α ( p ) : = E α , X ( p ) : = { x R : P α ( X ; x ) < p } .
Recall that, according to the standard convention, for any subset E of R , inf E = if and only if E = Ø . Now, one can state
Proposition 2.2. 
(i) 
For all x [ x * , ) , one has P α ( X ; x ) = P 0 ( X ; x ) = P ( X x ) = P ( X = x ) = p * I { x = x * } .  
(ii) 
For all x ( , x * ) , one has P α ( X ; x ) > 0 .
(iii) 
The function ( , x * ] R x P α ( X ; x ) 1 / α is continuous and convex if α ( 0 , ) ; we use the conventions 0 a : = and a : = 0 for all real a > 0 ; concerning the continuity of functions with values in the set [ 0 , ] , we use the natural topology on this set. Also, the function ( , x * ] R x ln P ( X ; x ) is continuous and convex, with the convention ln 0 : = .
(iv) 
If α ( 0 , ] , then the function ( , x * ] R x P α ( X ; x ) is continuous.
(v) 
The function R x P α ( X ; x ) is left-continuous.
(vi) 
x α is nondecreasing in α [ 0 , ] , and x α < for all α [ 0 , ] .
(vii) 
If α [ 1 , ] , then x α = E X ; even for X X α , it is of course possible that E X = , in which case P α ( X ; x ) < 1 for all real x.
(viii) 
x α x * , and x α = x * if and only if p * = 1 .
(ix) 
E α ( 1 ) = ( x α , ) Ø .
(x) 
P α ( X ; x ) = 1 for all x ( , x α ] .
(xi) 
If α ( 0 , ] , then P α ( X ; x ) is strictly decreasing in x [ x α , x * ] R .
This proposition will be useful when establishing continuity properties of the quantile bounds considered in Section 3 and the matters of effective computation addressed in Section 4. Moreover, Proposition 2.2 will be heavily used in the proof of Proposition 3.1 to establish basic properties of the risk measures Q α ( X ; p ) .
For α ( 1 , ) , Parts (i), (iv), (vii), (x), and (xi) of Proposition 2.2 are contained in [6], Proposition 3.2.
One may also note here that, by (2.16) and Part (v) of Proposition 2.2, the function P α ( X ; · ) may be regarded as the tail function of some r.v. Z α : P α ( X ; u ) = P ( Z α u ) for all real u.
Some parts of Propositions 2.1 and 2.2 are illustrated in Example 1.3 in [8] and in the corresponding figure there.
Proposition 2.3. P α ( X ; x ) is continuous in α [ 0 , ] in the following sense: Suppose that ( α n ) is any sequence in [ 0 , ) converging to α [ 0 , ] , with β : = sup n α n and X X β ; then P α n ( X ; x ) P α ( X ; x ) .
In view of Parts (ii) and (iii) of Proposition 2.1, the condition X X β in Proposition 2.3 is essential.
Let us now turn to the question of stability of P α ( X ; x ) with respect to (the distribution of) X. First here, recall that one of a number of mutually equivalent definitions of the convergence in distribution, X n D n X , of a sequence of r.v.’s X n to an r.v. X is the following: P ( X n x ) n P ( X x ) for all real x such that P ( X = x ) = 0 ; cf.; cf. e.g. [15, §4 and Theorem 2.1].
We shall also need the following uniform integrability condition:
sup n E ( X n ) + α I { X n > N } N 0 if α ( 0 , ) ,
sup n E e λ X n I { X n > N } N 0 for each λ Λ X if α = .
Proposition 2.4. Suppose that α ( 0 , ] . Then P α ( X ; x ) is continuous in X in the following sense. Take any sequence ( X n ) n N of real-valued r.v.’s such that X n D n X and the uniform integrability condition (2.20)- (2.21) is satisfied. Then one has the following.
(i) 
The convergence
P α ( X n ; x ) n P α ( X ; x )
takes place for all real x x * , where x * = x * , X as in (2.17); thus, by Parts (i) and (iv) of Proposition 2.2, (2.22) holds for all real x that are points of continuity of the function P α ( X ; · ) .
(ii) 
The convergence (2.22) holds for x = x * as well, provided that P ( X n = x * ) n P ( X = x * ) . In particular, (2.22) holds for x = x * if P ( X = x * ) = 0 .
Note that in the case α = 0 the convergence (2.22) may fail to hold, not only for x = x * , but for all real x such that P ( X = x ) > 0 .
***
Let us now discuss matters of monotonicity of P α ( X ; x ) in X, with respect to various orders on the mentioned set X of all real-valued r.v.’s X. Using the family of function classes H α , defined by (2.11), one can introduce a family of stochastic orders, say α + 1 , on the set X by the formula
X α + 1 Y def E g ( X ) E g ( Y ) for all g H α ,
where α [ 0 , ] and X and Y are in X . To avoid using the term “order” with two different meanings in one phrase, let us refer to the relation α + 1 as the stochastic dominance of order α + 1 , rather than the stochastic order of order α + 1 . In view of (2.11), it is clear that
X α + 1 Y E g α ; t ( X ) E g α ; t ( Y ) for all t T α ,
so that, in the case when α = m 1 for some natural number m, the order α + 1 coincides with the “m-increasing-convex” order m icx as defined e.g. on page 206 in [16]. In particular,
X 1 Y P ( X > t ) P ( Y > t ) for all t R P ( X t ) P ( Y t ) for all t R X st Y ,
where st denotes the usual stochastic dominance of order 1, and:
X 2 Y E ( X t ) + E ( Y t ) + for all t R ,
so that 2 coincides with the usual stochastic dominance of order 2. Also,
X st Y iff for some r.v.’s X 1 and Y 1 one has X 1 Y 1 , X 1 = D X , and Y 1 = D Y ,
where = D denotes the equality in distribution.
By (2.12), the orders α + 1 are graded in the sense that
if X α + 1 Y for some α [ 0 , ] , then X β + 1 Y for all β [ α , ] .
A stochastic order, which is a “mirror image” of the order α + 1 , but only for nonnegative r.v.’s, was presented by Fishburn in [17]; note Theorem 2 in [17] on the relation with a “bounded” version of this order, previously introduced and studied in [18]. Denoting the corresponding Fishburn [17] order by α + 1 , one has
X α + 1 Y ( Y ) α + 1 ( X ) ,
for nonnegative r.v.’s X and Y. However, as shown in this paper (recall Proposition 2.1), the condition of the nonnegativity of the r.v.’s is not essential; without it, one can either deal with infinite expected values or, alternatively, require that they be finite. The case when α is an integer was considered, in a different form, in [19].
One may also consider the order α 1 defined by the condition that X α 1 Y if and only if X and Y are nonnegative r.v.’s and F X ( α ) ( p ) F Y ( α ) ( p ) for all p ( 0 , 1 ) , where α ( 0 , ) ,
F X ( α ) ( p ) : = 1 Γ ( α ) [ 0 , p ) ( p u ) α 1 d F X 1 ( u ) ,
F X 1 ( p ) : = inf { x [ 0 , ) : P ( X x ) p } = Q ( X ; p )
with Q ( · ; · ) as in (3.3), and the integral in (2.29) is understood as the Lebesgue integral with respect to the nonnegative Borel measure μ X 1 on [ 0 , 1 ) defined by the condition that μ X 1 [ 0 , p ) = F X 1 ( p ) for all p ( 0 , 1 ) ; cf. [20,21]. Note that F X ( 1 ) ( p ) = F X 1 ( p ) . For nonnegative r.v.’s, the order α + 1 1 coincides with the order α + 1 if α { 0 , 1 } ; again see [20,21]. Even for nonnegative r.v.’s, it seems unclear how the orders α + 1 and α + 1 1 relate to each other for positive real α 1 ; see e.g. the discussion following Proposition 1 in [20] and Note 1 on page 100 in [22].
The following theorem summarizes some of the properties of the tail probability bounds P α ( X ; x ) established above and also adds a few simple properties of these bounds.
Theorem 2.5. The following properties of the tail probability bounds P α ( X ; x ) are valid.
Model-independence: 
P α ( X ; x ) depends on the r.v. X only through the distribution of X.
Monotonicity in X:
P α ( · ; x ) is nondecreasing with respect to the stochastic dominance of order α + 1 : for any r.v. Y such that X α + 1 Y , one has P α ( X ; x ) P α ( Y ; x ) . Therefore, P α ( · ; x ) is nondecreasing with respect to the stochastic dominance of any order γ [ 1 , α + 1 ] ; in particular, for any r.v. Y such that X Y , one has P α ( X ; x ) P α ( Y ; x ) .
Monotonicity in α:
P α ( X ; x ) is nondecreasing in α [ 0 , ] .
Monotonicity in x:
P α ( X ; x ) is nonincreasing in x R .
Values: 
P α ( X ; x ) takes only values in the interval [ 0 , 1 ] .
α-concavity in x:
P α ( X ; x ) 1 / α is convex in x if α ( 0 , ) , and ln P α ( X ; x ) is concave in x if α = .
Stability in x:
P α ( X ; x ) is continuous in x at any point x R – except the point x = x * when p * > 0 .
Stability in α:
Suppose that a sequence ( α n ) is as in Proposition 2.3. Then P α n ( X ; x ) P α ( X ; x ) .
Stability in X:
Suppose that α ( 0 , ] and a sequence ( X n ) is as in Proposition 2.4. Then P α ( X n ; x ) P α ( X ; x ) .
Translation invariance: 
P α ( X + c ; x + c ) = P α ( X ; x ) for all real c.
Consistency: 
P α ( c ; x ) = P 0 ( c ; x ) = I { c x } for all real c; that is, if the r.v. X is the constant c, then all the tail probability bounds P α ( X ; x ) precisely equal the true tail probability P ( X x ) .
Positive homogeneity: 
P α ( κ X ; κ x ) = P α ( X ; x ) for all real κ > 0 .

3. An Optimal Three-Way Stable and Three-Way Monotonic Spectrum of Upper Bounds on Quantiles

Take any
p ( 0 , 1 )
and introduce the generalized inverse (with respect to x) of the bound P α ( X ; x ) by the formula
Q α ( X ; p ) : = inf E α , X ( p ) = inf x R : P α ( X ; x ) < p ,
where E α , X ( p ) is as in (2.19). In particular, in view of the equality in (2.6),
Q ( X ; p ) : = Q 0 ( X ; p ) = inf x R : P ( X x ) < p = inf x R : P ( X > x ) < p ,
which is a ( 1 p ) -quantile of (the distribution of) the r.v. X; actually, Q ( X ; p ) is the largest one in the set of all ( 1 p ) -quantiles of X.
It follows immediately from (3.2), (2.13), and (3.3) that
Q α ( X ; p ) is an upper bound on the quantile Q ( X ; p ) , and Q α ( X ; p ) is nondecreasing in α [ 0 , ] .
Thus, one has a monotonic spectrum of upper bounds, Q α ( X ; p ) , on the quantile Q ( X ; p ) , ranging from the tightest bound, Q 0 ( X ; p ) = Q ( X ; p ) , to the loosest one, Q ( X ; p ) , which latter is based on the exponential bound P ( X ; x ) = inf λ > 0 E e λ ( X x ) on P ( X x ) .
Also, it is obvious from (3.2) that
Q α ( X ; p ) is nonincreasing in p ( 0 , 1 ) .
Basic properties of Q α ( X ; p ) are collected in
Proposition 3.1. Recall the definitions of x * and x α in (2.17) and (2.18). The following statements are true.
(i) 
Q α ( X ; p ) R .
(ii) 
If p ( 0 , p * ] ( 0 , 1 ) then Q α ( X ; p ) = x * .
(iii) 
Q α ( X ; p ) x * .
(iv) 
Q α ( X ; p ) p 0 x * .
Figure 1. Illustration of Proposition 3.1
Figure 1. Illustration of Proposition 3.1
Risks 02 00349 g001
(v) 
If α ( 0 , ] , then the function
( p * , 1 ) p Q α ( X ; p ) ( x α , x * )
is the unique inverse to the continuous strictly decreasing function
( x α , x * ) x P α ( X ; x ) ( p * , 1 ) .
Therefore, the function (3.6), too, is continuous and strictly decreasing.
(vi) 
If α ( 0 , ] , then for any y , Q α ( X ; p ) , one has P α ( X ; y ) > p .
(vii) 
If α [ 1 , ] , then Q α ( X ; p ) > E X .
Example 3.2. Some parts of Proposition 3.1 are illustrated in Figure 1, with graphs { p , Q α ( X ; p ) : 0 < p < 1 } in the important case when the r.v. X takes only two values. Then, by the translation invariance property stated below in Theorem 2.5, without loss of generality (w.l.o.g.) E X = 0 . Thus, X = X a , b , where a and b are positive real numbers and X a , b is a r.v. with the uniquely determined zero-mean distribution on the set { a , b } . Let us take a = 1 and b = 3 , with the values of α equal 0 (black), 1 2 (blue), 1 (green), 2 (orange), and ∞ (red). One may compare this picture with the one for P α ( X ; x ) in Example 1.3 in [8] (where the same values of a, b, and α were used), having in mind that the function Q α ( X ; · ) is a generalized inverse to the function P α ( X ; · ) .
The definition (3.2) of Q α ( X ; p ) is rather complicated, in view of the definition (2.5) of P α ( X ; x ) . So, the following theorem will be useful, as it provides a more direct expression of Q α ( X ; p ) ; at that, one may again recall (3.3), concerning the case α = 0 .
Theorem 3.3. For all α ( 0 , ]
Q α ( X ; p ) = inf t T α B α ( X ; p ) ( t ) ,
where T α is as in (2.8) and
B α ( X ; p ) ( t ) : = t + ( X t ) + α p 1 / α for α ( 0 , ) , t ln E e X / t p for α = .
Proof of Theorem 3.3. The proof is based on the simple observation, following immediately from the definitions (2.9) and (3.9), that the dual level sets for the functions A ˜ α ( X ; x ) and B α ( X ; p ) are the same:
T A ˜ α ( X ; x ) ( p ) = T B α ( X ; p ) ( x )
for all α ( 0 , ] , x R , and p ( 0 , 1 ) , where
T A ˜ α ( X ; x ) ( p ) : = { t T α : A ˜ α ( X ; x ) ( t ) < p } and T B α ( X ; p ) ( x ) : = { t T α : B α ( X ; p ) ( t ) < x } .
Indeed, by (2.7) and (3.10),
P α ( X ; x ) < p inf t T α A ˜ α ( X ; x ) ( t ) < p T A ˜ α ( X ; x ) ( p ) Ø T B α ( X ; p ) ( x ) Ø x > inf t T α B α ( X ; p ) ( t ) .
Now, (3.8) follows immediately by (3.2). ☐
Note that the case α = of Theorem 3.3 is a special case of Proposition 1.5 in [23], and the above proof of Theorem 3.3 is similar to that of Proposition 1.5 in [23]. Correspondingly, the duality presented in the above proof of Theorem 3.3 is a generalization of the bilinear Legendre–Fenchel duality considered in [23].
The following theorem presents the most important properties of the quantile bounds Q α ( X ; p ) , in addition to the variational representation of Q α ( X ; p ) given by Theorem 3.3.
Theorem 3.4. The following properties of the quantile bounds Q α ( X ; p ) are valid.
Model-independence: 
Q α ( X ; p ) depends on the r.v. X only through the distribution of X.
Monotonicity in X:
Q α ( · ; p ) is nondecreasing with respect to the stochastic dominance of order α + 1 : for any r.v. Y such that X α + 1 Y , one has Q α ( X ; p ) Q α ( Y ; p ) . Therefore, Q α ( · ; p ) is nondecreasing with respect to the stochastic dominance of any order γ [ 1 , α + 1 ] ; in particular, for any r.v. Y such that X Y , one has Q α ( X ; p ) Q α ( Y ; p ) .
Monotonicity in α:
Q α ( X ; p ) is nondecreasing in α [ 0 , ] .
Monotonicity in p:
Q α ( X ; p ) is nonincreasing in p ( 0 , 1 ) , and Q α ( X ; p ) is strictly decreasing in p [ p * , 1 ) ( 0 , 1 ) if α ( 0 , ] .
Finiteness: 
Q α ( X ; p ) takes only (finite) real values.
Concavity in p 1 / α or in ln 1 p : 
Q α ( X ; p ) is concave in p 1 / α if α ( 0 , ) , and Q ( X ; p ) is concave in ln 1 p .
Stability in p:
Q α ( X ; p ) is continuous in p ( 0 , 1 ) if α ( 0 , ] .
Stability in X:
Suppose that α ( 0 , ] and a sequence ( X n ) is as in Proposition 2.4. Then Q α ( X n ; p ) Q α ( X ; p ) .
Stability in α:
Suppose that α ( 0 , ] and a sequence ( α n ) is as in Proposition 2.3. Then Q α n ( X ; p ) Q α ( X ; p ) .
Translation invariance: 
Q α ( X + c ; p ) = Q α ( X ; p ) + c for all real c.
Consistency: 
Q α ( c ; p ) = c for all real c; that is, if the r.v. X is the constant c, then all of the quantile bounds Q α ( X ; p ) equal c.
Positive sensitivity: 
Suppose here that X 0 . If at that P ( X > 0 ) > 0 , then Q α ( X ; p ) > 0 for all α ( 0 , ] ; if, moreover, P ( X > 0 ) > p , then Q 0 ( X ; p ) > 0 .
Positive homogeneity: 
Q α ( κ X ; p ) = κ Q α ( X ; p ) for all real κ 0 .
Subadditivity: 
Q α ( X ; p ) is subadditive in X if α [ 1 , ] ; that is, for any other r.v. Y (defined on the same probability space as X) one has:
Q α ( X + Y ; p ) Q α ( X ; p ) + Q α ( Y ; p ) .
Convexity: 
Q α ( X ; p ) is convex in X if α [ 1 , ] ; that is, for any other r.v. Y (defined on the same probability space as X) and any t ( 0 , 1 ) one has
Q α ( 1 t ) X + t Y ; p ( 1 t ) Q α ( X ; p ) + t Q α ( Y ; p )
The inequality Q 1 ( X ; p ) Q ( X ; p ) , in other notations, was mentioned (without proof) in [24]; of course, this inequality is a particular, and important, case of the monotonicity of Q α ( X ; p ) in α [ 0 , ] . That Q α ( · ; p ) is nondecreasing with respect to the stochastic dominance of order α + 1 was shown (using other notations) in [25] in the case α = 1 .
The following two propositions complement the monotonicity property of Q α ( X ; p ) in X stated in Theorem 3.4.
Proposition 3.5. The upper bound α + 1 on γ in the statement of the monotonicity of Q α ( X ; p ) in X in Theorem 3.4 is exact in the following rather strong sense. For any α [ 0 , ) , there exist r.v.’s X and Y in X α such that X γ Y for all γ ( α + 1 , ] , whereas Q α ( X ; p ) > Q α ( Y ; p ) .
Proposition 3.6. Suppose that an r.v. Y is stochastically strictly greater than X (which may be written as X < st Y ; cf., (2.24)) in the sense that X st Y and for any v R there is some u ( v , ) such that P ( X u ) < P ( Y u ) . Then Q α ( X ; p ) < Q α ( Y ; p ) if α ( 0 , ] .
The latter proposition will be useful in the proof of Proposition 3.7 below.
Given the positive homogeneity, it is clear that the subadditivity and convexity properties of Q α ( X ; p ) easily follow from each other. In the statements in Theorem 3.4 on these two mutually equivalent properties, it was assumed that α [ 1 , ] . One may ask whether this restriction is essential. The answer to this question is “yes”:
Proposition 3.7. There are r.v.’s X and Y such that for all α [ 0 , 1 ) and all p ( 0 , 1 ) one has Q α ( X + Y ; p ) > Q α ( X ; p ) + Q α ( Y ; p ) , so that the function Q α ( · ; p ) is not subadditive (and, equivalently, not convex).
It is well known (see e.g. [1,2,26]) that Q ( X ; p ) = Q 0 ( X ; p ) is not subadditive in X; it could therefore have been expected that Q α ( X ; p ) will not be subadditive in X if α is close enough to 0. In quite a strong and specific sense, Proposition 3.7 justifies such expectations.
***
Consider briefly the rather important case when the distribution of X belongs to a location-scale family; that is, when (the distribution of) the r.v. X has a probability density function (pdf) of the form
f μ , σ ( x ) = 1 σ f x μ σ
for all real x, where f is a pdf, μ R (is the “location” parameter), and σ ( 0 , ) (is the “scale” parameter). Then f may be referred to as the “standard” pdf of this family. Perhaps the most common example of a location-scale family is the normal distribution family, for which f is the standard normal pdf, and μ and σ are the mean and the standard deviation of the distribution.
Proposition 3.8. If the r.v. X has a pdf of the form (3.11), then
Q α ( X ; p ) = μ + σ Q α ( Z ; p ) ,
where Z stands for any r.v. with the “standard” pdf f.
This follows immediately by the translation invariance, positive homogeneity, and model-independence properties stated in Theorem 3.4. Note that, given any location-scale family, Q α ( Z ; p ) depends only on α and p.
Remark 3.9. It is shown in [8] that for small enough values of p the quantile bounds Q α ( X ; p ) are close enough to the true quantiles Q 0 ( X ; p ) = VaR p ( X ) provided that the right tail of the distribution of X is light enough and regular enough, depending on α see Proposition 2.7 in [8].
For instance, if the r.v. X has the normal distribution with mean μ and standard deviation σ, then, by (3.12) and the monotonicity of Q α ( X ; p ) in α,
μ + σ Q 0 ( Z ; p ) = Q 0 ( X ; p ) Q α ( X ; p ) Q ( X ; p ) = μ + σ Q ( Z ; p ) .
Next, obviously Q 0 ( Z ; p ) = Φ 1 ( 1 p ) , where Φ 1 is the inverse to the standard normal distribution function Φ, and Q ( Z ; p ) = 2 ln 1 p . Also, 1 Φ ( u ) = exp { u 2 / ( 2 + o ( 1 ) ) } as u . Therefore, Q 0 ( Z ; p ) = Φ 1 ( 1 p ) p 0 2 ln 1 p = Q ( Z ; p ) . Here, as usual, a b means a / b 1 . Hence, by (3.13), Q α ( X ; p ) Q 0 ( X ; p ) = VaR p ( X ) for small p > 0 and all α ( 0 , ] .
Another easy to consider case, also illustrating Remark 3.9, is that of the exponential location-scale family, with the “standard” pdf f given by the formula f ( x ) = e x I { x > 0 } .
Let then the r.v. X have the corresponding pdf f μ , σ , so that f μ , σ ( x ) = 1 σ exp x μ σ I { x > μ } . Let Z be any r.v. with the “standard” exponential pdf f. Then, obviously, Q 0 ( Z ; p ) = ln 1 p . Also, it is not hard to see that here Q ( Z ; p ) = W 1 ( p / e ) , where W 1 is the ( 1 ) -branch of the Lambert function [27, pages 3 and 16]; that is, Q ( Z ; p ) is the only root u ( , 1 ] of the equation u e u = p / e . Note that u e u = exp { ( 1 + o ( 1 ) ) u } as u . Therefore, Q ( Z ; p ) = W 1 ( p / e ) p 0 ln 1 p = Q 0 ( Z ; p ) . Hence, by the monotonicity in α, one has Q α ( Z ; p ) p 0 Q 0 ( Z ; p ) uniformly in α [ 0 , ] . Hence, again by (3.13), Q α ( X ; p ) Q 0 ( X ; p ) = VaR p ( X ) for small p > 0 and all α ( 0 , ] .
For α [ 1 , ) and a r.v. Z as in the above paragraph, one has B ( 0 ) = 1 p α p 0 if 0 < p p α , where B ( t ) : = B α ( Z ; p ) ( t ) and p α : = Γ ( α + 1 ) α α ; then, in view of Part (i) of Proposition 4.4, the infimum in (3.8) is attained at some point t α [ 0 , ) ; in fact, t α = ln p α p . It follows that Q α ( Z ; p ) = α + ln p α p for all α [ 1 , ) and p ( 0 , p α ) ; so, one can now establish directly that Q α ( Z ; p ) p 0 ln 1 p = Q 0 ( Z ; p ) for each α [ 1 , ) .

4. Computation of the Tail Probability and Quantile Bounds

4.1. Computation of P α ( X ; x )

The computation of P α ( X ; x ) in the case α = 0 is straightforward, in view of the equality in (2.6). If x [ x * , ) , then the value of P α ( X ; x ) is easily found by Part (i) of Proposition 2.2. Therefore, in the rest of this subsection it may be assumed that α ( 0 , ] and x ( , x * ) .
In the case when α ( 0 , ) , using (2.5), the inequality
1 + λ ( X x ) / α + α 2 ( α 1 ) + λ α X + α + ( α λ x ) + α / α α ,
the condition X X α , and dominated convergence, one sees that A α ( X ; x ) ( λ ) is continuous in λ ( 0 , ) and right-continuous in λ at λ = 0 (assuming the definition (2.3) for λ = 0 as well), and hence
P α ( X ; x ) = inf λ [ 0 , ) A α ( X ; x ) ( λ ) .
Similarly, using in place of (4.1) the inequality e λ X 1 + e λ 0 X whenever 0 λ λ 0 , one can show that A ( X ; x ) ( λ ) is continuous in λ Λ X (recall (2.15)) and right-continuous in λ at λ = 0 , so that (4.2) holds for α = as well – provided that X X . Moreover, by the Fatou lemma for the convergence in distribution (see e.g. Theorem 5.3 in [15]), A ( X ; x ) ( λ ) is lower-semicontinuous in λ at λ = λ * : = sup Λ X even if λ * R \ Λ X . It then follows by the convexity of A ( X ; x ) ( λ ) in λ that A ( X ; x ) ( λ ) is left-continuous in λ at λ = λ * whenever λ * R ; at that, the natural topology on the set [ 0 , ] is used, as it is of course possible that A ( X ; x ) ( λ * ) = .
Since x ( , x * ) , one can find some y ( x , ) such that P ( X y ) > 0 (of course, necessarily y ( x , x * ] ); so, one can introduce
λ max : = λ max , α : = λ max , α , X : = α y x 1 P ( X y ) 1 / α 1 if α ( 0 , ) , 1 y x ln 1 P ( X y ) if α = .
Then, by (2.3), A α ( X ; x ) ( λ ) E 1 + λ ( X x ) / α + α I { X y } 1 + λ ( y x ) / α α P ( X y ) > 1 if α ( 0 , ) and λ ( λ max , α , ) , and A ( X ; x ) ( λ ) E e λ ( X x ) I { X y } e λ ( y x ) P ( X y ) > 1 if λ ( λ max , , ) . Therefore, for all α ( 0 , ] one has A α ( X ; x ) ( λ ) > 1 P α ( X ; x ) = inf λ ( 0 , ) A α ( X ; x ) ( λ ) provided that λ ( λ max , , ) , and hence
P α ( X ; x ) = inf λ [ 0 , λ max , α ] A α ( X ; x ) ( λ ) , if α ( 0 , ] and x ( , x * ) .
Therefore and because λ max , α < , the minimization of A α ( X ; x ) ( λ ) in λ in (4.4) in order to compute the value of P α ( X ; x ) can be done effectively if α [ 1 , ] , because in this case A α ( X ; x ) ( λ ) is convex in λ . At that, the positive-part moments E ( 1 + λ ( X x ) / α ) + α , which express A α ( X ; x ) ( λ ) for α ( 0 , ) in accordance with (2.3), can be efficiently computed using formulas in [28]; cf. e.g. Section 3.2.3 in [6]. Of course, for specific kinds of distributions of the r.v. X, more explicit expressions for the positive-part moments can be used.
In the remaining case, when α ( 0 , 1 ) , the function λ A α ( X ; x ) ( λ ) cannot in general be “convexified” by any monotonic transformations in the domain and/or range of this function, and the set of minimizing values of λ does not even have to be connected, in the following rather strong sense:
Proposition 4.1. For any α ( 0 , 1 ) , p ( 0 , 1 ) , and x R , there is a r.v. X (taking three distinct values) such that P α ( X ; x ) = p and the infimum inf λ ( 0 , ) A α ( X ; x ) ( λ ) in (2.5) is attained at precisely two distinct values of λ ( 0 , ) .
Proposition 4.1 is illustrated by
Example 4.2. Let X be a r.v. taking values 27 11 , 1 , 2 with probabilities 1 4 , 1 4 , 1 2 ; then x * = 2 . Also let α = 1 2 and x = 0 , so that x ( , x * ) , and then let λ max be as in (4.3) with y = x * = 2 , so that here λ max = 3 4 . Then the minimum of A α ( X ; 0 ) ( λ ) over all real λ 0 equals 3 2 and is attained at each of the two points, λ = 11 54 and λ = 1 2 , and only at these two points. The graph λ , A 1 / 2 ( X ; 0 ) ( λ ) : 0 λ λ max is shown here in Figure 2.
Figure 2. Illustration of Example 4.2
Figure 2. Illustration of Example 4.2
Risks 02 00349 g002
Nonetheless, effective minimization of A α ( X ; x ) ( λ ) in λ in (4.4) is possible even in the case α ( 0 , 1 ) , say by the interval method. Indeed, take any α ( 0 , 1 ) and write
A α ( X ; x ) ( λ ) = A α + ( X ; x ) ( λ ) + A α ( X ; x ) ( λ ) ,
where (cf. (2.3)) A α + ( X ; x ) ( λ ) : = E ( 1 + λ ( X x ) / α ) + α I { X x } and A α ( X ; x ) ( λ ) : = E ( 1 + λ ( X x ) / α ) + α I { X < x } . Just as A α ( X ; x ) ( λ ) is continuous in λ [ 0 , ) , so are A α + ( X ; x ) ( λ ) and A α ( X ; x ) ( λ ) . It is also clear that A α + ( X ; x ) ( λ ) is nondecreasing and A α ( X ; x ) ( λ ) is nonincreasing in λ [ 0 , ) .
So, as soon as the minimizing values of λ are bracketed as in (4.4), one can partition the finite interval [ 0 , λ max , α ] into a large number of small subintervals [ a , b ] with 0 a < b λ max , α . For each such subinterval,
M a , b : = max λ [ a , b ] A α ( X ; x ) ( λ ) A α + ( X ; x ) ( b ) + A α ( X ; x ) ( a ) , m a , b : = min λ [ a , b ] A α ( X ; x ) ( λ ) A α + ( X ; x ) ( a ) + A α ( X ; x ) ( b ) ,
so that, by the continuity of A α ± ( X ; x ) ( λ ) in λ ,
M a , b m a , b A α + ( X ; x ) ( b ) A α + ( X ; x ) ( a ) + A α ( X ; x ) ( a ) A α ( X ; x ) ( b ) 0
as b a 0 , uniformly over all subintervals [ a , b ] of the interval [ 0 , λ max , α ] . Thus, one can effectively bracket the value P α ( X ; x ) = inf λ [ 0 , λ max , α ] A α ( X ; x ) ( λ ) with any degree of accuracy; this same approach will work, and perhaps may be sometimes useful, for α [ 1 , ) as well.

4.2. Computation of Q α ( X ; p )

Proposition 4.3. (Quantile bounds: Attainment and bracketing).
(i) 
If α ( 0 , ) , then inf t T α B α ( X ; p ) ( t ) = inf t R B α ( X ; p ) ( t ) in (3.8) is attained at some t opt R and hence
Q α ( X ; p ) = min t R B α ( X ; p ) ( t ) = B α ( X ; p ) ( t opt ) ;
moreover, for any
s R and p ˜ ( p , 1 ) ,
necessarily
t opt [ t min , t max ] ,
where
t max : = B α ( X ; p ) ( s ) , t min : = t 0 , min t 1 , min ,
t 0 , min : = Q 0 ( X ; p ˜ ) , t 1 , min : = ( p ˜ / p ) 1 / α t 0 , min t max ( p ˜ / p ) 1 / α 1 .
(ii) 
Suppose now that α = . Then inf t T α B α ( X ; p ) ( t ) = inf t ( 0 , ) B α ( X ; p ) ( t ) in (3.8) is attained, and hence
Q ( X ; p ) = min t ( 0 , ) B ( X ; p ) ( t )
unless
x * < and p p * ,
where x * and p * are as in (2.17). On the other hand, if conditions (4.9) hold, then B ( X ; p ) ( t ) is strictly increasing in t > 0 and hence inf t T α B α ( X ; p ) ( t ) = inf t ( 0 , ) B α ( X ; p ) ( t ) in (3.8) is not attained; rather,
Q ( X ; p ) = inf t > 0 B ( X ; p ) ( t ) = B ( X ; p ) ( 0 + ) = x * .
For instance, in the case when α = 0.5 , p = 0.05 , and X has the Gamma distribution with the shape and scale parameters equal to 2.5 and 1, respectively, Proposition 4.3 yields t min > 4.01 (using p ˜ = 0.095 ) and t max < 6.45 .
When α = 0 , the quantile bound Q α ( X ; p ) is simply the quantile Q ( X ; p ) , which can be effectively computed by Formula (3.3), since the tail probability P ( X > x ) is monotone in x. Next, as was noted in the proof of Theorem 3.4, B α ( X ; p ) ( t ) is convex in t when α [ 1 , ] , which provides for an effective computation of Q α ( X ; p ) by Formula (3.8).
Therefore, it remains to consider the computation – again by Formula (3.8) – of Q α ( X ; p ) for α ( 0 , 1 ) . In such a case, as in Section 4.1, one can use an interval method. As soon as the minimizing values of t are bracketed as in (4.6), one can partition the finite interval [ t min , t max ] into a large number of small subintervals [ a , b ] with t min a < b t max . For each such subinterval,
M a , b : = max t [ a , b ] B α ( X ; p ) ( t ) b + p 1 / α ( X a ) + α , m a , b : = min t [ a , b ] B α ( X ; p ) ( t ) a + p 1 / α ( X b ) + α ,
so that, by the continuity of ( X t ) + α in t,
M a , b m a , b b a + p 1 / α ( ( X a ) + α ( X b ) + α ) 0
as b a 0 , uniformly over all subintervals [ a , b ] of the interval [ t min , t max ] . Thus, one can effectively bracket the value Q α ( X ; p ) = inf t R B α ( X ; p ) ( t ) ; this same approach will work, and perhaps may be useful, for α [ 1 , ) as well.
In accordance with Proposition 3.2 in [6], consider
x * * : = x * * , X : = sup ( supp X ) \ { x * } [ , x * ] [ , ] .
The following proposition will be useful.
Proposition 4.4. 
(i) 
If α [ 1 , ] , then B α ( X ; p ) ( t ) is convex in the pair ( X , t ) X α × T α .
(ii) 
If α ( 1 , ) , then B α ( X ; p ) ( t ) is strictly convex in t ( , x * * ] R .
(iii) 
B ( X ; p ) ( t ) is strictly convex in t { s ( 0 , ) : E e X / s < } , unless P ( X = c ) = 1 for some c R .
If α ( 1 , ) then, by Part (ii) of Proposition 4.4 and Part (i) of Proposition 4.3, the set argmin t R B α ( X ; p ) ( t ) is a singleton one; that is, there is exactly one minimizer t R of B α ( X ; p ) ( t ) . If α = 1 , then B α ( X ; p ) ( t ) = B 1 ( X ; p ) ( t ) is convex, but not strictly convex, in t, and the set argmin t R B α ( X ; p ) ( t ) of all minimizers of B α ( X ; p ) ( t ) in t coincides with the set of all ( 1 p ) -quantiles of X, as mentioned at the conclusion of the derivation of the identity (5.10). Thus, if α = 1 , then the set argmin t R B α ( X ; p ) ( t ) may in general be, depending on p and the distribution of X, a nonzero-length closed interval. Finally, if α ( 0 , 1 ) then, in general, the set argmin t R B α ( X ; p ) ( t ) does not have to be connected:
Proposition 4.5. For any α ( 0 , 1 ) , p ( 0 , 1 ) , and x R , there is a r.v. X (taking three distinct values) such that Q α ( X ; p ) = x and the infimum inf t T α B α ( X ; p ) ( t ) = inf t R B α ( X ; p ) ( t ) in (3.8) is attained at precisely two distinct values of t.
Proposition 4.5 follows immediately from Proposition 4.1, by the duality (3.10) and the change-of-variables identity A α ( X ; x ) ( λ ) = A ˜ α ( X ; x ) ( x α / λ ) for α ( 0 , ) , used to establish (2.7)–(2.9). At that, λ ( 0 , ) is one of the two minimizers of A α ( X ; x ) ( λ ) in Proposition 4.1 if and only if t : = x α / λ is one of the two minimizers of B α ( X ; p ) ( t ) in Proposition 4.5.
Proposition 4.1 is illustrated by the following example, which is obtained from Example 4.2 by the same duality (3.10).
Example 4.6. 
As in Example 4.2, let α = 1 2 , and let X be a r.v. taking values 27 11 , 1 , 2 with probabilities 1 4 , 1 4 , 1 2 . Also let p = 3 2 . Then the minimum of B α ( X ; p ) ( t ) over all real t equals zero and is attained at each of the two points, t = 27 11 and t = 1 , and only at these two points. The graph t , B 1 / 2 X ; 3 2 ( t ) : 3 t 3 is shown in Figure 3. The minimizing values of t here, 27 11 and 1 , are related with the minimizing values of λ in Example 4.2, 11 54 and 1 2 , by the mentioned formula t = x α / λ (here, with x = 0 and α = 1 2 ).
Figure 3. Illustration of Example 4.6
Figure 3. Illustration of Example 4.6
Risks 02 00349 g003

4.3. Optimization of the Risk Measures Q α ( X ; p ) with Respect to X

As was pointed out, the variational representation of Q α ( X ; p ) given in (3.8) allows for a comparatively easy incorporation of these risk measures into more specialized optimization problems, with restrictions on the r.v. X. Indeed, (3.8) immediately yields the following generalization of Theorem 14 of Rockafellar and Uryasev [2]:
Theorem 4.7. (Optimization shortcut.) Take any α ( 0 , ] and any p ( 0 , 1 ) . Let Y α be any subset of the set X α of r.v.’s defined by Formula (2.14). Then, for any α ( 0 , ] and any p ( 0 , 1 ) , the minimization of the risk measure Q α ( X ; p ) in X Y α is equivalent to the minimization of B α ( X ; p ) ( t ) in ( t , X ) ( T α , Y α ) , in the sense that
inf X Y α Q α ( X ; p ) = inf ( t , X ) ( T α , Y α ) B α ( X ; p ) ( t ) .
The mentioned Theorem 14 in [2] is the special case of Theorem 4.7 corresponding to α = 1 ; recall that in this case, according to (5.1), Q α ( X ; p ) coincides with CVaR p ( X ) .
Suppose that α [ 1 , ] and the set Y α is convex. Then, in view of Part (i) of Proposition 4.4, computing the infimum on the right-hand side of (4.11) is a problem of convex optimization, for which there are very effective algorithms.
In view of the variational representations of P α ( X ; x ) given in (2.5) and (2.7), the result similar to Theorem 4.7 obviously holds for P α ( X ; x ) as well.
When the uncertain potential losses on the assets under consideration are modeled as jointly normal r.v.’s, the optimization can be further simplified. Indeed, suppose that the column matrix X = [ X 1 , , X n ] T of the uncertain losses X 1 , , X n on assets 1 , , n is multivariate normal with mean vector μ = [ μ 1 , , μ n ] T and n × n covariance matrix Σ; here, as usual, T denotes the matrix transposition. Let w = [ w 1 , , w n ] T be the column matrix of the weights of the assets 1 , , n in the considered investment portfolio, so that the potential loss on the portfolio is X : = w · X : = w T X = w 1 X 1 + + w n X n , which is normally distributed with mean μ = w · μ and standard deviation σ = w T Σ w . Thus, in view of Proposition 3.8, the investor is now in the Markowitz mean-variance risk-assessment framework. For instance, the problem of minimizing the risk measure Q α ( X ; p ) given the mean loss μ (which, it is hoped, is negative) is equivalent to the quadratic optimization problem of minimizing the value of the quadratic form w T Σ w over all weight “vectors” w satisfying the restrictions (say) w · μ = μ , w · 1 = 1 , and K w 0 , where 1 : = [ 1 , , 1 n ] T , K is a rectangular real matrix, 0 is the the zero column matrix of the appropriate height, and the inequality K w 0 is considered component-wise, so that the latter inequality requires some or all of the weights w 1 , , w n (or some of their linear combinations) to be nonnegative.

4.4. Additional Remarks on the Computation and Optimization

As demonstrated in Propositions 4.1 and 4.5, the computation of P α ( X ; x ) and Q α ( X ; p ) in the case α ( 0 , 1 ) inherits some of the difficulties known for the case α = 0 , when Q α ( X ; p ) coincides with VaR p ( X ) .
One may also note that – even when a minimizing value of λ or t in Formulas (2.5) – (2.7), or (3.8) is not identified quite perfectly – one still obtains, by those formulas, an upper bound on P α ( X ; x ) or Q α ( X ; p ) and hence on the true tail probability P ( X x ) or the true quantile Q ( X ; p ) , respectively. A similar remark is valid concerning the optimization shortcut (4.11).
Using variational formulas – of which Formulas (2.5), (2.7), and (3.8) are examples – to define or compute measures of risk is not peculiar to the present paper. Indeed, as mentioned previously, the special case of (3.8) with α = 1 is the well-known variational representation (5.10) of CVaR , obtained in [2,3,26]. The risk measure given by the the Securities and Exchange Commission (SEC) rules Subsection 3.2 in [1] is another example where the calculations are done, in effect, according to a certain minimization formula, which is somewhat implicit and complicated in that case.

5. Implications for Risk Assessment in Finance and Inequality Modeling in Economics

5.1. The Spectrum Q α ( X ; p ) α [ 0 , ] Contains VaR and CVaR .

In financial literature (see, e.g., [2,26,29]), the quantile bounds Q 0 ( X ; p ) and Q 1 ( X ; p ) are known as the value at risk and conditional value at risk, denoted as VaR p ( X ) and CVaR p ( X ) , respectively:
Q 0 ( X ; p ) = VaR p ( X ) and Q 1 ( X ; p ) = CVaR p ( X ) ;
here, X is interpreted as a priori uncertain potential loss. The value of Q 1 ( X ; p ) is also known as the expected shortfall (ES) [30], average value at risk (AVaR) [31] and expected tail loss (ETL) [32]. As indicated in [2], at least in the case when there is no atom at the quantile point Q ( X ; p ) , the quantile bound Q 1 ( X ; p ) is also called the “mean shortfall” [33], whereas the difference Q 1 ( X ; p ) Q ( X ; p ) is referred to as “mean excess loss” [34,35].

5.2. The Spectrum Parameter α as a Risk Sensitivity Index

Greater values of the spectrum parameter α correspond to greater sensitivity to risk; cf., e.g., [36]. This is manifested, first of all, by the monotonicity of Q α ( X ; p ) in α, as stated in Theorem 3.4). (In the normal-distribution realm, this monotonicity is expressed as the growing (with α) weight of the standard deviation σ of the loss X in its linear combination with the mean μ in (3.12).)
Moreover, in view of the monotonicity in X (also stated in Theorem 3.4) and Proposition 3.5, the sensitivity index α is in a one-to-one correspondence with the highest order of the stochastic dominance respected by Q α ( X ; p ) .
As pointed out in the Introduction, the most popular coherent risk measure CVaR has a fixed and rather limited sensitivity to risk and thus allows of no variation in the degree of such sensitivity. In fact, one can easily construct two investment portfolios such that
(i)
one of the portfolios is clearly riskier than the other;
(ii)
this distinction is sensed (to varying degrees, depending on α) by all the risk measures Q α ( X ; p ) with α ( 1 , ) ;
(iii)
yet, the values of CVaR p = Q 1 ( X ; p ) are the same for both portfolios.
For instance, let X and Y denote the potential losses corresponding to two different investments portfolios. Suppose that there are mutually exclusive events E 1 and E 2 and real numbers p * ( 0 , 1 ) and δ ( 0 , 1 ) such that (i) P ( E 1 ) = P ( E 2 ) = p * / 2 ; (ii) the loss of either portfolio is 0 if the event E 1 E 2 does not occur; (iii) the loss of the X-portfolio is 1 if the event E 1 E 2 occurs; and (iv) the loss of the Y-portfolio is 1 δ if the event E 1 occurs, and it is 1 + δ if the event E 2 occurs. Thus, the r.v. X takes values 0 and 1 with probabilities 1 p * and p * , and the r.v. Y takes values 0, 1 δ , and 1 + δ with probabilities 1 p * , p * / 2 , and p * / 2 , respectively. Hence, E X = E Y , that is, the expected losses of the two portfolios are the same. Clearly, the distribution of X is less dispersed than that of Y, both intuitively and also in the formal sense that X α + 1 Y for all α [ 1 , ] . So, everyone will probably say that the Y-portfolio is riskier than the X-portfolio. However, for any p ( p * , 1 ) it is easy to see, by (3.3), that Q 0 ( X ; p ) = 0 = Q 0 ( Y ; p ) , and hence, in view of (5.10), Q 1 ( Y ; p ) = 1 p E Y = p * p = 1 p E X = Q 1 ( X ; p ) . Using also the continuity of Q α ( · ; p ) in p, as stated in Theorem 3.4, one concludes that the Q 1 ( · ; p ) = CVaR p ( · ) risk value of the riskier Y-portfolio is the same as that of the less risky X-portfolio for all p [ p * , 1 ) . Such indifference (which may also be referred to as insufficient sensitivity to risk) may generally be considered “an unwanted characteristic” – see e.g. pages 36 and 48 in [4]. One can also perceive the exhibited here lack of dependence of CVaR on δ as a certain “flatness” of this measure of risk.
Let us now show that, in contrast with the risk measure Q 1 ( · ; p ) = CVaR p ( · ) , the value of Q α ( · ; p ) is sensitive to risk for all α ( 1 , ) and all p ( 0 , 1 ) ; that is, for all such α and p and for the losses X and Y as above, Q α ( Y ; p ) > Q α ( X ; p ) . Indeed, take any α ( 1 , ) . By (2.17) and (4.10), x * , X = 1 , p * , X = p * , x * , Y = 1 + δ , x * * , Y = 1 δ , and p * , Y = p * / 2 . If p ( 0 , p * / 2 ] then, by Part (ii) of Proposition 3.1, Q α ( Y ; p ) = x * , Y = 1 + δ > 1 = x * , X = Q α ( X ; p ) . If now p ( p * / 2 , 1 ) , then, by Formula (3.20) in [8, t Y : = α 1 Q ( Y ; p ) ( , x * * , Y ) = ( , 1 δ ) . Also, by a strict version of Jensen’s inequality and the strict convexity of u α in u [ 0 , ) , B α ( X ; p ) ( t ) = t + p 1 / α X t α < t + p 1 / α Y t α = B α ( Y ; p ) ( t ) for all t ( , 1 δ ] . Therefore, by Formula (3.18) in [8] and Formula (3.8) in the present paper, Q α ( Y ; p ) = B α ( Y ; p ) ( t Y ) > B α ( X ; p ) ( t Y ) Q α ( X ; p ) . Thus, it is checked that Q α ( Y ; p ) > Q α ( X ; p ) for all α ( 1 , ) and all p ( 0 , 1 ) .
The above example is illustrated in Figure 4, for p * = 0.1 and δ = 0.6 . It is seen that the sensitivity of the measure Q α ( · ; p ) to risk (reflected especially by the gap between the red and blue lines for p [ p * , 1 ) = [ 0 . 1 , 1 ) ) increases from the zero sensitivity when α = 1 to an everywhere positive sensitivity when α = 2 to an everywhere greater positive sensitivity when α = 5 .
Figure 4. Sensitivity of Q α ( · ; p ) to risk, depending on the value of α: graphs p , Q α ( X ; p ) : 0 < p < 1 (blue) and p , Q α ( Y ; p ) : 0 < p < 1 (red) for α = 1 (left); α = 2 (middle); and α = 5 (right).
Figure 4. Sensitivity of Q α ( · ; p ) to risk, depending on the value of α: graphs p , Q α ( X ; p ) : 0 < p < 1 (blue) and p , Q α ( Y ; p ) : 0 < p < 1 (red) for α = 1 (left); α = 2 (middle); and α = 5 (right).
Risks 02 00349 g004
That CVaR p = Q 1 ( · ; p ) is flat – in contrast to Q α ( · ; p ) with α ( 1 , ) – is of course rooted in the fact that u α is strictly convex in u [ 0 , ) only for α ( 1 , ) , but not for α = 1 ; cf. e.g. [37], where it is shown that the normed space L α is uniformly convex for α ( 1 , ) (but of course not for α = 1 ).

5.3. Coherent and Non-Coherent Measures of Risk

Based on an extensive and penetrating discussion of methods of measurement of market and non-market risks, Artzner et al. [1] concluded that, for a risk measure to be effective in risk regulation and management, it has to be coherent, in the sense that it possess the translation invariance, subadditivity, positive homogeneity, and monotonicity properties. In general, a risk measure, say ρ ^ , is a mapping of a linear space of real-valued r.v.’s on a given probability space into R . The probability space (say Ω) was assumed to be finite in [1]. More generally, one could allow Ω to be infinite, and then it is natural to allow ρ ^ to take values ± as well. In [1], the r.v.’s (say Y) in the argument of the risk measure were called risks but at the same time interpreted as “the investor’s future net worth”. Then the translation invariance was defined in [1] as the identity ρ ^ ( Y + r t ) = ρ ^ ( Y ) t for all r.v.’s Y and real numbers t, where r is a positive real number, interpreted as the rate of return. We shall, however, follow Pflug [26] (among other authors), who considers a risk measure (say ρ) as a function of the potential cost/loss, say X, and then defines the translation invariance of ρ, quite conventionally, as the identity ρ ( X + c ) = ρ ( X ) + c for all r.v.’s X and real numbers c. The approaches in [1,26] are equivalent to each other, and the correspondence between them can be given by the formulas ρ ( X ) = r ρ ^ ( Y ) = r ρ ^ ( X ) , X = Y , and c = r t . The positive homogeneity, as defined in [1], can be stated as the identity ρ ( λ X ) = λ ρ ( X ) for all r.v.’s X and real numbers λ 0 .
Corollary 5.1. For each α [ 1 , ] and each p ( 0 , 1 ) , the quantile bound Q α ( · ; p ) is a coherent risk measure, and it is not coherent for any pair ( α , p ) [ 0 , 1 ) × ( 0 , 1 ) .
This follows immediately from Theorem 3.4 and Proposition 3.7.
The usually least trivial of the four properties characterizing the coherence is the subadditivity of a risk measure – which, in the presence of the positive homogeneity, is equivalent to the convexity, as was pointed out earlier in this paper. As is well known and also discussed above, the value at risk measure VaR p ( X ) is translation invariant, positive homogeneous, and monotone (in X), but it fails to be subadditive. Quoting from page 1458 in [2]: “The coherence of [ CVaR p ( X ) ] is a formidable advantage not shared by any other widely applicable measure of risk yet proposed.”
Corollary 5.1 above addresses this problem by providing an entire infinite family of coherent risk measures, indexed by α [ 1 , ] , including CVaR p = Q 1 ( · ; p ) just as one member of the family. Moreover, CVaR p can now be seen as only “barely”, borderline coherent – because ( CVaR p = Q 1 ( · ; p ) and) α = 1 is the smallest value of the sensitivity index for which the risk measure Q α ( · ; p ) is coherent. One can also say that the coherence of CVaR is unstable with respect to the sensitivity index α: CVaR p is coherent, but the risk measure Q α ( · ; p ) (which is arbitrarily close to CVaR p when α is close enough to 1) is not coherent if α [ 0 , 1 ) . Here one may also recall the discussion in Section 5.2 on CVaR ’s “flatness” and indifference to risk.

5.4. Other Terminology Used in the Literature for Some of the Listed Properties of Q α ( · ; p )

Theorem 3.4 provides a number of useful properties of the spectrum of risk measures Q α ( · ; p ) . The terminology we use to name some of these properties differs from the corresponding terminology used elsewhere.
In particular, what we refer to as the “positive sensitivity” in Theorem 3.4 corresponds to the “relevance” in [1].
Next, in the present paper the “model-independence” means that the risk measure depends on the potential loss only through the distribution of the loss, rather than on the way to model the “states of nature”, on which the loss may depend. In contrast, in [1] a measure of risk is considered “model-free” if it does not depend, not only on modeling the “states of nature”, but, to a possibly large extent, on the distribution of the loss. An example of such a “model-free” risk measure is given by the SEC rules mentioned in Section 4.4; this measure of risk depends only on the set of all possible representations of the investment portfolio in question as a portfolio of long call spreads, that is, pairs of the form (a long call, a short call). If a measure of risk is not “model-free”, then it is called “model-dependent” in [1]. The “model-independence” property is called “law-invariance” in Section 12.1.2 of [38], and a similar property is called “neutrality” on page 97 in [39].
Also in [38], the consistency property is referred to as “constancy”.

5.5. Gini-Type Mean Differences and Related Risk Measures

Yitzhaki [40] utilized the Gini mean difference – which had prior to that been mainly used as a measure of economic inequality – to construct, somewhat implicitly, a measure of risk; this approach was further developed in [41,42]. If (say) a r.v. X is thought of as the income of a randomly selected person in a certain state, then the Gini mean difference can be defined by the formula
G H ( X ) : = E H ( | X X ˜ | ) ,
where X ˜ is an independent copy of X and H : [ 0 , ) R is a measurable function, usually assumed to be nonnegative and such that H ( 0 ) = 0 ; clearly, given the function H, the Gini mean difference G H ( X ) depends only on the distribution of the r.v. X. Therefore, if H ( u ) is considered, for any u [ 0 , ) , as the measure of inequality between two individuals with incomes x and y such that | x y | = u , then the Gini mean difference E H ( | X X ˜ | ) is the mean H-inequality in income between two individuals selected at random (and with replacement, thus independently of each other). The most standard choice for H is the identity function id , so that H ( u ) = id ( u ) = u for all u [ 0 , ) . Based on the measure-of-inequality G H , one can define the risk measure
R H ( X ) : = E X + G H ( X ) = E X + E H ( | X X ˜ | ) ,
where now the r.v. X is interpreted as the uncertain loss on a given investment, with the term G H ( X ) = E H ( | X X ˜ | ) then possibly interpreted as a measure of the uncertainty. Clearly, when there is no uncertainty, so that the loss X is in fact a nonrandom real constant, then the measure G H ( X ) of the uncertainty is 0, assuming that H ( 0 ) = 0 . If X N ( μ , σ 2 ) (that is, X is normally distributed with mean μ and standard deviation σ > 0 ) and H = κ id for some positive constant κ, then R H ( X ) = μ + 2 κ π σ , a linear combination of the mean and the standard deviation, so that in such a case we find ourselves in the realm of the Markowitz mean-variance risk-assessment framework; cf. (3.12).
It is assumed that R H ( X ) is defined when both expected values in the last expression in (5.2) are defined and are not infinite values of opposite signs – so that these two expected values could be added, as needed in (5.2).
It is clear that R H ( X ) is translation-invariant. Moreover, R H ( X ) is convex in X if the function H is convex and nondecreasing. Further, if H = κ id for some positive constant κ, then R H ( X ) is also positive-homogeneous.
It was shown in [40], under an additional technical condition, that R H ( X ) is nondecreasing in X with respect to the stochastic dominance of order 1 if H = 1 2 id . Namely, the result obtained in [40] is that if X st Y and the distribution functions F and G of X and Y are such that F G changes sign only finitely many times on R , then R 1 2 id ( X ) R 1 2 id ( Y ) . A more general result was obtained in [42], which can be stated as follows: in the case when the function H is differentiable, R H ( X ) is nondecreasing in X with respect to the stochastic dominance of order one if and only if | H | 1 2 . Cf. also [41]. The proof in [42] was rather long and involved; in addition, it used a previously obtained result of [43]. Here we are going to give (in Appendix A) a very short, direct, and simple proof of the more general
Proposition 5.2. The risk measure R H ( X ) is nondecreasing in X with respect to the stochastic dominance of order 1 if and only if the function H is 1 2 -Lipschitz: | H ( x ) H ( y ) | 1 2 | x y | for all x and y in [ 0 , ) .
In Proposition 5.2, it is not assumed that H 0 or that H ( 0 ) = 0 . Of course, if H is differentiable, then the 1 2 -Lipschitz condition is equivalent to the condition | H | 1 2 in [42].
The risk measure R H ( X ) was called mean-risk (M-R) in [41].
It follows from [42] or Proposition 5.2 above that the risk measure R κ id ( X ) is coherent for any κ [ 0 , 1 2 ] . In fact, based on Proposition 5.2, one can rather easily show more:
Proposition 5.3. The risk measure R H ( X ) is coherent if and only if H = κ id for some κ [ 0 , 1 2 ] .
It is possible to indicate a relation – albeit rather indirect – of the risk measure R H ( X ) , defined in (5.2), with the quantile bounds Q α ( X ; p ) . Indeed, introduce
Q ^ α ( X ; p ) = E X + p 1 / α ( X E X ) + α ,
assuming E X exists in R . By (3.8)–(3.9), Q ^ α ( X ; p ) is a majorant of Q α ( X ; p ) , obtained by using t = E X in (3.8) as a surrogate of the minimizing value of t.
The term p 1 / α ( X E X ) + α in (5.3) is somewhat similar to the Gini mean-difference term E H ( | X X ˜ | ) , at least when α = 1 and (the distribution of) the r.v. X is symmetric about its mean.
Moreover, if the distribution of X E X is symmetric and stable with index γ ( 1 , 2 ] , then Q ^ 1 ( X ; p ) = R κ id ( X ) with κ = 2 1 1 / γ / p .
One may want to compare the two considered kinds of coherent measures of risk/inequality, R κ id ( X ) for κ [ 0 , 1 2 ] and Q α ( X ; p ) for α [ 1 , ] and p ( 0 , 1 ) . It appears that the latter measure is more flexible, as it depends on two parameters (α and p) rather than just one parameter (κ). Moreover, as previously mentioned Proposition 2.7 in [8] shows, rather generally Q α ( X ; p ) retains a more or less close relation with the quantile Q 0 ( X ; p ) – which, recall, is the widely used value at risk (VaR). On the other hand, recall here that, in contrast with the VaR, Q α ( X ; p ) is coherent for α [ 1 , ] . However, both of these kinds of coherent measures appear useful, each in its own manner, representing two different ways to express risk/inequality.
Formulas (5.2) and (5.3) can be considered special instances of the general relation between risk measures and measures of inequality established in [44]. Let X E be a convex cone of real-valued r.v. X X with a finite mean E X such that X E contains all real constants.
Largely following [44] (see also the earlier study [45]), let us say a coherent risk measure R : X E ( , ] is strictly expectation-bounded if R ( X ) > E X for all X X E . (Note that here the r.v. X represents the loss, whereas in [44] it represents the gain; accordingly, X in this paper corresponds to X in [44]; also, in [44] the cone X E was taken to be the space L 2 .) In view of Theorem 3.4 and Part (vii) of Proposition 3.1, it follows that Q α ( X ; p ) is a coherent and strictly expectation-bounded risk measure if α [ 1 , ] . Also (cf. Definition 1 and Proposition 1 in [44]), let us say that a mapping D : X E [ 0 , ] is a deviation measure if D is subadditive, positive-homogeneous, and nonnegative with D ( X ) = 0 if and only if E ( X = c ) = 1 for some real constant c; here X is any r.v. in X E . Next (cf. Definition 2 in [44]), let us say that a deviation measure D : X E [ 0 , ] is upper-range dominated if D ( X ) sup supp X E X for all X X E . Then (cf. Theorem 2 in [44]), the formulas
D ( X ) = R ( X E X ) and R ( X ) = E X + D ( X )
provide a one-to-one correspondence between all coherent strictly expectation-bounded risk measures R : X E ( , ] and all upper-range dominated deviation measures D : X E [ 0 , ] .
In particular, it follows that the risk measure Q ^ α ( · ; p ) , defined by Formula (5.3), is coherent for all α [ 1 , ] and all p ( 0 , ) . It also follows that X Q α ( X E X ; p ) is a deviation measure. As was noted, Q ^ α ( X ; p ) is a majorant of Q α ( X ; p ) . In contrast with Q α ( X ; p ) , in general Q ^ α ( X ; p ) will not have such a close hereditary relation with the true quantile Q 0 ( X ; p ) as e.g. the ones given in the previously mentioned Proposition 2.7 in [8]. For instance, if P ( X x ) is like x then, by Formulas (2.13)-(2.14) in [8, Q α ( X ; p ) p 0 Q 0 ( X ; p ) for each α [ 0 , ] , whereas Q ^ ( X ; p ) = for all real p > 0 . On the other hand, in distinction with the definition (5.3) of Q ^ α ( X ; p ) , the expression (3.8) for Q α ( X ; p ) requires minimization in t; however, that minimization will add comparatively little to the complexity of the problem of minimizing Q α ( X ; p ) subject to a usually large number of restrictions on X; cf. Theorem 4.7. Risk measures similar to (5.3) were considered in [46] in relation with the stochastic dominance of arbitrary orders.

5.6. A Lorentz-Type Parametric Family of Risk Measures

Recalling (2.29) and following [21,22,47], one may also consider F X ( α ) ( p ) as a measure of risk. Here one will need the following semigroup identity, given by Formula (8a) in [21], (cf. e.g. Remark 3.7 in [5]):
F X ( α ) ( p ) = 1 Γ ( α ν ) 0 p ( p u ) α ν 1 F X ( ν ) ( u ) d u
whenever 0 < ν < α < . The following proposition is well known.
Proposition 5.4. If the r.v. X is nonnegative, then
F X ( 2 ) ( p ) = L X ( p ) = p CVaR p ( X ) ,
where L X is the Lorenz curve function, given by the formula
L X ( p ) : = 0 p F X 1 ( u ) d u
.
Indeed, the first equality in (5.6) is the special case of the identity (5.5) with α = 2 and ν = 1 , and the second equality in (5.6) follows by Part (i) of Theorem 3.1 in [48], identity (3.8) for α = 1 , and the second identity in (5.1). Cf. Theorem 2 in [49] and [20,50].
Using (5.5) with ν = 2 , α + 1 in place of α, and X in place of X together with Proposition 5.4, one has
F X ( α 1 ) ( p ) = 1 Γ ( α 1 ) 0 p ( p u ) α 2 u CVaR u ( X ) d u
for any α ( 1 , ) . Since CVaR u ( X ) is a coherent risk measure, it now follows that, as noted in [47], F X ( α 1 ) ( p ) is a coherent risk measure as well, again for α ( 1 , ) ; by (5.6), this conclusion will hold for α = 1 . However, one should remember that the expression F X ( α ) ( p ) was defined only when the r.v. X is nonnegative (and otherwise some of the crucial considerations above will not hold). Thus, F X ( α 1 ) ( p ) is defined only if X 0 almost surely.

5.7. Spectral Risk Measures

In view of (5.8), the risk measure F X ( α 1 ) ( p ) is a mixture of the coherent risk measures CVaR u ( X ) and thus a member of the general class of the so-called spectral risk measures [51], which are precisely the mixtures, over the values u ( 0 , 1 ) , of the risk measures CVaR u ( X ) ; thus, all spectral risk measures are automatically coherent. However, in general such measures will lack such an important variational representation as the one given by Formula (3.8) for the risk measure Q α ( X ; p ) . Of course, for any “mixing” nonnegative Borel measure μ on the interval ( 0 , 1 ) and the corresponding spectral risk measure CVaR μ ( X ) : = ( 0 , 1 ) CVaR u ( X ) μ ( d u ) , one can write
CVaR μ ( X ) = ( 0 , 1 ) inf t R t + 1 u ( X t ) + 1 μ ( d u ) ,
in view of (5.1) and (3.8)–(3.9). However, in contrast with (3.8), the minimization (in t R ) in (5.9) needs in general to be done for each of the infinitely many values of u ( 0 , 1 ) . If the r.v. X takes only finitely many values, then the expression of CVaR μ ( X ) in (5.9) can be rewritten as a finite sum, so that the minimization in t R will be needed only for finitely many values of u; cf. e.g. the optimization problem on page 8 in [47].
On the other hand, one can of course consider arbitrary mixtures in p ( 0 , 1 ) and/or α [ 1 , ) of the risk measures Q α ( X ; p ) . Such mixtures will automatically be coherent. Also, all mixtures of the measures Q α ( X ; p ) in p will be nondecreasing in α, and all mixtures of Q α ( X ; p ) in α will be nonincreasing in p.

5.8. Risk Measures Reinterpreted as Measures of Economic Inequality

Deviation measures such as the ones studied in [44] and discussed in the paragraph containing (5.4) can be used as measures of economic inequality if the r.v. X models, say, the random income/wealth – defined as the income/wealth of an (economic) unit chosen at random from a population of such units. Then, according to the one-to-one correspondence given by (5.4), coherent risk measures R translate into deviation measures D, and vice versa.
However, the risk measures Q α ( · ; p ) themselves can be used to express certain aspects of economic inequality directly, without translation into deviation measures. For instance, if X stands for the random wealth, then the statement Q 1 ( X ; 0 . 01 ) = 30 E X formalizes the common kind of expression “the wealthiest 1% own 30% of all wealth”, provided that the wealthiest 1% can be adequately defined, say as follows: there is a threshold wealth value t such that the number of units with wealth greater than or equal to t is 0.01 N , where N is the number of units in the entire population. Then (cf. (5.12)), 0.01 N Q 1 ( X ; 0 . 01 ) = 0.01 N E ( X | X t ) = N E X I { X t } = 0.30 N E X , whence indeed Q 1 ( X ; 0 . 01 ) = 30 E X . Similar in spirit expressions of economic inequality in terms of Q α ( X ; p ) can be provided for all α ( 0 , ) . For instance, suppose now that X stands for the annual income of a randomly selected household, whereas x is a particular annual household income level in question. Then, in view of (3.8)–(3.9), the inequality Q α ( X ; p ) x means that for any (potential) annual household income level t less than the maximum annual household income level x * , X in the population, the conditional α-mean E ( X t ) α | X > t 1 / α of the excess ( X t ) + of the random income X over t is no less than p E ( X > t ) 1 / α times the excess ( x t ) + of the income level x over t. Of course, the conditional α-mean E ( X t ) α | X > t 1 / α is increasing in α. Thus, using the measure Q α ( X ; p ) of economic inequality with a greater value of α means treating high values of the economic variable X in a more progressive/sensitive manner. One may also note here that the above interpretation of the inequality Q α ( X ; p ) x is a “synthetic” statement in the sense that it provides information concerning all values of potential interest of the threshold annual household income level t.
Not only the upper bounds Q α ( X ; p ) on the quantile Q ( X ; p ) , but also the upper bounds P α ( X ; x ) on the tail probability P ( X x ) may be considered measures of risk/inequality. Indeed, if X is interpreted as the potential loss, then the tail probability P ( X x ) corresponds to the classical safety-first (SF) risk measure; see e.g. [52,53].

5.9. “Explicit” Expressions of Q α ( X ; p )

In the case α = 1 , an expression of Q α ( X ; p ) can be given in terms of the true ( 1 p ) -quantile Q ( X ; p ) :
Q 1 ( X ; p ) = Q ( X ; p ) + 1 p E X Q ( X ; p ) + .
That the expression for Q 1 ( X ; p ) in (3.8) coincides with the one in (5.10) was established in Theorem 1 in [3] for absolutely continuous r.v.’s X, and then on page 273 in [26] and in Theorem 10 in [2] in general. For the readers’ convenience, let us present here the following brief proof of (5.10). For all real h > 0 and t R , one has
( X t ) + ( X t h ) + = h I { X > t } ( t + h X ) I { t < X < t + h } .
It follows that the right derivative of the convex function t t + ( X t ) + 1 / p at any point t R is 1 P ( X > t ) / p , which, by (3.3), is 0 if t < Q ( X ; p ) and > 0 if t > Q ( X ; p ) . Hence, Q ( X ; p ) is a minimizer in t R of t + ( X t ) + 1 / p , and thus (5.10) follows by (3.8). It is also seen now that any ( 1 p ) -quantile of X is a minimizer in t R of t + ( X t ) + 1 / p as well, and Q ( X ; p ) is the largest of these minimizers.
As was shown in [2], the expression for Q 1 ( X ; p ) in (5.10) can be rewritten as a conditional expectation:
Q 1 ( X ; p ) = Q ( X ; p ) + E X Q ( X ; p ) | X Q ( X ; p ) , U δ = E X | X Q ( X ; p ) , U δ ,
where U is any r.v. which is independent of X and uniformly distributed on the interval [ 0 , 1 ] , δ : = δ ( X ; p ) : = d I { X = Q ( X ; p ) } , and d is any real number in the interval [ 0 , 1 ] such that
P X Q ( X ; p ) p = P X = Q ( X ; p ) d ;
such a number d always exists. Thus, the r.v. U is used to split the possible atom of the distribution of X at the quantile point Q ( X ; p ) in order to make the randomized tail probability P X Q ( X ; p ) , U δ exactly equal to p. Of course, in the absence of such an atom, one can simply write
Q 1 ( X ; p ) = Q ( X ; p ) + E X Q ( X ; p ) | X Q ( X ; p ) = E X | X Q ( X ; p ) .
As pointed out in [2,3] and discussed in Section 4.3, a variational formula such as (3.8) has a distinct advantage over such ostensibly explicit formulas as (5.10) and (5.11), since (3.8) allows of rather easy incorporation into specialized optimization problems. Nonetheless, one can obtain an extension of the representation (5.10), valid for all α [ 1 , ) ; see Formula (4.18) and also Proposition 4.7 in [8].

6. Conclusions

Let us summarize some of the advantages of the risk/inequality measures P α ( X ; x ) and Q α ( X ; p ) :
  • P α ( X ; x ) and Q α ( X ; p ) are three-way monotonic and three-way stable – in α, p, and X.
  • The monotonicity in X is graded continuously in α, resulting in varying, controllable degrees of sensitivity of P α ( X ; x ) and Q α ( X ; p ) to financial risk/economic inequality.
  • x P α ( X ; x ) is the tail-function of a certain probability distribution.
  • Q α ( X ; p ) is a ( 1 p ) -percentile of that probability distribution.
  • For small enough values of p, the quantile bounds Q α ( X ; p ) are close enough to the corresponding true quantiles Q ( X ; p ) = VaR p ( X ) , provided that the right tail of the distribution of X is light enough and regular enough, depending on α.
  • In the case when the loss X is modeled as a normal r.v., the use of the risk measures Q α ( X ; p ) reduces, to an extent, to using the Markowitz mean-variance risk-assessment paradigm – but with a varying weight of the standard deviation, depending on the risk sensitivity parameter α.
  • P α ( X ; x ) and Q α ( X ; p ) are solutions to mutually dual optimizations problems, which can be comparatively easily incorporated into more specialized optimization problems, with additional restrictions on the r.v. X.
  • P α ( X ; x ) and Q α ( X ; p ) are effectively computable.
  • Even when the corresponding minimizer is not identified quite perfectly, one still obtains an upper bound on the risk/inequality measures P α ( X ; x ) or Q α ( X ; p ) .
  • Optimal upper bounds on P α ( X ; x ) and, hence, on Q α ( X ; p ) over important classes of r.v.’s X represented (say) as sums of independent r.v.’s X i with restrictions on moments of the X i ’s and/or sums of such moments can be given; see e.g. [7,54] and references therein.
  • The quantile bounds Q α ( X ; p ) with α [ 1 , ] constitute a spectrum of coherent measures of financial risk and economic inequality.
  • The r.v.’s X of which the measures P α ( X ; x ) and Q α ( X ; p ) are taken are allowed to take values of both signs. In particular, if, in a context of economic inequality, X is interpreted as the net amount of assets belonging to a randomly chosen economic unit, then a negative value of X corresponds to a unit with more liabilities than paid-for assets. Similarly, if X denotes the loss on a financial investment, then a negative value of X will obtain when there actually is a net gain.
As seen from the discussion in Section 5, some of these advantages, and especially their totality, appear to be unique to the risk measures proposed here.
Further studies involving especially the use and computational implementation of the proposed risk measures would be welcome.

Acknowledgment

I am pleased to thank Emmanuel Rio for the mentioned communication [24], which also included a reference to [29] and in fact sparked the study presented here. I am also pleased to thank the referees for useful suggestions concerning the presentation.

Appendix

A. Proofs

Proof of Proposition 2.1. This proof is not hard but somewhat technical; it can be found in the more detailed version [8] of this paper; see the proof of Proposition 1.1 there. ☐
Proof of Proposition 2.2. This too can be found in [8]; see the proof of Proposition 1.2 there. ☐
Proof of Proposition 2.3. Let α and a sequence ( α n ) be indeed as in Proposition 2.3. If x [ x * , ) , then the desired conclusion P α n ( X ; x ) P α ( X ; x ) follows immediately from part (i) of Proposition 2.2. Therefore, assume in the rest of the proof of Proposition 2.3 that
x ( , x * ) .
Then (4.4) takes place and, by (4.3), λ max , α is continuous in α ( 0 , ] . Hence,
λ * : = sup n λ max , α n [ 0 , )
and
P γ ( X ; x ) = inf λ [ 0 , λ * ] A γ ( X ; x ) ( λ ) for all γ { α } { α n : n N } .
Also, by (2.3), (2.2), the inequality (4.1) for α ( 0 , ) , the condition X X β , and dominated convergence,
A α n ( X ; x ) ( λ ) A α ( X ; x ) ( λ ) .
Hence, by (2.5), lim sup n P α n ( X ; x ) lim sup n A α n ( X ; x ) ( λ ) = A α ( X ; x ) ( λ ) for all λ [ 0 , ) , whence, again by (2.5),
lim sup n P α n ( X ; x ) P α ( X ; x ) .
Thus, the case α = 0 of Proposition 2.3 follows by (2.6).
If α ( 0 , 1 ] , then for any κ and λ such that 0 κ < λ < one has
| A α ( X ; x ) ( λ ) A α ( X ; x ) ( κ ) | ( λ κ ) α E ( X x ) + α / α α + ( λ κ ) α / 2 / α α + P ( x X ) + > 1 λ κ ;
this follows because
0 ( 1 + λ u / α ) + α ( 1 + κ u / α ) + α ( λ κ ) α u α / α α if u 0 , 0 ( 1 + κ u / α ) + α ( 1 + λ u / α ) + α min 1 , ( λ κ ) α | u | α / α α ( λ κ ) α / 2 / α α + I { | u | > 1 λ κ } if u < 0 .
If now α ( 0 , 1 ) , then (say, by cutting off an initial segment of the sequence ( α n ) ) one may assume that β ( 0 , 1 ) , and then, by (A6) with α n in place of α, the sequence A α n ( X ; x ) ( λ ) is equicontinuous in λ [ 0 , ) , uniformly in n. Therefore, by (A2) and the Arzelà–Ascoli theorem, the convergence in (A4) is uniform in λ [ 0 , λ * ] and, hence, the conclusion P α n ( X ; x ) P α ( X ; x ) follows by (A3) – in the case when α ( 0 , 1 ) .
Quite similarly, the same conclusion holds if α = 1 = β ; that is, P α ( X ; x ) is left-continuous in α at the point α = 1 provided that E X + < .
It remains to consider the case when α [ 1 , ] and α n 1 for all n. Then, by the definition in (2.1), the functions h α and h α n are convex and hence, by (2.3), A α ( X ; x ) ( λ ) and A α n ( X ; x ) ( λ ) are convex in λ [ 0 , ) . Then the conclusion P α n ( X ; x ) P α ( X ; x ) follows by Corollary 3 in [55], the condition X X β , (A2), and (A3).  ☐
Proof of Proposition 2.4. This is somewhat similar to the proof of Proposition 2.3. One difference here is the use of the uniform integrability condition, which, in view of (2.3), (4.1), and the condition X X α , implies (see e.g. Theorem 5.4 in [15]) that for all λ [ 0 , )
lim n A α ( X n ; x ) ( λ ) = A α ( X ; x ) ( λ ) ;
here, in the case when α = and λ Λ X , one should also use the Fatou lemma for the convergence in distribution (see e.g. Theorem 5.3 in [15]), according to which one always has lim inf n A α ( X n ; x ) ( λ ) A α ( X ; x ) ( λ ) , even without the uniform integrability condition. In this entire proof, it is indeed assumed that α ( 0 , ] .
It follows from (A7) and the nonnegativity of P α ( · ; · ) that
0 lim inf n P α ( X n ; x ) lim sup n P α ( X n ; x ) P α ( X ; x )
for all real x; cf. (A4) and (A5).
The convergence (2.22) for x ( x * , ) follows immediately from (A8) and part (i) of Proposition 2.2.
Using the same ingredients, it is easy to check Part (ii) of Proposition 2.4 as well. Indeed, assuming that P ( X n = x * ) n P ( X = x * ) and using also (2.6), one has
P ( X = x * ) = lim inf n P ( X n = x * ) lim inf n P ( X n x * ) lim inf n P α ( X n ; x * ) lim sup n P α ( X n ; x * ) P α ( X ; x * ) = P ( X = x * ) ,
which yields (2.22) for x = x * . Also, X n D n X implies lim sup n P ( X n = x * ) P ( X = x * ) ; see e.g. Theorem 2.1 in [15]. So, if P ( X = x * ) = 0 , then P ( X n = x * ) P ( X = x * ) and hence (2.22) holds for x = x * , by the first sentence of Part (ii) of Proposition 2.4.
It remains to prove Part (i) of Proposition 2.4 assuming (A1). The reasoning here is quite similar to the corresponding reasoning in the proof of Proposition 2.3, starting with (A1). Here, instead of the continuity of λ max , α = λ max , α , X in α, one should use the convergence λ max , α , X n λ max , α , X , which holds provided that y ( x , x * ) is chosen to be such that P ( X = y ) = 0 . Concerning the use of inequality (A6), note that (i) the uniform integrability condition implies that E ( X n x ) + α is bounded in n and (ii) the convergence in distribution X n D n X implies that sup n P ( x X n ) + > 1 λ κ 0 as 0 < λ κ 0 . Proposition 2.4 is now completely proved. ☐
Proof of Theorem 2.5. The model-independence is obvious from the definition (2.5). The monotonicity in X follows immediately from (2.23), (2.10), and (2.7)–(2.9). The monotonicity in α was already given in (2.13). The monotonicity in x is Part (i) of Proposition 2.1. That P α ( X ; x ) takes on only values in the interval [ 0 , 1 ] follows immediately from (2.16). The α-concavity in x and stability in x follow immediately from parts (iii) and (i) of Proposition 2.2. The stability in α and the stability in X are Propositions 2.3 and 2.4, respectively. The translation invariance, consistency, and positive homogeneity follow immediately from the definition (2.5). ☐
Proof of Proposition 3.1.
(i) Part (i) of this proposition follows immediately from (3.2) and (2.16).
(ii) Suppose here indeed that p ( 0 , p * ] ( 0 , 1 ) . Then for any x ( x * , ) one has P α ( X ; x ) = 0 < p , by Part (i) of Proposition 2.2, whence, by (2.19), x E α ( p ) . On the other hand, for any x ( , x * ] one has P α ( X ; x ) P α ( X ; x * ) = p * p , by Part (i) of Proposition 2.1 and Part (i) of Proposition 2.2, whence x E α ( p ) . Therefore, E α ( p ) = ( x * , ) , and the conclusion Q α ( X ; p ) = x * now follows by the definition of Q α ( X ; p ) in (3.2).
(iii) If x * = , then the inequality Q α ( X ; p ) x * in Part (iii) of Proposition 3.1 is trivial. If x * < and p ( p * , 1 ) , then x * E α ( p ) and hence Q α ( X ; p ) x * by (3.2). Now Part (iii) of Proposition 3.1 follows from its Part (ii).
(iv) Take any x ( , x * ) . Then P 0 ( X ; x ) = P ( X x ) > 0 . Moreover, for all p ( 0 , P 0 ( X ; x ) ) one has x E 0 , X ( p ) . Therefore and because the set E 0 , X ( p ) is an interval with endpoints Q 0 ( X ; p ) and , it follows that x Q 0 ( X ; p ) . Thus, for any given x ( , x * ) and for all small enough p > 0 one has Q 0 ( X ; p ) x and hence, by the already established Part (iii) of Proposition 3.1, Q 0 ( X ; p ) [ x , x * ] . This means that Part (iv) of Proposition 3.1 is proved for α = 0 . To complete the proof of this part, it remains to refer to the monotonicity of Q α ( X ; p ) in α stated in (3.4) and, again, to Part (iii) of Proposition 3.1.
(v) Assume indeed that α ( 0 , ] . By Part (viii) of Proposition 2.2, the case p * = 1 is equivalent to x α = x * , and in that case both mappings (3.6) and (3.7) are empty, so that Part (v) of Proposition 3.1 is trivial. So, assume that p * < 1 and, equivalently, x α < x * . The function ( x α , x * ) x P α ( X ; x ) is continuous and strictly decreasing, by Parts (iv) and (xi) of Proposition 2.2. At that, P α ( X ; x * ) = P α ( X ; x * ) = p * by Parts (iv) and (i) of Proposition 2.2 if x * < , and P α ( X ; x * ) = 0 = p * by (2.16) and (2.17) if x * = . Also, P α ( X ; x α + ) = P α ( X ; x α ) = 1 by the condition x α < x * and Parts (iv) and (x) of Proposition 2.2 if x α > , and P α ( X ; x α + ) = 1 by (2.16) if x α = . Therefore, the continuous and strictly decreasing function ( x α , x * ) x P α ( X ; x ) maps ( x α , x * ) onto ( p * , 1 ) , and so, Formula (3.7) is correct, and there is a unique inverse function, say ( p * , 1 ) p x α , p ( x α , x * ) , to the function (3.7); moreover, this inverse function is continuous and strictly decreasing. It remains to show that Q α ( X ; p ) = x α , p for all p ( p * , 1 ) . Take indeed any p ( p * , 1 ) . Since the function ( p * , 1 ) p x α , p ( x α , x * ) is inverse to (3.7) and strictly decreasing, P α ( X ; x α , p ) = p , P α ( X ; x ) > p for x ( x α , x α , p ) , and P α ( X ; x ) < p for x ( x α , p , x * ) . So, by Part (i) of Proposition 2.1, P α ( X ; x ) > p for x ( , x α , p ) and P α ( X ; x ) < p for x ( x α , p , ) . Now the conclusion that Q α ( X ; p ) = x α , p for all p ( p * , 1 ) follows by (3.2).
(vi) Assume indeed that α ( 0 , ] and take indeed any y , Q α ( X ; p ) . If P α ( X ; y ) = 1 , then the conclusion P α ( X ; y ) > p in Part (vi) of Proposition 3.1 is trivial, in view of (3.1). Therefore, w.l.o.g. P α ( X ; y ) < 1 and hence y E α ( 1 ) = ( x α , ) , by (2.19) and Part (ix) of Proposition 2.2. Let now y p : = Q α ( X ; p ) for brevity, so that y ( , y p ) and, by the already verified part (iii) of Proposition 3.1, y p x * . Hence, x α < y < y p x * . So, by Part (v) of Proposition 3.1 and Parts (iv) and (i) of Proposition 2.2,
P α ( X ; y ) > lim x y p P α ( X ; x ) = P α ( X ; y p ) P α ( X ; x * ) = p * ,
which yields the conclusion P α ( X ; y ) > p in the case when p p * . If now p > p * , then p ( p * , 1 ) and, by Part (v) of Proposition 3.1, y p = Q α ( X ; p ) ( x α , x * ) and P α ( X ; y p ) = p , so that the conclusion P α ( X ; y ) > p follows by (A9) in this case as well.
(vii) Part (vii) of Proposition 3.1 follows immediately from (3.6), (3.5), and Part (vii) of Proposition 2.2. ☐
Proof of Theorem 3.4. The model-independence, monotonicity in X, monotonicity in α, translation invariance, consistency, and positive homogeneity properties of Q α ( X ; p ) follow immediately from (3.2) and the corresponding properties of P α ( X ; x ) stated in Theorem 2.5.
Concerning the monotonicity of Q α ( X ; p ) in p: that Q α ( X ; p ) is nondecreasing in p ( 0 , 1 ) follows immediately from (3.3) for α = 0 and from (3.8) and (3.9) for α ( 0 , ] . That Q α ( X ; p ) is strictly decreasing in p [ p * , 1 ) ( 0 , 1 ) if α ( 0 , ] follows immediately from Part (v) of Proposition 3.1, and the verified below statement on the stability in p: Q α ( X ; p ) is continuous in p ( 0 , 1 ) if α ( 0 , ] .
The monotonicity of Q α ( X ; p ) in α follows immediately from (2.13) and (3.2).
The finiteness of Q α ( X ; p ) was already stated in Part (i) of Proposition 3.1.
The concavity of Q α ( X ; p ) in p 1 / α in the case when α ( 0 , ) follows by (3.8), since B α ( X ; p ) ( t ) is affine (and hence concave) in p 1 / α . Similarly, the concavity of Q ( X ; p ) in ln 1 p follows by (3.8), since B ( X ; p ) ( t ) is affine in ln 1 p .
The stability of Q α ( X ; p ) in p can be deduced from Proposition 3.1. Alternatively, the same follows from the already established finiteness and concavity of Q α ( X ; p ) in p 1 / α or ln 1 p (cf. the proof of [2, Proposition 13]), because any finite concave function on an open interval of the real line is continuous, whereas the mappings ( 0 , 1 ) p p 1 / α ( 0 , ) and ( 0 , 1 ) p ln 1 p ( 0 , ) are homeomorphisms.
Concerning the stability of Q α ( X ; p ) in X, take any real x x * . Then the convergence P α ( X n ; x ) P α ( X ; x ) holds, by Proposition 2.4. Therefore, in view of (2.19), if x E α , X ( p ) then eventually (that is, for all large enough n) x E α , X n ( p ) . Hence, by (3.2), for each real x x * such that x > Q α ( X ; p ) eventually one has x Q α ( X n ; p ) . It follows that lim sup n Q α ( X n ; p ) Q α ( X ; p ) . On the other hand, by Part (vi) of Proposition 3.1, for any y , Q α ( X ; p ) , one has P α ( X ; y ) > p and, hence, eventually P α ( X n ; y ) > p , which yields y E α , X n ( p ) and, hence, y Q α ( X n ; p ) . It follows that lim inf n Q α ( X n ; p ) Q α ( X ; p ) . Recalling now the established inequality lim sup n Q α ( X n ; p ) Q α ( X ; p ) , one completes the verification of the stability of Q α ( X ; p ) in X.
The stability of Q α ( X ; p ) in α is proved quite similarly, only using Proposition 2.3 in place of Proposition 2.4. Here the stipulation x x * is not needed.
Consider now the positive sensitivity property. First, suppose that α ( 0 , 1 ) . Then, for all real t < 0 , the derivative of B α ( X ; p ) ( t ) in t is less than D : = 1 ( E Y α ) 1 + 1 / α E Y α 1 , where Y : = ( X t ) + = X t > 0 . The inequality D 0 can be rewritten as the true inequality τ τ + 1 L ( 1 ) + 1 τ + 1 L ( τ ) L ( 0 ) for the convex function s L ( s ) : = ln E exp { ( 1 α ) s ln Y } , where τ : = α 1 α . Therefore, the derivative is negative and hence B α ( X ; p ) ( t ) decreases in t 0 (here, to include t = 0 , we also used the continuity of B α ( X ; p ) ( t ) in t, which follows by the condition X X α and dominated convergence). On the other hand, if t > 0 , then B α ( X ; p ) ( t ) t > 0 . Also, B α ( X ; p ) ( 0 ) > 0 by (3.9) if the condition P ( X > 0 ) > 0 holds. Recalling again the continuity of B α ( X ; p ) ( t ) in t, one completes the verification of the positive sensitivity property – in the case α ( 0 , 1 ) .
The positive sensitivity property in the case α = 1 follows by (5.10). Indeed, (5.10) yields Q 1 ( X ; p ) Q ( X ; p ) > 0 if Q ( X ; p ) > 0 , and Q 1 ( X ; p ) = 1 p E X 0 by the condition X 0 if Q ( X ; p ) = 0 ; moreover, one has E X > 0 and hence Q 1 ( X ; p ) = 1 p E X > 0 if Q ( X ; p ) = 0 and P ( X > 0 ) > 0 . On the other hand, by (3.3), X 0 implies Q ( X ; p ) 0 . Thus, the positive sensitivity property in the case α = 1 is verified as well. This and the already established monotonicity of Q α ( X ; p ) in α implies the positive sensitivity property whenever α [ 1 , ] .
As far as this property is concerned, it remains to verify it when α = 0 – assuming that P ( X > 0 ) > p . The sets E : = x R : P ( X > x ) p and E : = x R : P ( X > x ) < p are intervals with the right endpoint . The condition P ( X > 0 ) > p means that 0 E . By the right continuity of P ( X > x ) in x, the set E contains the closure E ¯ of the set E . Therefore, 0 E ¯ and hence 0 < inf E = Q 0 ( X ; p ) , by (3.3). Thus, the positive sensitivity property is fully verified.
In the presence of the positive homogeneity, the subadditivity property is easy to see to be equivalent to the convexity; cf. e.g. Theorem 4.7 in [56].
Therefore, it remains to verify the convexity property. Assume indeed that α [ 1 , ] . If at that α < , then the function · α is a norm and hence convex; moreover, this function is nondecreasing on the set of all nonnegative r.v.’s. On the other hand, the function R x x + is nonnegative and convex. It follows by (3.9) that B α ( X ; p ) ( t ) is convex in the pair ( X , t ) . So, to complete the verification of the convexity property of Q α ( X ; p ) in the case α [ 1 , ) , it remains to refer to the well-known and easily established fact that, if f ( x , y ) is convex in ( x , y ) , then inf y f ( x , y ) is convex in x; cf. e.g. Theorem 5.7 in [56].
The subadditivity and hence convexity of Q α ( X ; p ) in X in the remaining case α = can now be obtained by the already established stability in α. It can also be deduced from Lemma B.2 in [57] (cf. Lemma 2.1 in [58]) or from the main result in [23], in view of the inequality (LX1+⋯+Xn)*−1 Risks 02 00349 i001 given in the course of the discussion in [23] following Corollary 2.2 therein. However, a direct proof, similar to the one above for α [ 1 , ) , can be based on the observation that B ( X ; p ) ( t ) is convex in the pair ( X , t ) . Since t ln 1 p is obviously linear in ( X , t ) , the convexity of B ( X ; p ) ( t ) in ( X , t ) means precisely that for any natural number n, any r.v.’s X 1 , , X n , any positive real numbers t 1 , , t n , and any positive real numbers α 1 , , α n with i α i = 1 , one has the inequality t ln E e X / t i α i t i ln E e X i / t i , where X : = i α i X i and t : = i α i t i ; but the latter inequality can be rewritten as an instance of Hölder’s inequality: E i Z i i Z i p i , where Z i : = e α i X i / t and p i : = t / ( α i t i ) (so that i 1 p i = 1 ). (In particular, it follows that B ( X ; p ) ( t ) is convex in t, which is useful when Q ( X ; p ) is computed by Formula (3.8).)
The proof of Theorem 3.4 is now complete. ☐
Proof of Proposition 3.5. Take indeed any α [ 0 , ) . Let then Y be a r.v. with the density function f given by the formula f ( y ) = c α y α 1 ( ln y ) 2 I { y > 2 } for all y R , where c α : = 1 / 2 y α 1 ln 2 y d y . Then Y X α and, by the finiteness property stated in Theorem 3.4, Q α ( Y ; p ) R . Thus, one can find some real constant c > Q α ( Y ; p ) . Let now X = c , for any such constant c. Then, by the consistency property stated in Theorem 3.4, Q α ( X ; p ) = c > Q α ( Y ; p ) . On the other hand, for any γ ( α + 1 , ] one has E g γ 1 ; t ( X ) = g γ 1 ; t ( c ) < = E g γ 1 ; t ( Y ) for all t T γ 1 (letting here γ 1 : = when γ = ), so that, by (2.23), X γ Y . ☐
Proof of Proposition 3.6. Consider first the case α ( 0 , ) . Let r.v.’s X and Y be in the default domain of definition, X α , of the functional Q α ( · ; p ) . The condition X < st Y and the left continuity of the function P ( X · ) imply that for any v R , there are some u ( v , ) and w ( v , u ) such that P ( X z ) < P ( Y z ) for all z [ w , u ] . On the other hand, by the Fubini theorem, E ( X t ) + α = R α ( z t ) + α 1 P ( X z ) d z for all t R . Recalling also that X and Y are in X α , one has B α ( X ; p ) ( t ) < B α ( Y ; p ) ( t ) for all t R . By Proposition 4.3, Q α ( Y ; p ) = B α ( Y ; p ) ( t opt ) for some t opt R . Therefore, Q α ( X ; p ) B α ( X ; p ) ( t opt ) < B α ( Y ; p ) ( t opt