Next Article in Journal
An Alternated Inertial Projection Algorithm for Multi-Valued Variational Inequality and Fixed Point Problems
Next Article in Special Issue
Bound for an Approximation of Invariant Density of Diffusions via Density Formula in Malliavin Calculus
Previous Article in Journal
Efficient Solution of Burgers’, Modified Burgers’ and KdV–Burgers’ Equations Using B-Spline Approximation Functions
Previous Article in Special Issue
Limit Theorem for Spectra of Laplace Matrix of Random Graphs
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Second Order Chebyshev–Edgeworth-Type Approximations for Statistics Based on Random Size Samples

by
Gerd Christoph
1,*,† and
Vladimir V. Ulyanov
2,3,†
1
Department of Mathematics, Otto-von-Guericke University Magdeburg, 39016 Magdeburg, Germany
2
Faculty of Computer Science, HSE University, 101000 Moscow, Russia
3
Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, 119991 Moscow, Russia
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2023, 11(8), 1848; https://doi.org/10.3390/math11081848
Submission received: 27 February 2023 / Revised: 3 April 2023 / Accepted: 11 April 2023 / Published: 13 April 2023
(This article belongs to the Special Issue Limit Theorems of Probability Theory)

Abstract

:
This article completes our studies on the formal construction of asymptotic approximations for statistics based on a random number of observations. Second order Chebyshev–Edgeworth expansions of asymptotically normally or chi-squared distributed statistics from samples with negative binomial or Pareto-like distributed random sample sizes are obtained. The results can have applications for a wide spectrum of asymptotically normally or chi-square distributed statistics. Random, non-random, and mixed scaling factors for each of the studied statistics produce three different limit distributions. In addition to the expected normal or chi-squared distributions, Student’s t-, Laplace, Fisher, gamma, and weighted sums of generalized gamma distributions also occur.

1. Introduction

To improve the convergence properties of sums of independent identically distributed random variables in the Central Limit Theorem, asymptotic expansions of distribution functions of normalized sums were considered. The history of asymptotic expansions in nonparametric statistics is presented in detail in Wallace [1], Bickel [2], and Hall [3], among others. Chebyshev–Edgeworth expansions, with which we are concerned here, are presented in great detail in Bhattacharya and Rao [4] for random vectors and in Petrov [5] for one-dimensional random variables. For instance, in Pfanzagl [6] and Bentkus et al. [7], the authors emphasize that asymptotic expansions can provide more effective approximations for asymptotic studies in statistical theory. Second order approximations of distribution functions of sums of random variables are of great importance because they take into account the skewness and kurtosis of the random variable in addition to the expected value and the variance, as in the Central Limit Theorem. In Burnashev [8], second order expansions are proved for the asymptotically normally distributed sample median M m on a sample of size m and its MSE. Based on this, for a Laplace population with density e | x | / 2 , the actual MSE with exact data is compared numerically with approximations data. For the normal approximation, the influence of the remaining term is below 10% only for m > 250 , while for the approximation with the second order expansion, the influence of the remaining term is below 10% already from m = 8 . For a Cauchy population with smooth and heavy tailed density 1 / ( π ( 1 + x 2 ) ) , for the normal approximation, the influence of the remaining term is below 10% for m 23 , while for the approximation with the second order expansion, the influence of the remaining term is below 10% already from m = 11 . Consequently, as Burnashev [8] pointed out, asymptotic expansions can significantly improve the exactness of statistical conclusions, even in the case of a small number of observations. The results in the abovementioned papers are based on non-random sample sizes or non-random number of observations.
When planning statistical studies, situations often arise where the sample sizes are unknown in advance and they are modeled as realizations of random variables. Many models from medicine, finance, risk theory, physics, and reliability lead to samples with random dimensions. For instance, in the papers by Nunes et al. [9,10,11], different models in medical research random size samples were investigated in order to prevent false conclusions. In Esquível et al. [12], the authors give an informative overview of statistical inference with a random number of observations and some applications. Results for mean and variance for normally distributed samples, calculation of quantiles, and interval estimates with random sample size were also proved. Döbler [13] gives a detailed review of the literature on random sums as well as recent results on approximation in various metrics. In Schluter and Trede [14] (Theorem 1, Proposition 1), the authors show, using the convergence of a negative binomial random sum, that the growth rate of cities is Student t-distributed with 2 degrees of freedom. Their empirical investigations verify the result. The references in the above-cited papers provide further applications for random dimension sampling.
Bening et al. [15,16] proved convergence rates and asymptotic expansions for distributions of statistics T N n based on samples with random dimension N n 1 . Here, T m is a statistic based on a non-random number m 1 of independent observations. The random variables size N n 1 form a sequence of integer random sample sizes that depends on a natural parameter n with N n in probability for n . Inequalities with a convergence rate are assumed for the approximations of the distribution functions of both the normalized statistics T m and the normalized random sample sizes N n . As examples, convergence rates and first order asymptotic expansions are derived for the statistics T N n , where T m is an asymptotically normal statistic and the random sample size N n is either negatively binomial or Pareto-like distributed.
In Christoph et al. [17], inequalities for the second order approximations of the distribution functions of normalized negative binomial and Pareto-like sample sizes were proved. Consequently, second order Chebyshev–Edgeworth approximations and the corresponding Cornish–Fisher expansions could be obtained for the distribution of the normalized arithmetic mean of a sample with normalized negative binomial or Pareto-like sample sizes where the remainders are of order n 3 / 2 .
The present work provides a supplement to our paper, Christoph and Ulyanov [18], where we have developed a formal second order design for asymptotic Chebyshev–Edgeworth approximations. We considered asymptotically normal statistics with sample size having negative binomial distribution as well as asymptotically chi-squared statistics with Pareto-like distributed sample sizes. In addition to the distributions of statistic T m and random sample size N n , three scaling factors for T N n are also introduced, leading to different expansions. It is the first paper to consider approximations for asymptotic chi-square statistics based on random sample sizes. Some more applications of random sample size sampling were also mentioned.
In the present paper, we provide similar results for asymptotically normal statistics of samples with Pareto-like distributed sample sizes and for asymptotically chi-squared statistics with sample size having negative binomial distribution.
For better reader convenience, we list in Section 2 some notations, conditions, and statements that were also used in Christoph and Ulyanov [18]. Section 3 states the necessary approximations for the statistics T m and the sample sizes N n . The dependence of the limit distributions of the scaled statistic T N n on the distributions of the statistic T m and the sample size N n , as well as the scaling factors, is discussed in Section 4. Section 5 then presents the main results. As examples, we consider the same statistic T m as in Christoph and Ulyanov [18] (Corollaries 1 and 2), but with changed sample sizes. Section 6 provides the proofs of the main results, leaving three auxiliary lemmas to Appendix A. Conclusions are presented in Section 7.

2. Notation and Preliminaries

Let Ω , A , P be a probability space on which all occurring random variables are given.
Set positive numbers, real axis, integer part [ y ] of real y, and indicator function as follows:
N + = { 1 , 2 , . . . } , R = ( , ) , y 1 < [ y ] y a n d I A = I A ( x ) = 1 , x A R 0 , x A R .
Let X 1 , X 2 , X 3 R be independent identically distributed random variables. Define the statistic
T m : = T m ( X 1 , , X m ) w i t h m N + ,
based on the random sample { X 1 , X 2 , , X m } with a non-random sample size m N + .
Consider the sequence of discrete random variables N 1 , N 2 , , depending on an integer parameter n 1 . This integer N n 1 indicates the random dimension of the observations X 1 , , X N n . Let us assume that the sample size N n does not depend on X 1 , X 2 , X 3 , where N n in probability when n . Define for each n N + the statistic T N n obtained from a random sample { X 1 , X 2 , , X N n }  by
T N n ( ω ) : = T N n ( ω ) ( X 1 ( ω ) , X 2 ( ω ) , , X N n ( ω ) ( ω ) ) for each ω Ω .
It follows from Esquível et al. [12] (Theorem 2.1.1) that the statistic T N n is well-defined in (1).
Since we want to prove second order approximations for the statistic T N n in form of inequalities, we need the corresponding assumptions for the statistic T m and for the random sample size N n as well.
For the statistic T m with E T m = 0 and the random sample sizes N n N + we suppose conditions on the structure of the approximating functions as well as on the convergence rate:
Assumption 1. 
There are a distribution function F ( x ) , bounded functions f 1 ( x ) , f 2 ( x ) which are differentiable for all x 0 , γ { 1 , 1 / 2 , 0 , 1 / 2 , 1 } , a > 1 / 2 as well as 0 < C 1 < such that
sup x | P m γ T m x F ( x ) m 1 / 2 f 1 ( x ) I a > 1 ( a ) m 1 f 2 ( x ) | C 1 m a , m 1 .
Assumption 2. 
There exists a distribution function H ( y ) with H ( 0 + ) = 0 , a bounded variation function h 2 ( y ) , a sequence of numbers 0 < g n , b > 0 , and 0 < C 2 < such that for n N +
sup y 0 P g n 1 N n y H ( y ) C 2 n b , f o r 0 < b 1 , sup y 0 P g n 1 N n y H ( y ) n 1 h 2 ( y ) C 2 n b , f o r b > 1 .
Remark 1. 
Assumptions 1 and 2 require inequalities for the approximations of T m and N n for all m , n N + , leading to inequalities for the approximations of T N n . See also Remark 5 below on Poisson and binomial random variables N n . For these sample sizes, we are so far only aware of estimates of the remaining terms with small-o or large- O convergence rates. About the differences between inequalities and O order bounds, see, e.g., Fujikoshi and Ulyanov [19] (Chapter 1).
Remark 2. 
In Bening et al. [16], these conditions are formulated more generally. Assumption 1 requires the existence of f 1 ,..., f l with a > l / 2 and Assumption 2 that of h 1 ,..., h k with b > k / 2 . We restrict ourselves here, as in Christoph and Ulyanov [18], to the required approximation functions.
Assumptions 1 and 2 lead to the approximations for the distribution functions of statistics T N n :
Proposition 1. 
(Christoph and Ulyanov [18], Proposition 1) Let γ { 1 , 1 / 2 , 0 , 1 / 2 , 1 } . The statistic T m and the sample size N n are supposed to satisfy Assumptions 1 and 2, respectively. Then,
sup x R | P g n γ T N n x G n ( x , 1 / g n ) | C 1 E N n a + ( C 3 D n + C 4 ) n b ,
where a > 0 , b > 0 are the convergence rates in (2) and (3),
G n ( x , 1 / g n ) = 1 / g n F ( x y γ ) + f 1 ( x y γ ) g n y + f 2 ( x y γ ) g n y d H ( y ) + h 2 ( y ) n ,
D n = sup x 1 / g n y F ( x y γ ) + f 1 ( x y γ ) g n y + f 2 ( x y γ ) y g n d y ,
and f 1 ( z ) , f 2 ( z ) , h 2 ( y ) are given in (2) and (3). The constants C 1 , C 3 , C 4 do not depend on n.
Bening et al. [16] proved general transfer theorems under the conditions indicated in Remark 2 only for case γ 0 . Therefore, the proof is repeated in Christoph and Ulyanov [20] (Appendix A.1).

3. Second Order Estimates for Both the Statistics T m and the Sample Sizes N n

First we consider the following statistics T m with non-random sample size m and E T m = 0 with the corresponding second order approximations. Let the asymptotically normal statistic T m satisfy the following inequality:
P ( m T m x ) Φ ( x ) m 1 / 2 ( p 0 + p 2 x 2 ) + m 1 ( p 1 x + p 3 x 3 + p 5 x 5 ) I a > 1 ( a ) φ ( x ) C m a
with a > 0 and Φ ( x ) refers to the standard normal distribution function with density function φ ( y ) :
Φ ( x ) = x φ ( y ) d y , x R , and φ ( y ) = 1 2 π e y 2 / 2 , y R .
Asymptotically chi-squared distributed statistics T m satisfy the following inequality:
P ( m T m x ) G d ( x ) m 1 ( q 1 x + q 2 x 2 ) g d ( x ) C m 2 ,
where G d ( x ) , d N + , denotes the chi-squared distribution function with d degrees of freedom and the density function g d ( y ) :
g d ( y ) = 1 2 d / 2 Γ ( d / 2 ) y ( d 2 ) / 2 e y / 2 , y > 0 , and G d ( x ) = P ( χ d 2 x ) = 0 x g d ( y ) d y , x > 0 .
In Christoph and Ulyanov [18] (Sections 3.1 and 3.2), some examples of such statistics T m are given that satisfy (7) or (8) and consequently, Assumption 1.
As already announced, we consider the following random sample sizes N n with the corresponding second order approximations.
The Pareto-like random sample sizes N n ( s ) are defined as follows:
Let Y j ( s ) N + , j = 1 , 2 , be independent discrete Pareto II random variables with parameter s > 0 , which are discretized from continuous Lomax (Pareto II) random variables on N + , for a review, see, e.g., Buddana and Kozubowski [21]. For s > 0 , there are defined
P Y j ( s ) k = k s + k , N n ( s ) = max 1 j n Y j ( s ) and P ( N n ( s ) k ) = k s + k n , n , k N + .
Proposition 2. 
(Christoph and Ulyanov [18], Proposition 4) Let N n ( s ) be the discrete Pareto-like random variable whose distribution function is given in (9); then, for all integers n 1 and fixed positive s > 0 , we have
sup y > 0 P N n ( s ) n y W s ( y ) h 2 ; s ( y ) n C 2 ( s ) n 2
W s ( y ) = e s / y y > 0 , h 2 ; s ( y ) = s e s / y 2 y 2 ( s 1 + 2 Q 1 ( n y ) ) , y > 0 ,
with jump correcting function Q 1 ( y ) = 1 / 2 ( y [ y ] ) and C 2 ( s ) > 0 does not depend on n. Furthermore,
E N n ( s ) a C ( a , s ) n min { a , 2 } ,
with optimal bound in (12) for 0 < a 2 , where a is the convergence rate in (7).
Remark 3. 
The inverse exponential random variable W ( s ) with distribution function H s ( y ) = P ( W ( s ) y ) = e s / y I ( 0 , ) ( y ) and rate parameter s > 0 is “heavy tailed” with shape parameter 1 as is P ( N n ( s ) y ) . Thus, the expected values of these two random variables do not exist.
Suppose the positive integer N n ( r ) has a (shifted by 1) negative binomial distribution with probability of success 1 / n , n N + , parameter r > 0 , probabilities
P ( N n ( r ) = j ) = Γ ( j + r 1 ) Γ ( j ) Γ ( r ) 1 n r 1 1 n j 1 , j N + and g n = E ( N n ( r ) ) = r ( n 1 ) + 1 .
In statistical studies, for counting models, the negative binomial and Poisson distributions are the two most important ones. In Schluter and Trede [14] (Section 2.1), the authors emphasize that the negative binomial distribution with its two parameters can typically observe over-dispersion in count data, while this is not the case with the one-parameter Poisson distribution. They proved in a more general framework
lim n sup y P ( N n ( r ) / g n y ) G r , r ( y ) = 0 ,
while G r , r ( y ) denotes the gamma distribution that has identical scale and shape parameters r > 0 , whose density is
g r , r ( y ) = r r Γ ( r ) y r 1 e r y I ( 0 , ) ( y ) , y R .
In Bening and Korolev [22] (Lemma 2.2), the result (14) was also obtained.
Proposition 3. 
(Christoph and Ulyanov [18], Proposition 3) Let r > 0 . The discrete random variable N n ( r ) has probabilities and expected value g n given in (13). Then, for all n N + :
sup y 0 P N n ( r ) g n y G r , r ( y ) h 2 ; r ( y ) n C 2 ( r ) n min { r , 2 } ,
where C 2 ( r ) > 0 does not depend on n and with the jump correcting function Q 1 ( y ) = 1 / 2 ( y [ y ] ) ,
h 2 ; r ( y ) = 0 , f o r r 1 , g r , r ( y ) 2 r ( y 1 ) ( 2 r ) + 2 Q 1 ( g n y ) , f o r r > 1 .
Moreover, negative moments E ( N n ( r ) ) a satisfy the estimation for all r > 0 , α > 0
E N n ( r ) α C ( r ) n min { r , α } , r α ln ( n ) n α , r = α
and the convergence rate in case r = α cannot be improved.
Remark 4. 
Second order Chebyshev–Edgeworth expansions (10) and (15) with r > 1 were first proved in Christoph et al. [17] (Theorems 4 and 1). Approximations in (10) and (15) with remainder estimations C s / n or C r n min { r , 1 } are given, e.g., in Bening et al. [16] and Gavrilenko et al. [23]. In Christoph et al. [24] (Corollaries 5.4 and 6.5), leading terms for the negative moments of N n ( r ) and N n ( s ) are derived that lead to (17) and (12).
Remark 5. 
The negative binomial distribution belongs to the class of Panjer distributions, which also includes the Poisson and binomial distributions. Samples with binomial or Poisson distributed sample sizes were studied among others in the above-cited papers [9,10,11,12]. Convergence rate bounds for statistics based on such samples are given in Döbler [13], Korolev [25], Bulinski and Slepov [26]. Döbler [13], Korolev and Shevtsova [27], Sunklodas [28] obtained Berry–Esseen bounds for sums based on samples with binomial and Poisson sample sizes. To the best of the authors’ knowledge, Chebyshev–Edgeworth expansions for these lattice distributed random variables have only been proven so far with bounds of small-o or large- O rates, see, e.g., Petrov [29] (Chapter 6, Theorem 6) or Kolassa and McCullagh [30]. Therefore, inequality (3) in Assumption 2 is not fulfilled.

4. Limit Distributions of Statistics with Random Size Samples using Different Scaling Factors

We now consider the statistics T m and the sample sizes N n , which are supposed to satisfy the inequalities (2) and (3) in Assumptions 1 and 2, respectively. Let us investigate the scaled statistics g n γ N n γ γ T N n with the sequence g n as n . We analyze the two cases Φ and G u as limiting distributions F in Assumption 1 with respect to the exponents γ and γ : If F = Φ , then γ = 1 / 2 and γ { 1 / 2 , 0 , 1 / 2 } , while if F = G u , then γ = 1 and γ { 1 , 0 , 1 } . Then, conditioning on N n and using (2) and (3), we have
P g n γ N n γ γ T N n x = P N n γ T N n x ( N n / g n ) γ = m = 1 P m γ T m x ( m / g n ) γ P ( N n = m ) ( 2 ) E F x ( N n / g n ) γ = 1 / g n F ( x y γ ) d P ( N n / g n y ) ( 3 ) 1 / g n F ( x y γ ) d H ( y ) .
Consequently, the limit distribution of the scaled statistic g n γ N n γ γ T N n is a scale mixture of underlying F with mixing distribution H: P g n γ N n γ γ T N n x 0 F ( x y γ ) d H ( y ) , as n . Refer to, e.g., Choy and Chan [31], Fujikoshi et al. [32] (Chapter 13), and Fujikoshi and Ulyanov [19] (Chapter 2) and the references therein.
The limiting distributions 1 / g n F ( x y γ ) d H ( y ) therefore only arise from the leading distributions F ( x ) and H ( y ) in the inequalities (2) and (3) and also depend on the parameter γ .
In Christoph and Ulyanov [18] (Sections 5 and 6), the cases F ( x ) = Φ ( x ) with H ( y ) = G r , r ( y ) as well as F ( x ) = G u ( x ) with H ( y ) = W s ( y ) were considered. Now, we interchange the distributions of random sample sizes N n . We first study the limiting distributions of asymptotically normally distributed statistics with Pareto-like distributed sample sizes N n ( s ) and also asymptotically chi-squared distributed statistics with negative binomial distributed sample sizes N n ( r ) . Since W s ( 1 / n ) = e s n and G r , r ( 1 / g n ) r r 1 Γ ( r ) g n r hold, the integral range in the last integral in (18) can be extended from ( 1 / g n , ) to ( 0 , ) for further investigations.

4.1. The Case F ( x ) = Φ ( x ) and H ( y ) = W s ( y )

In Christoph and Ulyanov [20,33], asymptotically normally distributed statistics T m for samples of m-dimensional normally distributed vectors were considered: correlation coefficient as well as the three geometric features: the length of a vector, the distance, and the angle between two vectors. Inequalities for second order approximations for statistic T m are derived when the dimension m is replaced by Pareto-like distributed random dimension N n ( s ) . For the median of a sample with random sample size N n ( s ) analogous results are shown in Christoph et al. [24] (Section 6). All these asymptotically normally distributed statistics T N n ( s ) with Pareto-like random dimensions or sample sizes have the same limiting distribution.
Let γ { 1 / 2 , 0 , 1 / 2 } . Since E N n ( s ) = , we choose as g n = n . Then, the limit laws for
P n γ N n ( s ) 1 / 2 γ T N n ( s ) x are V γ ( x , s ) = 0 Φ ( x y γ ) d H s ( y ) = 0 Φ ( x y γ ) s y 2 e s y d y .
with corresponding densities
v γ ( x , s ) = s 2 π 0 y γ 2 e ( x 2 y 2 γ / 2 + s / y ) d y = l 1 / s ( x ) = 2 s 2 e 2 s | x | , γ = 1 2 , φ ( x ) = 1 2 π e x 2 / 2 , γ = 0 , s 2 ( x ; s ) = 1 2 2 s 1 + x 2 2 s 3 / 2 , γ = 1 2 , .
Therefore, the limit distributions V γ ( x , s ) are the Laplace law L 1 / s ( x ) with density l 1 / s ( x ) and scale parameter λ = 1 / s for γ = 1 / 2 , the standard normal law Φ ( x ) and density φ ( x ) for γ = 0 and for γ = 1 / 2 the scaled Student’s t-distribution S 2 ( x ; s ) with 2 degrees of freedom and density s 2 ( x ; s ) . These mixed scale distributions V γ ( x , s ) are discussed in more detail in Christoph and Ulyanov [20] (Section 4.2).

4.2. The Case F ( x ) = G d ( x ) and H ( y ) = G r , r ( y )

Asymptotically chi-squared distributed statistics of samples with random sample size were considered for the first time in Christoph and Ulyanov [18] in case of H ( y ) = W s ( y ) = e s / y , y > 0 .
Now, negatively binomial distributed sample sizes N n ( r ) are considered. With γ { 1 , 0 , 1 } and g n = E N n ( r ) = r ( n 1 ) + 1 , the limit distributions for
P g n γ N n ( r ) 1 γ T N n ( r ) x are V γ ( x ; d , r ) = 0 G d ( x y γ ) d G r , r ( y ) = 0 G d ( x y γ ) r r Γ ( r ) y r 1 e r y d y .
The corresponding densities are
v γ ( x ; d , r ) = r r x d / 2 1 Γ ( r ) 2 d / 2 Γ ( d / 2 ) 0 y r + γ d / 2 1 e ( x y γ / 2 + r y ) d y = f ( x ; d , 2 r ) = Γ ( d / 2 + r ) x d / 2 1 Γ ( d / 2 ) Γ ( r ) 2 d / 2 r d / 2 1 + x 2 r ( d + 2 r ) / 2 , γ = 1 , g d ( x ) = 1 2 d / 2 Γ ( d / 2 ) x d / 2 1 e x / 2 , γ = 0 , w r d / 2 ( x ; d , r ) = r Γ ( r ) Γ ( d / 2 ) x r 2 r / 2 + d / 4 1 K r d / 2 ( 2 r x ) . γ = 1 .
We prove (20) for γ = ± 1 in Section 6 in the proof of Theorem 2.
The scale mixtures V γ ( x ; d , r ) are the (scaled by d) F-distribution F ( x ; d , 2 r ) = F ( x / d ; d , 2 r ) with parameters d N + and r > 0 and density f ( x ; d ; 2 r ) = 1 d f ( x d ; d ; 2 r ) for γ = 1 , the chi-squared distribution G d ( x ) with d degrees of freedom and density g d ( x ) for γ = 0 and a gamma distribution of generalized type W r d / 2 ( x ; d , r ) occurs with density w r d / 2 ( x ; d , r ) for γ = 1 . The modified Bessel function of the third kind or Macdonald functions K λ ( u ) also occurred in Christoph and Ulyanov [18,20] in generalized gamma and Laplace densities.
Remark 6. 
The Macdonald function satisfying order-reflection formula K λ ( u ) = K λ ( u ) and K λ ( u ) may be expressed for λ = m + 1 / 2 with integer m in closed forms. In Oldham et al. [34] (Formulas 51:4:1 and 26:13:3), the Macdonald functions K λ ( u ) = K λ ( u ) for λ = 1 / 2 , 3 / 2 , 5 / 2 , 7 / 2 , 9 / 2 are explicitly given. Using Prudnikov et al. [35] (Formulas 2.3.16.1-3), the densities w r d / 2 ( x ; d , r ) = w m + 1 / 2 ( x ; d , r ) can be calculated:
w m + 1 / 2 ( x ; d , r ) = r r x d / 2 1 Γ ( r ) 2 d / 2 Γ ( d / 2 ) ( 1 ) m π m r m r 1 / 2 e 2 r x , m = 0 , 1 , 2 , , ( 2 ) m π r m x m e 2 r x , m = 0 , 1 , 2 ,
Example 1. 
Some densities w m + 1 / 2 ( x ; d , r ) for m = r ( d + 1 ) / 2 = 2 , 1 , 0 , 1 , 2 :
m = 2 d = 7 , r = 2 w 3 / 2 ( x ; 7 , 2 ) = 4 x 15 ( 1 + 4 x ) e 4 x m = 1 d = 4 , r = 3 / 2 w 1 / 2 ( x ; 4 , 3 / 2 ) = 3 4 3 x e 3 x m = 0 d = 4 , r = 5 / 2 w 1 / 2 ( x ; 4 , 5 / 2 ) = 1 12 25 x e 5 x m = 0 d = 3 , r = 2 w 1 / 2 ( x ; 3 , 2 ) = 4 x e 4 x m = 1 d = 3 , r = 3 w 3 / 2 ( x ; 3 , 3 ) = 3 8 ( 6 x + 6 x ) e 6 x m = 2 d = 3 , r = 4 w 5 / 2 ( x ; 3 , 4 ) = 1 12 ( 8 x ) 3 / 2 + 24 x + 3 8 x e 8 x .
Remark 7. 
If m = r ( d + 1 ) / 2 is an integer, the distribution functions W m + 1 / 2 ( x ; d , r ) of the densities w m + 1 / 2 ( x ; d , r ) can also be calculated explicitly by substitution and partial integration.
Example 2. 
Distribution functions W λ ( x ; d , r ) for given densities w λ ( x ; d , r ) with λ = ± 1 / 2 :
w 1 / 2 ( x ; 4 , 3 2 ) = 3 4 3 x e 3 x a n d W 1 / 2 ( x ; 4 , 3 2 ) = 1 1 2 2 3 x + 3 x + 2 e 3 x
w 1 / 2 ( x ; 4 , 5 2 ) = 25 x 12 e 5 x a n d W 1 / 2 ( x ; 4 , 5 2 ) = 1 ( 5 x ) 3 / 2 6 + 5 x 2 + 5 x 6 + 1 e 5 x
w 1 / 2 ( x ; 3 , 2 ) = 4 x e 4 x a n d W 1 / 2 ( x ; 3 , 2 ) = 1 ( 2 x + 2 x + 1 ) e 4 x .
Remark 8. 
The generalized gamma distribution G ( x ; β , α , λ ) has two shape parameters α and β, a scale parameter λ, and the density
g ( x ; β , α , λ ) = | α | λ β Γ ( β ) x α β 1 e λ x α , x 0 , | α | > 0 , β > 0 , λ > 0 .
The density (25) is given in Korolev and Zeifman [36] and Korolev and Gorshenin [37] and summarizes many known densities. Generalized gamma distributions are defined in many different ways, but they do not correspond to the ones that occur above.
Remark 9. 
The densities w m + 1 / 2 ( x ; d , r ) with integer m = r ( d + 1 ) / 2 are generalized gamma densities g ( x ; β , α , λ ) given in formula (25) or may be represented as linear combinations of such densities. The parameters α = 1 / 2 and λ = 2 r apply in all densities g ( x ; β , α , λ ) . The parameter β also depends on the number of derivatives m = r ( d + 1 ) / 2 in the densities ( 21 ) .
Example 3. 
Some linear combinations of generalized gamma densities:
w 1 / 2 ( x ; 3 , 2 ) = g ( x ; 3 , 1 / 2 , 4 ) w 3 / 2 ( x ; 3 , 3 ) = 3 4 g ( x ; 4 , 1 / 2 , 6 ) + 1 4 g ( x ; 3 , 1 / 2 , 6 ) w 5 / 2 ( x ; 3 , 4 ) = 1 2 g ( x ; 5 , 1 / 2 , 8 ) + 3 8 g ( x ; 4 , 1 / 2 , 8 ) + 1 8 g ( x ; 3 , 1 / 2 , 8 ) .

5. Main Results

Inequalities for approximations to scaled statistics P g n γ N n γ γ T N n x for γ { 0 , ± 1 / 2 , ± 1 } will be presented. Here, γ = 1 / 2 and γ { 0 , ± 1 / 2 } when the statistic T m is asymptotically normally distributed, or γ = 1 and γ { 0 , ± 1 } when normalized T m has chi-squared limit distribution.

5.1. Asymptotically Normal Statistics T m and Pareto-like Sample Sizes N n ( s )

Let asymptotically normal statistic T m satisfy inequality (7) with coefficients p k and the rate of convergence a > 0 . The Pareto-like sample size N n = N n ( s ) , s > 0 , is given in (9), which fulfills the inequality (10). For the scaling factors, select γ = 1 / 2 and γ { 0 , ± 1 / 2 } in formula (18).
Theorem 1. 
Under the conditions given above, the following approximations apply:
i:
Let γ = 1 / 2 . The non-random scaling factor n for the statistic T N n ( s ) leads to approximations by the Laplace distribution L 1 / s ( x ) with the density l 1 / s ( x ) stated in (19) for γ = 1 / 2 :
sup x P n T N n ( s ) x L 1 / s ; n ( x ) C s n min { a , 2 }
where a > 0 is the rate of convergence in (7) and
L 1 / s ; n ( x ) = L 1 / s ( x ) + l 1 / s ( x ) ( I { a > 1 / 2 } ( a ) n p 2 x 2 + p 0 | x | 2 s + 1 2 s + I { a > 1 } ( a ) n p 5 x 3 | x | 2 s + p 3 x 3 + p 1 + s 1 4 x | x | 2 s + 1 2 s ) .
ii:
Let γ = 0 . The random scaling factor N n ( s ) with T N n ( s ) leads to the normal approximation Φ ( x ) :
sup x P N n ( s ) T N n ( s ) x Φ ( x ) φ n , 2 ( x ) C s n min { a , 2 } ,
where a > 0 is the rate of convergence in (7) and
φ n , 2 ( x ) = φ ( x ) π ( p 0 + p 2 x 2 ) 2 s n I { a > 1 / 2 } ( a ) + p 1 x + p 3 x 3 + p 5 x 5 s n I { a > 1 } ( a ) .
iii:
Let γ = 1 / 2 . The mixed scaling factor n 1 / 2 N n ( s ) at T N n ( s ) results in Scaled Student’s t-distribution S 2 ( x ; s ) with density s 2 ( x ; s ) given in (19) for γ = 1 / 2 :
sup x P n 1 / 2 N n ( s ) T N n ( s ) x S n ; 2 ( x ) C s n min { a , 2 } ,
where a > 0 is the rate of convergence in (7) and
S n ; 2 ( x ; s ) = S 2 ( x ; s ) + s 2 ( x ; s ) ( I { a > 1 / 2 } ( a ) n p 0 + 3 p 2 x 2 ( x 2 + 2 s ) + I { a > 1 } ( a ) n 3 p 1 x x 2 + 2 s + 15 p 3 x 3 ( x 2 + 2 s ) 2 + 105 p 5 x 5 ( x 2 + 2 s ) 3 + 3 ( s 1 ) x 4 ( x 2 + 2 s ) ) .
As applications of the Theorem 1, we now examine the Student t-distribution, the Student t-test statistic, and the sample mean as asymptotically normal statistics T m considered in Christoph and Ulyanov [18] (Section 3.1 and Corollary 1) for the case of negative binomial sample sizes N n = N n ( r ) .
Corollary 1. 
Let the conditions of Theorem 1 be satisfied:
i:
Let γ = 1 / 2 . In case of the Student’s t-statistic T m = Z / χ m 2 with m degrees of freedom estimated in [18] (Formula (18)), inequality (7) is valid with p 0 = p 2 = p 5 = 0 , p 1 = p 3 = 1 / 4 and a = 2 . The non-random scaling factor n and Pareto-like N n ( s ) sample sizes lead to:
sup x P n Z χ N n ( s ) 2 x L 1 / s ( x ) l 1 / s ( x ) 8 n 2 x 3 + x ( 1 + | x | 2 s C s n 2
ii:
Let γ = 0 . Let T m = ( X ¯ m μ ) / σ ^ m be the Student’s t-statistic with sample mean X ¯ m and sample variance σ ^ m , which was considered in [18] (Formulas (21) and (20)). The first order approximation (7) with p 0 = λ 3 / 6 , p 2 = λ 3 / 3 , a = 1 , the Pareto-like random sample sizes N n ( s ) and the random scaling factor N n ( s ) result in:
sup x P N n ( s ) T N n ( s ) x Φ ( x ) φ ( x ) π ( λ 3 + 2 λ 3 x 2 ) 12 s n C s n 1 ,
iii:
Let γ = 1 / 2 . Considering sample mean T m = X ¯ m estimated in [18] (Formulas (15) and (16)), one has (7) with p 0 = p 2 = λ 3 / 6 , p 1 = λ 4 / 8 5 λ 3 2 / 24 , p 3 = λ 4 / 24 + 5 λ 3 2 / 36 , p 5 = λ 3 2 / 72 , a = 3 / 2 , Pareto-like random sample sizes N n ( s ) and mixed scaling factor n 1 / 2 N n ( s ) , then
sup x P n 1 / 2 N n ( s ) T N n ( s ) x S 2 ( x ; s ) s n ; 2 ( x ; s ) C s n 3 / 2 ,
with
s n ; 2 ( x ; s ) = s 2 ( x ; s ) ( 1 n λ 3 6 λ 3 x 2 2 ( x 2 + 2 s ) + 1 n ( 3 λ 4 5 λ 3 2 ) x 8 ( x 2 + 2 s ) 5 ( 3 λ 4 10 λ 5 ) x 3 24 ( x 2 + 2 s ) 2 35 λ 3 2 x 5 24 ( x 2 + 2 s ) 3 + 3 ( s 1 ) x 4 ( x 2 + 2 s ) ) .

5.2. Asymptotically Chi-Squared Distributed T m with Negative Binomially Distributed Sample Sizes N n ( r )

Let the asymptotically chi-squared distributed statistics T m satisfy inequality (8) with coefficients q 1 , q 2 and the rate of convergence a = 2 . The negative binomially distributed sample sizes N n = N n ( r ) with parameter r > 0 and success probability 1 / n are given in (13) and fulfill the inequality (15). For the scaling factors, choose γ = 1 and γ { 0 , ± 1 } in formula (18).
Theorem 2. 
Under the conditions given above, the following approximations apply.
i:
Let γ = 1 . The non-random scaling factor g n = E N n ( r ) = r ( n 1 ) + 1 at statistics T N n ( r ) leads to approximations by the scaled F-distribution F ( x ; d , 2 r ) = F ( x / d ; d , 2 r ) having parameters d N + and r > 0 and density f ( x ; d ; 2 r ) = 1 d f ( x d ; d ; 2 r ) given in (20) with γ = 1 :
sup x P g n T N n ( r ) x F ( x ; d , 2 r ) f n x ; d , 2 r C r n min { r , 2 } , r 2 , n 2 ln n , r = 2 ,
where
f n ( x ; d , 2 r ) = f ( x ; d , 2 r ) g n I { r > 1 } ( r ) q 1 2 r 2 x ( 2 r + x ) 2 r + d 2 + q 2 x 2 + x ( 2 r ) 2 .
ii:
For γ = 0 and random scaling factor N n ( r ) at T N n ( r ) , the approximation G d ( x ) does not change:
sup x P N n ( r ) T N n ( r ) x G d ( x ; n ) C r n min { r , 2 } , r 2 , n 2 ln n , r = 2 ,
where
G d ( x ; n ) = G d ( x ) + g d ( x ) g n I { r > 1 } ( r ) ( q 1 x + q 2 x 2 ) r r 1 .
iii:
Let γ = 1 and r 2 . The mixed scaling factor g n 1 N n 2 ( r ) at T N n ( r ) results in a gamma distribution of generalized type W r d / 2 ( x ; d , r ) with density w r d / 2 ( x ; d , r ) given in (20) for γ = 1 :
sup x P N n 2 ( r ) g n T N n ( r ) x W r d / 2 ; n ( x ; d , r ) C r n 2 , r > 2 , n 2 ln n , r = 2 , ,
where
W r d / 2 ; n ( x ; d , r ) = W r d / 2 ( x ; d , r ) + w r d / 2 ( x ; d , r ) g n I { r > 1 } ( r ) ( 2 q 2 r x + ( r 2 ) x 2 + 2 r x 2 2 q 1 + 2 q 2 ( d + 2 2 r ) + 2 r K r d / 2 1 ( 2 r x ) K r d / 2 ( 2 r x ) ) .
The restriction r 2 in Theorem 2(iii) has a purely proof-technical character. In Proposition 4, a result is shown with r = 3 / 2 .
Remark 10. 
The function R ( u ; d , r ) = K λ 1 ( u ) K λ ( u ) can be calculated explicitly for λ = m + 1 / 2 with integer m = r ( d + 1 ) / 2 . Then, for example, R ( 3 x ; 4 , 3 / 2 ) = 1 + 1 3 x and R ( 4 x ; 3 , 2 ) = 1 .
Example 4. 
Let γ = 1 in (20), r = 2 and d = 3 . Then, for an asymptotically chi-squared distributed test variable T m satisfying (8), with scale factor N n 2 ( 2 ) 2 n 1 , the estimation holds:
sup x > 0 P N n 2 ( 2 ) 2 n 1 T N n ( 2 ) x W 1 / 2 ( x ; 3 , 2 ) + w 1 / 2 ( x ; 3 , 2 ) 4 ( 2 n 1 ) 4 x ( q 2 4 x + q 1 + q 2 ) C 2 ln n n 2 ,
where W 1 / 2 ( x ; 3 , 2 ) and w 1 / 2 ( x ; 3 , 2 ) are specified in (24).
As applications to Theorem 2, we now examine Hotelling’s T 0 2 distribution and normalized quotients of two independent chi-square distributions as asymptotic chi-square distributions, considered in Christoph and Ulyanov [18] (Section 3.2 and Corollary 2) where the sample sizes N n = N n ( s ) had Pareto-like distribution.
Corollary 2. 
The conditions of the Theorem 2 shall be fulfilled:
i:
Let γ = 1 . Consider Hotelling’s generalized T 0 2 -statistic T 0 2 = T m = tr S q S m 1 with independently distributed random matrices S q and S m having Wishart distributions W p ( q , I p ) and W p ( m , I p ) , respectively. Then, inequality (8) holds with limit distribution G d ( x ) , d = p q , q 1 = ( p + 1 q ) / 2 and q 2 = ( p + 1 + q ) / ( 2 d + 4 ) . The non-random scaling factor g n = E N n ( r ) by T N n ( r ) leads to
sup x P g n T N n ( r ) x F ( x ; p q , 2 r ) f n x ; p q , 2 r C r n min { r , 2 } , r 2 , n 2 ln n , r = 2 ,
where the scaled F-distribution F ( x ; p q , 2 r ) with density f ( x ; p q , 2 r ) is given in (20) for γ = 1
f n ( x ; p q , 2 r ) = f ( x ; p q , 2 r ) g n I { r > 1 } ( r ) ( p + 1 q 2 2 r 2 x ( 2 r + x ) 2 r + p q 2 + ( p + 1 + q ) x 2 ( 2 p q + 4 + x ( 2 r ) 2 ) .
ii:
Let γ = 0 , χ d 2 and χ m 2 be independent and T m = χ d 2 / χ m 2 be scale mixtures satisfying inequality (8) with coefficients q 1 = ( d 2 ) / 2 and q 2 = 1 / 2 . Random degrees of freedom N n ( r ) instead of m and random scaling factor N n ( r ) lead to
sup x > 0 P N n ( r ) T N n ( r ) x G d ( x ; n ) C r n min { r , 2 } , r 2 , n 2 ln n , r = 2 ,
where
G d ( x ; n ) = G d ( x ) + g d ( x ) 2 g n I { r > 1 } ( r ) ( ( d 2 ) x x 2 ) r r 1 .
iii:
Let γ = 1 . The statistics T m = χ 4 2 / χ m 2 satisfy the inequality (8) with the limiting distribution G 4 ( x ) and the coefficients q 1 = 1 and q 2 = 1 / 2 . The mixed scaling factor g n 1 N n 2 ( r ) at T N n ( r ) results in a limiting gamma distribution of generalized type W r d / 2 ( x ; d , r ) . Only if r ( d + 1 ) / 2 = m is an integer, the involved Macdonald functions K r d / 2 ( 2 r x ) may be explicitly calculated. Since d = 4 , we choose r = 5 / 2 and find r ( d + 1 ) / 2 = 0 . Then, uniformly in x > 0 :
P N n 2 ( 5 / 2 ) ( 5 n 3 ) / 2 χ 4 2 χ N n ( 5 / 2 ) 2 x W 1 / 2 ( x ; 4 , 5 / 2 ) + w 1 / 2 ( x ; 4 , 5 / 2 ) 2 ( 5 n 3 ) 9 x 5 x C 3 / 2 n 3 / 2 ,
where W 1 / 2 ( x ; 4 , 5 / 2 ) and w 1 / 2 ( x ; 4 , 5 / 2 ) are specified in (23).
Remark 11. 
In the paper Monahkov [38], an analogous to (27) estimation is shown, but with 11 approximation terms in corresponding formula (28). Instead of (8) with q 1 = ( p + 1 q ) / 2 , q 2 = ( p + 1 + q ) / ( 2 d + 4 ) and d = p q , the following equivalent inequality is used; see Fujikoshi et al. [39] (Theorem 4.1(ii)):
sup x P m tr S q S m 1 x G d ( x ) d 4 m a 0 G d ( x ) + a 1 G d + 2 ( x ) + a 2 G d + 4 ( x ) C m 2
where a 0 = q p 1 , a 1 = 2 q , a 2 = q + p + 1 with a 0 + a 1 + a 2 = 0 and d = p q .
Proposition 4. 
Let γ = 1 . Consider the statistics T m = χ 4 2 / χ m 2 , satisfying the inequality (8) with the limiting distribution G 4 ( x ) , the coefficients q 1 = 1 and q 2 = 1 / 2 and the mixed scaling factor g n 1 N n 2 ( r ) at T N n ( r ) . If r = 3 / 2 and d = 4 , then r ( d + 1 ) / 2 = 1 , g n = ( 3 n 1 ) / 2 and, uniformly in x > 0 :
P N n 2 ( 3 / 2 ) ( 3 n 1 ) / 2 χ 4 2 χ N n ( 3 / 2 ) 2 x W 1 / 2 ( x ; 4 , 3 / 2 ) + w 1 / 2 ( x ; 4 , 3 / 2 ) 2 ( 3 n 1 ) 7 x + 3 x + 1 C 3 / 2 n 3 / 2 ,
where W 1 / 2 ( x ; 4 , 3 / 2 ) and w 1 / 2 ( x ; 4 , 3 / 2 ) are specified in (22).

6. Proofs

For the proofs of Theorems 1 and 2, we use Proposition 1. The statistics T m and the sample size N n are either asymptotically normally and discretely Pareto-like distributed (i.e., F = Φ and H = W s ) or asymptotically chi-squared and negatively binomially distributed (i.e., F = G d and H = G r , r ). In both cases, the size D n defined in (6) is uniformly bounded for all n N + , see Christoph and Ulyanov [18] (Lemma A1). Next, the bounds that are required in (4) for the negative moments of sample sizes E N n ( s ) a and E N n ( r ) a are provided by (12) and (17). Furthermore, it follows from Christoph and Ulyanov [18] (Proposition 2 and Lemma A2) that in both cases the domain of integration of the integrals in the function G n ( x , 1 / g n ) defined in (5) can be extended from ( 1 / g n , ) to ( 0 , ) :
sup x | G n ( x , 1 / g n ) G n , 2 ( x ) | C g n b ,
where b = 2 if F = Φ and H = W s or b = min { r , 2 } if F = G d and H = G r , r , respectively, and
G n , 2 ( x ) = 0 F ( x y γ ) d H ( y ) , f o r 0 < b 1 / 2 , 0 F ( x y γ ) + f 1 ( x y γ ) g n y d H ( y ) = : G n , 1 ( x ) , f o r 1 / 2 < b 1 , G n , 1 ( x ) + 0 f 2 ( x y γ ) g n y d H ( y ) + 0 F ( x y γ ) n d h 2 ( y ) , f o r b > 1 , .
We still have to calculate the integrals in (29) that contain f 1 , f 2 , and h 2 , respectively.
Proof of Theorem 1. 
We now consider F = Φ , H = H s and γ { 0 ; ± 1 / 2 } . Here, f 1 ( x y γ ) = ( p 0 + p 2 x 2 y 2 γ ) φ ( x y γ ) , f 2 ( x y γ ) = ( p 1 x y γ + p 3 x 3 y 3 γ + p 5 x 5 y 5 γ ) φ ( x y γ ) and we divide the function h 2 ( y ) = h 2 ; s ( y ) given in (11) into two parts: h 2 ; s ( y ) = s ( s 1 ) e s / y / ( 2 y 2 ) and h 2 ; s ( y ) = s Q 1 ( n y ) y 2 e s / y . The densities of the limit distributions V γ ( x ; d , r ) = 0 Φ ( x y γ ) d W s ( y ) were given in (20). If γ = 1 / 2 to calculate the integrals in (29) involving f 1 ( x y ) , f 2 ( x y ) and h 2 ; s ( y ) we use Prudnikov et al. [35] (Formulas 2.3.16.2 and 2.3.16.3):
0 y m 1 / 2 e p y q / y d y = ( 1 ) m π m p m p 1 / 2 e 2 p q , m = 0 , 1 , 2 , ( 1 ) m π p m q m e 2 p q , m = 0 , 1 , 2 , , p , q > 0 ,
for p = x 2 / 2 > 0 , q = s > 0 and m = 0 , 1 , 2 , respectively. The corresponding integral with h 2 ; s ( y ) was estimated in Christoph et al. [17] (see Proof of Theorem 5) by c ( s ) e π s n / 2 C ( s ) n 2 .
In case of γ = 0 , we obtain 0 Φ ( x ) d h 2 ( y ) = Φ ( x ) h 2 ( ) lim y 0 h 2 ( y ) = 0 . To calculate the integrals with f 1 ( x ) and f 2 ( x ) we use [35] (Formula 2.3.3.1) with α = 3 / 2 , 2 and q = s :
0 y α 1 e q / y d y = 1 / y = z 0 z α 1 e q z d z = Γ ( α ) q α , α > 0 , q > 0 .
If γ = 1 / 2 , the integrals with f 1 ( x / y ) , f 2 ( x / y ) and h 2 , s ( x / y ) are calculated using (31) with α = 3 / 2 , 5 / 2 , 7 / 2 , 9 / 2 and q = s + x 2 / 2 . From Christoph and Ulyanov [20] (see Proof of Theorem 8), it follows that holds: n 1 sup x 0 Φ ( x / y ) d h 2 ; s ( y ) C ( s ) n 2 and Theorem 1 is proved. □
Proof of Theorem 2. 
Now, we consider the case F ( x ) = G d ( x ) , H ( y ) = G r , r ( y ) and γ { 0 ; ± 1 } . This combination has not yet been studied in the literature. Only if γ = 1 , there is a result by Monahkov [38]; see Remark 11 above. Then, f 1 ( x y γ ) = 0 , f 2 ( x y γ ) = ( q 1 x y γ + q 2 x 2 y 2 γ ) g d ( x y γ ) and we divide the function h 2 ( y ) = h 2 ; r ( y ) given in (16) into two parts: h 2 ; r ( y ) = ( 2 r ) 1 g r , r ( y ) ( y 1 ) ( 2 r ) and h 2 ; r ( y ) = r 1 g r , r ( y ) Q 1 g n y .
For γ = 1 , the density v 1 ( x ; d , r ) in (20) and the integrals in (29) with f 2 ( x y ) and h 2 ; r ( y ) are computed with (31) for α = r + d / 2 , r + d / 2 1 . The integral with h 2 ; r ( y ) is estimated in (A1) in Lemma A1. Together with the inequality | 1 / g n 1 / ( r n ) | max { 2 , r } ( r 1 ) ( r n ) 2 , we get (26).
In case of γ = 0 , we obtain 0 G d ( x ) d h 2 ( y ) = G d ( x ) h 2 , r ( ) lim y 0 h 2 , r ( y ) = 0 . To calculate the integrals with f 2 ( x ) , we use (31) with α = r 1 and q = r .
If γ = 1 the density v 1 ( x ; d , r ) in (20) and the integrals with f 2 ( x / y ) and h 2 , r ( y ) are calculated using Prudnikov et al. [35] (Formula 2.3.16.1):
0 y α 1 e p y q / y d y = 2 ( p / q ) α / 2 K α ( 2 p q ) , p , q > 0 ,
with α = r d / 2 , r d / 2 1 , r d / 2 2 , p = r and q = x / 2 . We use the order-reflection formula K α ( u ) = K α ( u ) and the recursion formula; see Oldham et al. [34] (Chapter 51.5):
K r d / 2 2 ( 2 r x ) = K d / 2 + 2 r ( 2 r x ) = 2 ( d / 2 r + 1 ) 2 r x K d / 2 r + 1 ( 2 r x ) + K d / 2 r ( 2 r x ) .
The integral with h 2 ; r ( y ) is estimated in (A4) in Lemma A2 and Theorem 2 is proved. □
Proof of Proposition 4. 
We consider γ = 1 , r = 3 / 2 d = 4 and g n = ( 3 n 1 ) / 2 . The integrals in (29) with f 2 ( x / y ) and h 2 , r ( y ) are calculated using (30) with m = 1 , 2 , 3 , p = r and q = x / 2 . The integral with h 2 , r is estimated in (A5) in Lemma A3 and Proposition 4 is proved. □

7. Conclusions

The common goal of the present work and that of Christoph and Ulyanov [18] is to develop formal second order Chebyshev–Edgeworth expansions for sample statistics with random sample sizes. Corresponding expansions are assumed for the statistics with non-random sample sizes as well as for the random sample sizes. The statistics examined are asymptotically normally distributed and, for the first time in this setting, also asymptotically chi-squared distributed. The random sample sizes have negative binomial or Pareto-like distributions. The formal construction of the approximating functions allows the results to be used for a whole family of asymptotically normal or chi-squared distributed statistics. The Student t-distribution with m degrees of freedom, the one-sample Student t-test statistic, and the sample mean are considered as examples of asymptotic normal statistics. Hotelling’s generalized T 0 2 statistic and scale mixture of a normalized quotient of two independent chi-squared random variables were studied as examples of the asymptotic chi-squared distributions. In addition, random, non-random, and mixed scaling factors for the statistics are considered, which have a significant influence on the limit distributions. The limit laws are scale mixtures of the normal with mixing gamma or chi-squared with mixing inverse exponential distributions. In addition to the normal distribution and the chi-square distribution, there are a variety of limit distributions: the Laplace, the scaled Student t-, the scaled Fisher, the generalized gamma, and linear combinations of generalized gamma distributions.
The remaining terms in the approximations of the scaled statistics are estimated by inequalities.

Author Contributions

Conceptualization, G.C. and V.V.U.; methodology, V.V.U. and G.C.; formal analysis, G.C. and V.V.U.; investigation, G.C. and V.V.U.; writing—original draft, G.C. and V.V.U.; writing—review and editing, V.V.U. and G.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding. It was carried out within the project, “Analysis of the quality of approximations in the statistical analysis of multivariate observations” of the Magdeburg University, the program of the Moscow Center for Fundamental and Applied Mathematics, Lomonosov Moscow State University, and HSE University Basic Research Programs.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors thank the Editor for his support and the Reviewers for their appropriate comments which have improved the quality of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Auxiliary Lemmas

Lemma A1. 
Let r > 1 then
| J 1 ( x ) | = 0 G d ( x y ) d h 2 ; r ( y ) c ( r , d ) g n r 1 w i t h h 2 ; r ( y ) = r 1 g r , r ( y ) Q 1 g n y .
Proof of Lemma A1. 
We use the Fourier series expansion of the jump correcting function Q 1 ( y ) at all non-integer points y; see Prudnikov et al. [35] (Formula 5.4.2.9 for a = 0 ):
Q 1 ( y ) = 1 2 ( y [ y ] ) = k = 1 sin ( 2 π k y ) k π , y [ y ] ,
and Prudnikov et al. [35] (Formula 2.5.31.4):
0 y α 1 e p y sin ( b y ) d y = Γ ( α ) ( b 2 + p 2 ) α / 2 sin ( α arctan ( b / p ) ) with α > 1 , b , p > 0 .
Integration by parts in the integral J 1 ( x ) , using (A2), interchanging sum and integral and applying (A3) with α = r + d / 2 1 , p = ( r + x / 2 ) and b = 2 π k g n leads to
J 1 ( x ) = r r 1 x d / 2 Γ ( r ) 2 d / 2 Γ ( d / 2 ) 0 y r + d / 2 2 Q 1 g n y e ( r + x / 2 ) y d y = r r 1 x d / 2 π Γ ( r ) 2 d / 2 Γ ( d / 2 ) k = 1 1 k 0 y r + d / 2 2 e ( r + x / 2 ) y sin 2 π k g n y d y = r r 1 Γ ( r + d / 2 1 ) π Γ ( r ) 2 d / 2 Γ ( d / 2 ) k = 1 a k ( x ; n ) k
with
a k ( x ; n ) = x d / 2 sin ( r + d / 2 1 ) arctan 2 π k g n / ( r + x / 2 ) ( 2 π k g n ) 2 + ( r + x / 2 ) 2 ( r + d / 2 1 ) / 2 .
Now, we split the exponent ( r + d / 2 1 ) / 2 = ( r 1 ) / 2 + d / 4 and obtain
| a k ( x ; n ) | x d / 2 ( 2 π k g n ) r 1 ( r + x / 2 ) d / 2 2 d / 2 ( 2 π k g n ) r 1 .
Since r > 1 , we find uniform in x 0
| J 1 ( x ) | c 1 ( r , d ) g n r 1 k = 1 k r = c ( r , d ) g n r 1
and Lemma A1 is proved. □
Lemma A2. 
Let r 2 , then
| J 1 ( x ) | = 0 G d ( x / y ) d h 2 ; r ( y ) c ( r , d ) g n w i t h h 2 ; r ( y ) = r 1 g r , r ( y ) Q 1 g n y .
Proof of Lemma A2. 
Integration by parts in the integral J 1 ( x ) , using the Fourier series expansion (A2), interchanging sum and integral, we find
J 1 ( x ) = r r 1 x d / 2 Γ ( r ) 2 d / 2 Γ ( d / 2 ) 0 y r d / 2 2 Q 1 g n y e ( r y + x / ( 2 y ) ) d y = r r 1 π Γ ( r ) 2 d / 2 Γ ( d / 2 ) k = 1 J k , n ( x ) k
with J k , n ( x ) = 0 x d / 2 y r d / 2 2 e ( r y + x / ( 2 y ) ) sin 2 π k g n y d y .
In the literature, we have only found integrals J k , n ( x ) with power functions y 1 / 2 and y 3 / 2 . Therefore, we integrate by parts in the integral J k , n ( x ) :
J k , n ( x ) = 1 2 0 ( d 2 r + 4 ) f 1 ( x , y ) + 2 r f 2 ( x , y ) f 3 ( x , y ) e ( r y + x / ( 2 y ) ) cos ( 2 π k g n y ) 2 π k g n d y ,
where f 1 ( x , y ) = x d / 2 y r d / 2 3 , f 2 ( x , y ) = x d / 2 y r d / 2 2 and f 3 ( x , y ) = x d / 2 + 1 y r d / 2 4 .
Since r 2 and d 1 we obtain y r 2 e r y / 2 c r and ( x / y ) ( d 1 ) / 2 e x / ( 4 y ) c d . Using (30) with m = 0 , 1 , 2 , p = r / 2 , and q = x / 4 we find
0 f 1 ( x , y ) d y c r c d x 1 / 2 0 y 3 / 2 e ( r y / 2 + x / ( 4 y ) ) d y = c r c d 2 π e r x / 2 C 1 ( r , d ) ,
0 f 2 ( x , y ) d y c r c d x 1 / 2 0 y 1 / 2 e ( r y / 2 + x / ( 4 y ) ) d y = c r c d 2 π x / r e 2 r x / 2 C 2 ( r , d ) ,
0 f 3 ( x , y ) d y c r c d x 3 / 2 0 y 5 / 2 e ( r y / 2 + x / ( 4 y ) ) d y = c r c d 2 π ( 2 r x + 2 ) e r x / 2 C 3 ( r , d )
and
| J k , n | 1 4 π k g n | d 2 r + 4 | C 1 ( r , d ) + 2 r C 2 ( r , d ) + C 3 ( r , d ) C ( r , d ) k g n .
Hence,
| J 1 ( x ) | r r 1 π Γ ( r ) 2 d / 2 Γ ( d / 2 ) π 2 6 g n C ( r , d ) c ( r , d ) g n .
Lemma A2 is proved. □
Lemma A3. 
Let γ = 1 , r = 3 / 2 , d = 4 and g n = ( 3 n 1 ) / 2 , then
| J 1 ( x ) | = 0 G d ( x / y ) d h 2 ; 3 / 2 ( y ) c ( 3 / 2 , 4 ) g n w i t h h 2 ; 3 / 2 ( y ) = ( 2 / 3 ) g 3 / 2 , 3 / 2 ( y ) Q 1 g n y .
Proof of Lemma A3. 
Integration by parts in the integral J 1 ( x ) , using the Fourier series expansion (A2), interchanging sum and integral, we find
J 1 ( x ) = 3 / 2 x 2 4 Γ ( 3 / 2 ) 0 y 5 / 2 Q 1 g n y e ( 3 y / 2 + x / ( 2 y ) ) d y = 3 / 2 π k = 1 J k , n ( x ) k
with
J k , n ( x ) = x 2 0 y 5 / 2 e ( 3 y / 2 + x / ( 2 y ) ) sin 2 π k g n y d y .
Using Prudnikov et al. [35] (Formula 2.5.37.3), with the real constants p > 0 , q > 0 and b > 0 , we obtain
0 y 3 / 2 e p y q / y sin ( b y ) d y = π q e 2 q z + sin ( 2 q z ) a n d 2 z ± 2 = p 2 + b 2 ± p .
It was shown in Christoph et al. [17] (Proof of Theorem 5) that Leibniz’s integral rule allows differentiation to q under the integral sign in (A6). Therefore,
0 y 5 / 2 e p y q / y sin ( b y ) d y = ( π / 2 ) e 2 q z + ( q 3 / 2 sin ( 2 q z ) + 2 q 1 z + sin ( 2 q z ) 2 q 1 z cos ( 2 q z ) ) .
Since 0 < z z + , p = 3 / 2 , q = x / 2 , b = 2 π k g n , k 1 and g n 1 we find z + π k g n ,
| J k , n ( x ) | x 2 π 2 e 2 x z + 2 2 x 3 / 2 + 8 x z + = π z + e 2 x z + 2 x z + + 4 x z + 2 e 1 + 8 e 2 k n
and
| J 1 ( x ) | 3 / 2 π k = 1 e 1 + 8 e 2 k 3 / 2 g n .
Lemma A3 is proved. □

References

  1. Wallace, D.L. Asymptotic approximations to distributions. Ann. Math. Statist. 1958, 29, 635–654. [Google Scholar] [CrossRef]
  2. Bickel, P.J. Edgeworth expansions in nonparametric statistics. Ann. Statist. 1974, 2, 1–20. [Google Scholar] [CrossRef]
  3. Hall, P. The Bootstrap and Edgeworth Expansion; Springer Series in Statistics; Springer: New York, NY, USA, 1992. [Google Scholar]
  4. Bhattacharya, R.N.; Ranga Rao, R. Normal Approximation and Asymptotic Expansions; Wiley: New York, NY, USA, 1976. [Google Scholar]
  5. Petrov, V.V. Limit Theorems of Probability Theory, Sequences of Independent Random Variables; Clarendon Press: Oxford, UK, 1995. [Google Scholar]
  6. Pfanzagl, J. Asymptotic expansions related to minimum contrast estimators. Ann. Statist. 1973, 1, 993–1026. [Google Scholar] [CrossRef]
  7. Bentkus, V.; Götze, F.; van Zwet, W.R. An Edgeworth expansion for symmetric statistics. Ann. Statist. 1997, 25, 851–896. [Google Scholar] [CrossRef]
  8. Burnashev, M.V. Asymptotic expansions for median estimate of a parameter. Theory Probab. Appl. 1997, 41, 632–645. [Google Scholar] [CrossRef]
  9. Nunes, C.; Capistrano, G.; Ferreira, D.; Ferreira, S.S.; Mexia, J.T. Exact critical values for one-way fixed effects models with random sample sizes. J. Comput. Appl. Math. 2019, 354, 112–122. [Google Scholar] [CrossRef]
  10. Nunes, C.; Capistrano, G.; Ferreira, D.; Ferreira, S.S.; Mexia, J.T. Random sample sizes in orthogonal mixed models with stability. Comp. Math. Methods 2019, 1, e1050. [Google Scholar] [CrossRef] [Green Version]
  11. Nunes, C.; Mário, A.; Ferreira, D.; Moreira, E.M.; Ferreira, S.S.; Mexia, J.T. An algorithm for simulation in mixed models with crossed factors considering the sample sizes as random. J. Comput. Appl. Math. 2022, 404, 113463. [Google Scholar] [CrossRef]
  12. Esquível, M.L.; Mota, P.P.; Mexia, J.T. On some statistical models with a random number of observations. J. Stat. Theory Pract. 2016, 10, 805–823. [Google Scholar] [CrossRef]
  13. Döbler, C. New Berry-Esseen and Wasserstein bounds in the CLT for non-randomly centered random sums by probabilistic methods. ALEA Lat. Am. J. Probab. Math. Stat. 2015, 12, 863–902. [Google Scholar]
  14. Schluter, C.; Trede, M. Weak convergence to the Student and Laplace distributions. J. Appl. Probab. 2016, 53, 121–129. [Google Scholar] [CrossRef]
  15. Bening, V.E.; Galieva, N.K.; Korolev, V.Y. On rate of convergence in distribution of asymptotically normal statistics based on samples of random size. Ann. Math. Inform. 2012, 39, 17–28. [Google Scholar]
  16. Bening, V.E.; Galieva, N.K.; Korolev, V.Y. Asymptotic expansions for the distribution functions of statistics constructed from samples with random sizes. Inform. Appl. 2013, 7, 75–83. (In Russian) [Google Scholar]
  17. Christoph, G.; Monakhov, M.M.; Ulyanov, V.V. Second-order Chebyshev-Edgeworth and Cornish-Fisher expansions for distributions of statistics constructed with respect to samples of random size. J. Math. Sci. 2020, 244, 811–839, Translated from Zap. Nauchnykh Semin. POMI 2017, 466, 167–207. [Google Scholar] [CrossRef]
  18. Christoph, G.; Ulyanov, V.V. Chebyshev–Edgeworth-type approximations for statistics based on samples with random sizes. Mathematics 2021, 9, 775. [Google Scholar] [CrossRef]
  19. Fujikoshi, Y.; Ulyanov, V.V. Non-Asymptotic Analysis of Approximations for Multivariate Statistics; Springer: Singapore, 2020. [Google Scholar]
  20. Christoph, G.; Ulyanov, V.V. Second order expansions for high-dimension low-sample-size data statistics in random setting. Mathematics 2020, 8, 1151, Reprinted in Special Issue: Stability Problems for Stochastic Models: Theory and Applications; Zeifman, A.; 57 Korolev, V.; Sipin, A., Eds.; MPDI: Basel, Switzerland, 2021; pp. 259–286.. [Google Scholar] [CrossRef]
  21. Buddana, A.; Kozubowski, T.J. Discrete Pareto distributions. Econ. Qual. Control. 2014, 29, 143–156. [Google Scholar] [CrossRef]
  22. Bening, V.E.; Korolev, V.Y. On the use of Student’s distribution in problems of probability theory and mathematical statistics. Theory Probab. Appl. 2005, 49, 377–391. [Google Scholar] [CrossRef]
  23. Gavrilenko, S.V.; Zubov, V.N.; Korolev, V.Y. The rate of convergence of the distributions of regular statistics constructed from samples with negatively binomially distributed random sizes to the Student distribution. J. Math. Sci. 2017, 220, 701–713. [Google Scholar] [CrossRef]
  24. Christoph, G.; Ulyanov, V.V.; Bening, V.E. Second order expansions for sample median with random sample size. ALEA Lat. Am. J. Probab. Math. Stat. 2022, 19, 339–365. [Google Scholar] [CrossRef]
  25. Korolev, V. Bounds for the rate of convergence in the generalized Rényi theorem. Mathematics 2022, 10, 4252. [Google Scholar] [CrossRef]
  26. Bulinski, A.; Slepov, N. Sharp estimates for proximity of geometric and related sums distributions to limit laws. Mathematics 2022, 10, 4747. [Google Scholar] [CrossRef]
  27. Korolev, V.; Shevtsova, I. An improvement of the Berry-Esseen inequality with applications to Poisson and mixed Poisson random sums. Scand. Actuar. J. 2012, 2012, 81–105. [Google Scholar] [CrossRef] [Green Version]
  28. Sunklodas, J.K. On the normal approximation of a binomial random sum. Lith. Math. J. 2014, 54, 356–365. [Google Scholar] [CrossRef]
  29. Petrov, V.V. Sums of Independent Random Variables; Akademie-Verlag: Berlin, Germany, 1975. [Google Scholar]
  30. Kolassa, J.E.; McCullagh, P. Edgeworth series for lattice distributions. Ann. Statist. 1990, 18, 981–985. [Google Scholar] [CrossRef]
  31. Choy, T.B.; Chan, J.E. Scale mixtures distributions in statistical modelling. Aust. N. Z. J. Stat. 2008, 50, 135–146. [Google Scholar] [CrossRef]
  32. Fujikoshi, Y.; Ulyanov, V.V.; Shimizu, R. Multivariate Statistics. High-Dimensional and Large-Sample Approximations; Wiley Series in Probability and Statistics; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2010. [Google Scholar]
  33. Christoph, G.; Ulyanov, V.V. Random dimension low sample size asymptotics. In Recent Developments in Stochastic Methods and Applications; Shiryaev, A.N., Samouylov, K.E., Kozyrev, D.V., Eds.; Springer Proceedings in Mathematics & Statistics; Springer International Publishing: Cham, Switzerland, 2021; Volume 371, pp. 215–228. ISBN 978-3-030-83266-7 and 978-3-030-83266-0. [Google Scholar] [CrossRef]
  34. Oldham, K.B.; Myland, J.C.; Spanier, J. An Atlas of Functions, 2nd ed.; Springer Science + Business Media: New York, NY, USA, 2009. [Google Scholar]
  35. Prudnikov, A.P.; Brychkov, Y.A.; Marichev, O.I. Integrals and Series, Vol. 1: Elementary Functions, 3rd ed.; Gordon & Breach Science Publishers: New York, NY, USA, 1992. [Google Scholar]
  36. Korolev, V.Y.; Zeifman, A.I. Generalized negative binomial distributions as mixed geometric laws and related limit theorems. Lith. Math. J. 2019, 59, 366–388. [Google Scholar] [CrossRef] [Green Version]
  37. Korolev, V.Y.; Gorshenin, A. Probability models and statistical tests for extreme precipitation based on generalized negative binomial distributions. Mathematics 2020, 8, 604. [Google Scholar] [CrossRef] [Green Version]
  38. Monakhov, M.M. Chebyshev–Edgeworth expansions for distributions of generalized Hotelling-type statistics based on random size samples. Inform. Primen. [Informatics Its Appl.] 2021, 15, 72–81. (In Russian) [Google Scholar] [CrossRef]
  39. Fujikoshi, Y.; Ulyanov, V.V.; Shimizu, R. L1-norm error bounds for asymptotic expansions of multivariate scale mixtures and their applications to Hotelling’s generalized T02. J. Multivar. Anal. 2005, 96, 1–19. [Google Scholar] [CrossRef] [Green Version]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Christoph, G.; Ulyanov, V.V. Second Order Chebyshev–Edgeworth-Type Approximations for Statistics Based on Random Size Samples. Mathematics 2023, 11, 1848. https://doi.org/10.3390/math11081848

AMA Style

Christoph G, Ulyanov VV. Second Order Chebyshev–Edgeworth-Type Approximations for Statistics Based on Random Size Samples. Mathematics. 2023; 11(8):1848. https://doi.org/10.3390/math11081848

Chicago/Turabian Style

Christoph, Gerd, and Vladimir V. Ulyanov. 2023. "Second Order Chebyshev–Edgeworth-Type Approximations for Statistics Based on Random Size Samples" Mathematics 11, no. 8: 1848. https://doi.org/10.3390/math11081848

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop