Next Article in Journal
On Λ-Fractional Analysis and Mechanics
Next Article in Special Issue
An Effective Approximation Algorithm for Second-Order Singular Functional Differential Equations
Previous Article in Journal
(ω,c)-Periodic Solutions to Fractional Differential Equations with Impulses
Previous Article in Special Issue
Approximate Flow Friction Factor: Estimation of the Accuracy Using Sobol’s Quasi-Random Sampling

Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

# Generalized Beta Prime Distribution Applied to Finite Element Error Approximation

by
1,* and
Franck Assous
2
1
Jean Le Rond d’Alembert Institute, Sorbonne University, 4 Place Jussieu, CEDEX 05, 75252 Paris, France
2
Department of Mathematics, Ariel University, Ariel 40700, Israel
*
Author to whom correspondence should be addressed.
Axioms 2022, 11(3), 84; https://doi.org/10.3390/axioms11030084
Submission received: 2 December 2021 / Revised: 17 January 2022 / Accepted: 18 February 2022 / Published: 22 February 2022

## Abstract

:
In this paper, we propose a new family of probability laws based on the Generalized Beta Prime distribution to evaluate the relative accuracy between two Lagrange finite elements $P k 1$ and $P k 2 , ( k 1 < k 2 )$. Usually, the relative finite element accuracy is based on the comparison of the asymptotic speed of convergence, when the mesh size h goes to zero. The new probability laws we propose here highlight that there exists, depending on h, cases where the $P k 1$ finite element is more likely accurate than the $P k 2$ element. To confirm this assertion, we highlight, using numerical examples, the quality of the fit between the statistical frequencies and the corresponding probabilities, as determined by the probability law. This illustrates that, when h goes away from zero, a finite element $P k 1$ may produce more precise results than a finite element $P k 2$, since the probability of the event “$P k 1$ is more accurate than $P k 2$” becomes greater than 0.5. In these cases, finite element $P k 2$ is more likely overqualified.

## 1. Introduction

Recently, we proposed a new vision to consider error estimates applied to finite element approximations (see [1,2,3]).
Indeed, we followed the emerging field called probabilistic numerics whose founding principles and aims consist of providing a statistical treatment of the errors and/or approximations that are made en route to the output of a deterministic numerical method, particularly the discretized solution of a partial differential equation (see, for example, [4,5,6]).
Then, each method that belongs to the field of probabilistic numerics introduces a probability measure on the approximated solution of a given classical numerical method.
One of the main applications of probabilistic numerics leads to the possibility for quantifying uncertainties contained in numerical approximations due to several causes, like numerical errors, randomness of the mesh generator, and so on.
In the same spirit, we introduced a probability measure of the classical error estimate considered as a random variable to derive two probability laws. This allowed us to model the relative accuracy between two Lagrange finite elements $P k 1$ and $P k 2 , ( k 1 < k 2 ) ,$ analyzed as a random variable.
This new point of view enabled us to complete the classical results, which are limited to the convergence analysis. Indeed, the relative accuracy between two finite elements $P k 1$ and $P k 2 , ( k 1 < k 2 )$ is usually obtained by comparing the asymptotic rate of convergence when the mesh size h goes to zero. This enables one to conclude that the $P k 2$ finite element is more accurate than $P k 1$, since $h k 2$ goes faster to zero than $h k 1$.
The probability laws we derived confirmed our intuition (see also our previous approaches in [7]), that when h is set to a fixed value, greater than an undetermined threshold, the asymptotic behavior recalled above does not necessarily apply. Indeed, in the error estimates, the upper bound of the approximation error is an unknown constant that depends, among others, on a given semi-norm of the unknown exact solution. Hence, the numerical comparison of error approximations between $P k 1$ and $P k 2$ cannot be achieved.
As a consequence, determining the smallest of the two concerned approximation errors remains an open question.
Starting from there, we first consider a given approximation error as a positive number whose position is unknown within an interval defined by the unknown upper bound of the corresponding error estimate. Then, we consider this position as the result of a random variable. Indeed, the approximation error depends on the approximation, hence, on quantitative uncertainties which are related to the mesh generator process.
In addition to the new insights we obtained from these probability laws, we also implemented practical cases [8], which enabled us to evaluate the quality of the fit between these probability laws and the corresponding statistical frequencies. We show that these two probability laws globally behave like the corresponding statistical frequencies. However, the fit was sometimes not precise enough. Next, we identified the reasons for this lack of accuracy: it basically originates in the excessive rigidity of the probabilistic assumptions we made to derive these laws.
Thus, we developed a new generation of probabilistic models based on the Generalized Beta Prime distribution (GBP). These models enabled us to derive, under probabilistic acceptable hypotheses, the probability law of the relative accuracy between two Lagrange finite elements.
In this paper, we introduce the probabilistic framework built to obtain this law; we justify its use and show that it fits well with several examples. This confirms, in a significant manner, the relevance of considering error estimates as random variables in a suitable probabilistic environment.
The paper is organized as follows. In Section 2, we recall the mathematical problem we consider and a corollary of the Bramble-Hilbert lemma, from which the previous probabilistic laws [8] have been derived. In Section 3, we show a typical numerical result we obtained combining numerical statistics and these probability laws. Then, in Section 4, we derive the new probability law that enable one to evaluate the relative error accuracy between two finite elements $P k 1$ and $P k 2 , ( k 1 < k 2 )$ relying on the GBP distribution. Finally, in Section 5, we show, using several examples, the high quality of the fit between the GBP probabilistic law and the corresponding statistical frequencies. Concluding remarks follow.

## 2. Abstract Problem and Finite Element Error Estimates

We consider an open-bounded and non-empty subset $Ω$ of $R n$, and denote by $Γ$ its boundary, assumed to be $C 1 −$ piecewise. We also introduce a Hilbert space V endowed with a norm $. V$, and a bilinear, continuous and $V −$ elliptic form $a ( · , · )$ defined on $V × V$. Finally, $l ( · )$ denotes a linear continuous form defined on V.
Let $u ∈ V$ be the unique solution to the second order elliptic variational formulation:
In this paper, we restrict ourselves to the simple case where V is the usual Sobolev space of distributions $H 1 ( Ω )$. More general cases can be found in [2].
Now, for a given $h > 0$, let us introduce $V h$, a finite-dimensional subset of V, and consider $u h ∈ V h$ an approximation of u, the solution to the approximate variational formulation:
In what follows, we are interested in evaluating error estimates for the finite element method. Hence, we first assume that domain $Ω$ is exactly covered by a regular $T h$ mesh composed by $N s$ n-simplexes $K j , ( 1 ≤ j ≤ N s ) ,$ which respects the classical rules for regular discretization (see, for example, ref. [9] for the bidimensional case or [10,11] in $R n$). We also denote by $P k ( K j )$ the space of polynomial functions defined on a given n-simplex $K j$ of degree less than or equal to k, ($k ≥$ 1).
Let us first briefly recall the result of [11], which is the starting point of our study. Let $∥ . ∥ 1$ be the classical norm in $H 1 ( Ω )$ and $| . | k + 1$ the semi-norm in $H k + 1 ( Ω )$, and h the mesh size, namely the largest diameter of the elements of mesh $T h$. We then have:
Lemma 1.
Suppose that there exists an integer $k ≥ 1$ such that approximation $u h$ of $V h$ is a continuous piecewise function composed by polynomials that belong to $P k ( K j ) , ( 1 ≤ j ≤ N s )$.
Then, if the exact solution u to (1) belongs to $H k + 1 ( Ω )$, we have the following error estimate:
$∥ u h − u ∥ 1 ≤ C k h k | u | k + 1 ,$
where $C k$ is a positive constant independent of h.
Let us now consider two families of Lagrange finite elements $P k 1$ and $P k 2$ corresponding to a set of values $( k 1 , k 2 ) ∈ N 2$ such that $0 < k 1 < k 2$.
The two corresponding inequalities given by (3), assuming that the solution u to (1) belongs to $H k 2 + 1 ( Ω )$, are:
$∥ u h ( k 1 ) − u ∥ 1 , Ω ≤ C k 1 h k 1 | u | k 1 + 1 , Ω ,$
$∥ u h ( k 2 ) − u ∥ 1 , Ω ≤ C k 2 h k 2 | u | k 2 + 1 , Ω ,$
where $u h ( k 1 )$ and $u h ( k 2 )$, respectively, denote the $P k 1$ and $P k 2$ Lagrange finite element approximations of u.
Now, if we consider a given mesh for the finite element of $P k 2$, which contains that of $P k 1$, then, for the particular class of problems where the variational formulation (1) is equivalent to a minimization formulation (see for example [9]), one can show that the approximation error of $P k 2$ is always lower than that of $P k 1$, and that $P k 2$ is more accurate than $P k 1$ for all values of mesh size h.
Next, for a given mesh size value of h, we consider two independent meshes for $P k 1$ and $P k 2$ built by a mesh generator (see Figure 1).
Usually, in order to compare the relative accuracy between these two finite elements, one asymptotically considers inequalities (4) and (5) and concludes that, when h goes to zero, $P k 2$ is more accurate than $P k 1$, since $h k 2$ goes faster to zero than $h k 1$.
However, for a given application, h has a fixed value and this method of comparison is no more valid. Therefore, we propose a solution to determine the relative accuracy between the finite elements $P k 1$ and $P k 2 , ( k 1 < k 2 )$, for a given value of h, for which two independent meshes are considered.
To this end, let us set:
$β k 1 = C k 1 h k 1 | u | k 1 + 1 , Ω and β k 2 = C k 2 h k 2 | u | k 2 + 1 , Ω ,$
that allows us to rewrite (4) and (5) as:
$∥ u h ( k 1 ) − u ∥ 1 , Ω ≤ β k 1 ,$
$∥ u h ( k 2 ) − u ∥ 1 , Ω ≤ β k 2 .$
Now, as explained in [3], there is no a priori available information to surely or better specify the relative position of the approximation errors $∥ u h ( k 1 ) − u ∥ 1 , Ω$ and $∥ u h ( k 2 ) − u ∥ 1 , Ω$, which respectively belong to the interval $[ 0 , β k 1 ]$ and $[ 0 , β k 2 ]$, where $β k i$ are unknown since $C k i$ and $| u | k i + 1 , Ω$ are unknown quantities.
In [3], we explained why quantitative uncertainties have to be handled using finite element computations. This originates in the way the mesh grid generator builds the mesh used to compute the approximation $u h ( k i ) , ( i = 1 , 2 )$; this leads to a partial non-control of the mesh, even for a given maximum mesh size. Indeed, most meshes are generated by a mesh generator based on a Delaunay-Voronoi method. As a consequence, the corresponding point generation, and/or the treatment of ambiguities inevitably contains some randomness. Therefore, the generated grid is a priori random, and so is the corresponding approximation $u h ( k i ) ( i = 1 , 2 )$.
For these reasons, we introduced in [3] a probabilistic framework that enables one to consider the possible values of the norm $∥ u h ( k ) − u ∥ 1 , Ω$ as a random variable, which is defined as follows:
• For a fixed value of the mesh size h, a random trial corresponds to the grid generation and the associated approximation $u h ( k )$.
• The probability space $Ω$, therefore, contains all the possible results for a given random trial, namely, all of the possible grids that the mesh generator may construct, or equivalently, all of the corresponding associated approximations $u h ( k )$.
Then, for a fixed value of k, we define the random variable $X ( k )$ as follows:
$X ( k ) : Ω → [ 0 , β k ]$
$ω ≡ u h ( k ) ↦ X ( k ) ( ω ) = X ( k ) ( u h ( k ) ) = ∥ u h ( k ) − u ∥ 1 , Ω .$
In what follows, for simplicity, we will set: $X ( k ) ( u h ( k ) ) ≡ X ( k ) ( h )$.
Therefore, our goal is now to evaluate the probability of the event:
$∥ u h ( k 2 ) − u ∥ 1 , Ω ≤ ∥ u h ( k 1 ) − u ∥ 1 , Ω ≡ X ( k 2 ) ( h ) ≤ X ( k 1 ) ( h ) ,$
which will enable us to estimate the more likely accuracy of the two finite elements $P k 1$ and $P k 2$, $( k 1 < k 2 )$.
Now, regarding the absence of information concerning the more or less likely values of the norm $∥ u h ( k i ) − u ∥ 1 , Ω , ( i = 1 , 2 )$ in the interval $[ 0 , β k i ] , ( i = 1 , 2 )$, we assumed in [1], that each random variable $X ( k i ) , ( i = 1 , 2 )$ had a uniform distribution on the interval $[ 0 , β k i ]$, and that they were also independent.
This probabilistic framework enabled us to obtain, in a more general context (see Theorem 3.1 in [1]), the density of the probability of the random variable Z defined by $Z = X ( k 2 ) − X ( k 1 )$ and, consequently, $P k 1 , k 2 ≡ P r o b X ( k 2 ) ≤ X ( k 1 )$. This probability corresponds to the value of the entire cumulative distribution function $F Z ( z )$ at $z = 0$, defined by:
$F Z ( z ) = ∫ − ∞ z f Z ( z ) d z ,$
with $f Z ( z )$ denoting the density function of the random variable Z.
For the purpose of the present work, by performing elementary adaptations of Theorem 3.3 in [1], we get the corresponding probability law given by:
$P k 1 , k 2 ( h ) = | ( 13 ) 1 − 1 2 h h k 1 , k 2 * k 2 − k 1 if 0 ≤ h ≤ h k 1 , k 2 * , ( 14 ) 1 2 h k 1 , k 2 * h k 2 − k 1 if h ≥ h k 1 , k 2 * ,$
where $h k 1 , k 2 *$ is defined by:
$h k 1 , k 2 * ≡ C k 1 | u | k 1 + 1 , Ω C k 2 | u | k 2 + 1 , Ω 1 k 2 − k 1 .$
The shape of this law is similar to a “sigmoid” curve, as one can see in Figure 2.
We already remarked in [3] that the probability law (13) and (14) can asymptotically—when $k 2 − k 1$ goes to infinity—lead to the limit situation we defined as the “two-steps” model (see Figure 2).
Nevertheless, we also proved in [3] that under suitable probabilistic assumptions, one can directly derive this “two-steps” probabilistic law for all non-zero integers $k 1$ and $k 2$ to finally obtain the following probability law:
$P k 1 , k 2 ( h ) = 1 if 0 < h < h k 1 , k 2 * , 0 if h > h k 1 , k 2 * .$
The next section is dedicated to the analysis of the fit between statistical data and the above probability laws (13), (14) and (16).

## 3. Numerical Statistics and Probability-Law Comparison

Here, we compare the theoretical probability law (13) and (14) and the statistics obtained from two examples of approximate variational formulations (2).
As shown in [8], the “two-steps” law (16) provides good-quality fits in several numerical cases. However, this law is a bit rough and cannot really follow the variations of the convexity of the statistical data curve (see below). For this reason, we also tested the accuracy of the fit with the “sigmoid” law given by (13) and (14).
In order to numerically check the accuracy of the fit between the probability law (13) and (15) and the corresponding statistical data produced by numerical simulations, we considered two finite elements $P k 1$ and $P k 2$ ($k 1 < k 2$).
One of the main ingredients of the method is the computation of $h k 1 , k 2 *$, as defined by (15), which is evaluated using a maximum likelihood estimator; see for instance [12].
In our case, this principle is applied as follows: for a given finite element $P k 1$, we consider N different meshes with the same (maximum) mesh size h. Then, we compute:
$max N , h ∥ u h ( k 1 ) − u ∥ 1 h k 1$
which constitutes, using estimate (4), the maximum likelihood estimator for $C k 1 | u | k 1 + 1$. Indeed, due to inequality (4), quantity $X k 1 ( h ) h k 1$ is also a uniform random variable whose support is $[ 0 , C k 1 | u | k 1 + 1 ]$.
Then, one can show [13] that for a given uniform random variable Y whose support is $[ 0 , θ ]$, $θ$ being an unknown real parameter, the maximum likelihood estimator $θ ^$ is given by:
$θ ^ = max ( Y 1 , … , Y N ) ,$
where $( Y 1 , … , Y N )$ is a sample built with independent and identically distributed random variables $( Y i ) i = 1 , N$, with the same distribution as Y.
In our case, this implies that (17) is the maximum likelihood estimator for $C k 1 | u | k 1 + 1$, since N and h each take a finite number of values.
Doing the same for another finite element $P k 2 , ( k 1 < k 2 )$, we obtain that the estimator for $h k 1 , k 2 *$, denoted $h k 1 , k 2 * ^$, is defined by:
$h k 1 , k 2 * ^ = max N , h ∥ u h ( k 1 ) − u ∥ 1 h k 1 max N , h ∥ u h ( k 2 ) − u ∥ 1 h k 2 1 / k 2 − k 1$
Then, one can easily compute the probability law given by (13) and (14), when one replaces $h k 1 , k 2 *$ by its estimator $h k 1 , k 2 * ^$.
Finally, in order to numerically check the validity of this probability law, we now compare the probabilities computed by (13), (14) and (18), with the corresponding statistical frequencies computed on the N meshes, for each fixed value h of the mesh size, as follows.
We considered for the two finite elements $P k 1$ and $P k 2$ ($k 1 < k 2$), the same number N of different meshes with the same (maximum) mesh size h. From there, we computed the approximate solution $u h ( k 1 )$ and $u h ( k 2 )$, and tested if $∥ u h ( k 2 ) − u ∥ 1 ≤ ∥ u h ( k 1 ) − u ∥ 1$. Then, we repeated the same process for different values of h, either lower or greater than $h k 1 , k 2 * ^$. This gives, as a function of h, the percentage of cases where the approximation error of $P k 2$ is lower than the approximation error of $P k 1$.
In all cases, we used package FreeFem++ [14] to compute the $P k 1$ and $P k 2$ finite element approximations. Basically, the mesh generator of FreeFem++ is based on the Delaunay-Voronoi algorithm, and allows us to fix the maximum mesh size h for N different meshes of the same domain $Ω$. Since FreeFEM++ is based on updated and modern methods of meshing, we are confident that using another mesh generator will lead to equivalent results.
Now, in order to motivate the new probability law of the next section, we recall typical results obtained in [8]. To this end, let us look at the numerical tests we implemented to analyze the relative accuracy between two finite elements.
Hence, we consider the following classical elliptic problem, with obvious notations: Find $u ∈ H 1 ( Ω )$ solution to:
where, for simplicity, $Ω$ is the open unit square in $R 2$: $Ω = ] 0 , 1 [ × ] 0 , 1 [$. The associated variational formulation, which is analogous to (1), can readily be derived. According to the choice of f and g, we considered, as examples, a stiff problem, where the solution exhibits rapid variations or, alternatively, a very smooth problem.
More precisely, when the right-hand side $f ( x , y )$ is the Runge-like function, (see [15] or [16]), given by:
$f ( x , y ) = 1 1 + α x 2 1 1 + α y 2 ,$
$α$ being a real parameter, we derived in [8], the exact solution of (19) according to a suitable choice of the Dirichlet condition, parametered by g.
Then, considering 100 meshes for each value of h and $α = 3000$ in the Runge function, we plot in Figure 3 the statistical frequencies corresponding to “$P 3$ more accurate that $P 1$” (blue line), together with the corresponding “sigmoid” probability law (orange line).
As one can see, the fit between the statistics and the probability law is not satisfactory. The same gaps were observed for other simulations with different sets of parameter values, and for several pairs of finite elements $P k 1$ and $P k 2$.
We now illustrate the probabilistic law for a very smooth solution to the variational problem (1). To build such a case, we choose the right-hand side f of problem (19) as:
$f ( x , y ) = 2 π 2 sin ( π x ) cos ( π y )$, so that $u ( x , y ) = sin ( π x ) cos ( π y )$ is the exact solution of the problem, provided that the Dirichlet boundary condition g is taken as the trace of $u ( x , y )$ on the boundary $∂ Ω$ (for more details, see [8]).
As previously, we first compute $h k 1 , k 2 * ^$ defined by (15), then we compute the probabilistic model recalled above. After that, we compare these results to the statistical frequencies.
As an example, we consider the finite elements $P 2$ and $P 3$, the corresponding results being depicted in Figure 4.
As one can see, the comparison with the sigmoid law in (13) and (14) gives only a global trend and is not accurate.
Here, as for the previous Runge example, this lack of accuracy will be improved by the new probabilistic laws derived in the next section. We will show how to enrich the “sigmoid” model, which actually only depends on one parameter, namely, $h k 1 , k 2 *$.

## 4. A New Probability Law Based on GBP

This section is devoted to the new probabilistic law we derived in order to evaluate the relative error accuracy between two finite elements $P k 1$ and $P k 2 , ( k 1 < k 2 )$. To motivate this new approach, let us make some remarks:
1.
In the previous works that enabled us to derive the “Sigmoid” probability law, we had made some assumptions (i.e., uniform distribution and independency). Since we now aim at obtaining a better fit between the probabilistic law and the statistical data, we will now relax the hypothesis of uniformity we had made on the densities of the random variables $X ( k i ) ( h ) , ( i = 1 , 2 )$, (see [1,2,3]).
2.
In order to obtain more degrees of freedom for the shape of these densities, let us first consider the random variable $Z = X ( k 2 ) − X ( k 1 )$. Our goal is to obtain, for the cumulative distribution function $F Z$ defined by (12) at point $z = 0$, a curve whose shape looks like a “Sigmoid”. We thus have to enrich our modeling process by adding more degrees of freedom, the “Sigmoid” probability law (13) and (14) containing only one parameter $h k 1 , k 2 *$. For this reason, we now consider a density $f Z$ for the random variable Z, whose associated cumulative distribution function $F Z$ at point $z = 0$ will include two exogenous parameters; these will be statistically estimated.
Keeping in mind these remarks, we begin by introducing the probability density function $f X$ of the normalized Beta random variable X (see [17]), defined by:
$∀ x ∈ [ 0 , 1 ] : f X ( x ; p , q ) ≡ x p − 1 ( 1 − x ) q − 1 ∫ 0 1 u p − 1 ( 1 − u ) q − 1 d u = Γ ( p + q ) Γ ( p ) Γ ( q ) x p − 1 ( 1 − x ) q − 1 ,$
with p and q, two parameters of shape that belong to $R + *$, and $Γ ( . )$, the classical Gamma function.
Among the numerous properties of the Beta distribution, let us mention the one that particularly motivates us into considering it. Depending on the two parameters p and q, the shapes of the corresponding cumulative distributions are very rich. In particular, they include the shape of the “Sigmoid” curve we are looking for to fit the statistics we get in this numerical example.
However, one cannot directly apply the Beta distribution to get the probability law we are looking for. More specifically, two properties have to be taken into account:
1.
The support of the Beta density $f X$ of the random variable X, denoted $S u p p X$, is included in $[ 0 , 1 ]$. Nevertheless, that of the random variable Z is $[ − β k 1 , β k 2 ]$, since $Z = X ( k 2 ) − X ( k 1 )$ and $S u p p X ( k i ) ⊂ [ 0 , β k i ] , ( i = 1 , 2 )$. This will make us use a suitable transformation of the density $f X$ to guarantee the correct support of the density $f Z$ of Z.
2.
We are looking for a probability law for the event $X ( k 2 ) ≤ X ( k 1 )$ as a function of h, that belongs to $[ 0 , + ∞ [$. Consequently, we have to apply another transformation to the density $f Z$ to guarantee this property for the support of h.
To achieve these transformations, we establish the following results:
Lemma 2.
Let Z be the random variable defined by $Z ≡ X ( k 2 ) − X ( k 1 )$, where $X ( k i ) ( i = 1 , 2 )$ are defined by (10). Let $X ↪ B ( p , q )$ be the Beta distribution parameterized by two given shape parameters $( p , q ) ∈ R + * 2$. Assume that Z is given by:
$Z = − β k 1 + ( β k 1 + β k 2 ) X .$
Then, the probability density function $f Z$ of Z can be written as:
$f Z ( z ) = Γ ( p + q ) Γ ( p ) Γ ( q ) β k 1 p − 1 β k 2 q − 1 ( β k 1 + β k 2 ) p + q − 1 1 + z β k 1 p − 1 1 − z β k 2 q − 1 𝟙 [ − β k 1 , β k 2 ] ( z ) ,$
where $𝟙 [ − β k 1 , β k 2 ]$ is the indicator function of the interval $[ − β k 1 , β k 2 ]$.
Proof.
As we already noticed, the support of $f Z$ is clearly within $[ − β k 1 , β k 2 ]$ as soon as those of $X ( k i )$ are in $[ 0 , β k i ]$.
Now, let us evaluate the cumulative distribution function $F Z ( z )$ of the random variable Z as defined by (21):
$F Z ( z ) = P r o b Z ≤ z = P r o b − β k 1 + ( β k 1 + β k 2 ) X ≤ z$
$= P r o b X ≤ z + β k 1 β k 1 + β k 2 = Γ ( p + q ) Γ ( p ) Γ ( q ) ∫ 0 z + β k 1 β k 1 + β k 2 u p − 1 ( 1 − u ) q − 1 d u .$
Then, by deriving $F Z ( z )$ as given by (24), we obtain the expression (22) of the density $f Z$. □
Remark 1.
Let us now give an interpretation of this result. As mentioned above, in our previous works (see, for example, [1,3]), we assumed the random variables $X ( k i )$ to be uniformly distributed on their support $[ 0 , β k i ]$. This enabled us to derive the density $f Z$ of the random variable Z (see Theorem 3.1 in [1]).
Here, Lemma 2 can be interpreted regarding the new assumption implicitly made on the random variables $X ( k i )$. Indeed, if we rewrite Z as:
$Z = β k 2 X − ( β k 1 − β k 1 X ) ,$
by setting:
$X ( k 1 ) ≡ β k 1 − β k 1 X a n d X ( k 2 ) = β k 2 X ,$
we observe that the support of $X ( k i )$ belongs to $[ 0 , β k i ]$, on the one hand, and that the difference $X ( k 2 ) − X ( k 1 )$ is equal to Z, on the other hand.
In other words, the choice to write Z given by (21) as a dimensional Beta distribution on $[ − β k 1 , β k 2 ]$ (which corresponds to altering the location and scale of the standard Beta distribution), accordingly modifies the hypothesis of uniformity of the two variables $X ( k i )$ to the one of a non-dimensional Beta distribution.
We are now capable to derive the new probability distribution $P k 1 , k 2 ( h )$ in order to evaluate which, between two Lagrange finite elements $P k 1$ and $P k 2 , ( k 1 < k 2 )$, is the more likely accurate. This is the purpose of the following Theorem.
Theorem 1.
Let $X ( k i ) ( i = 1 , 2 ) ,$ be the two random non-dimensional Beta distribution variables defined by (25) and Z the corresponding random variable defined by (21).
Then, $P k 1 , k 2 ( h )$ is the cumulative distribution function of a Generalized Beta Prime random variable H, whose density of probability $f H$, is defined by four parameters $( p , q , k 2 − k 1 , h k 1 , k 2 * )$, and we have:
$P k 1 , k 2 ( h ) = P r o b { H ≥ h } = ∫ h + ∞ f H ( s ; q , p , k 2 − k 1 , h k 1 , k 2 * ) d s ,$
where:
$f H ( s ; q , p , k 2 − k 1 , h k 1 , k 2 * ) = Γ ( p + q ) Γ ( p ) Γ ( q ) ( k 2 − k 1 ) h k 1 , k 2 * s h k 1 , k 2 * q ( k 2 − k 1 ) − 1 1 + s h k 1 , k 2 * k 2 − k 1 − p − q .$
Proof.
On the one hand, we are looking for a random variable H whose $[ 0 , + ∞ [$, which is associated to $P k 1 , k 2 ( h )$ equal to $F Z ( 0 )$. On the other hand, (24) was derived by the help of the dimensionless Beta distribution $B ( p , q )$, whose support belongs to $[ 0 , 1 ]$. Hence, we set the following change of variable in (24):
$s = u 1 − u ,$
and we get:
$F Z ( z ) = Γ ( p + q ) Γ ( p ) Γ ( q ) ∫ 0 z + β k 1 β k 2 − z s p − 1 ( 1 + s ) − p − q d s .$
Now, taking $z = 0$ in (28), and using $β k i ( i = 1 , 2 )$, given by (6) and $h k 1 , k 2 *$ by (15), we obtain:
$F Z ( 0 ) = P k 1 , k 2 ( h ) = Γ ( p + q ) Γ ( p ) Γ ( q ) ∫ 0 β k 1 β k 2 s p − 1 ( 1 + s ) − p − q d s ,$
$= Γ ( p + q ) Γ ( p ) Γ ( q ) ∫ 0 ( h k 1 , k 2 * / h ) k 2 − k 1 s p − 1 ( 1 + s ) − p − q d s .$
A last change of variables, $t = 1 s$, in (30) leads to:
$F Z ( 0 ) = Γ ( p + q ) Γ ( p ) Γ ( q ) ∫ ( h / h k 1 , k 2 * ) k 2 − k 1 + ∞ t q − 1 ( 1 + t ) − p − q d t .$
Finally, by considering the probability of the complementary event considered in (31) as a function of h (namely, $1 − P k 1 , k 2 ( h )$), and after derivation with respect to h, we get the density $f H$ defined in (27), which is a GBP probability density function parameterized by $( q , p , k 2 − k 1 , h k 1 , k 2 * )$, see for example [17].
In other words, we have obtained:
$P r o b X ( k 1 ) ≤ X ( k 2 ) = 1 − P r o b X ( k 2 ) ≤ X ( k 1 ) ( h ) = ∫ o h f H ( s ; q , p , k 2 − k 1 , h k 1 , k 2 * ) d s ,$
and $P k 1 , k 2 ( h )$ in (26) is deduced by complementarity. □
Remark 2.
1.
From (26) and (27), or equivalently from (30), we can obtain the asymptotic behavior of the probability of the event $X ( k 1 ) ≤ X ( k 2 )$ as the mesh size h goes to 0.
Clearly, $P r o b X ( k 1 ) ≤ X ( k 2 )$ goes to 0 with h. In other words, we have found with the new probabilistic law the usual result, which claims that the finite element $P k 2$ is more accurate than $P k 1$, since from (4) and (5), $h k 2$ goes faster to zero than $h k 1$ when $k 1 < k 2$.
Here, by (30), the same property is expressed in terms of probability, namely: the event $X ( k 1 ) ≤ X ( k 2 )$ occurs almost never, or equivalently, the event $X ( k 2 ) ≤ X ( k 1 )$ is an almost surely one, since its probability is equal to 1.
2.
From (26), using the positivity of the density $f H$, we conclude that $P k 1 , k 2 ( h )$ is a decreasing function of h. This property was already observed with the “Sigmoid” model in [1,3] where uniformity of the random variables $X ( k i ) , ( i = 1 , 2 ) ,$ was assumed.
In other words, this property claims that the event is true only for small h, and the more h increases the less likely this property is. Moreover, when h becomes large, asymptotically the event occurs almost never.

## 5. Numerical Comparisons

This section is devoted to the numerical comparison of the GBP probabilistic law (26), as derived in Theorem 1, with the statistical frequencies, computed as explained in Section 3.
Concerning the mesh generation, as explained above, we have considered N different meshes with the same (maximum) mesh size h. In these conditions, if we denote by $T h$ one of the N meshes, each other mesh is constructed as a random perturbation of $T h$ obtained by moving and/or adding internal vertices, in such a way that h remains constant, the number of points on the boundary being constant.
Let us also explain how we proceeded to determine the four parameters $( q , p , k 2 − k 1 , h k 1 , k 2 * )$ of density $f H$. First of all, we considered and fixed two finite elements, for example $P 1$ and $P 3$, which corresponds to $k 2 − k 1 = 2$.
Regarding the three other parameters, $p , q$ and $h k 1 , k 2 *$, we have implemented a least squares optimization (see for instance [18]) to minimize the norm $L 2$ of the difference between the statistical frequencies and the corresponding values, computed by the GBP model. This was computed using the Excel solver.
The results we obtained are depicted in Figure 5 for $N = 500$ different meshes. As one can see, the least squares algorithm found the optimal parameters p, q and $h k 1 , k 2 *$ such that the fit between the statistical frequencies and the GBP law (26) is very satisfactory.
This is clearly due to the the richness and flexibility of this distribution, which provides two supplementary degrees of freedom, p and q, with respect to the “Sigmoid” law. This also validates a posteriori the approach presented in Section 4, which describes the randomness values of the approximation errors $∥ u h ( k i ) − u ∥ 1 , Ω , ( i = 1 , 2 ) ,$ within their respective interval $[ 0 , β k i ] , ( i = 1 , 2 )$.
Another example that enables one to appreciate the accuracy of the fit may be obtained by comparing the results implemented with the two Lagrange finite elements $P 1$ and $P 4$, also for $N = 500$. In this case, as one can see in Figure 6, the “Sigmoid” law only describes the global trend of the statistical frequencies, whereas the GBP law perfectly fits the corresponding data.
As a last illustration, let us present the results obtained when comparing the $P 2$ and $P 3$ finite elements. The difference in the quality of the fit between the “Sigmoid” law and the GBP law becomes even more important than in the previous case. Indeed, in this case, $k 2 − k 1$ is equal to one and the “Sigmoid” law becomes a linear function of h when $h ≤ h k 1 , k 2 *$ in formula (13). In addition, the GBP law once again fit the statistical frequencies very well (see Figure 7).

## 6. Conclusions

In this paper, we derived a new family of probabilistic laws to compare the accuracy between two Lagrange finite elements $P k 1$ and $P k 2 ( k 1 < k 2 )$. Based on a new probabilistic approach introduced in [1], we extended these previous results to improve the fit between the statistical frequencies obtained by implementing practical cases and the corresponding probabilities.
We recall, in Section 3, the main problems we were confronted with regarding the accuracy of the previous probabilistic laws we derived in [1,2,3], by identifying the gaps we observed with the corresponding statistical frequencies.
Afterwards, in Section 4 we derived the new probabilistic law based on the Generalized Beta Prime law. In Section 5, we showed the very good fit we obtained between the numerical results and the new probability law. This corrected the main observed deficiencies described above with simpler laws.
Finally, the new probabilistic law, together with our statistical validation, confirmed the validity of considering the approximation errors as random variables defined in a well-adapted probabilistic framework. Of course, this probabilistic approach is not restricted to Lagrange $P k$ finite elements and can be extended to any other finite elements, like rectangular $Q k$. It can also be generalized to any types of approximation in the following sense: given a class of numerical schemes and their corresponding approximation error estimates, we should be able to order them not only in terms of asymptotic rate of convergence but also by evaluating the most probably accurate one.

## Author Contributions

Conceptualization, J.C.; methodology, J.C.; software, F.A.; writing—original draft preparation, J.C.; writing—review and editing, F.A.; supervision, J.C. All authors have read and agreed to the published version of the manuscript.

## Funding

This research received no external funding.

Not applicable.

Not applicable.

Not applicable.

## Acknowledgments

The authors want to warmly dedicate this research to pay homage to the memory of André Avez and Gérard Tronel who largely promote the passion of research and teaching in mathematics.

## Conflicts of Interest

The author declares no conflict of interest.

## References

1. Chaskalovic, J. A probabilistic approach for solutions of determinist PDE’s as well as their finite element approximations. Axioms 2021, 10, 349. [Google Scholar] [CrossRef]
2. Chaskalovic, J.; Assous, F. Explicit k-dependence for Pk finite elements in Wm,p error estimates: Application to probabilistic laws for accuracy analysis. Appl. Anal. 2021, 100, 2825–2843. [Google Scholar] [CrossRef]
3. Chaskalovic, J.; Assous, F. A new mixed functional-probabilistic approach for finite element accuracy. Comput. Methods Appl. Math. 2020, 20, 799–813. [Google Scholar] [CrossRef]
4. Abdulle, A.; Garegnani, G. A probabilistic finite element method based on random meshes: A posteriori error estimators and Bayesian inverse problems. Comput. Methods Appl. Mech. Eng. 2021, 384, 113961. [Google Scholar] [CrossRef]
5. Hennig, P.; Osborne, M.A.; Girolami, M. Probabilistic numerics and uncertainty in computations. Proc. R. Soc. A Math. Phys. Eng. Sci. 2015, 471, 20150142. [Google Scholar] [CrossRef] [PubMed] [Green Version]
6. Oates, C.J.; Sullivan, T.J. A modern retrospective on probabilistic numerics. Stat. Comput. 2019, 29, 1335–1351. [Google Scholar] [CrossRef] [Green Version]
7. Assous, F.; Chaskalovic, J. Data mining techniques for scientific computing: Application to asymptotic paraxial approximations to model ultra-relativistic particles. J. Comput. Phys. 2011, 230, 4811–4827. [Google Scholar] [CrossRef]
8. Chaskalovic, J.; Assous, F. Numerical validation of probabilistic laws to evaluate finite element error estimates. Math. Model. Anal. 2021, 26, 684–695. [Google Scholar] [CrossRef]
9. Chaskalovic, J. Mathematical and Numerical Methods for Partial Differential Equations; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
10. Ciarlet, P.G. The Finite Element Method for Elliptic Problems; Classics in Applied Mathematics; SIAM: Philadelphia, PA, USA, 2002. [Google Scholar]
11. Raviart, P.A.; Thomas, J.M.; Thomas, J.M. Introduction à L’analyse Numérique des Équations aux Dérivées Partielles; Elsevier Masson: Paris, France, 1982. [Google Scholar]
12. Rossi, R.J. Mathematical Statistics: An Introduction to Likelihood Based Inference; John Wiley and Sons: New York, NY, USA, 2018. [Google Scholar]
13. Jeune, M.L. Théorie de L’estimation Paramétrique Ponctuelle; Springer: Paris, France, 2010. [Google Scholar]
14. Hecht, F. New development in FreeFem++. J. Numer. Math. 2012, 20, 251–265. [Google Scholar] [CrossRef]
15. Runge, C. Über empirische Funktionen und die Interpolation zwischen äquidistanten Ordinaten. Zeitschrift für Mathematik und Physik 1901, 46, 224–243. [Google Scholar]
16. Epperson, J.F. On the Runge Example. Am. Math. Mon. 1987, 94-4, 329–341. [Google Scholar] [CrossRef]
17. Gavin, E. Crooks, Field Guide to Continuous Probability Distributions; Berkeley Institute for Theoretical Science: Berkeley, CA, USA, 2019. [Google Scholar]
18. Lawson, C.L.; Hanson, R.J. Solving Least Squares Problems; SIAM in Applied Mathematics: Philadelphia, PA, USA, 1995. [Google Scholar]
Figure 1. Superposition of two independent meshes generated with the same mesh size h.
Figure 1. Superposition of two independent meshes generated with the same mesh size h.
Figure 2. General shape of probabilistic law (13) and (14) together with the limit Heaviside case (16).
Figure 2. General shape of probabilistic law (13) and (14) together with the limit Heaviside case (16).
Figure 3. $P 1$ versus $P 3$ for the Runge function with $α = 3000$. Comparison between the statistical frequencies (blue) and the “Sigmoid law” (orange).
Figure 3. $P 1$ versus $P 3$ for the Runge function with $α = 3000$. Comparison between the statistical frequencies (blue) and the “Sigmoid law” (orange).
Figure 4. $P 2$ versus $P 3$ for the smooth case. Comparison between the statistical frequencies (blue) and the two probabilistic “Sigmoid” laws (orange).
Figure 4. $P 2$ versus $P 3$ for the smooth case. Comparison between the statistical frequencies (blue) and the two probabilistic “Sigmoid” laws (orange).
Figure 5. $P 1$ versus $P 3$ for the Runge function with $α = 3000$. Comparison between the statistical frequencies (blue) and the GBP law (red).
Figure 5. $P 1$ versus $P 3$ for the Runge function with $α = 3000$. Comparison between the statistical frequencies (blue) and the GBP law (red).
Figure 6. $P 1$ versus $P 4$. Up: comparison between the statistical frequencies (blue) and the Sigmoid (orange). Down: comparison between the statistical frequencies (blue) and the GBP law (red).
Figure 6. $P 1$ versus $P 4$. Up: comparison between the statistical frequencies (blue) and the Sigmoid (orange). Down: comparison between the statistical frequencies (blue) and the GBP law (red).
Figure 7. $P 2$ versus $P 3$. Up: comparison between the statistical frequencies (blue) and the Sigmoid (orange). Down: comparison between the statistical frequencies (blue) and the GBP law (red).
Figure 7. $P 2$ versus $P 3$. Up: comparison between the statistical frequencies (blue) and the Sigmoid (orange). Down: comparison between the statistical frequencies (blue) and the GBP law (red).
 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Share and Cite

MDPI and ACS Style

Chaskalovic, J.; Assous, F. Generalized Beta Prime Distribution Applied to Finite Element Error Approximation. Axioms 2022, 11, 84. https://doi.org/10.3390/axioms11030084

AMA Style

Chaskalovic J, Assous F. Generalized Beta Prime Distribution Applied to Finite Element Error Approximation. Axioms. 2022; 11(3):84. https://doi.org/10.3390/axioms11030084

Chicago/Turabian Style

Chaskalovic, Joël, and Franck Assous. 2022. "Generalized Beta Prime Distribution Applied to Finite Element Error Approximation" Axioms 11, no. 3: 84. https://doi.org/10.3390/axioms11030084

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.