Next Article in Journal
Cure Models with Exponentiated Weibull Exponential Distribution for the Analysis of Melanoma Patients
Next Article in Special Issue
To Google or Not: Differences on How Online Searches Predict Names and Faces
Previous Article in Journal
Performance Analysis of Hybrid MTS/MTO Systems with Stochastic Demand and Production
Previous Article in Special Issue
Stancu Type Baskakov—Durrmeyer Operators and Approximation Properties
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Optimal Shape Parameter for the Least Squares Approximation Based on the Radial Basis Function

School of Mathematics Science & Key Laboratory of Mathematics, Informatics and Behavioral Semantics, Ministry of Education, Beihang University, Beijing 100191, China
*
Author to whom correspondence should be addressed.
Mathematics 2020, 8(11), 1923; https://doi.org/10.3390/math8111923
Submission received: 20 August 2020 / Revised: 22 October 2020 / Accepted: 22 October 2020 / Published: 2 November 2020
(This article belongs to the Special Issue Approximation Theory and Methods 2020)

Abstract

:
The radial basis function (RBF) is a class of approximation functions commonly used in interpolation and least squares. The RBF is especially suitable for scattered data approximation and high dimensional function approximation. The smoothness and approximation accuracy of the RBF are affected by its shape parameter. There has been some research on the shape parameter, but the research on the optimal shape parameter of the least squares based on the RBF is scarce. This paper proposes a way for the measurement of the optimal shape parameter of the least squares approximation based on the RBF and an algorithm to solve the corresponding optimal parameter. The method consists of considering the shape parameter as an optimization variable of the least squares problem, such that the linear least squares problem becomes nonlinear. A dimensionality reduction is applied to the nonlinear least squares problem in order to simplify the objective function. To solve the optimization problem efficiently after the dimensional reduction, the derivative-free optimization is adopted. The numerical experiments indicate that the proposed method is efficient and reliable. Multiple kinds of RBFs are tested for their effects and compared. It is found through the experiments that the RBF least squares with the optimal shape parameter is much better than the polynomial least squares. The method is successfully applied to the fitting of real data.

1. Introduction

The radial basis function (RBF) is a class of approximation functions that are widely used nowadays in many fields. It is generally believed that the RBF method was originally introduced in Hardy’s work [1]. Since then, there have been many theoretical research and application methods of RBF approximation. Two valuable books [2,3] are recommended here. There are two groups of radial basis functions: global RBF and compactly supported RBF [2,3,4,5].
The RBF approximation function space P = s p a n { Φ 1 ( x ) , , Φ M ( x ) } for any R d can be obtained by a one-dimensional radial function ϕ ( r ) ( r > 0 ) and the given center points { y j } j = 1 M in R d . Here, Φ j ( x ) = ϕ ( | | x y j | | ) , 1 j M . Because the number and the distribution of center points are almost arbitrary, P is flexible and easily constructed. Therefore, the RBF approximation is suitable for the scattered data approximation and the high dimensional function approximation.
Almost all radial functions have a shape parameter. The shape parameter of RBF is related to the shape of the basis function and the approximation function space P. Therefore, the selection of shape parameters can affect the approximation accuracy of the RBF and computational stability. An important problem is how to define and select a parameter that can achieve the optimal approximation accuracy for RBF approximation. The problem is called the selection of optimal shape parameters in this paper.
The research on the selection of optimal shape parameters is mainly focused on the RBF finite difference, the RBF least squares and the RBF interpolation.
Feng and Duan [6] discussed the influence of shape parameters of RBF finite differences on the approximation errors and hl gave the optimal shape parameters of multiquadric RBF finite difference, which yields a convergence order of RBF finite difference twice that of polynomial finite difference on the same stencil. Liu, C.S. and Liu, D.J. [7] proposed an energy gap functional (EGF) for the elliptic partial differential equation and defined an optimal shape parameter that can minimize the EGF. Chen et al. [8] studied the optimal shape parameter in the Kansa method with the multiquadric RBF. They proposed a sample solution for every PDE problem and applied the optimal shape parameter to the Kansa method for the PDE problem.
Majdisova and Skala [9,10] examined the RBF least squares with real data and synthetic data. They tried to improve the approximation by changing the shape parameters and defined the optimal shape parameters by abundant experiments.
Rippa [11] employed the idea of cross validation in statistics and then constructed the cost function for the RBF interpolation. He obtained the optimal shape parameter by minimizing the cost function. This method is known as the leave-one-out cross validation method (abbr. LOOCV) in the RBF interpolation. Fasshauer and Zhang [12] combined the idea of the LOOCV method with approximate moving least squares (abbr. AMLS) and obtained the optimal shape parameter in AMLS based on RBF. Azarboni et al. [13] extended Rippa’s method, changed the “one out” into “two out”, and proposed the leave-two-out cross validation method (abbr. LTOCV) in the RBF interpolation, which can also be used in AMLS. Luh also performed a series of studies on the selection of optimal shape parameters in the RBF interpolation. One of Luh’s important works [14] proposes an error upper bound for the multiquadrics RBF interpolation. Luh obtained the optimal shape parameter by minimizing the error upper bound in regard to the shape parameter.
All of the above studies proposed a way of measuring the optimal shape parameters for their respective research problems and provided a method to obtain those parameters.
However, the research on the optimal shape parameter of the least squares based on the RBF is scarce. Most of the existing research in this field provided the optimal shape parameter by experiments and trials. Therefore, this paper focuses on how to define a method of measuring the optimal shape parameter in RBF least squares approximation and to obtain that parameter efficiently.
The rest of the paper is organized as follows. In Section 2, we introduce the nonlinear least squares problem and define the optimal shape parameter. To simplify the nonlinear least squares problem, we perform a dimensional reduction of the nonlinear least squares problem in Section 3. To efficiently solve the problem after the dimensional reduction, Powell’s DFO method is introduced and applied in Section 4. In Section 5, there are various numerical experiments about the convergence, the efficiency and the curved surface fitting. Section 6 presents the conclusion and a possible generalization.

2. The Nonlinear Least Squares with Optimal Shape Parameter

The least squares method is a classical and widely used approximation method. The general form of the RBF least squares is as follows: for a sampling function f ( x ) : Ω R and the given sampled data { ( x i , f i ) } i = 1 N , the RBFs { Φ j ( x ; α ) } j = 1 M are obtained by the given center points { y j } j = 1 M and a radial function ϕ ( r ; α ) : R + R . Here, Φ j ( x ; α ) = ϕ ( | | x y j | | ; α ) , 1 j M . The following optimization problem
min c R M i = 1 N f i j = 1 M c j · Φ j ( x i ; α ) 2
is solved in the approximation function space P = s p a n { Φ 1 ( x ; α ) , , Φ M ( x ; α ) } , where α is a given parameter called the shape parameter, c is the coefficient vector, and  c j is the j-th component of c .
Since this paper focuses on the selection of the shape parameter, we assume that the solution to problem (1) always exists and is unique. Hence, M N . We set the solution to problem (1) to be c ¯ , and the corresponding approximation function is s ¯ ( x ; α ) = j = 1 M c ¯ j · Φ j ( x ; α ) .
Research has been conducted on the RBF least square approximation. Fasshauer discussed the solvability conditions and approximation error bounds of problem (1) in his work Meshfree Approximation Methods with Matlab [3]. Majdisova and Skala indicated the excellent performances of RBF least squares in applications in [9,10].
The basic idea of the least squares is that the smaller the object function of problem (1) is, the closer the approximation function s ¯ ( x ; α ) is to the sampling function f ( x ) . Hence, the optimal value of problem (1) under a certain shape parameter α , i.e., i = 1 N f i s ¯ ( x i ; α ) 2 , can be used to measure the shape parameter. We change the shape parameter from a constant into a variable and add it into the optimization problem to get the following optimization problem:
min c R M , α D i = 1 N f i j = 1 M c j · Φ j ( x i ; α ) 2 ,
where D is the domain of definition of the shape parameter or a given restricted interval.
The dimension of the object function thus changes from M to M + 1 , and the corresponding least squares problem changes from linear to nonlinear.
We set the solution to problem (2) to be ( c , α ) , and refer to α as the optimal shape parameter for the RBF least squares. The approximation function s ( x ) = j = 1 M c j · Φ j ( x ; α ) is the closest function to f ( x ) under the l 2 norm.
According to the references [15,16], the classical method for solving such a least squares problem is the Gauss–Newton method and the methods derived from it, such as the Levenberg–Marquardt method, the structural Newton method, and the generalized Gauss–Newton method. However, all of these methods require the derivative of the object function or the generalized derivative of the object function.
The derivative properties of different types of radial basis functions vary greatly, and M can be huge in some applications and research. These two points make it unwise to solve problem (2) directly using the Gauss–Newton method.

3. The Dimensional Reduction of Nonlinear Least Squares

As described in the previous section, it is not recommended to solve problem (2) directly. According to the literature [17], problem (2) is a separable nonlinear least squares problem, and there is an equivalent problem
min α D R 2 ( α ) ,
where
R 2 ( α ) : = min c R M i = 1 N f i j = 1 M c j · Φ j ( x i ; α ) 2 .
It is worth noting that, after the dimensional reduction, problem (3) becomes one-dimensional, but the expression of R 2 ( α ) is not clear.
Remark 1.
For a given α, the calculation of R 2 ( α ) is to solve a classical linear least squares problem, so it is relatively easy to obtain the value of R 2 ( α ) . However, for the variable α, it is difficult to obtain a specific expression of R 2 ( α ) with respect to α. This process is particularly difficult when M is large.
Golub and Pereyra have proven the equivalency between problem (2) and problem (3) in [17]. We thus begin to describe an important theorem in the literature [17].
We use F to denote the vector [ f 1 , , f N ] T and define the following matrix
Φ ( α ) : = Φ 1 ( x 1 ; α ) Φ M ( x 1 ; α ) Φ 1 ( x N ; α ) Φ M ( x N ; α ) .
We use R ( c , α ) to denote the objective function of problem (2), namely
R ( c , α ) = i = 1 N f i j = 1 M c j · Φ j ( x i ; α ) 2 .
Theorem 1.
R ( c , α ) and R 2 ( α ) are defined as (6) and (4), respectively. We assume that, in the open set Ω R , Φ ( α ) has a constant rank r m i n ( M , N ) .
(a) 
If α ^ is a critical point (or a global minimizer) of R 2 ( α ) in Ω, and
c ^ = Φ ( α ^ ) + F ,
then ( c ^ , α ^ ) is a critical point of R ( c , α ) (or a global minimizer for α Ω ) and R ( c ^ , α ^ ) = R 2 ( α ^ ) .
(b) 
If ( c ^ , α ^ ) is a global minimizer of R ( c , α ) for α Ω , then α ^ is a global minimizer of R 2 ( α ) in Ω and R 2 ( α ^ ) = R ( c ^ , α ^ ) . Furthermore, if there is an unique α ^ among the minimizing pairs of R ( c , α ) , then α ^ must satisfy (7).
The proof of Theorem 1 is too complicated to be given here, so no proof will be given. Interested readers can refer to the proof in the literature [17]. We have supposed that the solution to problem (1) always exists and is unique, so the condition of Theorem 1 is always true. Therefore, we can claim that problem (2) is equivalent to problem (3). The solution to problem (3) is our optimal shape parameter for the RBF least squares.
Remark 2.
The symbol Φ ( α ^ ) + in (7) represents the Moore–Penrose generalized inverse of Φ ( α ) . When the basis functions are linearly independent and M N , the rank of Φ ( α ) is M, and c ^ in (7) is the solution to problem (1) corresponding to α. Theorem 1 indicates that, if the radial basis functions we use are linearly independent, problem (2) is equivalent to problem (3). Hence, we can obtain the optimal shape parameter by solving problem (3).

4. The Solution of Optimal Shape Parameter

We have proposed the definition of the optimal shape parameter and the corresponding optimization problem in Section 2. The dimension of the equivalent problem (3) is reduced from M + 1 to 1 in Section 3. Although the dimension of the problem has been reduced, how to solve problem (3) efficiently still needs to be discussed.
Golub and Pereyra have conducted some research on the separable nonlinear least square problem such as problem (2) in [17]. They point out that it is feasible to use the Gauss–Newton method to solve problem (2) and problem (3) when the initial value is close to the solution. Their experiments show that, when M is relatively small, the efficiency of the Gauss–Newton method in solving (3) is equivalent to that of the Gauss–Newton method in solving (2). If M is substantially increased, obtaining the specific expression of R 2 ( α ) will become extremely difficult, thereby increasing the cost of solving problem (3) with the Gauss–Newton method.
The problems that Cotnoir and Terzić discussed in [18] are rooted in practical problems, naturally involving quite large N and M. They believe that it is difficult to solve a problem like (3) by using gradient-based methods, including the Gauss–Newton method, when M or N is large. Hence, they suggest using the direct search method to solve problem (3). Among the direct methods, they recommend the genetic algorithm (GA) method.
In order to efficiently and accurately obtain the optimal shape parameter in the case of the relatively big N or M, we use the derivative-free optimization (DFO) method instead of the above methods to solve problem (3). Among the various DFO methods, we choose Powell’s UOBYQA method [19] for problem (3). Powell’s UOBYQA replaces the objective function with its quadratic interpolation in iteration. Intuitively, the convergence rate of the method is nonlinear. Among several classic DFO methods, the performance of Powell’s UOBYQA was outstanding in our trials.
The UOBYQA method is proposed for an arbitrary dimension optimization problem. The construction and updating of the interpolation point set is complicated for high dimension problems. Because our problem (3) is one-dimensional, we can simplify the construction and updating of the interpolation point set in the original UOBYQA method. Hence, we propose a matching algorithm combining the characteristics of problem (3) with the UOBYQA method.

4.1. Powell’s UOBYQA Method

UOBYQA, short for unconstrained optimization by quadratic approximation, is a trust-region derivative-free optimization method for solving unconstrained optimization problems. Its essence is to replace the objective function with its quadratic interpolation in each trust region, so as to solve the subproblem of the trust region.
For the objective function F ( x ) , the UOBYQA method begins with an initial point x 0 , an interpolation set N 0 , and a trust-region radius r 0 . In the method, N 0 must guarantee that the quadratic interpolation of F ( x ) exists and is unique. At each step, x k , N k and r k are changed. At step ( k + 1 ) , the quadratic interpolation on N k is set to m k ( x ) , the iterate point is set to x k , and the current trust-region radius is set to r k . The subproblem of the trust region is then
min x x k 2 r k m k ( x ) .
We set the solution to the subproblem to x ˜ , and then calculate the number
ρ = F ( x k ) F ( x ˜ ) m k ( x k ) m k ( x ˜ ) .
If ρ > 0 , we make x k + 1 = x ˜ and update the interpolation set N k + 1 , or else x k + 1 = x k and N k + 1 = N k . The update of the trust-region radius r k + 1 depends on the size of ρ .
The UOBYQA method has two obvious advantages. Firstly, the method does not need the derivative of the objective function or even the specific expression of the objective function. The method only requires the value of F ( x ) for any x . Secondly, due to the sequence { x k } k = 0 obtained by the UOBYQA method, the corresponding function value sequence { F ( x k ) } k = 0 is a monotonic decreasing sequence.
The disadvantages of the UOBYQA method are also obvious. If x R d , the interpolation set N k for each k has 1 2 ( d + 1 ) ( d + 2 ) points, and these points must satisfy specific distribution conditions. Hence, the difficulty of updating N k increases with d. If d is relatively large, the interpolation of the objective function at each step is difficult. Therefore, the UOBYQA method is not efficient for the high dimension problems.

4.2. A Modified UOBYQA Method

It is worth noting that problem (3) is one-dimensional, so it can be appropriately solved by the UOBYQA method. For problem (3), N k has three different points, which can make the interpolation exist and unique. However, it is also worth noting that problem (3) may not be an unconstrained optimization problem. Generally, the D in problem (3) is R \ { 0 } or R + \ { 0 } . According to the characteristics of the radial basis function, D = R \ { 0 } can be converted to D = R + \ { 0 } . When D = R + \ { 0 } , problem (3) is an unconstrained optimization problem.
In conclusion, for different kinds of RBFs, D in problem (3) can be processed accordingly, making the problem an unconstrained problem, so that the UOBYQA method can solve problem (3) efficiently.
Since problem (3) is one-dimensional, the difficulty of quadratic interpolation is low. In order to make the interpolation more accurate and the update of the set N k easier, the set N k is specially constructed at each step. Specifically, N k = { x k 1 2 r k , x k , x k + 1 2 r k } . These three points can always divide the trust region into five equal parts.
After applying Powell’s UOBYQA method to problem (3) and achieving a small improvement, we obtain the following algorithm (Algorithm 1):
Algorithm 1 DFO algorithm
Input: objective function R 2 ( α ) , trust-region radius r, initial value α 0
Output: optimal shape parameter α
1:
Set k = 0 ;
2:
Set D = 1 ;
3:
while D > 10 6 do
4:
    Set α a = α k - 1 2 r , α b = α k + 1 2 r ;
5:
    Set N k = { ( α a , R 2 ( α a ) ) , ( α k , R 2 ( α k ) ) , ( α b , R 2 ( α b ) ) } ;
6:
    Calculate the interpolation m k ( α ) of R 2 ( α ) on N k ;
7:
     α ˜ = arg min α [ α k r , α k + r ] m k ( α ) ;
8:
     ρ = R 2 ( α k ) R 2 ( α ˜ ) m k ( α k ) m k ( α ˜ ) ;
9:
    if ρ > 0.75 then
10:
        r=2r;
11:
    end if
12:
    if ρ < 0.5 then
13:
        r= 1 4 r;
14:
    end if
15:
    if ρ > 0 then
16:
         α k + 1 = α ˜ ;
17:
        D= | α k + 1 α k | ;
18:
        k=k+1;
19:
    else
20:
         α k + 1 = α k ;
21:
        k=k+1;
22:
    end if
23:
end while
24:
α = α k ;
25:
return  α
Remark 3.
The above algorithm is a trust-region method. Compared with the classical line search methods, the algorithm can converge quickly with a wide range of convergence. The algorithm avoids complicated derivative calculations and does not concern the specific expression of R 2 ( α ) ; thus, it is easy to apply the above algorithm to different kinds of basis functions or sampling points. For different problems, all we need to do is to modify the objective function R 2 ( α ) without any other preparation like the Gauss–Newton method. Namely, the above algorithm is simple and convenient for programming implementation when solving problems such as problem (3), and has a wide applicability.
In the following section, we would use the DFO method to denote Algorithm 1.

5. Numerical Experiments

To demonstrate the feasibility and efficiency of our DFO method, we conduct a series of numerical experiments with it. In these experiments, in which the default experimental data are sampled from the Franke’s test function, its expression is
f ( x , y ) = 3 4 exp ( ( 9 x 2 ) 2 + ( 9 y 2 ) 2 4 ) + 3 4 exp ( ( 9 x 2 ) 2 49 ( 9 y + 1 ) 2 10 ) 1 5 exp ( ( 9 x 4 ) 2 ( 9 y 7 ) 2 ) + 1 2 exp ( ( 9 x 7 ) 2 + ( 9 y 3 ) 2 4 ) , ( x , y ) [ 0 , 1 ] × [ 0 , 1 ] .
In all experiments, N represents the number of sampling points, and M represents the number of basis functions. Our sampling points and center points are all the quasi-scattered Halton points [3,20] in ( 0 , 1 ) × ( 0 , 1 ) . The default initial value α 0 of our DFO method equals 1, if there are no additional instructions.
We use the error between the approximation function and the sample value
R 2 ( α ) = R ( c , α ) = i = 1 N f i j = 1 M c j · Φ j ( x i ; α ) 2
to measure the shape parameter α , where c is the solution to the classical least squares corresponding to α . The above formula not only measures the shape parameter, but also shows the approximation effect of the approximation function to the sampling function.
All experiments were implemented on a CPU with Intel Core i7-8700 and with a main frequency of 3.20 GHz. The programming language which we use is Matlab R2019a.
We have opened the sample point sets and center point sets of each experiment, so that the interested readers can reproduce those experiments. Those data are in the Supplementary Materials.

5.1. The Comparison of the DFO Method and the Gauss–Newton Method

In this section, the radial basis function is the thin plate spline function (TPS)
ϕ ( r ) = ( α r ) 2 · ln ( α r ) , α > 0 .
As mentioned in Section 3, if M is relatively large, obtaining the objective function R 2 ( α ) of problem (3) is difficult. If M is particularly large, the computational cost of obtaining the expression and derivative of R 2 ( α ) can be higher than that of solving problem (3). Hence, the Gauss–Newton method is used to solve problem (2), and the DFO method is used to solve problem (3) in this section. Although the problems are solved differently, the Formula (8) can be used to measure the shape parameters in these two ways.
The initial value of the Gauss–Newton method is ( c 0 , α 0 ) , here c 0 = α 0 ( 1 , , 1 ) T .

5.1.1. The Convergence Comparison

The classical Gauss–Newton method is a method with local convergence. It can converge quickly when the initial point is near the critical point. In addition, our DFO method is a trust-region method. Its convergence rate is theoretically higher than that of the line search method. Hence, we first compare the convergence between the Gauss–Newton method and the DFO method.
In this experiment, N = 5000 , and M takes 50 , 60 , 70 , and 80 , respectively. At first, we set α 0 = 1 . The results of the two methods in these cases are shown in Table 1.
The results in Table 1 indicate that α obtained by the two methods can be different in the same case. According to the size of R 2 ( α ) , α obtained by the DFO method is more likely to be the optimal shape parameter than that obtained by the Gauss–Newton method. In order to further understand the reason for the result, we use a numerical method to draw the image of R 2 ( α ) for the case of M = 50 around two α s obtained by the two methods in Figure 1.
In Figure 1, the red asterisk represents the α obtained by the Gauss–Newton method, and the red circle represents the α obtained by the DFO method. We can find that, beginning with α 0 = 1 , the DFO method can “climb” over a maximum to a minimum on the left, but the Gauss–Newton method can only reach a minimum near the initial value.
We then set α 0 = 0.5 . The results of the two methods are shown in Table 2.
According to Table 2, the results of DFO are almost identical to those of Gauss–Newton. Comparing the results in Table 2 with those in Table 1, the DFO method can achieve the same minimum point beginning with two different initial values. This shows that our DFO method is independent of the initial value selection, while the Gauss–Newton method depends on the initial value.

5.1.2. The Comparison of Computational Efficiency

At first, we set α 0 = 1 . In order to compare the computational efficiency of the DFO method with that of the Gauss–Newton method, we list their computational times in Table 3, when the α s obtained by the two methods are almost the same. In this experiment, N = 5000 , but M takes 10 , 20 , 30 , and 40 , respectively.
The results in Table 3 indicate that, when M is small, the Gauss–Newton method takes less time than the DFO method. However, with the increase of M, the computational time of the DFO method is significantly lower than that of the Gauss–Newton method. Such results are predictable. The problem that the Gauss–Newton method solves is ( M + 1 ) -dimensional, but the problem that the DFO method solves is one-dimensional. This shows that our dimensional reduction is meaningful.
Thus, we set α 0 = 0.5 . The initial value is nearer to the α in Table 3. We list their computational times in Table 4.
Comparing the results in Table 4 with those in Table 3, the computational time of DFO is hardly affected by the initial value, but the computational time of Gauss–Newton is heavily affected by the initial value. The efficiency of the DFO method is less dependent on the initial value.
According to Table 4, the nearer initial value can not make the Gauss–Newton faster than DFO. Combining these results with those in Table 3, we conclude that, when M is large, the computational efficiency of the DFO method is higher than that of the Gauss–Newton method.
Remark 4.
Ideally, we should use the Gauss–Newton method to solve problem (3) in order to make a comparison with the DFO method. However, obtaining the expression and derivatives of R 2 ( α ) is difficult. The time of obtaining R 2 ( α ) and R 2 ( α ) is longer than the time shown in Table 3. Hence, we choose to use the Gauss–Newton method to solve problem (2) instead of problem (3).

5.2. The Results of the DFO Method with a Larger Amount of Data

In practice, we often need to approximate large-scale data and use a large number of bases in order to obtain better approximation results. Namely, both N and M are large in general. Hence, we test the DFO method with larger N and M. The RBF is still the TPS function. N = 10 , 000 and M takes 25 , 50 , 100 , 200 , 400 , and 800, respectively. Just like in the previous experiment, we use R 2 ( α ) to show the approximation effect. In this experiment, we introduce a graph of R 2 ( α ) and a graph of the computational time of the DFO method in relation to M.
Figure 2 shows the relationship between the error R 2 ( α ) and M. Figure 3 shows the relationship between the computational time and M.
Figure 2 indicates that, for the same sampling data, the error R 2 ( α ) decreases rapidly with the increase of M. In other words, the RBF least squares with a large M and our optimal shape parameter can achieve an excellent approximation. Figure 3 indicates that the relationship between the computational time and M is approximately linear. Hence, the DFO method and our optimal shape parameter are suitable for the case of a large M.

5.3. The Comparison of Results on Different Radial Bases

In order to compare the results on different radial bases adequately, we conduct experiments in two cases. Just like in the previous experiment, we use R 2 ( α ) to show the approximation effect. At first, we compare R 2 ( α ) and the computational time on different radial bases under two sizes of sampling datasets. Secondly, we add some new test functions and show the results of different radial bases.
Although there are many kinds of RBFs at present, our experiment is only carried out under four commonly used kinds. The four kinds of RBFs are shown in Table 5.
We compare the approximation effects and computational time of the DFO method with these basis functions in every experiment.

5.3.1. The Comparison of Two Sizes of Sampling Datasets

The number of sampling datasets is N = 5000 . They are the quasi-scattered Halton points. The number of center points, M, takes 10 , 20 , , 100 , respectively. Figure 4 shows the relationship between R 2 ( α ) and M corresponding to the four kinds of basis functions.
Figure 5 shows the relationship between the computational time and M corresponding to the four kinds of basis functions.
In Figure 4 and Figure 5, the red polyline represents TPS-RBF, the green polyline represents MQ-RBF, the blue polyline represents IQ-RBF, and the black polyline represents CS-RBF [2].
Figure 4 indicates that, whatever the basis function is, R 2 ( α ) decreases rapidly with the increase of M. When M is small, the differences among R 2 ( α ) of the four basis functions are significant. In this case, the R 2 ( α ) of IQ is the lowest, and the R 2 ( α ) of TPS is the highest. When M becomes large, all the R 2 ( α ) values of the four basis functions decrease rapidly. In most cases, the approximation effect of IQ is the best, and the approximation effect of TPS is the worst.
Figure 5 indicates that, when M is small, the differences among the computational time of the four kinds of basis functions are not significant. As M increases, the differences in the computational time become obvious. In most cases, the computational time of IQ is the lowest. However, if M = 100 , the computational time of IQ becomes relatively high.
We then reduce the N from 5000 to 2500, and others are unchanged. The results of this experiment are shown in Figure 6 and Figure 7.
If we compare the results in Figure 6 and Figure 7 with those in Figure 4 and Figure 5, we can find that the conclusions are similar. The performance of IQ in our experiments is excellent, which is compared with the other three kinds of basis functions.
Based on the above analysis, we find that the performance of IQ is the best in the four kinds of RBFs. Hence, we recommend IQ-RBF for the high-precision problems.

5.3.2. The Comparison of the Four New Test Functions

In this section, we add the following four new test functions,
g 1 ( x , y ) = tanh ( 9 y 9 x ) + 1 9 , g 2 ( x , y ) = 1.25 + cos ( 5.4 y ) 6 + 6 ( 3 x 1 ) 2 , g 3 ( x , y ) = 1 3 exp ( 5.0625 ( ( x 0.5 ) 2 + ( y 0.5 ) 2 ) ) , g 4 ( x , y ) = 1 3 exp ( 20.25 ( ( x 0.5 ) 2 + ( y 0.5 ) 2 ) ) .
They are functions commonly used to test the performance of the RBF approximation. The four test functions are only used in this section.
In this section, the number of sampling datasets is always N = 5000 , and M takes 10 , 20 , , 100 , respectively. The approximation effects and computational times of the DFO method are shown in Figure 8 and Figure 9, respectively. Each of the two figures has four subfigures corresponding to the four test functions.
Figure 8 shows that the performances of MQ and IQ are better than that of TPS and CSRBF. In most cases, the R 2 ( α ) of IQ is the lowest among the four kinds of basis functions. However, as shown in Figure 8d, when M = 100 , the R 2 ( α ) of g 4 ( x , y ) based on IQ indicates that the DFO method may have failed.
Figure 9c shows that there is a special case in which the DFO method may take longer than expected. In most cases, the computational time of every basis function is low and the relationship between the computational time and M is approximately linear, though we cannot prove this in theory yet. According to Figure 9, CSRBF spends the longest time in most cases, IQ has a good performance with respect to g 1 ( x , y ) and g 4 ( x , y ) , and TPS has a good performance with respect to g 2 ( x , y ) and g 3 ( x , y ) .
Combining the results of Section 5.3.1 with those of this section, although there may be some unexpected results with respect to IQ-RBF on the approximation effect and computational time, IQ-RBF is the most recommended of the four basis functions in most cases.

5.4. Curved Surface Fitting

In this section, we present the application of the RBF least squares with the optimal shape parameter to the surface fitting of the simulated data and real data. We compare the approximation effect of the RBF least squares with that of the polynomial least squares with the same dimension on the simulated data.

5.4.1. Curved Surface Fitting on Simulated Data

In this experiment, the simulated data were sampled from Franke’s test function. Figure 10 is the image of Franke’s test function.
We sampled exactly at N = 5000 points to construct the simulated data, which are used for the least squares approximation with a polynomial basis and IQ-RBF. The polynomial space is selected as a binary polynomial space with a total degree less than or equal to 8. The dimension of the polynomial space is 45. Accordingly, the IQ-RBF approximation space is constructed with M = 45 basis functions.
The approximation function of the least squares based on polynomial is p ( x , y ) . The approximation function p ( x , y ) and the error function | p ( x , y ) f ( x , y ) | on [ 0 , 1 ] × [ 0 , 1 ] are shown in Figure 11 and Figure 12 respectively.
According to Figure 11 and Figure 12, the error at some points is relatively large, though the approximation function p ( x , y ) has a certain approximation accuracy to the surface by the simulation data. A comparison between Figure 10 and Figure 11 shows an obvious difference between p ( x , y ) and f ( x , y ) . In Figure 12, the maximum error is 0.2577, and the average error is 0.0132.
The optimal shape parameter of IQ-RBF with these data is α = 0.3047 ; the optimal approximation function of the least squares based on IQ-RBF is s ( x , y ) . The approximation function s ( x , y ) and the error function | s ( x , y ) f ( x , y ) | on [ 0 , 1 ] × [ 0 , 1 ] are shown in Figure 13 and Figure 14, respectively.
Figure 10, Figure 11 and Figure 13 show that s ( x , y ) is closer to f ( x , y ) than p ( x , y ) . Figure 14 indicates that the error | s ( x , y ) f ( x , y ) | oscillates violently, but the numbers are small. In Figure 14, the maximum error is 0.0463, and the average error is 0.0049.
By comparing Figure 12 with Figure 14, we can find that the maximum error and average error of s ( x , y ) are lower than those of p ( x , y ) .
Table 6 shows the errors of the least squares approximation for the binary polynomial spaces with a total degree less than or equal to 6–10, respectively. The errors of the IQ-RBF approximation whose approximation space has the same dimension as the corresponding polynomial spaces are also shown in the table.
According to Table 6, the results about the binary polynomial spaces with a total degree less than or equal to 8 appear to be extensive. Therefore, we think that the curved surface fitting by the RBF least squares with the optimal shape parameter is better than that by the least squares based on polynomial bases.

5.4.2. Curved Surface Fitting on Real Data

The real data set is obtained from GPS data of Mount Vel’ký Rozsutec in the Malá Fatra through the online tool GPSVisualizer (http://www.gpsvisualizer.com/elevation). These data have 24,190 points in total and are shown in Figure 15.
We only use IQ-RBF to fit the real data and select the M = 500 points in the region as the center point to generate the RBF approximation function space. We begin our DFO method with the initial value α 0 = 1 . In this case, the optimal shape parameter is α = 424.8005 , and the fitting surface is shown in Figure 16.
By comparing Figure 15 with Figure 16, we find that the RBF least squares with the optimal shape parameter have a smooth and continuous approximation. The peaks, ridges, and ravines of the mountain are well reconstructed. This approximation is acceptable.
We tabulate the error of RBF least approximation on different shape parameters in Table 7. The error is the famous RMSE (root mean square error).
The initial value is far away from our optimal shape parameter α = 424.8005 . The error at the initial value α 0 = 1 is very high. The results in Table 7 show that the optimal shape parameter can significantly improve the error, and our DFO method is effective when it begins with a bad initial value.

6. Conclusions

This paper investigates the selection of shape parameters in the RBF least squares. The approximation error of the least squares is used to measure the shape parameter, and we propose a corresponding optimization problem (2) to obtain the optimal shape parameter. In order to reduce the difficulty in solving the problem, an equivalent one-dimensional optimization problem (3) is obtained through variable separation. According to the characteristics of problem (3), we propose to solve it by using the derivative-free optimization method.
Among the various DFO methods, we choose Powell’s UOBYQA method, and we propose our DFO method by combining the characteristics of problem (3) with Powell’s method in Section 4.
We test our method in different cases by numerical experiments. Our method outperforms the classical Gauss–Newton method in the experiments. The performance of our DFO method to solve the least squares approximation based on IQ-RBF is excellent. The surface fitting by the RBF least squares with the optimal shape parameter is better than that by the least squares based on polynomial bases.
The approximation function s ( x ) with the optimal shape parameter can approximate the sampling function more closely than the normal least squares approximation. Hence, the optimal shape parameter is applied to the various applications of least squares. For example, when solving partial differential equations with least squares, we can approximate the exact solution better with the method proposed in Section 4; alternatively, when fitting the surface, we can approximate the sampled surface better by the least squares based on RBF with the optimal shape parameter.
The method used in this paper to define the optimal shape parameter directly implies the generalization that each basis function has an independent shape parameter, which would mean that the problem (2) would become a 2 M -dimensional optimization problem. Such a generalization will undoubtedly improve the approximation effect, but the difficulty of solving problem (2) will further increase. How the 2 M -dimensional nonlinear optimization can be solved efficiently is worth considering.

Supplementary Materials

The following are available online at https://www.mdpi.com/2227-7390/8/11/1923/s1.

Author Contributions

Conceptualization, R.F. and S.Z.; methodology, R.F. and S.Z.; software, S.Z. and A.H.; validation S.Z. and A.H.; formal analysis, S.Z.; investigation, S.Z.; resources, R.F., S.Z. and A.H.; data curation, R.F., S.Z. and A.H.; writing—original draft preparation, S.Z.; writing—review and editing, R.F. and S.Z.; visualization, S.Z.; supervision, R.F.; project administration, R.F.; funding acquisition, R.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (No.12071019).

Acknowledgments

We would like to thank Majdisova and Skala of University of West Bohemia in Czech Republic for providing the Mount Velky Rozsutec dataset.

Conflicts of Interest

The authors declare that there is no conflict of interest.

References

  1. Hardy, R.L. Multiquadric equations of topography and other irregular surfaces. J. Geophys. Res. 1971, 76, 1905–1915. Available online: https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/JB076i008p01905 (accessed on 20 May 2017). [CrossRef]
  2. Wendland, H. Scattered Data Approximation; Cambridge Monographs on Applied and Computational Mathematics; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar] [CrossRef]
  3. Fasshauer, G.E. Meshfree Approximation Methods with Matlab; World Scientific: London, UK, 2007; Available online: https://www.worldscientific.com/doi/pdf/10.1142/6437 (accessed on 15 July 2017). [CrossRef]
  4. Wu, Z.; Schavack, R. Local error estimates for radial basis function interpolation of scattered data. IMA J. Numer. Anal. 1993, 13, 13–27. Available online: https://academic.oup.com/imajna/article-pdf/13/1/13/2481996/13-1-13.pdf (accessed on 20 July 2020). [CrossRef]
  5. Wu, Z. Multivariate compactly supported positive definite radial functions. Adv. Comput. Math. 1995, 4, 283–292. [Google Scholar] [CrossRef]
  6. Feng, R.; Duan, J. High Accurate Finite Differences Based on RBF Interpolation and its Application in Solving Differential Equations. J. Sci. Comput. 2018, 76, 1785–1812. [Google Scholar] [CrossRef]
  7. Liu, C.S.; Liu, D. Optimal shape parameter in the MQ-RBF by minimizing an energy gap functional. Appl. Math. Lett. 2018, 86, 157–165. [Google Scholar] [CrossRef]
  8. Chen, W.; Hong, Y.; Lin, J. The sample solution approach for determination of the optimal shape parameter in the Multiquadric function of the Kansa method. Comput. Math. Appl. 2018, 75, 2942–2954. [Google Scholar] [CrossRef]
  9. Majdisova, Z.; Skala, V. Radial basis function approximations: Comparison and applications. Appl. Math. Model. 2017, 51, 728–743. [Google Scholar] [CrossRef] [Green Version]
  10. Majdisova, Z.; Skala, V. Big geo data surface approximation using radial basis functions: A comparative study. Comput. Geosci. 2017, 109, 51–58. [Google Scholar] [CrossRef] [Green Version]
  11. Rippa, S. An algorithm for selecting a good value for the parameter c in radial basis function interpolation. Adv. Comput. Math. 1999, 11, 193–210. [Google Scholar] [CrossRef]
  12. Fasshauer, G.E.; Zhang, J.G. On choosing “optimal” shape parameters for RBF approximation. Numer. Algorithms 2007, 45, 345–368. [Google Scholar] [CrossRef]
  13. Azarboni, H.R.; Keyanpour, M.; Yaghouti, M. Leave-Two-Out Cross Validation to optimal shape parameter in radial basis functions. Eng. Anal. Bound. Elem. 2018, 100, 204–210. [Google Scholar] [CrossRef]
  14. Luh, L.T. The Mystery of the Shape Parameter III. Appl. Comput. Harmon. Anal. 2016, 40, 186–199. [Google Scholar]
  15. Nocedal, J.; Wright, S.J. Numerical Optimization, 2nd ed.; Springer Series in Operations Research and Financial Engineering; Springer: New York, NY, USA, 2006; p. xxii+664. [Google Scholar]
  16. Beck, A. Introduction to Nonlinear Optimization; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2014; Available online: https://epubs.siam.org/doi/pdf/10.1137/1.9781611973655 (accessed on 1 September 2017). [CrossRef]
  17. Golub, G.; Pereyra, V. The Differentiation of Pseudo-Inverses and Nonlinear Least Squares Problems Whose Variables Separate. SIAM J. Numer. Anal. 1973, 10, 413–432. [Google Scholar] [CrossRef]
  18. Cotnoir, C.M.; Terzić, B. Decoupling linear and nonlinear regimes: An evaluation of efficiency for nonlinear multidimensional optimization. J. Glob. Optim. 2017, 68, 663–675. [Google Scholar] [CrossRef]
  19. Powell, M.J.D. UOBYQA: Unconstrained optimization by quadratic approximation. Math. Program. 2002, 92, 555–582. [Google Scholar] [CrossRef]
  20. Renka, R.; Brown, R. Algorithm 792: Accuracy tests of ACM algorithms for interpolation of scattered data in the plane. ACM Trans. Math. Softw. 1999, 25, 78–94. [Google Scholar] [CrossRef]
Figure 1. The image of R 2 ( α ) .
Figure 1. The image of R 2 ( α ) .
Mathematics 08 01923 g001
Figure 2. The graph of the variety of R 2 ( α ) with M.
Figure 2. The graph of the variety of R 2 ( α ) with M.
Mathematics 08 01923 g002
Figure 3. The graph of the variety of the computational time with M.
Figure 3. The graph of the variety of the computational time with M.
Mathematics 08 01923 g003
Figure 4. The graph of the variety of R 2 ( α ) with M corresponding to the four kinds of basis functions under N = 5000 .
Figure 4. The graph of the variety of R 2 ( α ) with M corresponding to the four kinds of basis functions under N = 5000 .
Mathematics 08 01923 g004
Figure 5. The graph of the variety of computational time with M corresponding to the four kinds of functions under N = 5000 .
Figure 5. The graph of the variety of computational time with M corresponding to the four kinds of functions under N = 5000 .
Mathematics 08 01923 g005
Figure 6. The graph of the variety of R 2 ( α ) with M corresponding to the four kinds of basis functions under N = 2500 .
Figure 6. The graph of the variety of R 2 ( α ) with M corresponding to the four kinds of basis functions under N = 2500 .
Mathematics 08 01923 g006
Figure 7. The graph of the variety of computational time with M corresponding to the four kinds of basis functions under N = 2500 .
Figure 7. The graph of the variety of computational time with M corresponding to the four kinds of basis functions under N = 2500 .
Mathematics 08 01923 g007
Figure 8. The R 2 ( α ) of the four test functions corresponding to the four kinds of basis functions.
Figure 8. The R 2 ( α ) of the four test functions corresponding to the four kinds of basis functions.
Mathematics 08 01923 g008
Figure 9. The computational time of the DFO method for the four kinds of basis functions to approximate the four test functions.
Figure 9. The computational time of the DFO method for the four kinds of basis functions to approximate the four test functions.
Mathematics 08 01923 g009
Figure 10. Franke’s test function on [ 0 , 1 ] × [ 0 , 1 ] .
Figure 10. Franke’s test function on [ 0 , 1 ] × [ 0 , 1 ] .
Mathematics 08 01923 g010
Figure 11. The approximation function p ( x , y ) on [ 0 , 1 ] × [ 0 , 1 ] .
Figure 11. The approximation function p ( x , y ) on [ 0 , 1 ] × [ 0 , 1 ] .
Mathematics 08 01923 g011
Figure 12. The error function | p ( x , y ) f ( x , y ) | on [ 0 , 1 ] × [ 0 , 1 ] .
Figure 12. The error function | p ( x , y ) f ( x , y ) | on [ 0 , 1 ] × [ 0 , 1 ] .
Mathematics 08 01923 g012
Figure 13. The approximation function s ( x , y ) on [ 0 , 1 ] × [ 0 , 1 ] .
Figure 13. The approximation function s ( x , y ) on [ 0 , 1 ] × [ 0 , 1 ] .
Mathematics 08 01923 g013
Figure 14. The error function | s ( x , y ) f ( x , y ) | on [ 0 , 1 ] × [ 0 , 1 ] .
Figure 14. The error function | s ( x , y ) f ( x , y ) | on [ 0 , 1 ] × [ 0 , 1 ] .
Mathematics 08 01923 g014
Figure 15. The data of the Mount Velky Rozsutec.
Figure 15. The data of the Mount Velky Rozsutec.
Mathematics 08 01923 g015
Figure 16. The surface fitting with real data.
Figure 16. The surface fitting with real data.
Mathematics 08 01923 g016
Table 1. The results of the Gauss–Newton method and the DFO method on different M values with α 0 = 1 .
Table 1. The results of the Gauss–Newton method and the DFO method on different M values with α 0 = 1 .
α 0 = 1 M50607080
Method
R 2 ( α ) DFO0.23070.18120.11560.0928
Gauss-Newton0.31560.25400.19100.1072
α DFO0.03500.08400.39710.4590
Gauss-Newton0.94890.93380.92150.9271
Table 2. The results of the Gauss–Newton method and the DFO method on different M values with α 0 = 0.5 .
Table 2. The results of the Gauss–Newton method and the DFO method on different M values with α 0 = 0.5 .
α 0 = 1 M50607080
Method
R 2 ( α ) DFO0.23070.18120.11600.0928
Gauss-Newton0.23070.18120.11600.092
α DFO0.03500.08400.39710.4590
Gauss-Newton0.03500.08400.39710.4590
Table 3. The comparison of computational time (unit:s) with α 0 = 1 .
Table 3. The comparison of computational time (unit:s) with α 0 = 1 .
α 0 = 1 M10203040
Method
tDFO0.38240.74890.89111.2423
Gauss-Newton0.26280.68135.39832.1273
α the two methods0.60730.60270.23540.4456
Table 4. The comparison of computational time (unit:s) with α 0 = 1 .
Table 4. The comparison of computational time (unit:s) with α 0 = 1 .
α 0 = 1 M10203040
Method
tDFO0.31170.75750.91851.2580
Gauss–Newton0.33469.72671.34971.5833
α the two methods0.60730.60270.23540.4456
Table 5. The four kinds of RBFs.
Table 5. The four kinds of RBFs.
RBF ϕ ( r ) Domain of Definition
Thin plate spline (TPS) ( α r ) 2 ln ( α r ) α > 0
Multi-Quadric (MQ) ( r 2 + α 2 ) 1 2 α > 0
Inverse-Quadric (IQ) 1 ( r 2 + α 2 ) 1 2 α > 0
CS-RBF ( ϕ 3 , 1 ( r ) ) ( 1 α r ) + 4 ( 4 α r + 1 ) α > 0
Table 6. The errors of least squares approximation based on different polynomial spaces and RBF space with the same dimension.
Table 6. The errors of least squares approximation based on different polynomial spaces and RBF space with the same dimension.
DimensionPolynomial BasisRBF
Total DegreeMax ErrorAverage ErrorMax ErrorAverage Error
2860.24480.02660.08310.0118
3670.17370.01870.06740.0078
4580.25770.01320.04630.0049
5590.14930.01050.04680.0037
66100.10680.00690.03530.0022
Table 7. The RMSE of the RBF least squares approximation on different shape parameters.
Table 7. The RMSE of the RBF least squares approximation on different shape parameters.
α 110100200400 α = 424.8005 600
RMSE69.878749.435710.47535.31364.08134.07874.8043
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zheng, S.; Feng, R.; Huang, A. The Optimal Shape Parameter for the Least Squares Approximation Based on the Radial Basis Function. Mathematics 2020, 8, 1923. https://doi.org/10.3390/math8111923

AMA Style

Zheng S, Feng R, Huang A. The Optimal Shape Parameter for the Least Squares Approximation Based on the Radial Basis Function. Mathematics. 2020; 8(11):1923. https://doi.org/10.3390/math8111923

Chicago/Turabian Style

Zheng, Sanpeng, Renzhong Feng, and Aitong Huang. 2020. "The Optimal Shape Parameter for the Least Squares Approximation Based on the Radial Basis Function" Mathematics 8, no. 11: 1923. https://doi.org/10.3390/math8111923

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop