Next Article in Journal
A Heuristic Model for Spare Parts Stocking Based on Markov Chains
Previous Article in Journal
Forecasting Day-Ahead Brent Crude Oil Prices Using Hybrid Combinations of Time Series Models
Previous Article in Special Issue
Modelling Sign Language with Encoder-Only Transformers and Human Pose Estimation Keypoint Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Non-Convex Optimization: Using Preconditioning Matrices for Optimally Improving Variable Bounds in Linear Relaxations

1
Escuela de Informática y Telecomunicaciones, Universidad Diego Portales, Santiago 8370068, Chile
2
Escuela de Ingeniería Informática, Pontificia Universidad Católica de Valparaíso, Valparaíso 2340000, Chile
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(16), 3549; https://doi.org/10.3390/math11163549
Submission received: 11 July 2023 / Revised: 8 August 2023 / Accepted: 10 August 2023 / Published: 17 August 2023
(This article belongs to the Special Issue Mathematical Modeling, Optimization and Machine Learning)

Abstract

:
The performance of branch-and-bound algorithms for solving non-convex optimization problems greatly depends on convex relaxation techniques. They generate convex regions which are used for improving the bounds of variable domains. In particular, convex polyhedral regions can be represented by a linear system A . x = b . Then, bounds of variable domains can be improved by minimizing and maximizing variables in the linear system. Reducing or contracting optimally variable domains in linear systems, however, is an expensive task. It requires solving up to two linear programs for each variable (one for each variable bound). Suboptimal strategies, such as preconditioning, may offer satisfactory approximations of the optimal reduction at a lower cost. In non-square linear systems, a preconditioner P can be chosen such that P . A is close to a diagonal matrix. Thus, the projection of the equivalent system P . A . x = P . b over x, by using an iterative method such as Gauss–Seidel, can significantly improve the contraction. In this paper, we show how to generate an optimal preconditioner, i.e., a preconditioner that helps the Gauss–Seidel method to optimally reduce the variable domains. Despite the cost of generating the preconditioner, it can be re-used in sub-regions of the search space without losing too much effectiveness. Experimental results show that, when used for reducing domains in non-square linear systems, the approach is significantly more effective than Gauss-based elimination techniques. Finally, the approach also shows promising results when used as a component of a solver for non-convex optimization problems.

1. Introduction

Non-convex optimization refers to the process of finding the minimum of a nonlinear function within a non-convex region, if it exists. In such regions, the function might have multiple local minima, maxima, or saddle points [1,2]. Usually, these kinds of problems can be defined as follows:
min x x f ( x ) s . t . g ( x ) 0 ,
with x R n the set of variables varying in the box x (an interval x i = [ x i ̲ , x i ¯ ] defines the set of reals x i , such that x i ̲ x i x i ¯ ; a box x is a Cartesian product of intervals x 1   ×     ×   x i   ×   ×   x n ), f : R n R a real-valued objective function, and g : R n R m a set of inequality constraints. Notice that f and g may be non-convex functions. Interval-based branch-and-bound (B&B) techniques are commonly used for solving non-convex global optimization or constraint satisfaction problems. Still, the approach relies heavily on convex relaxation techniques [3]. Convex relaxations are used to transform the original non-convex problem into a convex one. This involves generating a convex and generally polyhedral region that contains the optimal solution of the original problem. The polyhedral region is represented by a non-square linear system A . x = b , where A R m × n , x R n , and b R m . The vector x includes some variables from the original problem and auxiliary variables with unbounded bounds for the inequalities. Once the relaxation is generated, the objective is to reduce or contract the variable domains of the original problem. Lower and upper bounds of variable domains can be found by minimizing and maximizing each variable of the linear system. The method, which solves the 2 n linear programs, is called optimization-based bound tightening (obbt [4]) or PolytopeHull [5], and it is used by several global optimization solvers such as α BB [6], ANTIGONE [7], Couenne [8], LaGO [9], SCIP [10] and IbexOpt [5,11]. Due to its expensiveness, obbt is mostly applied at the root node and within the search tree only with limited frequency or based on its success rate. ANTIGONE, for instance, measures the success of obbt by the reduction in the box volume and disables it for all child nodes once the rate of reduction drops below a given threshold. The method is expensive, and some improvements have been proposed in order to: (1) reduce the number of linear programs to be solved; (2) accelerate the convergence of the simplex algorithm, and (3) generate projection inequalities that approximate the contraction performed by the 2 n linear programs [12].
Machine learning techniques have also been applied to reduce the expensiveness of obbt. In [13], the authors propose a deep neural network (DNN) capable of predicting, from a convex relaxation of an AC optimal power flow problem, the subset of variables whose tightening of bounds can still contribute to the best improvement of the relaxation. The results show promising outcomes for these kinds of problems, demonstrating a 6.3-speed-up in obbt run times. In another study [14], a different machine learning technique, deep value-based reinforcement learning, was used to enhance the Simplex method. This was achieved by combining two well-known rules (Dantzig and Steepest Edge [15,16]) and using the algorithm’s current status to decide when to switch between them. This approach also showed promising results for solving complex problems.
In this work, we deal with the same problem as obbt, i.e., we want to improve the domain bounds of a variable vector x by using a non-square linear system A . x = b . In other words, we want to find a minimal box x such that all the solutions of the system belong to x .
An interval variant of the Gauss–Seidel algorithm can be used for contracting  x (i.e., reducing the domains of the variables). However, it does not work well without a proper preconditioning [17]. In non-square systems, a conditioner matrix P can be chosen such that P . A is close to a diagonal matrix. Thus, the projection of the equivalent system P . A . x = P . b over x, by using an iterative method such as Gauss–Seidel, can be significantly improved. Gauss–Seidel applies the following operation for contracting the domain of each variable x k :
x k x k 1 a ^ i k b i j , j k m a ^ i j x j , i { 1 m }
where a ^ i j are the coefficients of the matrix A ^ = P . A .
Techniques used for preconditioning non-square matrices include the Gauss–Jordan elimination method [18] which can provide a preconditioning matrix P by retrieving the row (or column) operations performed in A. This method constructs a pseudo-diagonal matrix in an iterative way, by selecting a subset of m variables as pivots. Usually, the current maximum absolute value of A is selected as the pivot in each step. In  [19], the authors propose to select the pivot by using five priority rules (e.g., to select columns with at least two values, to select rows with less values, etc.). Another technique is the least squares method [20]. This technique constructs the system A T A . x = A T . b , which provides us with the solution x = A T ( A A T ) 1 b , known as the Moore–Penrose pseudoinverse. However, the matrix ( A A T ) 1 may not exist if A is not full rank. In that case, the well-known singular-value decomposition method [21] (also known as SVD) can be used in order to determine the pseudoinverse of A.
In this work, we propose three methods which construct preconditioning matrices in order to deal with non-square linear systems. Two of them are based on solving linear programs. The first one constructs a n × m matrix P by solving n linear programs. The solution of the i-th linear program corresponds to the i-th vector of P and minimizes the size of the projection over the variable x i (right part in (2)). The second method constructs a 2 n × m matrix P by solving 2 n linear programs. The solution of the first n linear programs corresponds to rows in P that maximize the lower bound of the projection over each of the variables. The other n linear programs minimize the upper bound of the projections. Furthermore, P is able to improve the domain bounds of the variables optimally, i.e., it leads to the smallest box that contains the feasible region. Finally, we realized that the problem of finding the optimal preconditioning matrix is equivalent to finding the dual feasible solutions of the 2 n linear programs solved by obbt. An equivalent method is proposed in  [12], where the authors, instead of generating a preconditioning matrix, directly generate a set of redundant inequalities for improving the bounds of the variables.
We also propose a heuristic for constructing preconditioners based on the Gauss–Jordan pivoting. It takes into account the current variable domains and constructs preconditioners that, when used with a Gauss–Seidel approach, offer a better contraction compared to the ones generated by other state-of-the-art heuristics.
The paper is organized as follows. Section 2 provides basic notions related to interval arithmetic and some remarks about linear systems. In Section 3 we present an example of using preconditioning for contracting a linear system. In Section 4 we describe in detail the three contributions of this paper. Section 5 reports the experimental results. Finally, Section 6 presents our conclusions and future work.

2. Background

In this section, we introduce some basic concepts related to interval arithmetic and interval linear systems. For more details and definitions, refer to [22].

2.1. Intervals

An interval  x i = [ x ̲ i , x ¯ i ] defines the set of reals x i s.t. x ̲ i x i x ¯ i , where x ̲ i and x ¯ i are floating-point numbers. The size or width of x i is defined as wid ( x i ) = x ¯ i x ̲ i . mid ( x i ) denotes the midpoint of x i , where mid ( x i ) = x ¯ i + x ̲ i 2 . A box  x = ( x 1 , , x n ) represents the Cartesian product of intervals x 1 × × x n . The size of a box is wid ( x ) = max x i x wid ( x i ) . The perimeter of a box is per ( x ) = i = 1 n wid ( x i ) . A hull of a set of vectors in R n corresponds to the minimal box containing all of these vectors.
Interval arithmetic defines the extension of unary and binary operators; for instance,
x 1 + x 2 = [ x ̲ 1 + x ̲ 2 , x ¯ 1 + x ¯ 2 ] x 1 x 2 = [ x ̲ 1 x ¯ 2 , x ¯ 1 x ̲ 2 ] x 1 x 2 = [ min ( x ̲ 1 x ̲ 2 , x ̲ 1 x ¯ 2 , x ¯ 1 x ̲ 2 , x ¯ 1 x ¯ 2 ) , max ( x ̲ 1 x ̲ 2 , x ̲ 1 x ¯ 2 , x ¯ 1 x ̲ 2 , x ¯ 1 x ¯ 2 ) ]     log ( x 1 ) = [ log ( x 1 ̲ ) , log ( x 1 ¯ ) ]
A function f : IR n IR is factorable if it can be computed in a finite number of simple steps, using unary and binary operators. f : IR n IR is said to be an extension of a real factorable function f to intervals if
x IR n , f ( x ) { f ( x ) , x x } .
The optimal image f o p t is the sharpest interval containing the image of f ( x ) over x . There are several kinds of extensions; in particular, the natural extension  f N corresponds to mapping a real n-dimensional function f to intervals by using interval arithmetic.

2.2. Linear Systems

A linear system A . x = b is a set of m > 1 equations defined by a set of n > 1 variables. Without loss of generality, we consider A as a matrix of real coefficients with m rows and n columns. b R m corresponds to an m-size vector of real values and x R n corresponds to the vector of variables. Initial domains of variables are represented by a box x which is generally reduced or contracted in order to converge to the hull of solutions. Linear systems can be classified in three types: square, overdetermined and underdetermined.
Square systems are the most common and studied type, where the number of linear independent equations is equal to the number of variables. Cheap suboptimal methods can be used for contracting x . For instance, the Gauss–Seidel algorithm updates, at each step, the box x by performing the contraction step (2).
None of these methods work well without a proper preconditioning. A good, but computationally expensive, preconditioning matrix when A is square corresponds to P = A 1 , i.e., the inverse of A. If A is not singular (or numerically close to it), then P . A will correspond to the identity matrix, and the problem can be directly solved: x = P . b .
The second type of linear system belongs to the overdetermined category. In this case, the number of equations is greater than the number of variables. As it is not possible to compute the inverse matrix ( m n ), other suboptimal methods, such as the Gauss–Jordan elimination technique [18], can be used for generating a preconditioner P through the row operations performed by the method. The Gauss–Jordan elimination is a variant of the Gaussian elimination technique. This method is usually used to solve linear systems and to find the inverse of any invertible matrix. Gauss–Jordan transforms a sub-matrix n × n of A into a pseudo-identity matrix. The algorithm selects an element a i j (known as the pivot) of A, and by performing row operations, it leaves a i j equal to 1 and the other elements of the column j equal to 0. Notice that any element in the column j can not be selected as a pivot in the future.
A more recent technique, known as the subsquares approach, is proposed in [23]. This method extracts, sequentially, square systems n × n from the original overdetermined system, performing a contraction of x by using each one of these systems. As there are m n possible combinations of square systems, the authors propose a heuristic to only select a fraction of them. Even if it does not compute the hull, it gives good results compared to other classical approaches.
Finally, the underdetermined linear systems have more variables than equations (i.e., m < n ). In this case, if we apply the Gauss–Jordan elimination, a matrix P . A = [ I R ] is generated, where I corresponds to an identity matrix of size m × m and R represents a residual m × ( m n ) matrix.
As P . A is not diagonal, the contraction is not optimal. Small values for the residual matrix are preferred in order to obtain better contractions. Thus, the order in which the pivots are selected is crucial. A reasonable and widely used strategy for the pivoting process corresponds to selecting, in each iteration of the Gauss–Jordan elimination, the current maximum absolute value of the matrix A [24].
In this paper we focus on dealing with this last type of linear systems, i.e., the underdetermined ones.

3. Example: Contracting a Linear System

In order to explain preconditioning and to describe the new proposed methods, we will consider an underdetermined linear system example. First, notice that a linear system of constraints a x i + c b can be represented by a linear system A . x = y , where y [ a c , b c ] m is an auxiliary vector of variables. A . x = y is equivalent to the linear system A . x = 0 , where x = x y and A = A I , where I is an identity matrix of size m × m . Thus in the following, and without loss of generality, we deal with the problem A . x = 0 , with A R m × n and x R n .
Example 1.
Consider the following underdetermined linear system (coefficients and domains were randomly generated):
7.31 6.95 5.28 4.90 0.09 1.01 3.03 5.77 8.12 9.43 6.72 1.34 5.25 9.96 1.09 x 1 x 2 x 3 x 4 x 5 = 0 0 0
where the domains are x 1 = [ 1.565 , 2.880 ] , x 2 = [ 0.478 , 4.463 ] , x 3 = [ 1.038 , 6.032 ] , x 4 = [ 0.048 , 3.615 ] and x 5 = [ 1.076 , 2.647 ] .
If we apply the Gauss–Seidel steps (2) directly, no contraction is afforded in any of the five variables. On the other hand, if we apply the Gauss–Jordan technique by pivoting the maximum absolute value of A in each iteration, we obtain the following preconditioning matrix:
0.091 0.004 0.048 0.069 0.113 0.058 0.069 0.009 0.073
Consequently, for P . A we obtain
1 0.714 0.253 0 0 0 0.059 0.017 0 1 0 0.353 0.699 1 0
By applying contraction steps (2) on the new system P . A . x = P . b , we obtain a contraction in some domains: x 1 [ 0.078 , 2.880 ] , x 2 [ 0.478 , 4.400 ] , x 3 [ 1.038 , 4.922 ] and x 5 [ 0.352 , 0.009 ] .

4. Toward an Optimal Contraction of Non-Square Linear Systems

In this section we describe in detail three new approaches for dealing with non-square linear systems A . x = b . All of them construct a preconditioning matrix P which attempts to improve the projection performed by the Gauss–Seidel contraction step (2).
The first proposal corresponds to an improvement of the Gauss-pivot selection heuristic by taking into account information of the box x . The second and third proposals aim to construct the preconditioning matrix by solving linear programs.

4.1. Improving the Gauss-Pivoting Heuristic

At the end of Section 2.2, we explained that the Gauss–Jordan technique can be used for generating a preconditioning matrix P for the system A . x = b . From (2) we can see that, in order to increase the likelihood of contracting a variable x k , the interval evaluation of 1 a ^ i k b i j , j k m a ^ i j x j , where a ^ i j are the coefficients of the matrix A ^ = P . A , should be as tight as possible. The width of this interval is
1 | a ^ i k | j , j k a ^ i j · wid ( x j )
When applying Gauss–Jordan elimination, an indirect way of reducing this size is by pivoting the variable which maximizes the value of | a ^ i k | in each iteration. When doing this, in a way, we are selecting the row i for contracting x k : | a ^ i k | is the largest value in the row i, thus, according to (4), when | a ^ i k | is large, it is more likely we will obtain a tight projection over x k . In addition, the Gauss–Jordan method removes the coefficient related to this variable from the other rows, benefiting the contraction over the other variables. Notice that the P . A matrix in Example 1 has a distinct property. In each row, the pivoted value is 1 and corresponds to the largest absolute value. As a result, other values in the corresponding columns are canceled out, providing a beneficial projection over the other variables.
Following the same idea, we think that it is also relevant to take into account the width of the next pivoting variable. For instance, if the width of x k in (2) is too small, the likelihood of contracting this variable will be small too. On the contrary, if x k is large, it is more likely to contract its domain. If we normalize variable domains to intervals e j = [ 1 , 1 ] , i.e., e j : = x j wid ( x j ) mid ( x j ) , then the width of the projection over e k is equal to:
1 | a ^ i k | · wid ( x k ) j = 0 , j k m | a ^ i j | · wid ( x j )
Thus, we propose, as a pivoting heuristic, to select the element | a ^ i k | , such that | a ^ i k | · wid ( x k ) is maximized.
Example 2.
If we use the Gauss–Jordan elimination method, using as pivoting rule the value that maximizes the product | a i j | · w i d ( x j ) , in the example we obtain the following preconditioner:
0.067 0.113 0.056 0.098 0.013 0.105 0.066 0.008 0.075
Consequently, for P . A we obtain
0 0.050 0 0.025 1 0 0.505 1 1.428 0 1 0.586 0 0.361 0
Once we perform Equation (2), we obtain an additional contraction compared to just pivoting the cell with the largest absolute value. We obtain additional contraction on  x 1 [ 0.297 , 2.880 ] [ 0.078 , 2.880 ]  and  x 5 [ 0.319 , 0.025 ] [ 0.352 , 0.009 ] .

4.2. Linear-Based Preconditioning

Despite offering good projections over x , the Gauss–Jordan-based preconditioning methods rarely lead to optimal contractions. In this section, we describe two methods that directly focus on optimizing the projection over variables.

4.2.1. Minimizing the Size of the Interval Projection

First, we attempt to construct a preconditioning vector  p = ( p 1 , p 2 , , p m ) , such that the projection of the system p . A . x = 0 over the interval x k has a minimum size. The values of the vector a ^ = p . A are computed:
a ^ j = i = 1 m p i · a i j , j = 1 , , n ,
where a i j are the coefficients of matrix A. Thus, taking into account the interval projection size (4), a preconditioning vector p for minimizing this size (related to a variable x k ) can be generated by solving the following linear program:
minimize j = 1 , j k n | a ^ j | · w i d ( x j ) s . t . a ^ k = 1 a ^ j = i = 1 m p i · a i j , j = 1 , , n
Notice that by adding the constraint a ^ k = 1 , we can remove the quotient | a ^ i k | from Formula (4) in the objective function.
For constructing the preconditioning matrix P, we have to solve the linear program for each variable  x k  that we want to contract and to include the precondition vectors as rows in P.
In order to deal with the absolute value inside the objective function, we replace | a ^ j | by auxiliary variables u j . Then, we add the constraints u j a ^ j and u j a ^ j and we solve the equivalent linear program by using the simplex algorithm.
Example 3.
For contracting x 1 of Example 1, we would generate the following linear program:
minimize 4.45 u 1 + 3.98 u 2 + 7.07 u 3 + 3.57 u 4 + 3.72 u 5 s . t . a ^ 1 = 1 u r s r ; r = 1 5 u r s r ; r = 1 5 a ^ 1 = 7.31 p 1 + 1.01 p 2 6.72 p 3 a ^ 2 = 6.95 p 1 3.03 p 2 + 1.34 p 3 a ^ 3 = 5.28 p 1 5.77 p 2 5.25 p 3 a ^ 4 = 4.90 p 1 + 8.12 p 2 + 9.96 p 3 a ^ 5 = 0.08 p 1 + 9.43 p 2 + 1.09 p 3 ,
with optimal solution p * = ( 0.066 , 0.008 , 0.075 ) . By solving the linear problems related to the other variables, we would obtain the following preconditioner P:
0.066 0.008 0.075 0.127 0.006 0.068 0.098 0.013 0.105 0.023 0.011 0.098 0.067 0.113 0.056
Finally, P . A is
1 0.586 0 0.361 0 1.400 1 0.354 0 0 0 0.505 1 1.428 0 0.495 0 0.574 1 0 0 0.050 0 0.025 1
Compared to the pivoting heuristic in Example 2, the preconditioner obtained by solving linear programs offers an additional contraction on  x 2 [ 0.478 , 4.400 ] [ 0.478 , 4.463 ]  and  x 5 [ 0.316 , 0.025 ] [ 0.319 , 0.025 ] .

4.2.2. Minimizing/Maximizing the Upper/Lower Bound of the Interval Projection

Extending the idea of the previous section, now we attempt to construct preconditioning vectors p = ( p 1 , p 2 , , p m ) such that the projection minimizes (maximizes) the upper (lower) bound of the projection of the system p . A . x = 0 over the interval x k . Considering that a ^ = p . A , the upper bound of the projection of a ^ . x = 0 over the interval x k is
1 a ^ k j , j k n a ^ j x j ¯
Minimizing (8) is equivalent to maximizing j , j k n a ^ j x j ̲ with a ^ k = 1 . We can replace a ^ j x j ̲ by auxiliary variables w j and two inequalities: w j a ^ i j x j ̲ and w j a ^ i j x j ¯ . Finally, we obtain the following linear program equivalent to minimizing (8):
maximize j = 1 , j k n w j s . t . a ^ k = 1 a ^ j = i = 1 m p i · a i j j = 1 , , n w j a ^ j x j ̲ j = 1 , , n ; j k w j a ^ j x j ¯ j = 1 , , n ; j k
An opposite and analogous reasoning can be performed in order to obtain a linear problem for maximizing the lower bound of the projection of p . A . x = 0 over x k .
Then, solving the linear programs results in obtaining preconditioning vectors p. By means of Gauss–Seidel projections, each of these vectors is capable of improving one of the bounds of an interval x k . The obtained vectors p can be included in a preconditioning matrix P (duplicated vectors can be discarded).
Example 4.
By solving the two linear programs for each variable in Example 1, we obtain the following preconditioning matrix (recall that each row corresponds to a solution vector p of a linear program):
0.066 0.008 0.075 0.127 0.006 0.068 0.098 0.013 0.105 0.023 0.011 0.098 0.067 0.113 0.056 0.069 0.009 0.073 0.061 0.113 0.062
The first five rows were obtained by minimizing the upper bounds of the projections, while the last two were obtained by maximizing the lower bounds of the projections. The missing rows correspond to duplicated ones. Finally, we obtain the following matrix P . A :
1 0.586 0 0.361 0 1.400 1 0.354 0 0 0 0.505 1 1.428 0 0.495 0 0.574 1 0 0 0.050 0 0.025 1 0 0.353 0.699 1 0 0.083 0 0.003 0 1
Compared to the preconditioner obtained in  Section 4.2.1, the preconditioner in this section offers additional contraction on x 5 [ 0.245 , 0.025 ] [ 0.316 , 0.025 ] .
Proposition 1.
Let p be an optimal solution of the linear program (9). Then, by using the system p . A . x and Gauss–Seidel, we can optimally improve the upper bound of the interval x k .
Proof. 
The optimal upper bound of a interval domain x k is equivalent to the maximum value of x k subject to the constraint system A . x , i.e.,
maximize x k s . t . j = 1 n a i j · x j = 0 , i = 1 , , m x j ̲ x j x j ̲ , i = 1 , , m
We consider first the dual problem of Section 4.2.2. Let π R m be the vector associated with the constraints, l R n be the vector associated with the bound constraints x j ̲ x j , and u R n be the vector associated with the bound constraints x j x j ¯ . Thus, the dual problem of Section 4.2.2 can be stated as follows:
minimize i = 1 m 0 · π + j = 1 , j k n x j ̲ · l j j = 1 , j k n x j ¯ · u j s . t . i = 1 m a i j · π + l j u j = 0 , j = 1 , , n ; j k i = 1 m a i k · π = 1 l , u 0 ,   π     free
By defining p i = π i and a ^ j = i = 1 m p i · a i j , and also developing the previous linear program, we obtain
maximize j = 1 , j k n ( x j ¯ · u j x j ̲ · l j ) s . t . a ^ j + l j u j = 0 , j = 1 , , n ; j k i = 1 m p i · a i j = a ^ j , j = 1 , , n ; j k a ^ k = 1 l , u 0 ,   π   free
Let w j = x j ¯ · u j x j ̲ · l j , l j k . Notice that if the first constraint is multiplied by x j ¯ , we obtain the following result:
x j ¯ · a ^ j + x j ¯ · u j = x j ¯ · l j / adding x j ̲ · l j x j ¯ · a ^ j + w j = x j ¯ · l j x j ̲ · l j w j = x j ¯ · a ^ j + l j ( x j ¯ x j ̲ ) ,
as l j 0 , we can deduce that w j x j ¯ · a ^ j . Using the same procedure, but multiplying by x j ̲ instead, it can be deduced that w j x j ̲ · a ^ j . Finally, we reach the same linear program stated in (9).
As finding the best preconditioning vector p for projecting over the upper bound of x k is equivalent to the dual linear problem of finding an optimal upper bound of x k , then, according to the duality theorem, the value of the optimal solutions is the same. In other words, by using the preconditioned system p . A . x and the Gauss–Seidel procedure, we can optimally improve the upper bound of x k .    □
Proposition 1 can be extended to lower bounds in a straightforward way. It is important to highlight that an equivalent proposition was derived in [12] by directly using the duality theory of linear programming.

5. Experiments

For validating our approach, we first compare the different strategies for contracting variable domains related to linear systems (see Section 5.1). Then, we observe the contraction power of a preconditioner P in boxes which are smaller than the box used for generating P (see Section 5.2). Finally, we include the preconditioning-based strategies into a non-convex optimization solver and compare the results with those of a standard strategy (see Section 5.3).
For the first experiments, we generated several sets of benchmark instances by using a random linear system generator (https://github.com/vareyesr/linear-generator, accessed on 1 October 2022). The generator constructs rectangular linear systems A . x = b with n variables and m constraints ( n > m ). Each constraint i has the following structure:
j = 1 n a i j x j = b i ,
where a i j corresponds to a real value between 10 and 10.
Without loss of generality, we set b to the null vector (i.e., x equal to a null vector is always a solution of the problem). Additionally, for all the performed experiments, the number of variables was fixed at n = 20 . The number of constraints m varied from 12 to 19. For each value of m, 20 systems were generated. The bounds of each variable domain were initially set to random values uniformly distributed in the range [ 50 , 0 ] for the lower bound and [ 0 , 50 ] for the upper bound. It is important to note that each variable domain included the value 0 to prevent empty solutions or manifolds.
All the strategies explained in the previous sections were incorporated into Ibex 2.8.9 [25], a C++ state-of-the-art library for constraint processing over real numbers.

5.1. Contracting Power

Figure 1 reports a comparison between the different strategies. The plot on the left side shows, on problems with different number of constraints, an average relative width of the most contracted interval w.r.t. the width of the optimally contracted interval. Considering that an input box is contracted to x , then the corresponding relative width is wid ( x i ) wid ( x i * ) , where x i = argmin i { 1 n } wid ( x i ) and x * is an optimally contracted box. The optimal contraction is performed by obbt.
On the other hand, the right plot shows the average relative perimeter of the contracted box, w.r.t. the perimeter of the optimally contracted box. The relative perimeter is computed analogously to the relative width.
All the strategies perform the Gauss–Seidel procedure for contracting x on the system P . A . x = 0 . The strategy Gauss max constructs the preconditioning matrix P by using the Gauss elimination method with the maximum heuristic, while the strategy Gauss max-diam constructs P by using the heuristic that takes into account the size of the interval domains x i . The strategy LP min-size constructs P by solving the linear programs (7) that minimize the size of the projection intervals. Recall that the strategy that constructs P by solving the linear programs (9) is optimal; thus, its results are not reported in the plots.
From the figure, we can see that the linear-based preconditioner is the closest to obbt. The largest difference in terms of relative width/perimeter between all the strategies occurs when the number of constraints is 17.
On the other hand, when the number of rows is 19, all the strategies are optimal. The reason is that in this case, all the preconditioners behave as A 1 , as the number of rows in A is almost equivalent to the number of columns.
A similar situation occurs when the number of constraints is low. In this case, it seems that the initial domains are close to the global hull-consistency [26]. Global hull-consistency occurs when each domain bound is part of a solution.
Even if LP min-size is the one with the best contraction among all the three strategies, its contraction is suboptimal, this is due to this strategy using only one preconditioning vector p for improving both bounds of each interval. Additionally, we can see that by taking into account the domain sizes in the Gauss pivoting heuristics (Gauss max-diam), we reach a significantly better contraction compared to its counterpart.

5.2. Sustainability

In a second series of experiments, we evaluate the sustainability of the approaches. That is, we want to know how long in the search we could use the same preconditioner P without losing too much effectiveness in contraction. In addition to the strategies used in the first series of experiments, we have included the strategy LP-opt, which constructs a preconditioning matrix that offers the same contraction power as obbt.
The experiment involves first generating a preconditioning matrix P using the same set of benchmarks and initial box  x as the previous experiment. Then, we arbitrarily and randomly reduce this box to a fraction of its original width ( 50 % , 10 % and 1 % ). Plots in Figure 2 report the average relative width obtained by the contraction performed by the strategies by using the reduced box. In this way, we simulate, in a certain way, what happens in an iteration of a B&B solver after some subdivisions and domain reductions of the initial box x . The optimal contraction is performed by obbt on the reduced box.
In the plots, we can observe that as the widths decrease, the strategies move away from the optimal contraction achieved by the reference strategy. Additionally, it is evident that the linear-based strategies consistently outperform the Gauss-based ones. Similar to the previous experiment, the Gauss-based strategy considering the size of the interval domains performs better than its counterpart, even when the size of the box is small.
When the width of the reduced box is 50 % of the original width, we can see that the contraction performed by LP opt is the best among all the strategies. However, when the width is small ( 10 % or 1 % of the original width), LP min-size is more effective than LP opt. We think that as the preconditioning vectors p generated by the strategy LP min-size focus on improving both bounds of a variable at the same time, they are probably more adaptable to changes in the interval domain bounds.

5.3. Non-Convex Optimization Problems

As a final series of experiments, we implemented a preconditioning-based method for filtering variable domains (obbt-gs α ) and included it in an interval B&B global optimizer: IbexOpt [11]. In general, interval B&B methods solve non-convex optimization problems, such as (1), by performing a branch-and-bound schema from an initial node [3]. At each iteration, a box from a list of the remaining boxes is selected and processed. Once such selection is performed, the box is divided into two or more sub-boxes by using a bisection strategy. New boxes are then treated by one or several contractors. Contractors attempt to remove inconsistent values from the bounds of the intervals without the loss of solutions; this process is known as contraction. The types of contraction algorithms can be primarily divided into three categories: interval analysis contractors (e.g., Interval Newton [27]), constraint programming contractors, and linear relaxation-based contractors. These include constraint propagation methods such as HC4/FBBT or 3BCID, and linear-relaxation-based methods such as obbt. Finally, the upper bounding method is applied. This method consists in finding feasible solutions in the box to be used for pruning the search space.
Thus, in the following experiment, we propose to use obbt-gs α instead of obbt as the linear-relaxation-based contractor of IbexOpt. Algorithm 1 shows the method. As input, obbt-gs α receives the current box x , the objective function f and the functions g related to the constraint system g ( x ) 0 . It returns a contracted box x c .
First, if matrix P A has not been created, or certain conditions (which will be explained later) are met, then the constraint system is linearized with a traditional linearization technique (e.g., AF2 [28] and XNewton [5]) resulting in a linear system of inequalities: A . x = b (note that b is an interval vector). Then, the method obbt contracts the box and generates the preconditioning matrix P by using the strategy LPopt. Variables P A and P b are computed and stored for future use. The variable x p (last preconditioned box) is updated by the current box x . On the other hand, if conditions are not met, then the much cheaper Gauss–Seidel method is applied, using the stored variables P A and P b , for contracting the current box x instead of obbt.
Algorithm 1: The obbt-gs α contractor for reducing box domains.
Mathematics 11 03549 i001
The preconditioning matrix is recomputed, and the box is contracted by using obbt, if one of the following conditions is met:
  • P A has not been created.
  • It is not the turn of applying Gauss–Seidel: the user-defined parameter F indicates the frequency of applying Gauss–Seidel instead of obbt. For example, if F = 1 / 5 , then GS_turn ( F ) returns true once every 5 calls.
  • The space related to the current box is not related to the space used for computing the current preconditioning matrix, i.e., x x p . This occurs when the algorithm finishes a branch of the search tree and starts another one.
  • The box is too small compared to the last one used for preconditioning, i.e., wid ( x ) < α , with 0 α 1 a user-defined parameter.
Example 5.
Consider the following simple example to illustrate the algorithm:
m i n f ( x ) = 0.4 x 1 2 + 2 x 1 + x 2 s u b j e c t t o : g ( x ) = x 1 3 + 5 · x 1 2 10 ( x 1 + x 2 ) 0 .
Suppose that after performing some search, we reach a node where x 1 = [ 5 , 5 ] and x 2 = [ 5 , 5 ] and that the current best solution x has a value of f ( x ) = 3 . Since we are not interested in feasible suboptimal solutions, the additional constraint f ( x ) 3 is also considered in the system.
When Algorithm 1 processes the box, it linearizes the constraints. Figure 3-left graphically illustrates the example. Constraints are projected on the box, i.e., we plot  f ( x ) = 3  and  g ( x ) = 0 . The feasible region corresponds to the area below the objective function and above the constraint function. The optimal solution of the problem, i.e.,  x = ( 0.51 , 0.66 ) , is also plotted. Notice that  g ( x )  is not convex and its linearization is only valid inside the box.
Straight lines represent the linearization of the constraints corresponding to
2 x 1 + x 2 [ 3 , + ) ( linearization of f ( x ) 3 ) x 1 x 2 [ 0 , + ) ( linearization of g ( x ) 0 ) .
Next,obbtis applied to contract the variables x 1 and x 2  (see Figure 3-right). That is, bounds are found by maximizing and minimizing each variable over the linear system. The corresponding dual optimal solutions are used for generating the preconditioning vectors  p k  for each bound (notice that this is equivalent to solving linear program (9) according to the proof of Proposition 1).
Finally, the matrix P allows us to construct the linear system  P . A . x = P . b :
1 1 0 1 0 1 1 2 1 0 1 1 2 1 1 0 x 1 x 2 w 1 w 2 = 0 0 0 0
Notice that, w 1 = [ 3 , ] and w 2 = [ 0 , ] correspond to auxiliary variables. Applying Gauss–Seidel on this system reaches the same contraction asobbtin the current box and will be used in subsequent iterations of the algorithm (line 10 of the algorithm).
The set of instances (42 in total) was selected from the COCONUT benchmarks [29] for global optimization (https://arnold-neumaier.at/glopt/coconut/Benchmark/Benchmark.html, accessed on 1 December 2022). We selected all the instances solved by the default strategy (i.e., IbexOpt using obbt) in a time period between 2 and 3600 s. Details of these instances are shown in Table 1, where n represents the number of variables, m represents the number of constraints, and #nonlinear represents the number of nonlinear constraints. It is important to note that all of these benchmarks (1) have a nonlinear objective function, and (2) are differentiable except at certain points where discontinuities may exist.
In Figure 4, we report the average percentage difference of the processed boxes (left) and the percentage difference in CPU time (right) of the strategies (the percentage difference is computed: s s * s * 100 , where s correspond to any strategy, and s * is the default strategy), w.r.t. the default strategy, with different values of F . Notice that a negative value denotes that the strategy reduced its time or the number of boxes compared to the default strategy, while positive values indicate an increase. We consider the following strategies: obbt-gs α and obbt-gs for different values of α . obbt-gs does not take into account the condition wid ( x ) < α · wid ( x p ) , i.e., α = 0 .
From the results, we can see that as we increase the frequency of applying obbt (i.e., we reduce the value of F ), the average percentage differences, evidently, approach 0. Although obbt-gs α seems to be less effective in contraction than obbt, when the value of F is adequately chosen, we may reach an improvement in CPU time. For instance, when F = 1 / 4 , the increase in the search tree size is compensated by a reduction in the CPU time required for processing nodes, and we reach an average percentage difference of 6.6 % ( α = 10 3 ).
In Figure 5, we report relative gains in obbt-gs _ a l p h a , w.r.t. to the reference strategy, with different configurations of parameters. The best results are reported when α = 10 3 and F = 1 / 4 . When the condition wid ( x ) < α · wid ( x p ) is not considered, i.e., α = 0 , the best results are reported when F = 1 / 5 . Additionally, obbt with F = 1 / b consists in applying obbtb out of b + 1 times, i.e, it is equivalent to obbt-gs with α = 0 but without using Gauss–Seidel.
Finally, Table 2 reports results on most relevant instances. In this table, we considered benchmark instances which were solved by the reference strategy in a time greater than 2 s and having a difference, in terms of both the number of boxes and CPU time, of at least 10 % compared with the other strategies (16 instances). The column labeled boxes (CPU) reports the number of boxes (CPU time) processed by using a strategy. The table also presents the percentage difference in the CPU time spent ( Δ t) compared to the default strategy. The strategy obbt-gs considers parameter values F = 1 / 5 and α = 0.0 , i.e., the best configuration found when α is fixed to 0.0 . On the other hand, the strategy obbt-gs α considers parameter values F = 1 / 4 and α = 10 3 , i.e., the best found configuration for the parameters. In bold, we highlight the best CPU times. The last row reports the relative gain in CPU time for the considered set of instances compared with the reference strategy, i.e., obbt.
First notice that obbt-gs α reports, in average, the greatest reductions in terms of CPU time (average percentage differenoe of 12.7 % ). This shows that a well-preconditioned matrix can provide a better balance between contraction power and CPU cost compared to using the traditional obbt strategy, which, although it offers optimal contraction, is much more expensive. On the other hand, if we compare the results with obbt-gs, the parameter α seems to be important, that is, it seems important to update the preconditioning matrix when the boxes are to small compared to the box used in the previous update.
In some instances, such as ex6_2_8, ex8_5_2_1, hs100 and hs113, we can observe important gains in terms of CPU time, even if the number of boxes increases considerably w.r.t the obbt strategy. This is due to the fact that the worsening of contraction effectiveness is highly compensated by the time complexity reduction of the Gauss–Seidel method. On the other hand, notice that the reference strategy outperforms obbt-gs and obbt-gs α in more than 10% of the reported CPU time only in 2 instances (dualc8 and chembis). In these cases, the time complexity reduction of the Gauss–Seidel method is not enough to compensate for the significant increase in the number of boxes.

6. Conclusions

In this work, we propose three methods for generating preconditioning matrices for non-square linear systems A . x = b . These preconditioners can be used for improving the contraction of iterative methods such as the Gauss–Seidel algorithm. The first method generates a preconditioner by using a Gauss–Jordan elimination method that takes into account the width of the intervals for selecting the next pivot. In this way, we are selecting the row i maximizing a i k · wid ( x k ) for contracting x k instead of simply selecting the row maximizing a i k as a previous approach. Additionally, we have presented two preconditioners generated by solving linear programs. They are focused, respectively, on minimizing the interval size and optimizing the bounds of the variable domains.
The experiments show promising results. On the one hand, by using the preconditioner based on Gauss–Jordan elimination, we obtain a better contraction than using its counterpart which selects the maximum coefficient of A for pivoting. On the other hand, the preconditioners generated by solving linear programs outperform the ones based on Gauss–Jordan elimination.
We also show that, by using the preconditioner focused on optimizing the interval bounds of a box x , we reach an optimal contraction of this box. In addition, when a smaller box x x is contracted, although the preconditioner does not offer an optimal contraction, it is still better than the Gauss–Jordan-based strategies. Finally, we propose a simple contractor (obbt-gs) α that replaces obbt calls by Gauss–Seidel iterations on the system P . A x = P b when the current box is similar to the box used for generating the last available preconditioner P (and linearization A . x = b ). Otherwise, obbt is applied normally and a new preconditioner (and linearization) is generated for future calls to the method. obbt-gs α shows promising results when included into a solver for non-convex optimization problems.
As a future work, we plan to design a more intelligent mechanism for updating P. It should update P only when it is needed, e.g., when some relevant coefficients in A or some relevant bounds of variable domains suffer significant changes. To achieve this, we propose exploring the use of deep learning mechanisms or machine learning algorithms to determine when it is necessary to update P. By training a model on historical data and monitoring changes in the problem structure, we can identify key indicators that trigger the need for a new preconditioner. This would optimize the usage of the obbt-gs α method, reducing unnecessary overheads while ensuring improved convergence rates when relevant changes occur.
Additionally, we aim to investigate the impact of different preconditioning techniques on a broader range of non-convex optimization problems. Understanding how the proposed preconditioners perform on various problem classes and problem sizes will provide valuable insights into their versatility and effectiveness in different scenarios.

Author Contributions

Validation, V.R.; Investigation, V.R. and I.A.; Writing—original draft, V.R. and I.A. All authors have read and agreed to the published version of the manuscript.

Funding

Victor Reyes is supported by Fondecyt project 11230225, and Ignacio Araya is supported by Fondecyt project 1200035.

Data Availability Statement

IbexOpt can be downloaded from https://github.com/ibex-team/ibex-lib (accessed on 1 October 2022). Benchmark used in Section 5.2 and Section 5.1 were generated by using https://github.com/vareyesr/linear-generator (accessed on 1 October 2022). Benchmarks used in Section 5.3 can be found in https://arnold-neumaier.at/glopt/coconut/Benchmark/Benchmark.html (accessed on 1 December 2022).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Bertsekas, D. Nonlinear Programming, 3rd ed.; Optimized Computers Series; Athena Scientific: Belmont, MA, USA, 2016; Available online: http://www.athenasc.com/nonlinbook.html (accessed on 1 August 2023).
  2. Boyd, S.P.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
  3. Araya, I.; Reyes, V. Interval branch-and-bound algorithms for optimization and constraint satisfaction: A survey and prospects. J. Glob. Optim. 2016, 65, 837–866. [Google Scholar] [CrossRef]
  4. Locatelli, M.; Schoen, F. Global Optimization: Theory, Algorithms and Applications; SIAM: Philadelphia, PA, USA, 2013. [Google Scholar]
  5. Araya, I.; Trombettoni, G.; Neveu, B. A contractor based on convex interval taylor. In Proceedings of the International Conference on Integration of Artificial Intelligence (AI) and Operations Research (OR) Techniques in Constraint Programming, Nantes, France, 28 May–1 June 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 1–16. [Google Scholar]
  6. Adjiman, C.S.; Dallwig, S.; Floudas, C.A.; Neumaier, A. A global optimization method, αBB, for general twice-differentiable constrained NLPs—I. Theoretical advances. Comput. Chem. Eng. 1998, 22, 1137–1158. [Google Scholar] [CrossRef]
  7. Misener, R.; Floudas, C.A. ANTIGONE: Algorithms for continuous/integer global optimization of nonlinear equations. J. Glob. Optim. 2014, 59, 503–526. [Google Scholar] [CrossRef]
  8. Belotti, P.; Lee, J.; Liberti, L.; Margot, F.; Wächter, A. Branching and bounds tightening techniques for non-convex MINLP. Optim. Methods Softw. 2009, 24, 597–634. [Google Scholar] [CrossRef]
  9. Nowak, I.; Vigerske, S. LaGO: A (heuristic) branch and cut algorithm for nonconvex MINLPs. Cent. Eur. J. Oper. Res. 2008, 16, 127–138. [Google Scholar] [CrossRef]
  10. Achterberg, T. SCIP: Solving constraint integer programs. Math. Program. Comput. 2009, 1, 1–41. [Google Scholar] [CrossRef]
  11. Trombettoni, G.; Ignacio, A.; Neveu, B.; Chabert, G. Inner regions and interval linearizations for global optimization. In Proceedings of the AAAI, San Francisco, CA, USA, 7–11 August 2011. [Google Scholar]
  12. Gleixner, A.M.; Berthold, T.; Müller, B.; Weltge, S. Three enhancements for optimization-based bound tightening. J. Glob. Optim. 2017, 67, 731–757. [Google Scholar] [CrossRef]
  13. Cengil, F.; Nagarajan, H.; Bent, R.; Eksioglu, S.; Eksioglu, B. Learning to accelerate globally optimal solutions to the AC Optimal Power Flow problem. Electr. Power Syst. Res. 2022, 212, 108275. [Google Scholar] [CrossRef]
  14. Suriyanarayana, V.; Tavaslioglu, O.; Patel, A.B.; Schaefer, A.J. DeepSimplex: Reinforcement Learning of Pivot Rules Improves the Efficiency of Simplex Algorithm in Solving Linear Programming Problems. 2019. Available online: https://openreview.net/pdf?id=SkgvvCVtDS (accessed on 29 July 2023).
  15. Forrest, J.J.; Goldfarb, D. Steepest-edge simplex algorithms for linear programming. Math. Program. 1992, 57, 341–374. [Google Scholar] [CrossRef]
  16. Dantzig, G.B.; Orden, A.; Wolfe, P. The generalized simplex method for minimizing a linear form under linear inequality restraints. Pac. J. Math. 1955, 5, 183–195. [Google Scholar] [CrossRef]
  17. Niki, H.; Kohno, T.; Morimoto, M. The preconditioned Gauss–Seidel method faster than the SOR method. J. Comput. Appl. Math. 2008, 219, 59–71. [Google Scholar] [CrossRef]
  18. Hansen, E.; Walster, G.W. Solving overdetermined systems of interval linear equations. Reliab. Comput. 2006, 12, 239–243. [Google Scholar] [CrossRef]
  19. Ceberio, M.; Granvilliers, L. Solving nonlinear equations by abstraction, Gaussian elimination, and interval methods. In Proceedings of the International Workshop on Frontiers of Combining Systems, Santa Margherita Ligure, Italy, 8–10 April 2002; Springer: Berlin/Heidelberg, Germany, 2002; pp. 117–131. [Google Scholar]
  20. Abdi, H. The method of least squares. Encycl. Meas. Stat. 2007, 1, 530–532. [Google Scholar]
  21. Golub, G.H.; Reinsch, C. Singular value decomposition and least squares solutions. In Linear Algebra; Springer: Berlin/Heidelberg, Germany, 1971; pp. 134–151. [Google Scholar]
  22. Jaulin, L.; Kieffer, M.; Didrit, O.; Walter, E.; Jaulin, L.; Kieffer, M.; Didrit, O.; Walter, É. Interval Analysis; Springer: Berlin/Heidelberg, Germany, 2001. [Google Scholar]
  23. Horáček, J.; Hladík, M. Subsquares approach—A simple scheme for solving overdetermined interval linear systems. In Proceedings of the International Conference on Parallel Processing and Applied Mathematics, Warsaw, Poland, 8–11 September 2013; Springer: Berlin/Heidelberg, Germany, 2013; pp. 613–622. [Google Scholar]
  24. Domes, F.; Neumaier, A. Rigorous filtering using linear relaxations. J. Glob. Optim. 2012, 53, 441–473. [Google Scholar] [CrossRef]
  25. Chabert, G.; Jaulin, L. Contractor Programming. Artif. Intell. 2009, 173, 1079–1100. [Google Scholar] [CrossRef]
  26. Benhamou, F.; Goualard, F.; Granvilliers, L.; Puget, J.F. Revising hull and box consistency. In Proceedings of the International Conference on Logic Programming, Las Cruces, NM, USA, 29 November–4 December 1999; Citeseer: State College, PA, USA, 1999. [Google Scholar]
  27. Moore, R.E. Methods and Applications of Interval Analysis; SIAM: Philadelphia, PA, USA, 1979. [Google Scholar]
  28. Ninin, J.; Messine, F.; Hansen, P. A reliable affine relaxation method for global optimization. 4OR 2015, 13, 247–277. [Google Scholar] [CrossRef]
  29. Shcherbina, O.; Neumaier, A.; Sam-Haroud, D.; Vu, X.H.; Nguyen, T.V. Benchmarking global optimization and constraint satisfaction codes. In Proceedings of the Global Optimization and Constraint Satisfaction: First International Workshop on Global Constraint Optimization and Constraint Satisfaction, COCOS 2002, Valbonne-Sophia Antipolis, France, 15–18 October 2002; Revised Selected Papers 1. Springer: Berlin/Heidelberg, Germany, 2003; pp. 211–222. [Google Scholar]
Figure 1. Average relative sizes of the contracted boxes w.r.t. the optimal contracted boxes. (left) Relative width of the most contracted variable; and (right) relative perimeter after contraction.
Figure 1. Average relative sizes of the contracted boxes w.r.t. the optimal contracted boxes. (left) Relative width of the most contracted variable; and (right) relative perimeter after contraction.
Mathematics 11 03549 g001
Figure 2. Relative width of the most contracted interval on linear systems with a different number of constraints. Each strategy uses the initial box x for generating the preconditioning matrix P. Then, the contraction is performed on a randomly generated box x x with (a) 50 % , (b) 10 % and (c) 1 % of the width of x .
Figure 2. Relative width of the most contracted interval on linear systems with a different number of constraints. Each strategy uses the initial box x for generating the preconditioning matrix P. Then, the contraction is performed on a randomly generated box x x with (a) 50 % , (b) 10 % and (c) 1 % of the width of x .
Mathematics 11 03549 g002
Figure 3. Example showing the constraints f ( x ) 3 , and g ( x ) 0 projected over a box x = x 1 × x 2 . Straight lines represent the linearization of the functions over the box. The point ( 0.51 , 0.66 ) corresponds to the solution minimizing f ( x ) subject to the constraint. On the right figure we can see, in blue, the box generated after applying obbt over the initial box and the linear system.
Figure 3. Example showing the constraints f ( x ) 3 , and g ( x ) 0 projected over a box x = x 1 × x 2 . Straight lines represent the linearization of the functions over the box. The point ( 0.51 , 0.66 ) corresponds to the solution minimizing f ( x ) subject to the constraint. On the right figure we can see, in blue, the box generated after applying obbt over the initial box and the linear system.
Mathematics 11 03549 g003
Figure 4. Performance profile. Comparison between the results reported by different configurations of α for the obbt-gs strategy: (left) percentage of contraction obtained by a strategy given a certain number of calls ( F ); and (right) percentage of CPU time spent by an strategy.
Figure 4. Performance profile. Comparison between the results reported by different configurations of α for the obbt-gs strategy: (left) percentage of contraction obtained by a strategy given a certain number of calls ( F ); and (right) percentage of CPU time spent by an strategy.
Mathematics 11 03549 g004
Figure 5. Summary of the best results, i.e., obbt-gs with an α of 10 3 and the obbt-gs without the α parameter. Additionally, obbt represents the curve where only the obbt is applied (without obbt-gs). (left) Percentage of contraction obtained by a strategy given a certain number of calls ( F ); (right) Percentage of CPU time spent by an strategy.
Figure 5. Summary of the best results, i.e., obbt-gs with an α of 10 3 and the obbt-gs without the α parameter. Additionally, obbt represents the curve where only the obbt is applied (without obbt-gs). (left) Percentage of contraction obtained by a strategy given a certain number of calls ( F ); (right) Percentage of CPU time spent by an strategy.
Mathematics 11 03549 g005
Table 1. Details of the benchmark instances used in the experiments.
Table 1. Details of the benchmark instances used in the experiments.
Benchmarknm#NonlinearBenchmarknm#Nonlinear
avgasa8100ex8_4_4bis540
chembis1140ex8_4_5151111
dipigri744ex8_4_5bis410
dixchlng1055ex8_5_1652
dualc88150ex8_5_1-1652
ex2_1_720100ex8_5_1bis762
ex2_1_824200ex8_5_2_1652
ex2_1_91010ex8_5_4542
ex5_4_427196ex8_5_5542
ex6_1_31296ex8_5_6642
ex6_1_3bis630hhfair27256
ex6_2_10630hs056744
ex6_2_11310hs100744
ex6_2_12420hs1131088
ex6_2_14420hs1191680
ex6_2_8310hydro30246
ex7_2_8844meanvar720
ex7_3_4bis7142schwefel5550
ex7_3_5bis462schwefel5-abs550
ex8_1_3220srcpm38200
ex8_4_4-1171212srcpm-139200
Table 2. CPU times and number of boxes for the reference strategy obbt, obbt-gs (with parameter values F = 1 / 5 and α = 0.0 ) and obbt-gs α (with parameter values F = 1 / 4 and α = 10 3 ). In bold, we highlight the best results.
Table 2. CPU times and number of boxes for the reference strategy obbt, obbt-gs (with parameter values F = 1 / 5 and α = 0.0 ) and obbt-gs α (with parameter values F = 1 / 4 and α = 10 3 ). In bold, we highlight the best results.
obbtobbt-gsobbt-gs α
BoxesCPUBoxesCPU Δ tBoxesCPU Δ t
ex6_2_12785410.8985411.13.3%10,29010.1−6.2%
ex6_2_831,79341.545,64938.2−8.2%46,07932.6−21.7%
ex8_4_4bis77,506114114,09199.1−13.1%120,76087.1−26.6%
ex8_5_2_1961623.613,58721.6−8.7%14,85319.2−18.9%
schwefel5−abs63298.694856.27−27.1%11,2837.97−7.3%
ex6_1_316,55312222,837112.4−7.9%21,87593.1−23.8%
srcpm3377.025976.3−10.1%6006.30−10.3%
ex2_1_7251517.9293916.3−9.0%260915.1−15.8%
ex8_5_1413510.150727.07−30.1%48998.41−16.8%
dixchlng17878.3019356.71−19.7%21586.51−22.1%
hs10026676.7537535.71−15.4%40504.8−28.3%
hs113476919.3681716.3−15.9%759315.0−22.5%
hhfair160921.9213218.9−13.8%218525.215.4 %
dualc8190,092532331,28174940.9%322,69861114.9%
chembis483,6271425802,579172521.1%500,69014622.6%
ex8_4_516,93421627,942204−5.8%17,368191−11.5%
avg: −8.5%avg: −12.7%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Reyes, V.; Araya, I. Non-Convex Optimization: Using Preconditioning Matrices for Optimally Improving Variable Bounds in Linear Relaxations. Mathematics 2023, 11, 3549. https://doi.org/10.3390/math11163549

AMA Style

Reyes V, Araya I. Non-Convex Optimization: Using Preconditioning Matrices for Optimally Improving Variable Bounds in Linear Relaxations. Mathematics. 2023; 11(16):3549. https://doi.org/10.3390/math11163549

Chicago/Turabian Style

Reyes, Victor, and Ignacio Araya. 2023. "Non-Convex Optimization: Using Preconditioning Matrices for Optimally Improving Variable Bounds in Linear Relaxations" Mathematics 11, no. 16: 3549. https://doi.org/10.3390/math11163549

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop