Next Article in Journal
Overview on Intrusion Detection Systems Design Exploiting Machine Learning for Networking Cybersecurity
Previous Article in Journal
ECGYOLO: Mask Detection Algorithm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on Cloud Task Scheduling Algorithm with Conflict Constraints Based on Branch-and-Price

1
School of Information Science and Engineering, Yunnan University, Kunming 650504, China
2
School of Mathematics and Statistics, Yunnan University, Kunming 650504, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2023, 13(13), 7505; https://doi.org/10.3390/app13137505
Submission received: 24 May 2023 / Revised: 13 June 2023 / Accepted: 23 June 2023 / Published: 25 June 2023
(This article belongs to the Section Computing and Artificial Intelligence)

Abstract

:
The low-energy task scheduling of cloud computing systems is a key issue in the field of cloud computing. Nevertheless, existing works on task scheduling lack consideration of the conflict relationship between tasks and focus on heuristic and other approximate algorithms. Thus, solving the problem of minimizing energy consumption with antiaffinity constraints between tasks and designing an efficient exact algorithm for task scheduling is a major challenge. This paper abstracts the problem into a multidimensional bin packing model with conflict constraints. The model is decomposed by the Lagrange relaxation principle and Dantzig–Wolfe decomposition principle. Moreover, we propose an accurate algorithm based on branch-and-price. The algorithm benefits from a new initial solution generation scheme based on maximum cliques and dominant resource proportion, and a multipattern branching strategy. The efficiency of the proposed branch-and-price algorithm is verified by a number of numerical experiments.

1. Introduction

With the increasing development of cloud computing technology, the huge energy consumption in cloud computing has attracted wide attention. The proportion of energy consumption in cloud data centers is estimated as 1.4% of the global energy consumption, which increases by approximately 12% annually [1]. Currently, many studies in cloud computing focus on task scheduling algorithms with the goal of reducing energy consumption [2,3]. Most of the existing works use heuristic algorithms, random algorithms, and intelligent algorithms to solve the approximate optimal solution of the problem [4,5].
To reduce the energy consumption of cloud computing systems, an effective and common method is to schedule tasks to as few virtual machines (VMs) as possible [6]. However, the existing studies lack research on the stability of cloud computing system, especially without considering the antiaffinity constraints between tasks. Kubernetes [7] first introduced the new feature of node antiaffinity. This feature can improve the usability and robustness of the applications. Similarly, OpenStack [8] has antiaffinity scheduler filter. In a cloud computing system, the antiaffinity constraint refers to the existence of interference factors between two tasks in the system. When these two tasks are deployed on the same VM, their performance may be influenced by each other. For example, Chi et al. [9] experimentally verified that deploying a disk-intensive task to a physical machine with multiple disk-intensive tasks will lead to resource contention and serious performance degradation. Therefore, when scheduling tasks, it is necessary to avoid the deployment of mutually interfering tasks on the same VM. This problem can be equivalent to the multidimensional bin packing problem with conflicts (MDBPPC).
The MDBPPC model originates from the classic bin packing problem model, but due to its difficulty in solving, research on this problem is very few. The existing research focuses on the bin packing problem with conflicts (BPPC) of single-dimension resources. Jansen [10] first introduced the conflict relationship in the traditional bin packing model, and represented the conflict relationship in the form of a graph structure. Christman et al. [11] studied the bin packing problem with stability, which is a variant of BPPC. Note that previous works have focused on algorithm design for single-dimension resource requirements without considering the situation of multidimension resource requirements.
Therefore, this paper studies the cloud computing task scheduling problem with antiaffinity constraints, and aims to minimize the number of cloud computing VMs to reduce the energy consumption of the cloud computing system. The main contributions of this work are summarized as follows:
  • We formulate the low-energy task scheduling problem with antiaffinity constraints in cloud computing as an MDBPPC problem. To reduce the solution difficulty, we decompose the problem by Lagrange relaxation principle and Dantzig–Wolfe decomposition principle.
  • To obtain the optimal integer solution, we design an efficient branch-and-price algorithm with several improvements. First, we introduce the idea of the dominant resource proportion into the process of initial solution generation. Second, we use an efficient branch rule based on multiple pattern. Third, we use the hot start and upper bound strategy to improve the branch-and-price algorithm.
  • We conduct numerical simulations on three scales to evaluate the correctness and effectiveness of our proposed algorithm. Moreover, we verify the effectiveness of the initial solution strategy and the branching strategy.
The remainder of this paper is organized as follows. Section 2 summarizes the related work. Section 3 describes the cloud computing task scheduling problem with conflict constraints, Section 4 decomposes the original model. Section 5 proposes a task scheduling algorithm based on branch-and-price, and Section 6 gives the experimental results. Section 7 summarizes the work of the full text.

2. Related Works

Recently, there have been many works focused on reducing the energy consumption of task scheduling in cloud computing. Beloglazov et al. [12], as pioneers, designed an energy-efficient cloud computing architecture model based on CPU and proposed an energy-efficient resource allocation algorithm, but did not consider hardware resources other than CPU. Garg et al. [13] considered multiple resources; they first designed a mathematical model to measure the energy consumption in the process of task execution, and then proposed a scheduling algorithm that can minimize the average energy consumption of each task. Mekala et al. [14] classified tasks and VMs, and scheduled tasks to corresponding VMs based on the classification results. This scheduling algorithm can both improve resource utilization and reduce energy consumption. Zhang et al. [15] weighed the energy consumption and performance of the cloud computing system. They abstracted the problem into a double-objective integer programming problem and designed a two-stage heuristic algorithm. Liang et al. [16] designed a dynamic hybrid resource deployment algorithm to minimize energy consumption of data centers by improving the average utilization rate of physical machines. The algorithm is based on the K-means clustering algorithm in unsupervised learning and the KNN classification algorithm in supervised learning. Note that the previous works are mostly based on approximation algorithms. The solution results often lack theoretical support and can only be evaluated by numerical experiments, thus lacking in promotion.
In terms of the algorithm research in BPPC, Gendreau et al. [17] studied the lower bound of the problem and designed various heuristic algorithms. Capua et al. [18] proposed an iterated local search algorithm. This metaheuristic algorithm can enhance the search procedure with a variety of local and large neighborhoods. Elhedhli et al. [19] proposed an exact algorithm based on branch-and-price. They utilized the structure of conflict constraints to design an efficient branching strategy. This strategy can preserve the model of subproblems after branching, thus greatly improving the efficiency of the algorithm. Wei et al. [20] designed a new label-setting scheme that can effectively solve the pricing problem and accelerate algorithm execution speed. Ekici [21] considered a variant model of BPPC, which allowed fragments of items. He proved that the proposed new model was still NP-hard and designed a heuristic algorithm based on a lower bounding scheme. It is noteworthy that the existing studies focus on the algorithm design for single-dimension resource requirements.

3. System Model and Problem Formulation

We assume that there are m VMs with the same configuration in the cloud computing system, and let M = { 1 , 2 , , M } be the set of VMs. Each VM has k types of hardware resources, such as CPU, memory, and disk. Let R = { 1 , 2 , , M } be the type set of hardware resources, and let vector c = ( c 1 , c 2 , , c k ) represent the resource capacity of each VM. Meanwhile, there are n ( n m ) tasks waiting to be deployed in the cloud computing system. Let J = { 1 , 2 , , n } represent the set of tasks. For each task j ( j J ) , let d j = ( d j 1 , d j 2 , , d j k ) represent the resource requirement vector of task j, where d j r represents the requirement of the jth task for the rth resource. In a cloud computing environment, there may be interference between tasks. If mutually interfering tasks are deployed on the same VM, their performance may be affected. Therefore, when scheduling tasks, it is important to avoid scheduling mutually interfering tasks to the same VM. The conflict interference relationship between tasks is characterized by a conflict graph G = ( V , E ) . Each vertex in vertex set V corresponds to each task in task set J, and the conflict relationship between tasks corresponds to edge set E. For any two vertices j , j V , if there is ( j , j ) E , it indicates that there is a conflict relationship between task j and task j ; thus, they cannot be deployed on the same VM. The optimal solution for task scheduling of this problem is to use as few VMs as possible. The aforementioned problem can be described by the following integer programming model:
z = min i M y i
s . t . j J d j r · x i j c r · y i , i M , r R
i M x i j = 1 , j J
x i j + x i j 1 , ( j , j ) E , i M
x i j , y i { 0 , 1 } , j J , i M
The objective z is to minimize the number of cloud computing VMs deployed. The decision variable x i j indicates whether task j is scheduled to VM i, where x i j = 1 indicates that task j is executed on VM i, and x i j = 0 indicates that it is not. The decision variable y i indicates whether there exists a task executed on the ith VM, where y i = 1 indicates that there is and y i = 0 otherwise. Formula (2) indicates that the resource used of any VM cannot exceed the capacity of any type of resource. Formula (3) ensures that each task has to be executed on exactly one VM. Formula (4) ensures that conflicting tasks cannot be executed on the same VM. Formula (5) ensures that the values of decision variables x i j and y i can only be 0 or 1.

4. Model Decomposition

Since the above problem is a generalization of the classical bin packing problem and graph coloring problem, its solution is quite difficult: when the conflict graph is an empty set, the problem can be transformed into a classic multidimensional bin packing problem, and when the capacity of each type of resource in each VM is zero, the problem can be transformed into a graph coloring problem. Both the bin packing problem and the graph coloring problem have been proven to be NP-hard problems [22]. The time complexity of solving such problems increases as the size of the problem increases. Therefore, this paper first relaxes the tightly coupled constraint condition (3) of the original model based on the Lagrange relaxation principle [23] to obtain the Lagrange relaxation problem of the original problem. Second, we decompose the relaxation problem based on the Dantzig–Wolfe decomposition principle [24]. Thus, we can obtain a main problem with a relatively simple structure and a multidimensional bin packing subproblem with conflict constraints.
We introduce a Lagrange multiplier μ = ( μ 1 , μ 2 , , μ n ) to perform Lagrange relaxation on the constraint condition (3). z ( μ ) is the optimal solution of the relaxation problem, and the model of the Lagrangian relaxation problem is as follows:
z ( μ ) = min i M y i i M j J μ j · x i j + j J μ j s . t . : ( 2 ) , ( 4 ) , ( 5 )
The problem can be decomposed into m identical multidimensional bin packing problems with conflict constraints, that is, z = m · z i ( μ ) + j J μ j , which is called subproblem (SP). The model is as follows:
( SP ) z i ( μ ) = min y j J μ j · x j
s . t . j J d j r · x j c r · y , r R
x j + x j 1 , ( j , j ) E
x j , y { 0 , 1 } , j J
Assuming that there are H feasible solutions that satisfy the constraint conditions (7)–(9), each feasible solution is represented by the vector ( x 1 h , x 2 h , x n h , y h ) h = 1 , 2 , , H T . Therefore, z i ( μ ) = min h = 1 , 2 , H y h j J μ j · x j h , and the optimal Lagrangian lower bound of the original problem is
max μ m · z i ( μ ) + j J μ j = max μ m · min h = 1 , 2 , , H y h j J μ j · x j h + j J μ j
We define variable θ = min h = 1 , 2 , , H y h j J μ j · x j h . Since ( 0 , 0 , , 0 , 0 ) T is a feasible solution to the SP, θ 0 . We can obtain the following Lagrangian master problem (LMP):
( LMP ) max ( j J μ j + m · θ )
s . t . θ + j J μ j · x j h y h , h = 1 , 2 , , H
θ 0
We can obtain the Dantzig–Wolfe master problem (DMP) which is the dual solution of the previous LMP as follows:
( DMP ) min h = 1 H y h · a h
s . t . h = 1 H a h m
h = 1 H x j h · a h = 1 , j J
a h 0 , h = 1 , 2 , , H
From the above models, it can be seen that the optimal solution of SP provides a lower bound m · z i ( μ ) + j J μ j for z ( μ ) . The DMP is equivalent to the original relaxation problem, but due to the large number of DMP columns (coefficients) that cannot be enumerated one by one, we iteratively add effective columns through the column generation algorithm to obtain the restricted master problem (RMP). The optimal solution of RMP provides an upper bound for z ( μ ) . Therefore, by iteratively solving SP and RMP until the upper and lower bounds of z ( μ ) are equal, the current result is the optimal solution of the Lagrange relaxation problem z ( μ ) .

5. Cloud Task Scheduling Algorithm Based on Branch-and-Price

The optimal solution of the Lagrange relaxation problem obtained through column generation is usually not an integer solution. To obtain the optimal integer solution, it is necessary to use a branch-and-bound algorithm to branch the fractional variables, and obtain the optimal integer solution through a limited number of branches. Therefore, we design a branch-and-price algorithm that combines the branch-and-bound algorithm and column generation algorithm to address the issues and characteristics of the proposed model. The branch-and-price algorithm for task scheduling in clouds (BPTSC) involves three main parts when iteratively searching in the solution space: the generation of the initial solution (column) of the root node, column generation for solving the Lagrangian relaxation optimal solution corresponding to the node, and the branch-and-bound stage. The following subsections will focus on the algorithm design of these three parts.

5.1. Generation of Initial Solution

5.1.1. Algorithmic Procedure

The quality of the initial solution has a significant impact on the performance of the branch-and-price algorithm. A good initial solution can effectively reduce the number of iterations in the column generation solution process, thereby improving the execution speed of the branch-and-price algorithm. The generation strategy of the initial solution for the root node and child nodes in the branch-and-price algorithm search tree is different. The strategy for the root node usually needs to be constructed based on the characteristics of the model, while the child nodes can use the feasible columns on the parent node as its initial columns. Therefore, the quality of the initial solution generation algorithm at the root node is crucial.
To improve the solving speed of the algorithm, we first use the improved Bron–Kerbosch algorithm proposed in [25] to identify all task conflict maximum cliques based on the task conflict graph G at the root node. Then, we add conflict constraints based on maximum cliques to the subproblems of the algorithm, thus reducing the number of constraints and accelerating the algorithm’s solving speed. Meanwhile, the improved Bron–Kerbosch algorithm is also used to find the complete partition of the maximal independent set of tasks. According to the characteristics of multidimensional resources in our proposed model, we introduce the idea of the dominant resource proportion [26,27,28,29] in the multiresources fair allocation to improve the famous first fit decreasing [30] algorithm. The improved algorithm allocates VMs to the tasks in each independent set partition. The definition of the dominant resource proportion is as follows:
d r j = max r d j r c r
A larger d r j indicates that task j has a higher requirement for its dominant resources. The steps for generating the initial solution algorithm at the root node are as follows:
Step 1 Use the improved Bron–Kerbosch algorithm to determine all task conflict maximum cliques and all complete partitions of task maximal independent set.
Step 2 If all task-independent sets have been processed, the algorithm will be terminated. Otherwise, retrieve the first unprocessed independent set, and sort the tasks by the nonincreasing order of the task dominant resource proportion.
Step 3 Enable a VM for the current independent set.
Step 4 Sequentially traverse the tasks in the current independent set and sort the corresponding deployed VMs in ascending order of index number. Schedule task j to the VM with the smallest index number, and the remaining capacity of each type of resource is not less than task j’s demand for this type of resource. If all deployed VMs are unable to deploy the task, skip to Step 3. Until all tasks of the independent set have been deployed, skip to Step 2.

5.1.2. Example

Consider a cloud computing system with five VMs, and each VM has five cores of CPU and 5 GB of memory. There are five tasks in the cloud computing system that need to be scheduled, and Table 1 provides the resource requirement information of the tasks. Figure 1 shows the relationship graph between tasks, where Figure 1a is the conflict graph of the tasks. To illustrate the algorithm more intuitively, Figure 1b is provided as the complementary graph of Figure 1a, which is known as the compatibility graph. There is no conflict relationship between any two connected tasks in Figure 1b, and they can be scheduled to the same VM.
The process of the initial solution of the root node is as follows: First, the improved Bron–Kerbosch algorithm is used to find all the task conflict maximum cliques { 1 , 2 , 3 } , { 2 , 4 } , { 3 , 5 } and the complete partition of the task maximal independent set { 1 , 4 , 5 } , { 2 } , { 3 } . Obtain the first independent set { 1 , 4 , 5 } . Since d r 1 = 2 , d r 4 = 3 and d r 5 = 4 , it is { 5 , 4 , 1 } after sorting. Enable VM 1 and sequentially traverse task list { 5 , 4 , 1 } . The available resources of VM 1 at this time are ( 5 , 5 ) , which can meet the resource requirements of task 5. Therefore, task 5 is scheduled to be executed on VM 1. The available resources of VM 1 become ( 1 , 3 ) , which cannot meet the resource requirements of task 4. Therefore, VM 2 is enabled and task 4 is scheduled to be executed on VM 2. Since VM 1 cannot meet the resource requirements of task 1, and VM 2 can meet the resource requirements of task 1, task 1 is scheduled to VM 2 for execution. At this time, all tasks in the first independent set are scheduled. Similarly, VM 3 is enabled for the second independent set { 2 } , and task 2 is scheduled to VM 3 for execution. Enable VM 4 for the third independent set { 3 } , and schedule task 3 to VM 4 for execution. Therefore, the target value of the generated initial feasible solution is 4, and the generated initial feasible column is ( 0 , 0 , 0 , 0 , 1 , 1 ) T , ( 1 , 0 , 0 , 1 , 0 , 1 ) T , ( 0 , 1 , 0 , 0 , 0 , 1 ) T , ( 0 , 0 , 1 , 0 , 0 , 1 ) T .

5.2. Column Generation Algorithm

5.2.1. Algorithmic Procedure

One of the cores of branch-and-price algorithms is to use column generation algorithms to solve the optimal solution of the Lagrange relaxation problem for each node. The solution process for the column generation algorithm on each node is as follows:
Step 1 Judge whether the current node is a root node. If it is a root node, the feasible columns generated by the initial solution are used as the initial columns of the node. Otherwise, use the feasible columns in the parent node that satisfy the branch constraint as the initial columns of the current node to obtain RMP.
Step 2 Solve RMP and obtain a set of dual solutions, then transfer the dual solutions to SP.
Step 3 Solve SP and calculate the lower bound m · z i ( μ ) + j J μ j of the Lagrangian relaxation problem. Judge whether the lower bound is equal to the upper bound, i.e., the optimal solution of RMP. If they are equal, the algorithm is terminated. Otherwise, add the columns generated by SP to RMP and skip to Step 2.

5.2.2. Example

The process of using the column generation algorithm to solve the optimal solution of the Lagrangian relaxation problem of the root node in Section 5.1.2 is as follows:
First, we use the initial columns generated by Section 5.1.2 as the initial columns of the root node and obtain the following RMP model:
min a 1 + a 2 + a 3 + a 4 s . t . : a 1 + a 2 + a 3 + a 4 5 a 2 = 1 a 3 = 1 a 4 = 1 a 2 = 1 a 1 = 1 a h 0 , h = 1 , 2 , , 4
Solve this RMP model with an object value of 4, which means the upper bound of the Lagrangian relaxation main problem is 4, and a 1 = 1 ,   a 2 = 1 ,   a 3 = 1 ,   a 4 = 1 . Meanwhile, the dual solution is θ = 0 ,   μ 1 = 1 ,   μ 2 = 1 ,   μ 3 = 1 ,   μ 4 = 0 ,   μ 5 = 1 . Substitute the dual solutions into the SP model and add conflict constraints based on the maximum clique found in Section 5.1.2 to obtain the following SP model:
z i ( μ ) = min y ( x 1 + x 2 + x 3 + x 5 ) s . t . : 2 x 1 + x 2 + 2 x 3 + 3 x 4 + 4 x 5 5 y x 1 + 3 x 2 + 4 x 3 + 2 x 4 + 2 x 5 5 y x 1 + x 2 + x 3 1 x 2 + x 4 1 x 3 + x 5 1 x j , y { 0 , 1 } , j = 1 , 2 , , 5
By solving this SP model, the target value z i ( μ ) of the model is −1; thus, the lower bound m · z i ( μ ) + j J μ j is −1. We obtain the solution x 1 = 0 , x 2 = 1 , x 3 = 0 , x 4 = 0 , x 5 = 1 , y = 1 , that is, the generated new column is ( 0 , 1 , 0 , 0 , 1 , 1 ) T . Since the upper bound is 4 and the lower bound is −1, which are not equal, it is necessary to use the column generation algorithm to continue solving. Add the newly generated columns to the RMP model and obtain the following RMP model:
min a 1 + a 2 + a 3 + a 4 + a 5 s . t . : a 1 + a 2 + a 3 + a 4 + a 5 5 a 2 = 1 a 3 + a 5 = 1 a 4 = 1 a 2 = 1 a 1 + a 5 = 1 a h 0 , h = 1 , 2 , , 5
Solve this RMP model and obtain a target value of 3, and a 1 = 0 ,   a 2 = 1 ,   a 3 = 0 ,   a 4 = 1 , a 5 = 1 . Meanwhile, the dual solution is θ = 0 ,   μ 1 = 1 ,   μ 2 = 0 ,   μ 3 = 1 ,   μ 4 = 0 ,   μ 5 = 1 . By substituting the dual solution into the SP model, the objective of the obtained SP model is z i ( μ ) = min y ( x 1 + x 3 + x 5 ) , and the constraint conditions are the same as those in Equation (7). By solving this SP model, the target value of the model is 0, and the lower bound of the Lagrange relaxation main problem is 3. At this time, the upper and lower bounds of the Lagrangian relaxation main problem are equal to 3. Therefore, the optimal solution of the Lagrangian relaxation problem is 3, and the algorithm ends.

5.3. Branch Strategy and Node Selection Strategy

5.3.1. Algorithmic Procedure

The solution obtained by solving the Lagrange relaxation problem through column generation is usually a fractional solution, which is the lower bound of the original integer programming problem. Therefore, we need to use an efficient branching strategy to obtain the optimal integer solution. A reasonable branching strategy should effectively partition the solution space of the problem, ensuring that feasible integer solutions can be found after a limited number of branches. However, the variable-based branching strategy, which is commonly used in traditional branch-and-bound algorithms, is not suitable for branch-and-price algorithms. Therefore, this paper adopts a subproblem-model-based branching strategy proposed in [31].
Assume that a node in the branch-and-price tree has completed the column generation process and generated a fractional solution. The column matrix composed of all columns generated by the Lagrange relaxation problem of this node is as follows:
M = ( x h ) h = 1 , 2 , , H = x 1 1 x 1 2 x 1 H x 2 1 x 2 2 x 2 H x n 1 x n 2 x n H
Column matrix M = ( x h ) h = 1 , 2 , , H ( H H ) is a submatrix of M that consists of columns corresponding to the condition a h > 0 . Search the column matrix M to find row pairs ( j , j ) and column pairs ( h , h ) that meet the pattern shown in Figure 2. Add constraint condition x j = x j to the subproblem of the left branch, and initialize the columns in M that satisfy constraint x j = x j as the initial columns of the left branch node of the current node. On the right branch, add the constraint x j + x j 1 to the subproblem, that is, add edge ( j , j ) to the conflict graph G, and select the columns in M that satisfy constraint x j + x j 1 as the initial columns of the right branch node of the current node. According to reference [31], if there is no pattern, as shown in Figure 2, then the solution must be an integer solution. Due to the limited number of rows and columns, the branch-and-price algorithm will inevitably terminate after a limited number of branches. As the conditions for the left branches are easier to solve, this paper adopts a depth-first search node selection strategy.

5.3.2. Example

Assuming a noninteger node, the column matrix M composed of the columns corresponding to condition a h > 0 is as follows:
0 1 0 0 1 1 0 0 0 0 1 1 0 1 1 0
We find that row pair ( 1 , 2 ) and column pair ( 1 , 2 ) meet the pattern shown in Figure 2. Therefore, add constraint x 1 = x 2 to the subproblem of the left branch and constraint x 1 + x 2 1 to the subproblem of the right branch.

5.4. Optimization of the Algorithm

To further improve the efficiency of the branch-and-price algorithm, we provide implementation details on branching strategy and the strategies for generating integer solutions.

5.4.1. Branch Strategy Based on Multiple Patterns

(a)
Algorithmic Procedure
The branching strategy can be extended to more general situations [31]. Assume that P ( P 1 ) row pairs satisfy the pattern shown in Figure 2 at the current node. Then we generate P + 1 subnodes for this node, where the following constraints are added for the leftmost branch:
x j 1 1 = x j 2 1 x j 1 2 = x j 2 2 x j 1 p = x j 2 P
The branching constraints added to the remaining P branches are as follows:
x j 1 p + x j 2 p 1 , p = 1 , 2 , , P
(b)
Example
By searching the column matrix in Section 5.3.2, the following row and column pairs that meet the pattern shown in Figure 2 can be found: row pair ( 1 , 2 ) and column pair ( 1 , 2 ) , row pair ( 1 , 4 ) and column pair ( 2 , 3 ) , row pair ( 2 , 4 ) and column pair ( 1 , 2 ) , row pair ( 2 , 4 ) and column pair ( 2 , 3 ) , row pair ( 3 , 4 ) and column pair ( 2 , 3 ) , row pair ( 3 , 4 ) and column pair ( 3 , 4 ) . We can find three different pairs of rows: ( 1 , 2 ) , ( 2 , 3 ) , and ( 3 , 4 ) . Therefore, four branches are generated for this node, where constraints x 1 = x 2 , x 2 = x 3 ,   x 3 = x 4 are added to the first branch, constraint x 1 + x 2 1 is added to the second branch, constraint x 2 + x 3 1 is added to the third branch, and constraint x 3 + x 4 1 is added to the fourth branch.

5.4.2. Hot Start and Upper Bound Strategy for Generating Integer Solutions

Since the feasible domain of a child node is contained within the feasible domain of its parent node, the object value of the parent node is the lower bound of the child node. Therefore, the object value z father of the parent node can be used to initialize the Lagrangian lower bound of the child node. This corresponds to adding constraint h = 1 H y h · a h z father to the main problem of the child nodes. Meanwhile, to reduce the number of iterations within each subnode, we use the current optimal integer solution to replace the parameter m in constraint 4 (b) of the DMP.
On the other hand, a good upper bound can effectively compress the solution space and improve the algorithm’s execution speed. Since the columns of the final integer solution must be the columns corresponding to a h = 1 , the larger the value of a h at noninteger nodes, the more likely the corresponding column is to be included in the optimal solution. Therefore, at each noninteger node, we sort all columns in descending order of a h . Each column corresponds to the allocation method of a VM. Traverse these columns in order, and add columns to the feasible columns if they cover new tasks and none of the already-covered tasks. After traversing all columns, if a task has not yet been covered, we enable a new VM for it. When all tasks are covered, a feasible solution is generated. If the feasible solution is less than the current upper bound, the upper bound is updated to this feasible solution.

5.5. Specific Implementation Steps of the Algorithm

Figure 3 shows the flow chart of BPTSC, and the overall process of the algorithm is as follows:
Step 1 By the initial solution generation algorithm, we identify all task conflict maximum cliques, and generate the initial solution and the upper bound to form the root node.
Step 2 Initialize the branch-and-price search tree and add the root node.
Step 3 Select a node by the depth first strategy and initialize the problem model with the information of this node.
Step 4 Solve the Lagrange relaxation problem through column generation. If the solution is an integer solution, skip to Step 6; otherwise, skip to Step 5.
Step 5 Calculate the feasible integer solution generated by this node, and if it is less than the current upper bound, update the upper bound value. Compare the Lagrangian relaxation optimal solution at this node with the upper bound value. If it is less than the upper bound, branch based on the above branching strategy and add the new nodes to the set of unsolved nodes.
Step 6 If the set of unsolved nodes is not empty, skip to Step 3. Otherwise, the current upper bound is the optimal integer solution, and the algorithm is terminated.

6. Experimental Results and Analysis

The configuration of the experimental platform in this paper is Intel (R) Core (TM) i5-4590 CPU @ 3.30 GHz. The branch-and-price algorithm is implemented in C++, and the multidimensional bin packing subproblem with conflict constraints and the RMP model are solved by CPLEX 12.6.3 solver.

6.1. Experimental Setup

To verify the correctness and effectiveness of BPTSC, we refer to the numerical experiment of the one-dimensional packing problem with conflict constraints in [11] for experimental design. The experimental settings are as follows:
  • The experiment is divided into three scales based on the number of tasks: small-scale (30 tasks), medium-scale (100 tasks), and large-scale (200 tasks).
  • There are a total of 200 cloud computing VMs, each with three hardware resources: CPU, memory, and storage. The resource capacity of each VM is c = ( 100 , 300 , 3000 ) .
  • The CPU requirements for each task are randomly generated on a uniform distribution of [1–64], memory requirements are randomly generated on a uniform distribution of [8–128], and storage requirements are randomly generated on a uniform distribution of [100–1000].
  • Generate a conflict graph G according to the settings in [11], and add the conflict relationship between tasks based on the density values δ ( δ = 0.1 , 0.2 , , 0.9 ) of the number of conflicting tasks.The specific method is to assign a feature value q j that satisfies a uniform distribution on [0–1] to each task j. If tasks j and j meet ( q j + q j ) ( q j + q j ) 2 2 δ , add a conflict relationship between tasks j and j .
  • Each scale is further divided into nine groups according to the increasing conflict density value δ . To ensure the universality of the experiment, each group includes 20 instances, and the experimental results are the average values simulated by 20 instances. The total number of the instances is 540.

6.2. Experimental Results

To evaluate the efficiency of the BPTSC algorithm proposed in this paper, Table 2 summarizes the evaluation results of BPTSC. As the number of tasks increases, the difficulty of the problem increases exponentially. As the problem scale increases from 30 to 200, the number of nodes increases by approximately 15 times, the number of columns increases by more than 8 times, and the average execution time increases by 25 times. Notably, the model is the most difficult to solve when the conflict density is between 10% and 30%. In this case, the key constraint of the model is the multidimensional resource capacity constraint. This makes the solution space of the search tree very large, requiring the generation of a large number of branch nodes and columns. When the conflict density is between 70–90%, the model is the easiest to solve. In this instance, the key constraint is the task conflict constraint; thus, the solution space of the model is small. Only generating a small number of nodes and columns can obtain the optimal solution of the problem. On the other hand, as expected, the value of the optimal solution increases as the density of the conflict graph increases. When the conflict density is 10%, the value of the optimal solution is usually quite similar to the no-conflict constraint case. However, for high-density instances (90%), almost 90% of tasks are scheduled to different VMs. As shown in the fourth and fifth columns of Table 2, most feasible columns are generated at the root node. The number of columns generated by the root node accounts for more than 2/3 of the total number of columns, resulting in the root node accounting for more than half of the total execution time. From the sixth and seventh columns of Table 2, the Lagrange lower bound obtained by the root node is very close to the optimal solution, reaching over 98%. In addition, we also conduct ultra-large-scale (1000 tasks) experiments. The experimental results show that when the conflict density is high (70–90%), the execution time of BPTSC is still very short. In other cases, the execution time usually exceeds our set time (1 h).
The BPTSC algorithm is an accurate algorithm, while the heuristic algorithms commonly used as the benchmark are approximation algorithms, and cannot obtain optimal solution. Thus, we use CPLEX as the benchmark to compare the efficiency of the algorithm. CPLEX directly solves the original model without using Lagrange relaxation principle and Dantzig–Wolfe decomposition principle. Meanwhile, to verify the effectiveness of the initial solution generation strategy of BPTSC, Figure 4a–c, respectively, show the comparison of the number of nodes, number of columns, and execution time between BPTSC and the branch-and-price algorithm using the descending first fit strategy, which is identified by FFBP in Figure 4. Figure 4c also provides a comparison between the two branch-and-price algorithms and CPLEX in terms of execution time. From Figure 4a,b, the initial solution generation strategy of BPTSC outperforms FFBP in terms of the number of nodes and number of columns, especially when the task size is large. From Figure 4c, it can be seen that when the number of tasks is small, the execution efficiency of all three algorithms is high. When the number of tasks is medium in scale, BPTSC is superior to FFBP in finding the initial solution. Both of them have much shorter execution time than the algorithm directly using CPLEX. When the number of tasks is large, since most of the execution times of the instances solved by CPLEX exceed our set time (1 h), only the two branch-and-prices are compared. The experimental result is similar to the medium-scale case, and BPTSC is superior to FFBP, which verifies the effectiveness of the initial solution generation strategy of BPTSC.
Figure 5 shows the execution results of using three different branch modes of 1, 5, and 10 for different task scales. The algorithms for the three branch modes correspond to BPTSC-1, BPTSC-5, and BPTSC-10. Although the total number of nodes may increase when using the multibranch mode (BPTSC-5 and BPTSC-10), the generated number of feasible columns will decrease, resulting in a significant reduction in the execution time of the algorithm. The algorithm has the highest execution efficiency when the branch mode is 5, which verifies the effectiveness of the multipattern-based branching strategy proposed in this paper.

7. Conclusions

This paper investigates the task scheduling problem in cloud computing systems with the goal of reducing energy consumption. Due to the lack of consideration of antiaffinity constraints between tasks in the current research, we formulate the problem into a multidimensional bin packing model with conflict constraints. Furthermore, the model is decomposed by the Lagrange relaxation principle and Dantzig–Wolfe decomposition principle. On the other hand, due to the lack of exact algorithms with theoretical support for task scheduling problem in cloud computing, we design a branch-and-price algorithm and improve the algorithm in many ways. Finally, through multiple sets of experiments, it is verified that the algorithm designed in this paper has good results in terms of execution time and efficiency. In future work, we will address the task scheduling problem of cloud computing by adding time as a new type of resource. Tasks can specify the start and end times for using VMs. On the other hand, to better solve ultra-large-scale problems in practice, we will study the approximation algorithms of the problem proposed in this paper.

Author Contributions

Conceptualization, N.X., W.L., J.Z. and X.Z.; methodology, W.L.; software, N.X. and J.Z.; validation, W.L., J.Z. and X.Z.; formal analysis, W.L. and X.Z.; investigation, N.X.; resources, J.Z.; data curation, J.Z.; writing—original draft preparation, N.X.; writing—review and editing, W.L., J.Z. and X.Z.; visualization, N.X.; supervision, X.Z.; project administration, J.Z.; funding acquisition, J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Nos. 62062065, 12071417, 61962061), the Education Foundation of Yunnan Province of China (2022J002) and the Program for Excellent Young Talents, Yunnan, China.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets used or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Bharany, S.; Sharma, S.; Khalaf, O.I.; Abdulsahib, G.M.; Al Humaimeedy, A.S.; Aldhyani, T.H.; Maashi, M.; Alkahtani, H. A systematic survey on energy-efficient techniques in sustainable cloud computing. Sustainability 2022, 14, 6256. [Google Scholar] [CrossRef]
  2. Garg, N.; Singh, D.; Goraya, M.S. Energy and resource efficient workflow scheduling in a virtualized cloud environment. Clust. Comput. 2021, 24, 767–797. [Google Scholar] [CrossRef]
  3. Zade, B.M.H.; Mansouri, N.; Javidi, M.M. SAEA: A security-aware and energy-aware task scheduling strategy by Parallel Squirrel Search Algorithm in cloud environment. Expert Syst. Appl. 2021, 176, 114915. [Google Scholar] [CrossRef]
  4. Houssein, E.H.; Gad, A.G.; Wazery, Y.M.; Suganthan, P.N. Task scheduling in cloud computing based on meta-heuristics: Review, taxonomy, open challenges, and future trends. Swarm Evol. Comput. 2021, 62, 100841. [Google Scholar] [CrossRef]
  5. Hussain, M.; Wei, L.F.; Lakhan, A.; Wali, S.; Ali, S.; Hussain, A. Energy and performance-efficient task scheduling in heterogeneous virtualized cloud computing. Sustain. Comput. Inform. Syst. 2021, 30, 100517. [Google Scholar] [CrossRef]
  6. Ghafari, R.; Kabutarkhani, F.H.; Mansouri, N. Task scheduling algorithms for energy optimization in cloud environment: A comprehensive review. Clust. Comput. 2022, 25, 1035–1093. [Google Scholar] [CrossRef]
  7. Kubernetes. Available online: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/ (accessed on 7 June 2023).
  8. OpenStack. Available online: https://specs.openstack.org/openstack/cinder-specs/specs/juno/affinity-antiaffinity-filter.html (accessed on 7 June 2023).
  9. Chi, R.; Qian, Z.; Lu, S. Be a good neighbour: Characterizing performance interference of virtual machines under xen virtualization environments. In Proceedings of the 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS), Hsinchu, Taiwan, 16–19 December 2014; IEEE: Piscataway, NJ, 2014; pp. 257–264. [Google Scholar]
  10. Jansen, K. An approximation scheme for bin packing with conflicts. J. Comb. Optim. 1999, 3, 363–377. [Google Scholar] [CrossRef]
  11. Christman, A.; Chung, C.; Jaczko, N.; Westvold, S.; Yuen, D.S. Robustly assigning unstable items. J. Comb. Optim. 2022, 44, 1556–1577. [Google Scholar] [CrossRef]
  12. Beloglazov, A.; Abawajy, J.; Buyya, R. Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing. Future Gener. Comput. Syst. 2012, 28, 755–768. [Google Scholar] [CrossRef] [Green Version]
  13. Garg, N.; Singh, D.; Singh Goraya, M. Deadline aware energy-efficient task scheduling model for a virtualized server. SN Comput. Sci. 2021, 2, 169. [Google Scholar] [CrossRef]
  14. Mekala, M.; Viswanathan, P. CTRV: Resource based task consolidation approach in cloud for green computing. Distrib. Parallel Databases 2023, 41, 157. [Google Scholar]
  15. Zhang, X.; Liu, X.; Li, W.; Zhang, X. Trade-off between energy consumption and makespan in the mapreduce resource allocation problem. In Proceedings of the Artificial Intelligence and Security: 5th International Conference, New York, NY, USA, 26–28 July 2019; Springer: Cham, Switzerland, 2019; pp. 239–250. [Google Scholar]
  16. Liang, B.; Wu, D.; Wu, P.; Su, Y. An energy-aware resource deployment algorithm for cloud data centers based on dynamic hybrid machine learning. Knowl.-Based Syst. 2021, 222, 107020. [Google Scholar] [CrossRef]
  17. Gendreau, M.; Laporte, G.; Semet, F. Heuristics and lower bounds for the bin packing problem with conflicts. Comput. Oper. Res. 2004, 31, 347–358. [Google Scholar] [CrossRef]
  18. Capua, R.; Frota, Y.; Ochi, L.S.; Vidal, T. A study on exponential-size neighborhoods for the bin packing problem with conflicts. J. Heuristics 2018, 24, 667–695. [Google Scholar] [CrossRef] [Green Version]
  19. Elhedhli, S.; Li, L.; Gzara, M.; Naoum-Sawaya, J. A branch-and-price algorithm for the bin packing problem with conflicts. INFORMS J. Comput. 2011, 23, 404–415. [Google Scholar] [CrossRef]
  20. Wei, L.; Luo, Z.; Baldacci, R.; Lim, A. A new branch-and-price-and-cut algorithm for one-dimensional bin-packing problems. INFORMS J. Comput. 2020, 32, 428–443. [Google Scholar] [CrossRef]
  21. Ekici, A. Bin packing problem with conflicts and item fragmentation. Comput. Oper. Res. 2021, 126, 105113. [Google Scholar] [CrossRef]
  22. Garey, M.R.; Johnson, D.S. Computers and Intractability; Freeman: San Francisco, CA, USA, 1979; Volume 174. [Google Scholar]
  23. Geoffrion, A.M. Lagrangean relaxation for integer programming. In Approaches to Integer Programming; Springer: Berlin/Heidelberg, Germany, 2009; pp. 82–114. [Google Scholar]
  24. Desaulniers, G.; Desrosiers, J.; Solomon, M.M. Column generation; Springer Science & Business Media: New York, NY, USA, 2006; Volume 5. [Google Scholar]
  25. Tomita, E.; Tanaka, A.; Takahashi, H. The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci. 2006, 363, 28–42. [Google Scholar] [CrossRef] [Green Version]
  26. Liu, X.; Zhang, X.; Li, W.; Zhang, X. Swarm optimization algorithms applied to multi-resource fair allocation in heterogeneous cloud computing systems. Computing 2017, 99, 1231–1255. [Google Scholar] [CrossRef]
  27. Zhang, J.; Yang, X.; Xie, N.; Zhang, X.; Vasilakos, A.V.; Li, W. An online auction mechanism for time-varying multidimensional resource allocation in clouds. Future Gener. Comput. Syst. 2020, 111, 27–38. [Google Scholar] [CrossRef]
  28. Zhang, J.; Xie, N.; Yang, X.; Zhang, X.; Li, W. Strategy-proof mechanism for time-varying batch virtual machine allocation in clouds. Clust. Comput. 2021, 24, 3709–3724. [Google Scholar] [CrossRef]
  29. Zhang, J.; Chi, L.; Xie, N.; Yang, X.; Zhang, X.; Li, W. Strategy-proof mechanism for online resource allocation in cloud and edge collaboration. Computing 2022, 104, 383–412. [Google Scholar] [CrossRef]
  30. Elhedhli, S. Ranking lower bounds for the bin-packing problem. Eur. J. Oper. Res. 2005, 160, 34–46. [Google Scholar] [CrossRef]
  31. Ryan, D.M.; Foster, B.A. An integer programming approach to scheduling. In Computer Scheduling of Public Transport Urban Passenger Vehicle and Crew Scheduling; Elsevier: Amsterdam, The Netherlands, 1981; pp. 269–280. [Google Scholar]
Figure 1. Relationship graph between tasks. (a) Conflict graph. (b) Compatibility graph.
Figure 1. Relationship graph between tasks. (a) Conflict graph. (b) Compatibility graph.
Applsci 13 07505 g001
Figure 2. Pattern for generating fractional solutions.
Figure 2. Pattern for generating fractional solutions.
Applsci 13 07505 g002
Figure 3. The flow chart of BPTSC.
Figure 3. The flow chart of BPTSC.
Applsci 13 07505 g003
Figure 4. Performance comparison with FFBP. (a) Number of nodes. (b) Number of columns. (c) Execution time.
Figure 4. Performance comparison with FFBP. (a) Number of nodes. (b) Number of columns. (c) Execution time.
Applsci 13 07505 g004
Figure 5. Performance comparison of three different branch patterns. (a) Number of nodes. (b) Number of columns. (c) Execution time.
Figure 5. Performance comparison of three different branch patterns. (a) Number of nodes. (b) Number of columns. (c) Execution time.
Applsci 13 07505 g005
Table 1. Task resource requirements.
Table 1. Task resource requirements.
Task NumberCPUMemory
121
213
324
432
542
Table 2. Detailed results of BPTSC.
Table 2. Detailed results of BPTSC.
Number
of Tasks
Conflic
Density δ
NodesRoot Node
Columns
Total
Columns
Root Node
Value
Optimal
Solution
Root Node
Time (s)
Total
Time (s)
300.148.4076.75107.4510.8811.3012.2521.23
0.231.0582.5598.2010.8611.4010.5314.23
0.329.8572.2087.8511.5211.657.6612.96
0.428.6566.0577.9012.9513.205.9810.55
0.520.8054.4565.7015.7215.953.586.11
0.616.6547.4052.1518.4818.602.385.05
0.75.8540.0541.9021.9922.151.021.73
0.88.8034.7036.8525.3025.400.380.74
0.92.8032.4533.5027.4027.400.240.64
Average21.4356.2966.8317.2317.454.898.14
1000.1276.35261.10474.2535.9536.5063.27159.52
0.2269.20253.55465.3536.2536.6569.56147.46
0.3133.45269.65470.5537.5138.0575.33143.70
0.4115.30287.35359.6039.8840.1565.5279.21
0.5107.75205.50284.6548.9549.0535.7853.06
0.6111.15181.50245.7558.8359.2024.7840.02
0.768.05162.85195.6570.8871.1016.5122.10
0.825.55139.30150.4080.2480.658.349.95
0.99.20109.00112.1091.3891.601.091.49
Average124.00207.76306.4855.5455.8840.0272.95
2000.1637.70605.30962.6572.3272.35157.71447.21
0.2565.80539.45934.2072.6073.00175.76385.43
0.3367.05566.30903.3073.2573.65185.33321.84
0.4364.60511.25742.2079.4279.75154.04256.64
0.5349.95431.80621.30101.98102.20106.64196.77
0.6335.05381.65532.45121.33121.3579.60152.03
0.7295.10353.75506.25135.41135.5059.47125.47
0.8106.85277.95329.20162.02162.2524.5237.17
0.938.50225.35242.30183.43183.755.348.826
Average340.07432.53641.54111.31111.53105.38214.60
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xie, N.; Li, W.; Zhang, J.; Zhang, X. Research on Cloud Task Scheduling Algorithm with Conflict Constraints Based on Branch-and-Price. Appl. Sci. 2023, 13, 7505. https://doi.org/10.3390/app13137505

AMA Style

Xie N, Li W, Zhang J, Zhang X. Research on Cloud Task Scheduling Algorithm with Conflict Constraints Based on Branch-and-Price. Applied Sciences. 2023; 13(13):7505. https://doi.org/10.3390/app13137505

Chicago/Turabian Style

Xie, Ning, Weidong Li, Jixian Zhang, and Xuejie Zhang. 2023. "Research on Cloud Task Scheduling Algorithm with Conflict Constraints Based on Branch-and-Price" Applied Sciences 13, no. 13: 7505. https://doi.org/10.3390/app13137505

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop