Research on Resource Allocation of Autonomous Swarm Robots Based on Game Theory

He, Zixiang; Sun, Yi; Feng, Zhongyuan

doi:10.3390/electronics12204370

Open AccessArticle

Research on Resource Allocation of Autonomous Swarm Robots Based on Game Theory

by

Zixiang He

^*,

Yi Sun

and

Zhongyuan Feng

School of Communication and Information Engineering, Xi’an University of Science and Technology, Xi’an 710054, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(20), 4370; https://doi.org/10.3390/electronics12204370

Submission received: 13 September 2023 / Revised: 15 October 2023 / Accepted: 20 October 2023 / Published: 22 October 2023

(This article belongs to the Special Issue Advanced Technologies in Autonomous Robotic System)

Download

Browse Figures

Versions Notes

Abstract

:

To address the issue of resource allocation optimization in autonomous swarm robots during emergency situations, this paper abstracts the problem as a two-stage extended game. In this game, participants are categorized as either resource-providing robots or resource-consuming robots. The strategies of the resource-providing robots involve resource production and pricing, whereas the strategies of the resource-consuming robots consist of determining the quantity to be purchased based on resource pricing. In the first stage of the game, the resource-providing robots use the Cournot game to determine the resource production according to market supply and demand conditions; in the second stage of the game, the resource-providing robots and the resource-consuming robots play the price game and establish the utility function of the swarm robots to seek the optimal pricing and the optimal purchasing strategy of the swarm robots. After the mathematical derivation, this paper demonstrates the existence of a single Nash equilibrium in the constructed game. Additionally, the inverse distributed iterative search algorithm solves the game’s optimal strategy. Finally, simulation verifies the game model’s validity. This study concludes that the designed game mechanism enables both sides to reach equilibrium and achieve optimal resource allocation.

Keywords:

autonomous swarm robots; game theory; resource allocation; Cournot game; price game; Nash equilibrium

1. Introduction

Sudden accident scenarios can be complex and varied, with huge differences between the scene information prior to the disaster. The overall structure and local details are unknown, and the overall environmental state continues to evolve. This presents serious challenges for initial rescue work, as direct access to the scene poses the risk of secondary disasters. In order to reduce the risks of injury or death to rescuers, using robots instead of personnel to conduct preliminary information detection has become an inevitable choice [1,2].

In high-risk pre-rescue operations, robots are utilized in lieu of human personnel to independently access disaster sites. These robots carry critical equipment such as communication and power sources to efficiently establish rescue infrastructure and provide necessary communication and computing services. This significantly enhances the preparedness of rescue teams and enhances overall efficiency. Early rescue robots often take the form of a single robot [3], but such traditional robots have significant limitations. Firstly, a single robot has limited operational coverage. Additionally, the scalability of the single robot system is poor. In complex emergency rescue sites, monocoque robots might not be able to perform their full role, thereby unable to cater to the demand for wide-area operations. Furthermore, the scalability of the single robot system is poor. Due to the excessive integration of various functional modules, the structure of the monocoque robot system is complex and lacks fault tolerance. Consequently, a failure of the key components may cause the entire system to crash, which is particularly critical in emergency response environments. Monobots are also less adept at adapting to complex dynamic environments, thus hindering the attainment of formidable environmental resilience.

To resolve these dilemmas, we can use swarm robots instead of single robots. The application of swarm robots is typically regarded as more beneficial compared to utilizing solely individual robots [4,5]. Swarm robots [6], comprising multiple individual robots consisting of varying structures, functions, and resources, can achieve broader task coverage and coordinate more intricate operations. Swarm robots can expand their operational scope and reduce functional complexity through collaboration and specialization. They can also achieve business expansion by incorporating new single-unit robots, reflecting a high degree of flexibility, scalability, and stability [7]. As a result, swarm robots can cover a wider range of tasks and collaborate to complete more complex operations. Autonomous swarm robots can surpass the limitations of single robots due to the ease of achieving large-scale expansion and, more importantly, the collaboration between swarm robots that can produce a powerful synergistic effect to complete complex tasks that are difficult for a single robot to achieve [8].

Swarm robots show great potential in rescue missions, and one of their core technical problems is how to efficiently allocate and schedule limited resources, such as computational resources and energy resources, among swarm members. If the resources are not properly allocated, it will directly affect the efficiency of the swarm or even cause mission failure. To address the resource allocation problem, from the control perspective, resource allocation methods are divided into two categories: centralized and decentralized [9]. In a centralized scheme, there is a single centralized node that plans the resource allocation strategy based on the overall situation. This centralized approach has the advantages of high allocation efficiency and easy access to find the global optimum solution [10,11]. Specifically, the centralized node can collect information from all nodes in the system and calculate the optimal allocation of resources across the entire network. With a holistic view of available resources and demands, the centralized node is able to determine the best strategy efficiently to maximize resource usage and meet needs. However, in pre-rescue situations, emergency scenarios are often dynamic in nature. The centralized robot faces challenges in obtaining global information in such dynamic environments. Additionally, without complete situational awareness, the centralized robot can easily become a single point of failure. Such a single point of failure could then lead to problems with the entire system, resulting in a single point of failure problem [12]. The decentralized scheme does not rely on a fixed central node, thus eliminating single point dependence. Instead, each robot is capable of making autonomous decisions based on its own environmental sensors, as well as information shared by other robots in the swarm. The robots also factor in their individual priorities and workload when determining the best course of action. This distributed decision-making model effectively enhances the robustness of the swarm system. It further bolsters the stability of the swarm robots, since failures of any individual node will not cause the whole system to fail. Without centralized control, a decentralized architecture is better suited for the unpredictable and time-sensitive needs of emergency response operations within complex and dynamically changing environments.

Game theory is a theory that studies the rational choices made by multiple decision-makers in competitive or cooperative situations [13], can model the strategic interactions between individuals and find Nash equilibrium solutions, and has been widely used to solve resource optimization problems [14,15]. In this paper, the resource allocation problem of swarm robots is abstracted as a two-stage autonomous swarm robot resource allocation game model. This model contains both a Cournot game and a price game. The autonomous swarm robots in the model are composed of two categories of robots. Namely, there are resource-providing robots that supply resources, and there are resource-consuming robots that demand resources. By dividing the swarm robots into these two types, the game model can analyze the dynamic resource interactions between robots needing to share limited resources. The resource-providing robot is the producer of the whole game model, providing computational resources for the resource-consuming robots; the resource-consuming robot can be regarded as the consumer of the whole game model, purchasing and consuming computational resources. In the first stage of the game, the resource-providing robots play the Cournot game, in which they simultaneously determine their respective production of computing resources according to the market demand. By maximizing their respective revenues, the resource-providing robots play a game among themselves to arrive at the optimal production level. Subsequently, the resource-providing and -consuming robots engage in a price game to determine resource allocation. In this second phase, the robots play the price game to allocate resources based on the production set in the first stage. In addition, the resource-providing robots share the local resource demand information among themselves to realize the synergy on the supply side. In the game process, the pricing objective of the provider is to maximize the revenue, and the purchasing objective of the consumer is also to maximize the revenue. In the advancement of the game, each robot makes rational decisions in accordance with the revenue maximization objective [16]. This scheme flexibly adapts to the dynamic changes in the emergency environment and optimizes resource allocation.

In this paper, the Cournot game and the price game are combined to form a two-stage combination game model. The specific contributions are as follows: firstly, this paper proposes a two-stage game model to solve the problem of optimal resource allocation of an autonomous swarm robotic system in emergency scenarios. In the first stage, the Cournot game is used to determine the resource production, and in the second stage, the price game is played to solve the optimal pricing and purchasing strategies. Then, this paper gives the existence proof of Nash equilibrium and solves the game model using the Lagrangian method and the inverse induction method. In addition, this paper designs a distributed iterative search algorithm to initialize the participants’ strategies and make the game converge to equilibrium through iterative updating, and also gives a description of the algorithm flow and process. Finally, this paper carries out simulation verification. The convergence of the algorithm under different initial conditions and the dynamic change processes in participants’ strategies and utilities are compared. The results verify the effectiveness and robustness of the game model and algorithm.

2. Related Works

The game-theoretic approach in economics has emerged as an effective modeling and analytical tool in the problem of resource allocation in swarm robotics systems. Many researchers have produced significant pieces of work in this area, which are categorized into the following three main areas:

2.1. Research on Output Game between Resource Providers

The resource allocation problem in swarm robotics systems can usually be classified as a decision problem between resource providers or between resource providers and consumers. In the field of economics, researchers often use the Cournot game to describe the decision problem between resource providers regarding the output.

Smirnov, A., Levashova, T., Pashkin, M. et al. [17] abstracted the competition between firms producing homogeneous goods in a network environment as a Cournot game and explored how to allocate the output to the relevant market. Von Neumann, J. and Morgenstern, O. [18] investigated the power allocation problem in a non-orthogonal multiple access wireless uplink IoT environment, modeling it as a Cournot game and taking into account factors such as the popularity of uploaded content, power consumption constraints, and energy costs. By deriving the concavity of the user device utility function, they prove the existence and uniqueness of the Nash equilibrium solution. The results show that their proposed scheme outperforms the traditional orthogonal multiple access and fixed power selection strategies in terms of system throughput and utility performance.

On the other hand, Nash, J. [19] modeled the allocation of power resources between a generating company operating in islanding mode and generation users as a Cournot game and calculated the optimal power generation ratio using Nash equilibrium points. The researchers devised a distributed cooperative control algorithm based on consensus theory to ensure that each generating device generates electricity in accordance with the overall consumption load in order to obtain the greatest benefit and stable operation of the microgrid. The optimal generation ratio allocation based on the Nash equilibrium point eliminates the steady-state frequency deviation of each device in the microgrid, thereby guaranteeing the users high-quality electricity. The study’s findings from simulation experiments demonstrate that the control algorithm can fulfill the study’s objectives. The above studies aimed to solve the resource allocation problem through the implementation of the Cournot game model. However, limited research has specifically examined the pricing challenge within the context of resource allocation among swarm robotics systems.

The Cournot game focuses on output decisions between firms in a market, where individual firms determine their optimal output based on the output of their competitors. In this game model, the firms’ objective is to maximize their profits or utility. However, the pricing problem is slightly different from the Cournot game. In the pricing problem, instead of determining its own price based on its competitors’ output, the firm determines its optimal pricing strategy based on its competitors’ pricing. Although the above literature has successfully solved the problem of resource allocation through the Cournot game, a specific study on the pricing problem has not been conducted.

2.2. Research on the Price Game between Resource Providers and Consumers

For resource allocation and pricing problems between resource providers and consumers, scholars typically utilize the price game research method. The price game model investigates how both parties can form optimal strategies in a competitive environment to maximize their economic benefits, ultimately reaching the Nash equilibrium state.

In the work of Venkateswarararao et al. [20], researchers constructed a price game model to solve the UAV edge computing resource allocation problem. According to the network environment and resources, the UAV base station first decides the price of the bandwidth to maximize its own utility, and then the mobile base station determines the amount of bandwidth to be purchased according to the pricing of the UAV base station to satisfy the quality of service requirements of the user devices. The appropriate resource pricing and computational offloading allocation strategies are obtained through an iterative process. Simulations verify the feasibility of the algorithm.

Xiao, Y., Peng, Y., Lu, Q. et al. [21] compared and analyzed three dynamic pricing mechanisms (proportional allocation mechanism, uniform pricing mechanism, and differentiated pricing mechanism seeking fairness) for edge computing resource allocation in IoT environments, and proposed a BID-PRAM model that can overcome the limitations of auction-based pricing schemes. In addition, they described UNI-PIM and FAID-PRIM as price game models between a single leader and multiple followers. The researchers analyzed the numerical results, identified the strengths and weaknesses of each model, and developed guidelines for multiple pricing scenarios for edge computing service providers.

In the work of Adamson, G. et al. [22], researchers proposed a dynamic price game multimode spectrum-sharing solution based on 5G-VANET. The solution considers the cellular base station’s revenue and network-wide throughput and develops corresponding access pricing strategies based on different spectrum-sharing modes. By dynamically changing the selection through evolutionary gaming, the simulation results show that the algorithm can significantly improve the total transmission rate of VANET and provide an efficient spectrum-sharing mechanism compared to the random selection method.

The work of Bimpikis, K. et al. [23] focused on pricing and resource management in IoT systems for mobile edge computing. They modeled resource management and pricing in the system to create a stochastic price game model. The system consists of a set of ground parties and an airborne base station for forwarding tasks from peers to a blockchain server. The stability of the proposed model is demonstrated through simulation results and it is established that the decisions of the ground parties are the best response to the optimal strategies of the base stations.

The above literature has used price game models to explore pricing strategies between resource providers and consumers to solve the resource allocation problem. However, these studies do not take into account the competitive relationship between resource providers, which makes it difficult to determine a reasonable total amount of resources, and the modeling is only conducted from the pricing perspective, with insufficient discussion of the resource production problem.

2.3. Research on the Combination Game

The problem of resource allocation often involves competition between resource providers and also includes the interaction between resource providers and resource consumers. When these two situations co-exist, a singular game model is inadequate for analyzing and resolving the problem. Combination games offer a practical solution to this challenge. The theoretical framework related to combination games aids in analyzing decision making and behavior among multiple players and highlighting the interactions and influences among them. In the literature, various dynamic combination game models have been proposed to address resource allocation in diverse fields.

He, S. and Wang, W. [24] used a dynamic combination game model to solve the problem of caching resource allocation for a leader and multiple followers. In the first stage of the game, the Stackelberg game model is used, in which multiple mobile network operators are considered as the leader, which adjusts its pricing strategy according to the changes in the decision making of the followers to maximize its revenue. Subsequently, in the second stage of the Cournot game, the followers compete among themselves for the amount of space available for the mobile network operator’s small base station caches in order to improve the quality of service for the users and maximize their own revenue. The study verifies the existence of Nash equilibrium through numerical analysis and identifies the optimal strategies of the leader and the followers and their corresponding optimal prices.

The work by Zhou et al. [25] explored the renewable energy allocation problem and designed a non-cooperative Stackelberg–Cournot game model. The model models the relationship between the dominant player of renewable energy generating capacity and local renewable energy and storage investors as a two-layer Stackelberg–Cournot game. The dominant player uses market power to determine the allocation of renewable energy, while the competitive behavior among followers is abstracted as a Cournot subgame. The feasibility of the model is verified by simulation results.

In the work of Nguyen et al. [26], in the field of prefabricated building supply chains, the researchers proposed a combination game model that nests the Cournot model within the Stackelberg model. The model solves a decision-making problem between an upstream component manufacturing company and two downstream contractors. The model takes into account the impact of the contractors’ self-manufacturing or outsourcing decisions based on the profitability of the supply chain firms and the supply chain as a whole and achieves an equilibrium solution for output, price, and profitability by accounting for the payoff functions of the different firms. The results of the study show the feasibility of the solution.

Baek, B., Lee, J., Peng, Y. et al. [27] proposed a two-stage game model based on Cournot’s game and Stackelberg’s game for solving the decision-making problem of container shipping companies in the emerging blockchain market. The model considers the decision making of container shipping companies to enter the blockchain market and utilizes the Cournot game and Stackelberg game to decide the freight rate strategy; it also introduces the spread factor to reflect the impact of the initial strategy on the blockchain strategy. The validity of the model is verified by simulation results.

In summary, previous research has made some progress in studying resource allocation and pricing issues using a price game model, from an economic perspective. However, there are certain limitations when only considering price competition. This fails to capture the competitive relationships among multiple resource-providing robots in emergency scenarios and struggles to determine the appropriate total quantity of resources. To address these limitations, this study builds upon the existing price game model by incorporating the analysis of the Cournot game model.

3. System Model

Establishing an efficient rescue support system is crucial for effectively conducting subsequent rescue operations in emergency scenarios. In this section, we first employ Cournot game analysis to examine the competition between resource-providing robots in terms of output, determining the total quantity of resources and their initial pricing. Compared to using the Stackelberg game model or the Bertrand game model, the Cournot game model provides a more accurate description of the interaction between resource-providing robots in terms of output competition. By directly modeling the output competition, the Cournot game model enables improved coordination of total output and better alignment with market demand. In contrast, the Bertrand game model assumes price competition rather than output competition, which is not suitable for the scenario under consideration. Moreover, the integration of the Cournot game model offers a deeper insight into the competitive dynamics among the robots and facilitates the determination of optimal resource quantities.

Due to the closed nature of emergency scenarios, with the system isolated from the outside world, limited computing resources are concentrated on resource-providing robots. This underscores the significance of rational resource allocation and pricing within the system. To address this, we proceed to the second phase of the game by constructing a price game model that analyzes the pricing game between resource-providing robots and resource-consuming robots, with the aim of achieving optimal resource allocation.

3.1. Network Model

In the study of autonomous swarm robots, we model the swarm robot system as a system containing N resource-providing robots and M resource-consuming robots, where the set of resource-providing robots is denoted as M = (1, 2,…, M) and the set of resource-consuming robots is denoted as N = (1, 2,…, N). In this system, the resource-providing robots act as producers of the whole system and can provide computational resources to the resource-consuming robots. The computing resources provided by different resource-providing robots produce the same unit revenue for resource-consuming robots, and a price mechanism is introduced into the resource allocation process to quantify the degree of demand for resources by resource-providing robots and resource-consuming robots. Price regulation can improve the efficiency of resource utilization and maximize overall revenue, and so the swarm robot system needs to establish an effective price mechanism between the resource-providing robots and the resource-consuming robots.

Modeling the autonomous swarm robot system as a graph-theoretic model can aid in understanding the game relationship between robots more deeply. The autonomous swarm robot system is modeled as a network graph model containing a set of resource-providing robots (Sv) and a set of resource-consuming robots (Cm). In this model, the resource-providing robots (Sv) and resource-consuming robots (Cm) are the nodes of the network graph, and the edges denote the existence of game behaviors between the robots on both sides of the edges. From Figure 1, it can be seen that in the autonomous swarm robot system, there is a game relationship between all robots. This means that the behavior and strategy choices of each robot will be influenced by other robots, and their decision-making relationships with each other will generate complex dynamics in the whole system.

3.2. Combination Game Model

In this system, multiple resource-providing robots provide computational resources for multiple resource-consuming robots, and for the problem of yield allocation among multiple resource-providing robots, the Cournot game is generally used in economics for yield decision analysis. In swarm robots, the resource-providing robots can obtain higher revenues through the yield game and can ensure price consistency through a uniform pricing strategy, thus avoiding price wars and resource wastage. Similarly, resource-consuming robots need to consider the balance between cost and revenue when purchasing computational resources to ensure that the purchased computational resources can generate sufficient revenue for themselves. This requires an effective pricing mechanism between resource-providing robots and resource-consuming robots to achieve efficient use of resources and maximize overall revenue.

In order to achieve this goal, market mechanisms can be introduced to coordinate transactions between resource-providing robots and resource-consuming robots. The market mechanism coordinates the supply and demand of resources through prices so that the prices of resources are balanced with the supply and demand. In the market mechanism, resource-providing robots can submit the supply quantity and price of resources to the market, while resource-consuming robots can submit the demand quantity and price of resources to the market. The market determines the price of resources by matching the supply and demand of buyers and sellers, i.e., reaching the Nash equilibrium point of the two, so as to realize the efficient allocation of resources and maximize the overall benefits.

In this model, resource-providing robots earn revenue by selling computational resources, while resource-consuming robots must buy these resources to perform tasks and earn revenue. In this process, it is assumed that resource-providing robots and resource-consuming robots make decisions based on rationality—they choose the strategy that maximizes their own benefits. Additionally, it is assumed that there is no direct competition between the resource-consuming robots and they can independently choose to purchase resources. Both parties aim to maximize their own benefits to achieve an equilibrium state where no robot has an incentive to unilaterally alter its strategy based on the strategies of the other participants.

For the resource-providing robot, its maximum revenue can be obtained by solving the maximum value of the revenue function, which takes the form of the following equation:

M a x U_{c} (P, Q_{n})

(1)

where U_c is the revenue of the resource-providing robot, P is the pricing of the computational resources, and Q_n is the number of computational resources sold by the resource-providing robot.

Similarly, for the resource-consuming robot, the problem of maximizing its revenue can be expressed in the following form:

M a x U_{n} (P, Q_{n})

(2)

where U_n denotes the revenue of the resource-consuming robots, in which the resource-consuming robots need to perform a trade-off between calculating the amount of resources to be purchased and the revenue generated by performing tasks to maximize their revenue.

3.2.1. Cournot Model between Resource-Providing Robots

For the competitive scenario in an autonomous swarm robotics system where there are multiple resource-providing robots, we adopt the Cournot model, in which the resource-providing robots compete in order to maximize their revenue. In this model, each resource-providing robot needs to consider both its own resource cost and the impact of competitors’ pricing strategies on itself when formulating its own resource pricing strategy.

Game Composition

The composition elements of the Cournot game model proposed in this paper are specified as follows:

Participants: M resource-providing robots;
Strategy set: output strategies of resource-providing robots;
Utility function: the utility function of resource-providing robots is denoted by $Π$ .

Utility function structure

Resource-providing robots produce the same computational resources with the same selling price p, the same production cost c₀, and the same revenue for resource-consuming robots. Each resource-providing robot simultaneously determines its own production x_i and obtains the total production X. Then, the simultaneous price p is inversely proportional to the total production.

From the above analysis, the total output of the producer is as follows:

X = \sum_{m = 1}^{M} x_{i}

(3)

Price is determined by the following equation:

p = m - α \sum_{m = 1}^{M} x_{i}

(4)

where m and α are constants.

The total cost is as follows:

C = c_{i} \times x_{i}

(5)

Due to the uniform pricing strategy, each resource-providing robot sells for p;

The revenue function of the ith resource-providing robot can be expressed as follows:

Π_{i} = (p - c_{0}) \times x_{i} = (m - α \times \sum_{i = 1}^{M} x_{i} - c_{0}) \times x_{i}

(6)

Then, the optimal decision for the resource-providing robot should be the following:

M a x Π_{i} (p, x_{i})

(7)

The game equilibrium is solved using backward induction by taking the first-order partial derivatives of the payoff functions of the n resource-providing robots and making them equal to zero:

\{\begin{matrix} \frac{\partial Π_{1}}{\partial x_{1}} = m - 2 α x_{1} - α \sum_{i = 2}^{M} x_{i} - c_{0} = 0 \\ \frac{\partial Π_{2}}{\partial x_{2}} = m - 2 α x_{2} - α (x_{i} + \sum_{i = 3}^{M} x_{i}) - c_{0} = 0 \\ \frac{\partial Π_{3}}{\partial x_{3}} = m - 2 α x_{3} - α (\sum_{i = 1}^{2} x_{i} + \sum_{i = 4}^{M} x_{i}) - c_{0} = 0 \\ \dots \dots \\ \frac{\partial Π_{M}}{\partial x_{M}} = m - 2 α x_{M} - α \sum_{i = 1}^{M - 1} x_{i} - c_{0} = 0 \end{matrix}

(8)

The production of robots provided by each resource can be found as follows:

\{\begin{matrix} x_{1} = \frac{m - α \sum_{i = 2}^{M} x_{i} - c_{0}}{2 α} \\ x_{2} = \frac{m - α (x_{1} + \sum_{i = 3}^{M} x_{i}) - c_{0}}{2 α} \\ x_{3} = \frac{m - α (\sum_{i = 1}^{2} x_{i} + \sum_{i = 4}^{M} x_{i}) - c_{0}}{2 α} \\ \dots \dots \\ x_{M} = \frac{m - α \sum_{i = 1}^{M - 1} x_{i} - c_{0}}{2 α} \end{matrix}

(9)

According to the optimization conditions, the Nash equilibrium production of this Cournot game model can be found as follows:

x_{1} = x_{2} = \dots = x_{m} = \frac{m - c_{0}}{(M + 1) α}

(10)

The analysis of the above formula shows that the production of resource-providing robots to calculate resources is inversely related to the number of resource-providing robots, and the total production X is never less than

\frac{m - c_{0}}{α}

, which indicates that the production will not be infinitely scaled down. This is because the total production is too low, which will lead to the market price being too high.

After solving the Nash equilibrium production, one can determine the initial price P₀ of the computational resource from Equation (4) as follows:

p_{0} = m - \frac{M}{M + 1} (m - c_{0})

(11)

3.2.2. Price Game between Resource-Providing Robots and Resource-Consuming Robots

After the resource-providing robot determines the initial pricing, the resource-consuming robot seeks a balance between cost and benefit, and based on the resource-providing robot’s pricing of the computing resources, it ensures that the purchased computing resources will bring sufficient benefit before deciding on its own purchasing strategy, Q_n. In the course of the game, the resource-providing robot and the resource-consuming robot keep adjusting their strategies until they reach Nash equilibrium. With Nash equilibrium, no participants have an incentive to change their strategies because doing so will not improve their returns. When the price game reaches Nash equilibrium, an effective price mechanism is established between the resource-providing robots and resource-consuming robots, which helps to achieve the efficient use of resources while maximizing the total revenue of the entire swarm robot system.

Game Composition

Three elements of the price game are specified as follows:

Participants: M resource-providing robots and N resource-consuming robots;
Strategy set:
- Pricing of resource-providing robots P;
- The purchase quantity Q_n of the resource-consuming robots;
Utility function: The utility function of the resource-providing robot is denoted by U_c. The utility function of the resource consumption robot is denoted by U_n.

Utility function structure

Inspired by the work of Jiang, Y., Ma, M., Bennis, M. et al. [28], in the system model, considering the actual situation, the computing resources of the resource-providing robots tend to have different frequencies of being purchased, i.e., they have different popularity, and the request probability of the resource-consuming robots to purchase the computing resources is T_c, which obeys the Zipf distribution:

T_{c} = \frac{c^{- ξ}}{\sum_{k = 1}^{c} k^{- ξ}}

(12)

where ξ is the popularity of the computing resource. When ξ is larger, the computing resource is more popular; when ξ is smaller, the computing resource is less popular.

In robot resource allocation scenarios, how to capture each robot’s preference and selection tendency for resources based on limited information is the key to understanding robot behavior and optimizing resource allocation. In order to better model the variability of resource-consuming robots purchasing computational resource requests, inspired by the research on preference modeling in the work of Yasir, M., uz Zaman, S.K., and Maqsood, T. et al. [29], this paper designs an individual content popularity formula that combines the quality of resource evaluation and the number of historical purchases:

T_{u} = e_{1} \frac{G_{i}^{a v g}}{G_{i}} + e_{2} \frac{H_{i}}{H}

(13)

where G_i^avg denotes the average evaluation of resource-consuming robot m on computing resources provided by resource-providing robot i, G_i is the full evaluation of computing resources provided by resource-providing robot i, H_i denotes the number of times resource-consuming robot m purchases computing resources i, H is the number of times resource-consuming robot m purchases computing resources, and e₁ and e₂ are the weighting coefficients of the evaluations and the number of times that they are purchased.

In addition, by weighting the resource popularity and the popularity based on individual preference, the purchase probability T of resource-consuming robot m on computing resource i can be constructed, as shown in Equation (14):

T = k_{1} T_{c} + k_{2} T_{u}

(14)

where k₁ and k₂ denote the weights of resource popularity and individual content popularity, respectively. k₁ > k₂ indicates that the resource-providing robot values the overall popularity of resources more; otherwise, it indicates that the resource-consuming robot pursues individual interests more. Therefore, T can reasonably reflect the purchasing interest of resource-consuming robots on popular computing resources.

Combined with the above analysis, the strategy of resource consumption robots can change their purchase Q_n according to their own situation, computing resource pricing P, and purchase probability T.

The Keynesian utility function is a concise and effective analytical tool for studying consumption behavior patterns and consumer decisions. It uses a logarithmic form to model the decreasing contribution of increasing consumption to utility. That is to say, each unit of increase in consumption makes a smaller and smaller marginal contribution to total utility. Similarly, in autonomous swarm robotics systems, the utility obtained by resource-consuming robots from computing resources conforms to the decreasing law. Meanwhile, inspired by the work of Christensen, L.R. et al. [30], combined with the above analysis, the logarithmic function is used to construct the revenue that the resource-consuming robots obtain after purchasing computing resources:

U_{i} = T β \sum_{n = 1}^{N} \ln (1 + Q_{n})

(15)

The function is a monotonically increasing function with respect to Q_n, and its marginal utility decreases. Here, the benefit coefficient of the resource-consuming robot (a constant) is involved.

The revenue function of the resource-consuming robots is the difference between the revenue obtained from the purchase of the computed resource and the purchase cost, where the purchase cost of the resource-consuming robot receives the influence of its own preferences and cost coefficients, i.e.,

U_{n} = T β \sum_{n = 1}^{N} \ln (1 + Q_{n}) - ρ \sum_{n = 1}^{N} Q_{n} ϕ P

(16)

where ϕ is the pricing adjustment factor influenced by the preferences of the resource-consuming robot and ρ is a cost factor.

The revenue per unit of computed resource for the resource-providing robot is the difference between its pricing and cost; so, the revenue function for the resource-providing robot can be derived as follows:

U_{c} = \sum_{m = 1}^{M} ρ (P - C) Q_{n}

(17)

4. Proof and Solving

In the interaction between the resource-consuming robot and the resource-providing robot, if the resource price set by the resource-providing robot is excessively high, the resource-consuming robot will refrain from purchasing resources. Conversely, if the resource price set by the resource-providing robot is too low, the resource-providing robot will be unable to generate sufficient profit. This situation gives rise to a non-cooperative game between the two parties. To address this game problem, the optimal decisions of the resource-providing robot and the resource-consuming robot can be determined by analyzing the Nash equilibrium point and the optimal solution, so as to maximize the benefits for both parties involved.

In this paper, we employ the Lagrange multiplier method to determine the optimal purchasing strategy for a resource-consuming robot. Through incorporating the Lagrange multiplier, we merge the robot’s constraints with the objective function to derive the optimal solution. The optimal service resource pricing is solved through the use of the inverse induction method, i.e., starting from the last step of backward extrapolation, the optimal pricing strategy of the service resource is gradually derived by analyzing the profit maximization problem of the resource-providing robot. Through the combination of these two methods, the model’s equilibrium state can be effectively solved, providing strong theoretical support for further study of resource allocation problems.

4.1. Nash Equilibrium Existence Proof

In the first phase of the combination game, a Cournot game is played between the resource-providing robots to determine the total production and the initial price. In the subsequent phases, the resource-consuming robots need to purchase these computational resources, and a non-cooperative game is formed between them. According to the Nash equilibrium theory, there exists a unique Nash equilibrium solution when the resource pricing of the resource-providing robots is kept constant. Under the condition of this solution, no party can gain more by changing its strategy.

First, we can compute the first-order partial derivatives of the revenue function of the resource-consuming robot with respect to its set of strategies to obtain the following:

\frac{\partial U_{n}}{\partial Q_{n}} = \frac{(k_{1} \frac{c^{- ξ}}{\sum_{k = 1}^{c} k^{- ξ}} + k_{2} (e_{1} \frac{G_{i}^{avg}}{G_{i}} + e_{2} \frac{H_{i}}{H})) β}{1 + Q_{n}} - P

(18)

Then, its second-order partial derivative is obtained:

\frac{\partial^{2} U_{n}}{\partial^{2} Q_{n}} = - \frac{(k_{1} \frac{c^{- ξ}}{\sum_{k = 1}^{c} k^{- ξ}} + k_{2} (e_{1} \frac{G_{i}^{avg}}{G_{i}} + e_{2} \frac{H_{i}}{H})) β}{{(1 + Q_{n})}^{2}}

(19)

The analysis, utilizing Equation (19), indicates that the utility function’s second-order partial derivative with respect to the resource-consuming robot’s set of strategies is negative. This implies that the utility function is a concave function and demonstrates the existence of Nash equilibrium in the game model. This result guarantees the existence of an optimal solution for the game between the resource-consuming robots and the resource-providing robots.

4.2. Game Model Solving

According to economic theory, a rational resource-consuming robot will follow the principle of utility maximization and choose the purchase strategy that maximizes its own utility under the given resource pricing conditions. From this perspective, the procurement of a resource-consuming robot can be considered as a mathematical optimization problem for maximizing utility. That is, the resource-consuming robot must identify an optimal strategy for maximizing its revenue based on a given pricing strategy. With this analysis, the problem can be transformed into the following:

{M a x U}_{n} = T β \sum_{n = 1}^{N} \ln (1 + Q_{n}) - ρ \sum_{n = 1}^{N} Q_{n} ϕ P

(20)

Using the Lagrange multiplier method, problem (20) is transformed into the following:

Γ = U_{n} + λ Q_{n} + γ (1 - Q_{n})

(21)

In emergency scenarios, the system is isolated from the outside world and resources are scarce, necessitating consideration of the constraints placed on resource-consuming robots in analyzing the problem. In the actual situation, the total amount of purchases of resource-consuming robots should be equal to the total production X of resource-providing robots, and so the production constraint is as follows:

\sum_{N}^{n = 1} Q_{n} = X

(22)

At the same time, the amount spent on the purchase of a resource-consuming robot cannot exceed its maximum budget Ω:

\sum_{N}^{n = 1} P Q_{n} \leq Ω

(23)

Based on the given information, the Karush–Kuhn–Tucker (KKT) constraint can be expressed as follows:

\{\begin{matrix} \frac{\partial U_{n}}{\partial Q_{n}} = 0 \\ κ (\sum_{n = 1}^{N} Q_{n} - x) = 0 \\ ν (\sum_{n = 1}^{N} P Q_{n} - Ω) = 0 \\ λ Q_{n} = 0, γ (1 - Q_{n}) = 0 \\ λ, γ, ν, κ \geq 0 \end{matrix}

(24)

The solution Q_n of the above equation is the optimal solution of the combination game proposed in this paper and is as follows:

Q_{n} = \frac{(k_{1} \frac{c^{- α}}{\sum_{k = 1}^{c} k^{- α}} + k_{2} (e_{1} \frac{G_{i}^{avg}}{G_{i}} + e_{2} \frac{N_{i}}{N})) β}{(ρ ϕ - ν) P - λ + γ - κ} - 1

(25)

The obtained Q_n represents the number of computing resources that the resource consumer chooses to purchase under rational conditions according to its preferences and the pricing it faces, and is the strategic choice of the resource-consuming robot. Q_n serves as the strategic variable of the resource-consuming robots in the game model, interacting with the pricing strategy P of the resource-providing robots. Firstly, the resource-providing robots will determine their optimal pricing P* according to the change in the size of Q_n, to maximize the revenue; secondly, the resource-consuming robot will determine its optimal purchasing quantity Q_n * according to P, to maximize the utility. Finally, the game reaches Nash equilibrium when both the resource-providing robots and the resource-consuming robots are unable to increase the utility by changing the strategy unilaterally. Therefore, Q_n is the key variable that connects the resource supply and demand sides and realizes the equilibrium of the game. As a strategic choice made by the resource-consuming party according to economic rationality, the change in its value will push the game towards the direction of an optimal solution, and ultimately affect the progress and results of the game.

After determining the purchasing strategy Q_n of the resource-consuming robots, the resource-providing robots will adjust their pricing strategy accordingly to maximize their revenue. To solve the optimal pricing of the resource-providing robots, the backward induction method is needed.

First, find the maximum benefit of the resource-providing robot:

M a x U_{c} = \sum_{m = 1}^{M} ρ (P - c) Q_{n}

(26)

To derive the equation above, we took the following derivative:

\frac{\partial U_{c}}{\partial P} = ρ [Q_{n} + (P - C) \frac{\partial Q_{n}}{\partial P}]

(27)

Making the above Equation 0, the optimal pricing strategy for the resource-providing robot is solved as follows:

P = C - \frac{Q_{n}}{\frac{\partial Q_{n}}{\partial P}}

(28)

It should be noted that the purchasing strategy Q_n of the resource-consuming robots and the pricing strategy P of the resource-providing robots have a bidirectional and dynamically changing implicit dependency, meaning that a change in Q_n will cause an adjustment in P and vice versa. This interdependence between the variables makes it impossible to directly analyze and solve the optimal pricing P* of the resource-providing robots, which is an NP-hard problem. Specifically, as the resource-providing robots adjust their purchasing strategy Q_n based on its utility maximization, changes to their strategy impact the differential equation of the pricing strategy used by the resource-providing robots, resulting in alterations in the original optimal pricing P. The resource-providing robots modify their purchasing strategy Q_n based on the new Q_n, which is a problem that is NP-hard. The adjustment of pricing by the resource-providing robots based on the new Q_n also affects the utility maximization problem of the resource-consuming robots, leading to a change in Q_n. As Q_n and P are mutually constrained, it becomes challenging to directly obtain the optimal pricing, P*, through straightforward solutions. Hence, an iterative updating mechanism must be devised for resolving optimal pricing between resource-providing robots through repeated iterations. In each iteration round, the resource-providing robots modify the pricing P based on the updated Q_n, and generate a new Q_n from the interdependence between Q_n and P in the current round. Through continuous iteration, the pricing strategies of the resource-providing robots will eventually converge to the optimal solution of the system as a whole. The specific iteration formula is as follows:

P_{t + 1} = C - \frac{Q_{n}}{\frac{\partial Q_{n}}{\partial P_{t}}}

(29)

where t is the number of iterations and P_t denotes the resource pricing of the resource-providing robot at the tth iteration. Set the following convergence conditions: 1. The revenue of both the resource-providing robots and the resource-consuming robots reaches the maximum value when the number of iterations is t + 1. 2. The change in the pricing of the resource-providing robots between the two rounds of iterations is less than the set convergence condition: |P_t+₁ − P_t| < ɛ. 3. When the number of iterations is more than 10,000, as long as any one of the above convergence conditions is met, stop the iterative process. Otherwise, continue to the next iteration cycle until both sides’ gains reach the maximum value.

The process of the distributed iterative algorithm 1 is as follows:

There are M resource-providing robots in the initialization phase, setting the initial pricing policy P₀; there are N resource-consuming robots, setting the initial purchasing policy Q₀. After entering the main loop, the following pseudocode is executed:

Algorithm 1: Distributed Iterative Game Pricing Algorithm

Input:
M: Number of resource-providing robots
N: Number of resource-consuming robots
P0: Initial pricing strategies of resource-providing robots
Q0: Initial purchasing strategies of resource-consuming robots
Output:
P*: Optimal pricing strategies of resource-providing robots
Q*: Optimal purchasing strategies of resource-consuming robots

1: t = 0 // initialize time step
2: while t>=10000 do
3: for each i in M do
4: Pi = AdjustPricingStrategy(Pi, Q, Equation (17)) // Update pricing strategy based on
Equation (17) and Q
5: Broadcast Pi to all resource-consuming robots
6: end for
7: for each j in N parallel do
8: kj = 0 // Initialize inner iteration counter
9: while Qj not Nash equilibrium do
10: Qj = AdjustPurchasingStrategy(Qj, P, Equation (16)) // Update purchasing strat-
egy based on Equation (16) and P
11: kj ++
12: end while
13: Broadcast Qj to all resource-providing robots
14: end for
15: t ++ // Update time step
16: end while
17: return P*, Q* // Output the purchasing strategy Q* of the resource-consuming
robots at this point and the pricing strategy P* of the resource-providing robots

The iterative calculation process reflects the dynamic game and continuous adaptation of resource supply and demand strategies, which simulates the dynamic adjustment process of supply and demand in the actual economic environment and can effectively obtain the Nash equilibrium solution of the resource price game.

5. Simulation Analysis and Discussion

The PyCharm platform is utilized to conduct comprehensive simulations of the combination game involving autonomous swarm robots in a networked environment. In order to simplify the calculation, we assumed that there were two resource-providing robots and three resource-consuming robots in the network environment. The specific simulation parameters are listed in Table 1.

5.1. Comparison of Utility between Cournot Game and Stackelberg Game

Figure 2 compares the impact of the Cournot, Stackelberg, and Bertrand game models on the total utility of the resource-providing robots as parameter m changes in the Nash equilibrium state. By adjusting the parameter m, we can simulate various market demand functions. Drawing upon economic principles, we can calculate the total utility of resource-providing robots under three distinct oligopoly game scenarios, with the price parameter α set to 2. The figure shows that the Cournot model achieves higher total utility than the Stackelberg model across all m values. Under the Cournot model, the interactive mechanism between output decisions is realized through information sharing. Each robot considers the influence of others when deciding the output, leading to a collective equilibrium of output that meets the collaboration needs for disaster rescue. However, under the Stackelberg model, output decisions are independent, failing to form collaborative adjustments. Secondly, the Bertrand model exhibits the lowest total utility across all m values. This may be because the Bertrand model only considers price competition, ignoring the output interaction between robots. In emergencies, actual resource output is critical for meeting demand and optimizing effectiveness. Therefore, the Bertrand model’s limitations lead to lower total utility. Additionally, as m increases, the total utility of the three models also increases since market demand increases with m. This aligns with economics, because as market demand increases, robots can better meet demand, increasing total utility. These results reflect the Cournot model’s advantages in resource allocation and pricing for swarm robots to achieve higher total utility.

5.2. Analysis of Pricing Changes

Figure 3 displays the pricing change process with the number of iterations for resource-providing robot 1 and resource-providing robot 2 when the initial pricing is set at 10. The robots adjust their pricing strategies based on the possible pricing strategies of each other. With increasing iterations, the participants progressively approach the optimal response strategy under rational expectations. Eventually, when resource-providing robot 1 is priced at around 8.4, and resource-providing robot 2 is priced at around 8.8, the pricing of both robots converges and reaches the Nash equilibrium state. At this point, the revenue of both parties is maximized. This equilibrium represents the optimal pricing combination that maximizes the revenue of both parties, and any unilateral change in the pricing strategy of either participant will lead to a decrease in their own utility. Although the strategies of the participants are dynamically adjusted during the game process, Figure 3 demonstrates that ultimately, the game mechanism drives both parties’ strategies to a stable state. This process confirms that the designed combination game model successfully promotes the convergence of participants’ strategies to the Nash equilibrium stabilization point.

Figure 4 illustrates the pricing relationship diagram for the initial pricing of 8. After modifying the initial pricing condition to P1 = P2 = 8, the iterative game solving of pricing for resource-providing robots is repeated. As shown in Figure 3, the Nash equilibrium points of resource-providing robot 1 and resource-providing robot 2 are still around (8.4, 8.8). This observation verifies the correctness and robustness of the algorithm. Under different initial conditions, the designed combination game model is able to drive the participants’ strategies to converge to similar stable equilibrium points.

5.3. Utility Analysis of Combination Game and Price Game of Swarm Robots

Figure 5 illustrates the comparison of changes in utility for resource-consuming robots with the number of iterations in the established model, specifically comparing the combination game and the traditional price game strategy. By examining the utility function of the resource-consuming robot, it becomes evident that its utility is dependent on the pricing strategy of the resource-providing robot. Prior to the attainment of Nash equilibrium by resource-providing robot 1 and resource-providing robot 2, the revenue of the resource-consuming robot experiences fluctuations. These fluctuations arise from the process of adjusting the pricing strategies of resource-providing robot 1 and resource-providing robot 1. Once the pricing strategies of both resource-providing robots converge to stable points, the revenue of the resource-consuming robot tends to stabilize as well. This observation highlights that the stability of the pricing strategies employed by the resource-providing robots leads to stable revenue for the resource-consuming robots when the Nash equilibrium value is reached. Furthermore, when compared with the traditional price game utilizing a distributed iterative search algorithm, it becomes apparent that under the combination game strategy proposed in this article, the resource-consuming robots achieve higher returns in the Nash equilibrium state.

Figure 6 demonstrates the changing trend in revenue for resource-providing robots as the number of iterations progresses in the established model and the traditional dynamic price game model. It is evident from the figure that the income of resource-providing robots fluctuates before reaching Nash equilibrium. Similarly to the scenario depicted in Figure 5, the utility of a resource-providing robot is influenced by its own pricing strategy as well as the pricing strategies of other participants. Once the pricing strategies of resource-providing robot 1 and resource-providing robot 2 stabilize at the Nash equilibrium point, the utility of the resource-providing robot also tends to stabilize. Comparing this with the price game model, it is observed that when the Nash equilibrium is attained, the income of the resource-providing robot under the combination game strategy based on the Cournot game is also significantly high.

5.4. Comparison of Utility with Different Numbers of Robots

According to the results shown in Figure 7, when the number of resource-consuming robots is six, the benefits of using resource-providing robots significantly increase, while the utility of resource-consuming robots decreases. These findings align with real-life emergency scenarios, where the closed nature of such situations and limited resources lead to higher selling prices when the number of consumers increases. As a result, resource-consuming robots are negatively affected as they have to pay higher prices to obtain the necessary resources, thus reducing their utility. On the other hand, resource-providing robots benefit from the price increase as it boosts their revenue from providing resources, resulting in a substantial increase in their earnings. This observation is consistent with the basic principle of economics that holding output constant and increasing market demand will lead to an increase in price.

The data in Figure 8 illustrate that, with constant demand, the individual utility of each resource-providing robot follows a monotonic declining pattern as the number of resource-providing robots increases. This observation aligns with the fundamental logic of neoclassical economics, where oversupply leads to price reductions, thus impacting profits. However, solely relying on the decline in the individual utility of resource-providing robots is insufficient to determine whether increasing the number of robots is harmful to the overall system. A more comprehensive and long-term perspective is required for evaluation. On the one hand, increasing the number of supply robots can expand the production scale of the system and better meet the demands of the consumer side. From this perspective, it is beneficial for the development of the system. On the other hand, despite the decrease in individual utility, the overall increase in the number of resource-providing robots enhances their total utility level. This indicates that from a macro-economic viewpoint, augmenting the supply power is not necessarily detrimental.

The simulation results depicted in Figure 7 and Figure 8 provide evidence that supports the constructed game model’s ability to accurately describe and analyze the dynamics of the supply and demand relationship. These findings validate the model’s rationality and correctness, affirming its reliability as a tool for understanding resource allocation scenarios.

5.5. Discussion

In summary, this study highlights the advantages of the Cournot game model and the proposed combination game strategy for resource allocation in emergency scenarios involving group robots. The Cournot game model consistently outperforms the other game models in terms of total utility. This is attributed to the collaborative adjustments enabled by information sharing, which are lacking in the Stackelberg model, and the disregard for output interaction in the Bertrand model. The two-stage resource allocation game model, combining the Cournot game and price game, effectively guides participants towards maximizing utility and achieving optimal resource allocation. Through the modification of initial pricing, the game process drives participants’ strategies to converge to similar Nash equilibrium states, validating the correctness and robustness of the designed game mechanism. Simulation results demonstrate the continuous adjustment of participants’ strategies during the iterative process, leading to dynamic changes in their effectiveness. As strategies stabilize at the Nash equilibrium point, participants’ utility tends to stabilize as well. This process aligns with the core conclusion of non-cooperative game theory, which states that utility varies with strategy adjustments and remains constant when a stable pairing is reached. Compared to traditional price games based on distributed iterative algorithms, the proposed combination game strategy yields significantly higher utility for both resource-providing and resource-consuming robots under the Nash equilibrium state. By changing the number of robots and observing the changes in the utility of resource-providing robots and resource-consuming robots, the results are consistent with economic principles and verify the rationality and correctness of the constructed model describing the supply and demand relationship.

6. Conclusions

This paper has constructed a two-stage resource allocation game model that addresses the optimal resource allocation problem for autonomous swarm robot systems in complex and dynamic emergency rescue scenarios. The model incorporates both the Cournot game and the price game. In the first stage, multiple resource-providing robots within the group engage in a Cournot game to determine their resource output and pricing. In the second stage, these resource-providing robots engage in a price game with resource-consuming robots, considering output constraints, to determine the optimal pricing and purchasing strategies. The model takes into account the strategic interaction between resource providers and consumers and mathematically proves the existence of a unique Nash equilibrium solution. To address the interdependence of pricing strategies among resource-providing robots, an inverse distributed iterative algorithm is designed to solve the game equilibrium. Simulation results confirm the effectiveness and robustness of the algorithm. The research demonstrates that the proposed game model captures the dynamic game relationship between resource-providing robots and resource-consuming robots in emergency scenarios. The game mechanism designed in this study aims to maximize the utility of both resource providers and consumers, facilitating optimal resource allocation through collaborative adjustments in resource output and pricing. Furthermore, this study highlights the advantages of the Cournot game and combinatorial game in addressing the resource allocation and utility maximization problems for swarm robots in emergency scenarios. Compared to the Stackelberg game and Bertrand game, the Cournot game more accurately describes the output–game relationship among resource-providing robots in emergency scenarios, facilitating the coordination of total output and market demand. In contrast to the traditional dynamic price game, the combination game considers the strategic interaction between resource providers and consumers, enabling joint adjustments in resource output and pricing to achieve optimal resource allocation. This collaborative game-driven mechanism is well suited to adapting to demand changes in complex and dynamic environments. The model successfully describes changes in supply and demand and their impact on the utility of the participants. The impact of quantity on efficiency is complex, and a short-term decline does not necessarily mean that it is detrimental to the system. The experimental results verify the rationality and reliability of the game model in describing the supply and demand relationship. Future work will focus on considering coupling constraints for different types of resources to address even more complex physical environments.

Author Contributions

Conceptualization, Z.H., Y.S. and Z.F.; Methodology, Z.H. and Z.F.; Validation, Y.S.; Formal analysis, Z.H. and Z.F.; Writing—Original draft, Z.H.; Writing—Review & editing, Z.H. and Y.S.; Visualization, Z.H.; Supervision, Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Mase, K. How to deliver your message from/to a disaster area. IEEE Commun. Mag. 2011, 49, 52–57. [Google Scholar] [CrossRef]
Deepak, G.C.; Ladas, A.; Sambo, Y.A.; Pervaiz, H.; Politis, C.; Imran, M.A. An overview of post-disaster emergency communication systems in the future networks. IEEE Wirel. Commun. 2019, 26, 132–139. [Google Scholar] [CrossRef]
Lu, X.; Yang, Z.; Xu, Z.; Xiong, C. Scenario simulation of indoor post-earthquake fire rescue based on building information model and virtual reality. Adv. Eng. Softw. 2020, 143, 102792. [Google Scholar] [CrossRef]
Cao, Y.U.; Kahng, A.B.; Fukunaga, A.S. Cooperative mobile robotics: Antecedents and directions. Robot. Colon. 1997, 1997, 7–27. [Google Scholar]
Dudek, G.; Jenkin, M.R.M.; Milios, E.; Wilkes, D. A taxonomy for multi-agent robotics. Auton. Robot. 1996, 3, 375–397. [Google Scholar] [CrossRef]
Dorigo, M.; Şahin, E. Swarm robotics: Special issue editorial. Auton. Robot. 2004, 17, 111–113. [Google Scholar] [CrossRef]
Brambilla, M.; Ferrante, E.; Birattari, M.; Dorigo, M. Swarm robotics: A review from the swarm engineering perspective. Swarm Intell. 2013, 7, 1–41. [Google Scholar] [CrossRef]
Beni, G. From swarm intelligence to swarm robotics. In Swarm Robotics, Proceedings of the International Workshop on Swarm Robotics Santa Monica, CA, USA, 17 July 2004; Springer: Berlin/Heidelberg, Germany, 2004; pp. 1–9. [Google Scholar]
Krauter, K.; Buyya, R.; Maheswaran, M. A taxonomy and survey of grid resource management systems for distributed computing. Softw. Pract. Exp. 2002, 32, 135–164. [Google Scholar] [CrossRef]
Jiang, Y. A survey of task allocation and load balancing in distributed systems. IEEE Trans. Parallel Distrib. Syst. 2015, 27, 585–599. [Google Scholar] [CrossRef]
Ahmad, A.; Ahmad, S.; Rehmani, M.H.; Hassan, N.U. A survey on radio resource allocation in cognitive radio sensor networks. IEEE Commun. Surv. Tutor. 2015, 17, 888–917. [Google Scholar] [CrossRef]
Estrin, D.; Govindan, R.; Heidemann, J.; Kumar, S. Next century challenges: Scalable coordination in sensor networks. In Proceedings of the 5th Annual ACM/IEEE International Conference on Mobile Computing and Networking, Seattle, WA, USA, 15–19 August 1999; pp. 263–270. [Google Scholar]
Trestian, R.; Ormond, O.; Muntean, G.-M. Game theory-based network selection: Solutions and challenges. IEEE Commun. Surv. Tutor. 2012, 14, 1212–1231. [Google Scholar] [CrossRef]
Riahi, S.; Riahi, A. Application of Game Theory to Optimize Wireless System Resource Allocation. Int. J. Online Eng. 2018, 14, 4–25. [Google Scholar] [CrossRef]
Talvar, H.M.; Javadi, H.H.S.; Navidi, H.; Rezakhani, A. A new resource allocation method in fog computing via non-cooperative game theory. J. Intell. Fuzzy Syst. 2021, 41, 3921–3932. [Google Scholar] [CrossRef]
Zavadskas, E.K.; Turskis, Z. Multiple criteria decision making (MCDM) methods in economics: An overview. Technol. Econ. Dev. Econ. 2011, 17, 397–427. [Google Scholar] [CrossRef]
Smirnov, A.; Levashova, T.; Pashkin, M.; Krizhanovsky, A.; Kashevnik, A.; Komarova, A.; Shilov, N. Web-service based distributed system for decision support in emergency situations. In Proceedings of the MILCOM 2007-IEEE Military Communications Conference, IEEE, Orlando, FL, USA, 29–31 October 2007; pp. 1–7. [Google Scholar] [CrossRef]
Von Neumann, J.; Morgenstern, O. Theory of Games and Economic Behavior (60th Anniversary Commemorative Edition); Princeton University Press: Princeton, NJ, USA, 2007. [Google Scholar] [CrossRef]
Nash, J. Non-cooperative games. Ann. Math. 1951, 1951, 286–295. [Google Scholar] [CrossRef]
Venkateswarararao, K.; Kumar, P.; Solanki, A.; Swain, P. BandBlock: Bandwidth allocation in blockchain-empowered UAV-based heterogeneous networks. ETRI J. 2022, 44, 945–954. [Google Scholar] [CrossRef]
Xiao, Y.; Peng, Y.; Lu, Q.; Wu, X. Chaotic dynamics in nonlinear duopoly Stackelberg game with heterogeneous players. Phys. A Stat. Mech. Its Appl. 2018, 492, 1980–1987. [Google Scholar] [CrossRef]
Adamson, G.; Wang, L.; Holm, M.; Moore, P. Cloud manufacturing–a critical review of recent development and future trends. Int. J. Comput. Integr. Manuf. 2017, 30, 347–380. [Google Scholar] [CrossRef]
Bimpikis, K.; Ehsani, S.; İlkılıç, R. Cournot competition in networked markets. Manag. Sci. 2019, 65, 2467–2481. [Google Scholar] [CrossRef]
He, S.; Wang, W. Multimedia upstreaming cournot game in non-orthogonal multiple access Internet of Things. IEEE Trans. Netw. Sci. Eng. 2019, 7, 398–408. [Google Scholar] [CrossRef]
Zhou, H.; Yu, C. Distributed cooperative control algorithm for optimal power sharing for AC microgrids using Cournot game theory. Neural Comput. Appl. 2021, 33, 973–983. [Google Scholar] [CrossRef]
Nguyen, H.-H.; Hasegawa, M.; Hwang, W.-J. Distributed resource allocation for D2D communications underlay cellular networks. IEEE Commun. Lett. 2015, 20, 942–945. [Google Scholar] [CrossRef]
Baek, B.; Lee, J.; Peng, Y.; Park, S. Three dynamic pricing schemes for resource allocation of edge computing for IoT environment. IEEE Internet Things J. 2020, 7, 4292–4303. [Google Scholar] [CrossRef]
Jiang, Y.; Ma, M.; Bennis, M.; Zheng, F.; You, X. A novel caching policy with content popularity prediction and user preference learning in fog-RAN. In Proceedings of the 2017 IEEE Globecom Workshops (GC Wkshps), IEEE, Singapore, 4–8 December 2017; pp. 1–6. [Google Scholar] [CrossRef]
Yasir, M.; Zaman, S.K.U.; Maqsood, T.; Rehman, F.; Mustafa, S. CoPUP: Content popularity and user preferences aware content caching framework in mobile edge computing. Clust. Comput. 2023, 26, 267–281. [Google Scholar] [CrossRef]
Christensen, L.R.; Jorgenson, D.W.; Lau, L.J. Transcendental logarithmic utility functions. Am. Econ. Rev. 1975, 65, 367–383. Available online: https://www.jstor.org/stable/1804840 (accessed on 12 September 2023).

Figure 1. Network model diagram.

Figure 2. Utility of three different oligopoly games.

Figure 3. Pricing relationship at initial pricing of 10.

Figure 4. Pricing relationship diagram for initial pricing of 8.

Figure 5. Change in utility of resource-consuming robots.

Figure 6. Change in utility of resource-providing robots.

Figure 7. Utility of robots when the number of resource-consuming robots is 6.

Figure 8. Utility of resource-providing robots with different numbers of resource-providing robots.

Table 1. Specific simulation parameters.

Notation	Description	Value
M	Number of resource-providing robots	2
N	Number of resource-consuming robots	3
T_c	Probability that a resource-consuming robot purchases a computing resource	0.8
G_i^avg	Average evaluation of computational resources provided by resource-providing robots i and resource-consuming robots m	0.9
G_i	Full evaluation of the computational resources provided for the resource-providing robot i	0.45
H_i	Number of times resource-consuming robot m purchases computational resource i	2
H	Total number of purchases of computational resources by the resource-consuming robot m	4
e₁	Weighting factor for the number of evaluations	0.25
e₂	Weighting factor for the number of purchases	1
k₁	Weighting of resource prevalence	0.5
k₂	Weighting of resource prevalence	0.5
α	Price parameter	4
m	Price parameter	40
c₀	Unit cost coefficient	2
γ	Lagrange factor	9.8
λ	Lagrange factor	1
ν	Lagrange factor	1
κ	Lagrange factor	0.75
ρ	Cost coefficient	1.2
ϕ	Pricing modifier influenced by resource-consuming robot preferences	0.5
β₁	Coefficient of gain	11
β₂	Coefficient of gain	10.8

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

He, Z.; Sun, Y.; Feng, Z. Research on Resource Allocation of Autonomous Swarm Robots Based on Game Theory. Electronics 2023, 12, 4370. https://doi.org/10.3390/electronics12204370

AMA Style

He Z, Sun Y, Feng Z. Research on Resource Allocation of Autonomous Swarm Robots Based on Game Theory. Electronics. 2023; 12(20):4370. https://doi.org/10.3390/electronics12204370

Chicago/Turabian Style

He, Zixiang, Yi Sun, and Zhongyuan Feng. 2023. "Research on Resource Allocation of Autonomous Swarm Robots Based on Game Theory" Electronics 12, no. 20: 4370. https://doi.org/10.3390/electronics12204370

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Resource Allocation of Autonomous Swarm Robots Based on Game Theory

Abstract

1. Introduction

2. Related Works

2.1. Research on Output Game between Resource Providers

2.2. Research on the Price Game between Resource Providers and Consumers

2.3. Research on the Combination Game

3. System Model

3.1. Network Model

3.2. Combination Game Model

3.2.1. Cournot Model between Resource-Providing Robots

3.2.2. Price Game between Resource-Providing Robots and Resource-Consuming Robots

4. Proof and Solving

4.1. Nash Equilibrium Existence Proof

4.2. Game Model Solving

5. Simulation Analysis and Discussion

5.1. Comparison of Utility between Cournot Game and Stackelberg Game

5.2. Analysis of Pricing Changes

5.3. Utility Analysis of Combination Game and Price Game of Swarm Robots

5.4. Comparison of Utility with Different Numbers of Robots

5.5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI