Q-Learning-Aided Offloading Strategy in Edge-Assisted Federated Learning over Industrial IoT

Wu, Suiyuan; Xue, Hongmei; Zhang, Long

doi:10.3390/electronics12071706

Open AccessArticle

Q-Learning-Aided Offloading Strategy in Edge-Assisted Federated Learning over Industrial IoT

by

Suiyuan Wu

¹,

Hongmei Xue

¹ and

Long Zhang

^1,2,*

¹

School of Information and Electrical Engineering, Hebei University of Engineering, Handan 056038, China

²

Chongqing Key Laboratory of Mobile Communications Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(7), 1706; https://doi.org/10.3390/electronics12071706

Submission received: 23 February 2023 / Revised: 23 March 2023 / Accepted: 28 March 2023 / Published: 4 April 2023

Download

Browse Figures

Versions Notes

Abstract

:

Federated learning (FL) is a key solution to realizing a cost-efficient and intelligent Industrial Internet of Things (IIoT). To improve training efficiency and mitigate the straggler effect of FL, this paper investigates an edge-assisted FL framework over an IIoT system by combining it with a mobile edge computing (MEC) technique. In the proposed edge-assisted FL framework, each IIoT device with weak computation capacity can offload partial local data to an edge server with strong computing power for edge training. In order to obtain the optimal offloading strategy, we formulate an FL loss function minimization problem under the latency constraint in the proposed edge-assisted FL framework by optimizing the offloading data size of each device. An optimal offloading strategy is first derived in a perfect channel state information (CSI) scenario. Then, we extend the strategy into an imperfect CSI scenario and accordingly propose a Q-learning-aided offloading strategy. Finally, our simulation results show that our proposed Q-learning-based offloading strategy can improve FL test accuracy by about 4.7% compared to the conventional FL scheme. Furthermore, the proposed Q-learning-based offloading strategy can achieve similar performance to the optimal offloading strategy and always outperforms the conventional FL scheme in different system parameters, which validates the effectiveness of the proposed edge-assisted framework and Q-learning-based offloading strategy.

Keywords:

Industrial IoT; federated learning; edge computing; offloading strategy; Q-learning

1. Introduction

With the rapid development of 5G and beyond-industrial information technology, the Industrial Internet of Things (IIoT) has an essential role in the deployment of Industry 4.0 to realize smart manufacturing processes due to its ubiquitous sensing and computation capability [1,2,3]. Combined with artificial intelligence (AI) techniques, IIoT can monitor, collect, exchange, analyze, and transmit data directly to drive an unprecedented level of efficiency, productivity, and performance in industrial operations [4]. However, in traditional centralized training methods, the AI function is placed in the central cloud for further data learning and processing, which leads to some critical limitations. First, in order to transfer massive real-time industrial data to the cloud center, the transmission of a huge volume of data will inevitably lead to a severe bandwidth burden and cause huge communication overhead. Second, sending raw industrial data from IIoT devices to the cloud center may cause privacy concerns [5].

Luckily, as an emerging distributed learning framework, federated learning (FL) is a promising solution to solve the above issues [6]. Specifically, FL allows IIoT devices to collaboratively train a sharing global model through local updates in devices without sharing raw local data to the central server [7]. Instead, each device only needs to send a model parameter to the central server. Therefore, the communication overhead is reduced, and privacy is protected. Nevertheless, the performance of federated learning is limited by the straggler effect [8] due to the heterogeneity of IIoT devices. Some devices with weak computation capacity or severe channel conditions may cause a longer training time, which may significantly prolong FL completion time and reduce FL training efficiency as well as learning performance.

Recently, mobile edge computing (MEC) has been proposed by researchers to provide edge computation capability for end devices, which can significantly reduce the latency of task execution [9,10]. Combined with MEC technology, the devices with weak local computation capacity can choose to offload partial local data to the edge server, relieve the computation burden, and reduce the model training time, thus effectively avoiding the straggler effect. However, directly applying MEC into FL still faces some challenges. First, most existing offloading strategies in MEC are aimed at reducing computation latency or energy consumption, while the objective of federated learning is different; typically, it is to improve learning performance. Second, the general optimization objective of FL is to minimize training loss, which is often an implicit function. The complicated relationship between the objective and optimization variable makes it challenging to obtain the optimal offloading strategy. Third, the process of federated learning involving the iterative exchange of model parameters and how to determine the offloading strategy in multiple communication rounds is also challenging.

Concerning the above discussion, the optimal offloading strategy in edge-assisted FL in IIoT systems to improve learning performance is still in question. Whereas most existing work on FL mainly focuses on the resource allocation and device schedule in FL [11,12,13,14,15,16,17,18,19,20,21,22], the sufficient computation power of the edge server is ignored. In addition, most existing offloading strategies are mainly studied in MEC networks [23,24,25,26], and the combination and optimization of MEC and FL techniques is still an unsolved question. As for the existing MEC-assisted FL research [27,28,29], most previous work mainly focused on the optimization of resource allocation and the offloading strategy to reduce FL latency [27,28] or weighted FL cost [29], and FL learning performance improvement was ignored in the above work. Moreover, the investigations of prior research [27,28,29] are assumed on the basis of a perfect CSI system, whereas in practical IIoT networks, there always exists imperfect CSI. How to incorporate the channel state information into system design in an edge-assisted FL network is still a challenge and motivates the work in this paper.

Therefore, in order to fill this research gap, in this paper, we propose an edge-assisted federated learning framework to effectively mitigate the straggler effect in IIoT networks and investigate how to decide the optimal offloading strategy for optimizing FL performance in IIoT networks. First, we formulate an offloading data size optimization problem with the objective of minimizing FL training loss while satisfying the latency constraint. Then, we propose an optimal offloading strategy to maximize the number of devices that can complete the FL process according to the perfect channel state information (CSI). Furthermore, we consider a practical IIoT scenario with imperfect CSI and propose a Q-learning-based offloading strategy to improve FL performance. Finally, the simulation results are provided to demonstrate the advantages of the proposed edge-assisted FL schemes compared to the benchmark schemes. The main contributions of this paper are summarized as follows:

We propose an edge-assisted federated learning framework in IIoT networks, which can mitigate the straggler effect and improve the training efficiency of FL;
We formulate an FL training loss minimization problem by optimizing the offloading data size under the latency constraint. We then derive a CSI-based offloading strategy to solve the optimization problem. In addition, we extend the edge-assisted FL framework into an imperfect CSI scenario and propose a Q-learning-based offloading strategy;
We provide numerical simulation results to validate the effectiveness of the proposed scheme.

The rest of this paper is organized as follows: The related work is summarized in Section 2. Section 3 discusses the system model and problem formulation. Section 4 derives two offloading strategies to solve the problem. Then, the simulation results and analyses are given in Section 5. Finally, Section 6 concludes this paper.

2. Related Work

2.1. FL over Wireless Network

There has been extensive research on performing efficient federated learning over the wireless network, including resource allocation [11,12,13,14], device schedules [15,16,17], and joint resource and device schedules [18,19,20,21,22]. Specifically for resource allocation, Yao et al. (2021) [11] developed a joint computation and power control problem to minimize the overall energy consumption of FL within constraint latency. Pham et al. (2022) [12] investigated UAV-enabled wireless-powered communications for FL networks and developed an iterative algorithm for energy-efficient FL. Luo et al. (2020) [13] proposed a cost-efficient hierarchical federated edge learning framework and particularly studied the joint edge association and resource allocation problem to simultaneously minimize the total FL time and energy consumption. Zhao et al. (2022) [14] proposed two novel bandwidth allocation schemes to optimize the FL network in practical communication scenarios, in which both instantaneous CSI and statistical CSI were considered. In addition, the authors proposed a particle swarm optimization method to effectively obtain the optimal bandwidth allocation strategy in statistical CSI scenarios.

As for the device schedule, Yang et al. (2020) [15] proposed and compared three different scheduling policies and theoretically derived their effectiveness in terms of convergence behavior. Zhang et al. (2022) [16] designed a novel optimized probabilistic device scheduling policy in FL to minimize communication time. Ren et al. (2020) [17] proposed a novel importance- and channel-aware scheduling policy in a federated learning framework in which both the channel state and update importance were considered. Furthermore, the authors then extended the scheduling policy into a multiple-device scenario to achieve faster model convergence and higher learning accuracy.

In [18,19,20,21,22], the joint resource allocation and device schedule problem are well investigated. Specifically, Chen et al. (2020) [18] studied the joint learning, wireless resource allocation, and user selection problem to minimize FL training time. Chen et al. (2022) [19] proposed a joint optimized communication efficiency and resources scheme for FL over wireless IoT networks and divided the original problem into two client scheduling and resource allocation subproblems to reduce the complexity. Chen et al. (2021) [20] provided a comprehensive study of the FL performance and wireless network and studied the joint user selection and resource allocation problem to minimize the training loss of FL. Shi et al. (2021) [21] formulated a joint bandwidth allocation and scheduling problem to maximize the accuracy of the trained model within the latency constraint. Moreover, Yu et al. (2022) [22] investigated the joint client selection and resource allocation problem to realize the tradeoff between learning performance and energy consumption. Different from the above research, this work considers the combination of MEC and FL techniques and proposes two offloading strategies in edge-assisted FL frameworks to reduce training latency as well as mitigate the straggler effect of FL, thus improving learning performance.

2.2. Mobile Edge Computing

MEC has attracted great attention in recent years due to its edge computation capacity to perform computing tasks for local devices [23,24,25,26]. Bozorgchenani et al. (2021) [23] investigated the computation sharing problem in MEC, and a multi-objective optimization problem that jointly considered the energy consumption and task processing delay of devices was solved to obtain the optimal offloading decisions. Sheng et al. (2020) [24] proposed a multiuser partial computation offloading scheme to minimize the weighted sum energy consumption. Wang et al. (2019) [25] proposed a cooperative offloading framework to increase UE’s computing capacity and formulated an energy consumption minimization problem. Tang et al. (2022) [26] formulated a task offloading problem to minimize the long-term cost of MEC systems and accordingly proposed a model-free deep reinforcement learning-based algorithm to obtain the optimal offloading decisions of devices. Different from the above research, this work considers the optimal offloading strategy in MEC-assisted FL systems with the objective of minimizing the training loss of FL.

2.3. MEC-Assisted Federated Learning

There are several studies on combined MEC and FL technologies [27,28,29]. Ji et al. (2021) [27] proposed a threshold-based offloading strategy for edge-assisted FL systems and formulated a delay minimization time to reduce the completion time of FL. Nguyen et al. (2022) [28] proposed a novel blockchain-based FL in multi-server edge computing, and a system latency minimization problem was formulated by jointly considering offloading and wireless resource allocation. Zhao et al. (2022) [29] investigated an MEC-assisted hierarchical FL system and formulated a multi-objective optimization problem to capture the minimization of both latency and energy and model accuracy. Different from the above research, this paper considers the optimal offloading strategy in MEC-assisted FL systems to improve the learning performance of FL, extends the scenario to the practical imperfect CSI scenario, and derives a Q-learning-based offloading strategy.

In relation to the above discussion regarding these studies, [11,12,13,14,15,16,17,18,19,20,21,22] only consider the resource allocation and device schedule in FL, and the strong computation power of edge servers in MEC is ignored. As for the existing research on task offloading in MEC systems [23,24,25,26], most existing offloading strategies are aiming to minimize energy consumption or task delay in MEC systems, which is not suitable for federated learning systems. Although MEC-assisted FL is studied in [27,28,29], the majority of the work is aiming to reduce the latency of FL or the weighted cost, while the optimization of FL performance improvement is ignored. In addition, these studies made the assumption that edge servers have the channel state information of devices, while in practical IIoT scenarios, the perfect CSI is unknown to both edge servers and devices. To fill this research gap, in this paper, we thus consider the offloading strategy in MEC-assisted FL systems. Along a different line, we work on obtaining the optimal offloading strategy in both perfect CSI and imperfect CSI scenarios to further improve FL performance.

3. System Model and Problem Formulation

3.1. Federated Learning Model

As shown in Figure 1, we consider an IIoT scenario consisting of two layers, i.e., the end layer and the edge layer. The end layer has

N

IIoT devices, which are denoted by a set

N = {1, 2, \dots, N}

. Each device has a local dataset

D_{n}

for local training in FL framework, where the sample number is

D_{n}

. The edge layer has an edge server to aggregate the local model from all IIoT devices. The goal of FL is to cooperatively update an optimal global model to minimize the global loss function

F (ω)

:

\underset{ω}{\arg \min} F (ω) = \underset{ω}{\arg \min} \sum_{n = 1}^{N} \frac{D_{n}}{D} F_{n} (ω_{n})

(1)

where

D = \sum_{n \in N} D_{n}

denotes the total number of data samples of all IIoT devices, and

F_{n} (ω_{n})

is the loss function of device

n

over its data sample, i.e.,

F_{n} (ω_{n}) = \frac{1}{D_{n}} \sum_{i = 1}^{D_{n}} f (ω, x_{n, i}, y_{n, i})

(2)

where

(x_{n, i}, y_{n, i})

is the input–output pair of

i

-th data in device

n

, and

f (ω, x_{n, i}, y_{n, i})

is the loss function to capture the error of local model, which may be different with different learning tasks [15].

The IIoT devices and edge server cooperatively update the global model in each FL training round, which is detailed below:

1.: Model broadcast: Specifically in round $t$ , the edge server first broadcasts global model $ω^{t - 1}$ to all IIoT devices. Note that the downlink time for model broadcast is not considered in our work as in [13,30] because edge server usually has higher transmit power and bandwidth;
2.: Local updating: Then, each device updates the received model for a certain number of local iteration using local dataset:

$ω_{n}^{t} = ω_{n}^{t - 1} - η \nabla F_{n} (ω^{t - 1})$

(3)

where $η$ is the learning rate. The updated local model $ω_{n}$ is then uploaded to edge server for global aggregation;
3.: Model aggregation: After receiving the local model from each device, edge server will aggregate them to obtain a global model:

$ω^{t} = \frac{1}{D} \sum_{n \in N} D_{n} ω_{n}^{t}$

(4)

The updated global model

ω^{t}

will then be broadcasted to all devices for next

t + 1

round of training. The global model is iteratively updated according to (4).

3.2. Proposed Edge-Assisted FL Model

In a practical IIoT scenario, due to the heterogeneity of IIoT devices, some devices with limited resources (e.g., low local computation capacity) may have a higher local training time and accordingly larger FL completion time, i.e., straggler effect [27]. When these straggling devices fail to finish the FL training process within the latency constraint, the global model performance is seriously decreased. Nevertheless, edge server typically has stronger computation resource than IIoT devices, thus it is desirable to employ the sufficient computation resource of edge server to assist straggling devices for edge training. Specifically, in the proposed edge-assisted FL framework, straggling device

n

can choose to partially offload the local dataset

ℒ_{n}

, whose data size is denoted by

L_{n}, \forall n \in N

, to edge server for edge training. Hence, the offloading time of device

n

is calculated by:

T_{n}^{off} = \frac{L_{n}}{R_{n}}

(5)

where

R_{n}

is the uplink rate from device

n

to edge server. Note that we consider an FDMA way in uplink transmission, in which each device is allocated to an equal bandwidth resource block. Therefore, the uplink rate can be indicated by:

R_{n} = B_{n} \log_{2} (1 + \frac{P_{n} h_{n}}{σ_{n}^{2}})

(6)

where

B_{n}

and

P_{n}

are the bandwidth and transmit power of device

n

, respectively.

h_{n}

is the channel gain from device

n

to edge server, and

σ_{n}^{2}

is the power of the additive white Gaussian noise (AWGN) at edge server. After the offloading stage, the edge server with total offloaded data from

N

devices and the device with remaining data will simultaneously perform model training. Denoting the computing capacity of device

n

as

f_{n}

, the local training time of device

n

in our proposed edge-assisted FL framework can be given by:

T_{n}^{loc} = \frac{I_{l} C_{n} (D_{n} - L_{n})}{f_{n}}

(7)

where

I_{l} = κ \log (1 / θ)

is the local iteration number with local model accuracy

θ \in (0, 1)

and learning model dependent constant

κ

[13].

C_{n}

is the CPU cycle for computing one data sample of device

n

. Each device will upload the updated local model

ω_{n}

to edge server for aggregation. Denoting the data size of local model as

s_{n}

, then the model upload time of device

n

is denoted by:

T_{n}^{upl} = \frac{s_{n}}{R_{n}} = \frac{s_{n}}{B_{n} \log_{2} (1 + \frac{P_{n} h_{n}}{σ_{n}^{2}})}

(8)

Note that the model training time in edge server can be neglected due to its strong computation power [13]. When edge server receives all local models from devices, it will aggregate the global model as:

ω = \frac{1}{D} [\sum_{n \in N} (D_{n} - L_{n}) ω_{n} + \sum_{n \in N} L_{n} ω_{e}]

(9)

where

ω_{n}

is the local model of device

n

, and

ω_{e}

is the training model in edge server. Hence, the total time of device

n

in our proposed edge-assisted FL framework can be denoted by:

T_{n} = T_{n}^{off} + I_{g} (T_{n}^{loc} + T_{n}^{upl})

(10)

where

I_{g} = \frac{δ}{1 - θ} \log (1 / ε)

is the global aggregation number to achieve global accuracy

ε

[13]. The latency of each device in proposed edge-assisted FL framework is also shown in Figure 2.

In order to ensure the efficiency of FL, each device must meet the latency constraint:

T_{n} = T_{n}^{off} + I_{g} (T_{n}^{loc} + T_{n}^{upl}) \leq Δ T, \forall n \in N

(11)

3.3. Problem Formulation

Based on the above subsections, in order to achieve a more accurate global model in our proposed edge-assisted FL framework, an offloading data size optimization problem is formulated, with the objective of minimizing the global loss function of FL within the total FL completion time constraint. The optimization problem can be written as follows:

\begin{matrix} \min_{{L_{1}, \dots, L_{N}}} F (ω) . \end{matrix}

(12)

\begin{matrix} s . t . C 1 : 0 \leq L_{n} \leq D_{n}, \forall n \in N . \end{matrix}

(12a)

\begin{matrix} C 2 : T_{n}^{off} + I_{g} (T_{n}^{loc} + T_{n}^{upl}) \leq Δ T, \forall n \in N \end{matrix}

(12b)

where constraint (12a) indicates the range of offloading data size of each device, and constraint (12b) is the latency constraint in FL of each device.

4. Proposed Offloading Strategy

It is obvious that the loss function in (12) may be different according to the variation of specific learning task, which makes Problem (12) intractable. Therefore, inspired by the fact that more available model parameters of devices in training process can improve the FL performance [19,31], we turn to maximizing the number of devices that can successfully finish the model training and upload within latency constraint. Problem (12) can be rewritten as:

\begin{matrix} \min_{{L_{1}, \dots, L_{N}}} N_{s} \end{matrix}

(13)

\begin{matrix} s . t . C 1 : 0 \leq L_{n} \leq D_{n}, \forall n \in N \end{matrix}

(13a)

\begin{matrix} C 2 : T_{n}^{off} + I_{g} (T_{n}^{loc} + T_{n}^{upl}) \leq Δ T, \forall n \in N \end{matrix}

(13b)

where

N_{s}

denotes the number of IIoT devices (denoted by the set

N_{s}

) which successfully upload the local model in the latency constraint. In the following two subsections, we will propose two offloading strategies to solve Problem (13).

4.1. Offloading Strategy with Perfect CSI

When the edge server has the perfect CSI of each IIoT device in each training round, an offloading strategy can be obtained from the CSI of each device to solve Problem (13). Aiming to maximize the successful IIoT device number and thereby improve the training performance, more IIoT devices must finish the model training and upload in the specific delay constraint

Δ T

. Specifically, in order to meet the delay constraint in (13b), each device can choose to offload appropriate amount of local data to edge server for edge training; thus, the total delay

T_{n}

is reduced. Note that the offloading strategy is dependent on the channel information and computation capacity of each device.

4.1.1. Feasible Analysis

Before giving the offloading strategy, we first analyze the following two conditions:

Case 1: If $I_{g} (\frac{I_{l} C_{n} D_{n}}{f_{n}} + \frac{s_{}}{R_{n}}) \leq Δ T$ holds, the optimal offloading is $L_{n}^{*} = 0$ . In this case, IIoT device $n$ could finish the model training and upload in the FL latency constraint all by itself without the help of edge server. Therefore, it will not offload data to edge server for system efficiency;
Case 2: If $I_{g} (\frac{I_{l} C_{n} D_{n}}{f_{n}} + \frac{s_{}}{R_{n}}) > Δ T$ holds, IIoT device $n$ cannot finish the model training and upload under the latency constraint and will need the help of edge server for edge training.

Hereafter, we will discuss the offloading strategy in Case 2, in which devices can not finish the model training by themselves.

4.1.2. Offloading Strategy

Specifically in Case 2, combined with (5) and (11), each device needs to meet the following delay constraint:

\frac{L_{n}}{R_{n}} + I_{g} (\frac{I_{l} C_{n} (D_{n} - L_{n})}{f_{n}} + \frac{s_{n}}{R_{n}}) \leq Δ T

(14)

from which we find that:

L_{n} (\frac{1}{R_{n}} - \frac{I_{g} I_{l} C_{n}}{f_{n}}) \leq Δ T - \frac{I_{g} I_{l} C_{n} D_{n}}{f_{n}} - \frac{I_{g} s_{n}}{R_{n}}

(15)

When the device

n

completes the FL training under the latency

Δ T

, the following condition must hold:

L_{n} = \frac{Δ T - \frac{I_{g} I_{l} C_{n} D_{n}}{f_{n}} - \frac{I_{g} s_{n}}{R_{n}}}{\frac{1}{R_{n}} - \frac{I_{g} I_{l} C_{n}}{f_{n}}} = \tilde{L_{n}}

(16)

Theorem 1.

The optimal offloading data size

L_{n}^{*}

of device

n

can be written as:

L_{n}^{*} = {\begin{matrix} 0 & if \frac{1}{R_{n}} \geq \frac{I_{g} I_{l} C_{n}}{f_{n}} \\ \tilde{L_{n}} & if \frac{1}{R_{n}} \leq \frac{I_{g} I_{l} C_{n}}{f_{n}} and \tilde{L_{n}} \in [0, D_{n}] \\ 0 & if \frac{1}{R_{n}} \leq \frac{I_{g} I_{l} C_{n}}{f_{n}} and \tilde{L_{n}} > D_{n} \end{matrix}

(17)

Proof of Theorem 1.

We prove Theorem 1 in the following two conditions:

1.: When $\frac{1}{R_{n}} \geq \frac{I_{g} I_{l} C_{n}}{f_{n}}$ holds, we find that:

$L_{n} \leq \frac{Δ T - \frac{I_{g} I_{l} C_{n} D_{n}}{f_{n}} - \frac{I_{g} s_{n}}{R_{n}}}{\frac{1}{R_{n}} - \frac{I_{g} I_{l} C_{n}}{f_{n}}} = \tilde{L_{n}}$

(18)

One can observe that in (18), the numerator satisfies

Δ T - \frac{I_{g} I_{l} C_{n} D_{n}}{f_{n}} - \frac{I_{g} s_{n}}{R_{n}} < 0

, and the denominator satisfies

\frac{1}{R_{n}} - \frac{I_{g} I_{l} C_{n}}{f_{n}} > 0

, which lead to

\tilde{L_{n}} < 0

. This contradicts constraint (13a), which indicates that offload data to edge server cannot bring any delay reduction for device

n

, and the completion time

T_{n}

is monotonically increasing with respect to

L_{n}

; thus, the optimal offloading data size is

L_{n}^{*} = 0

. The reason is that for device

n

, the channel condition is worse compared to its computation capacity; thus, offloading data to edge server will even exacerbate the completion time

T_{n}

;

2.

When

\frac{1}{R_{n}} < \frac{I_{g} I_{l} C_{n}}{f_{n}}

holds, similar to Condition 1, we find that

L_{n} > \tilde{L_{n}} > 0

. This indicates the device prefers to offload its local data to edge server, thus reducing the completion time. Note that two cases exist in Condition 2:

When $\tilde{L_{n}} \in [0, D_{n}]$ is combined with (13a), the feasible set of $L_{n}$ is given by $\tilde{L_{n}} \leq L_{n} \leq D_{n}$ , and the optimal value of offloading data size is $L_{n}^{*} = \tilde{L_{n}}$ ;
When $\tilde{L_{n}} > D_{n}$ , even though the device offloads all its data to edge server, it cannot complete the training task in the delay constraint; thus, the offloading data are $L_{n}^{*} = 0$ .

Combining the two conditions above, we can accordingly derive Equation (17), which completes the proof. □

Under the above discussion, we can obtain the edge-assisted offloading strategy in perfect CSI scenario based on Theorem 1, and the offloading strategy is also detailed in Algorithm 1.

Algorithm 1: Edge-Assisted Offloading Strategy with Perfect CSI

Input: The system parameters

I_{l}, C_{n}, D_{n}, s_{n}, f_{n}, P_{n}, h_{n}, σ_{n}^{2}, Δ T

Output: Optimal offloading data size

L_{n}^{*}

and the set of

N_{s}

.
Initialize The set

N_{s} = N

.
for

n = 1 : N

do
if

\frac{I_{l} C_{n} D_{n}}{f_{n}} + \frac{s_{n}}{R_{n}} \leq Δ T

then

L_{n}^{*} = 0

else
if

\frac{1}{R_{n}} \geq \frac{I_{g} I_{l} C_{n}}{f_{n}}

then
Delete

n

from

N_{s}

, and set

L_{n}^{*} = 0

.
else
Calculate

\tilde{L_{n}}

according to (16).
if

0 \leq \tilde{L_{n}} \leq D_{n}

set

L_{n}^{*} = \tilde{L_{n}}

else
Delete

n

from

N_{s}

, and set

L_{n}^{*} = 0

.
end for

4.2. Q-Learning-Based Offloading Strategy with Imperfect CSI

In the above subsection, each device needs to take the optimal offloading strategy according to its computation capacity and channel condition to the edge server, which causes a severe communication overhead due to the channel estimation (e.g., dedicated feedback channels). Note that in the practical implementation of FL, IIoT devices may not have information about the network environment and the quality of channel in each training round [14]. Thus, the offloading strategy in the above subsection is not applicable. To address this challenge, we resorted to reinforcement learning methods to obtain the optimal offloading strategy. Reinforcement learning [32,33] is characterized by the continuous interaction between entities and the environment. Relying on the feedback of the actual reward or punishment, entities conduct a step-by-step update for their action. Therefore, in this subsection, we first model the problem in (13) as a Markov decision process (MDP) and then propose a Q-learning framework-aided offloading strategy.

4.2.1. Q-Learning Framework

In the proposed Q-learning framework, each device is seen as an independent agent and maintains its Q-table. The external environment for the agent mainly consists of computation capacity and channel condition. The learning progress in which each agent interacts with the environment is modeled as an MDP, which is denoted by a 4-tuple

(S, A, ℛ, γ)

, where

S

is the state space,

A

is the action space,

ℛ

is the set of rewards when an agent takes an action and makes a transition from current state to next state, and

γ \in [0, 1]

is the discount factor. We detail the state, action, and reward for our Q-learning framework in the following, which is also shown in Figure 3:

State: The set of states for all devices is defined as discrete set $S = {S_{1}^{t}, \dots, S_{n}^{t}, \dots, S_{N}^{t}}$ , where $S_{n}^{t} \in {s_{1}, s_{2}}$ indicates the level of the model update time in FL of device $n$ at time step $t$ . To be specific, the state of device $n$ in time step $t$ is defined as:

$S_{n}^{t} = {\begin{matrix} s_{1}, & T_{n} \leq Δ T, \\ s_{2}, & T_{n} > Δ T, \end{matrix}$

(19)

where $Δ T$ is the maximum latency constraint of FL;
Action: The action set of $N$ IIoT devices can be written as a discrete set $A = {A_{1}^{t}, \dots, A_{n}^{t}, \dots, A_{N}^{t}}$ , where $A_{n}^{t}$ is the action of device $n$ in time step $t$ . The action of each agent taken at each epoch is defined as the offloading data size. As the offloading data size $L_{n}$ is a continuous variable, we can equally divide the offloading data ranging from 0 to $D_{n}$ into $X$ values. The action of device $n$ can, therefore, be written as:

$A_{n}^{t} \in {a_{1}, a_{2}, \dots, a_{x}, \dots, a_{X}}$

(20)

where $a_{x}$ is the selected offloading data size of device $n$ ;
Reward: The immediate reward function of the agent is determined by state transition from state $S_{n}^{t} \in S$ to next state $S_{n}^{t + 1} \in S$ by taking action $A_{n}^{t} \in A$ . The basic design principle of the reward function of each agent is mainly based on the model update time. If the action is much more beneficial for the agent (i.e., the model update time is much closer to $Δ T$ ), the higher reward value will be received by the agent. Otherwise, the agent will receive a penalty value. Specifically, the reward function of agent $n$ is given by:

$r_{n} = {\begin{matrix} 1 - \frac{1}{k_{n}} {(T_{n} - Δ T)}^{2}, & if T_{n} \leq Δ T, \\ C, & otherwise, \end{matrix}$

(21)

where $k_{n}$ is the normalized coefficient, and $C$ is a negative constant.

4.2.2. Training Process and Proposed Algorithm

During the training process, the agent first observes its current state according to the environment. Then, the agent will then take an action according to the action selection mechanism. In this work, in order to balance the exploitation of the current best Q-value function with exploration for the better option during the training process of proposed Q-learning framework, we adopt

ε

-greedy policy in the action selection mechanism, i.e., the random action with a probability

ε

and the best action with probability

1 - ε

, i.e.,

A^{t} = {\begin{matrix} random action, & with probability ε, \\ \arg \max_{A^{t} \in A} Q (S^{t}, A^{t}), & with probability 1 - ε, \end{matrix}

(22)

where

Q (S^{t}, A^{t})

denotes the Q-value of action

A^{t}

taken by the agent under state

S^{t}

at time step

t

. After the action, the agent will receive a reward

r^{t}

from environment and transition from state

S^{t}

to

S^{t + 1}

. Finally, the agent will calculate the Q-value and update the Q-table as follows:

Q (S^{t}, A^{t}) \leftarrow Q (S^{t}, A^{t}) + α [r^{t + 1} + γ \max_{A^{t} \in A} Q (S^{t + 1}, A^{t}) - Q (S^{t}, A^{t})]

(23)

where

α \in (0, 1]

denotes the learning rate, and

γ

is the discount factor. Each agent will maintain a Q-table after the training process, which determines the optimal action to be taken in each of the possible states after the whole learning procedure. The whole process is detailed in Algorithm 2.

Algorithm 2: Q-Learning-Based Offloading Strategy with Imperfect CSI

Input: System parameters

I_{l}, C_{n}, D_{n}, s_{n}, P_{n}, σ_{n}^{2}, Δ T

.
Output: Optimal Q-table

Q (S, A)

.
Initialize: set

Q (S, A) = 0, \forall S \in S, A \in A

; set

δ = 1

;
repeat
random select an initial state

S \in S

for each device;
set

t = 1

;
while

t \leq ξ

do
observe the current state

S^{t}

;
choose an action

A^{t}, \forall A \in A

according to

ε

-greedy policy in (22);
observe the next state

S^{t + 1} \in S

;
if

S^{t + 1} \in s_{1}

then
Calculate

R^{t + 1} = 1 - \frac{1}{k_{n}} {(T_{n} - Δ T)}^{2}

in (21); Break;
else if

S^{t + 1} \in s_{2}

then
Calculate

R^{t + 1} = C

in (21);
update state

S^{t} \leftarrow S^{t + 1}

;
Calculate Q-value according to (23);
set

t = t + 1

;
end while
set

δ = δ + 1

;
until

δ \geq Φ_{\max}

4.2.3. Complexity Analysis

First, the computation complexity of Algorithm 1 is analyzed as follows: In Algorithm 1, each device can determine its optimal offloading data size according to Equation (17), thus the computational complexity of Algorithm 1 is

O (N)

. Then, the computational complexity of Algorithm 2 is analyzed as follows: The maximum iteration number of proposed Q-learning algorithm is

Φ_{\max}

, and the maximum number of time steps is specified by

ξ

in each iteration. Therefore, the maximum number of update operations is given by

ξ \cdot Φ_{\max}

. Therefore, taking all

N

agents into account, the computational complexity of Algorithm 2 is given by

O (N \cdot ξ \cdot Φ_{\max})

. We note that, although the computational complexity of proposed Q-learning-based offloading strategy in Algorithm 2 is higher than Algorithm 1, it does not need the channel estimation in impletion phase; thus, the communication overhead is effectively reduced. In addition, the proposed Q-learning-based offloading strategy can achieve similar performance to the optimal benchmark scheme in terms of FL accuracy, which will be shown later in Section 5.

5. Simulation Results and Analysis

In this section, we evaluate the performance of the proposed edge-assisted offloading strategy in the IIoT system using experimental simulation, where Python 3.7 was used as the simulation tool, and the learning framework was PyTorch 1.4.0. The proposed algorithm was executed in a computer with Intel I5-7300HQ @ 2.5 GHz and 8-GB memory. First, we will provide the value setting of the simulation parameters. Then, the performance comparisons of the proposed algorithm and benchmark schemes will be introduced and analyzed.

5.1. Simulation Setting

The IIoT device number

N

was set to 50, and the wireless channel was modeled as

h_{n} = g_{n} ϖ_{n}

, with Rayleigh fading and the shadowing standard considered, where

g_{n}

is the frequency and distance dependent on the large-scale fading effect that consists of path loss and shadowing, and

ϖ_{n}

is the shadowing standard deviation [34]. The CPU cycle for training a data sample was set as

C_{n} = 4 \times 10^{5}

cycles/sample [22]. The local and edge accuracy was set as 0.1 and 0.5, respectively [29]. The other relevant parameters used for wireless setting are listed in Table 1 [29,34].

As in [35], we used the well-known MNIST dataset to evaluate the learning performance of the proposed strategy. The MNIST dataset contained a training set of 60,000 samples and a test of 10,000 samples of handwritten digits from 0 to 9, which is widely used in the relative literature [13,14,15,16,17]. The MNIST dataset was evaluated in non-IID data distributions. Specifically, the data were first sorted using digit labels, divided into 200 shards of size 300, and assigned to each of the 50 IIoT devices’ four shards; thus, each device only had examples of, at most, four digits. The learning model was a convolutional neural network (CNN), which had two

5 \times 5

convolution layers, a fully connected layer with 512 units and ReLu activation, and a final softmax output layer. Furthermore, in order to evaluate the effectiveness of the proposed edge-assisted FL framework and Q-learning-based offloading strategy, in the simulation part, we considered our proposed scheme and two other benchmark schemes for comparison purposes, which are listed as follows:

Proposed Q-learning-based offloading strategy (PQB-OS): In our proposed Q-learning-based offloading strategy, each IIoT device can dynamically adjust its offloading data size to satisfy the total delay according to Algorithm 2;
Proposed offloading strategy with perfect CSI (POS-CSI) [27]: In this benchmark scheme, each IIoT device can choose the optimal offloading strategy based on the perfect CSI according to Algorithm 1, which can be seen as the optimal solution;
FL without an edge-assisted framework (FL w.o.) [35]: In this benchmark scheme, each IIoT device can only perform FL without the help of the edge server, which is referred to as the standard FL method.

5.2. Results and Discussion

Figure 4 describes the test accuracy of the proposed scheme and two benchmark schemes versus communication rounds with non-IID data distribution. One can observe that the test accuracy of several schemes will become convergent with the increase of the communication rounds. In addition, both the PQB-OS and POS-CSI schemes outperform the FL w.o. scheme, and, respectively, improve accuracy by about 10.6 and 4.7% compared to the FL w.o. scheme. The reason is that, in these two schemes, IIoT devices can offload partial data to the edge server and thereby reduce the FL time

T_{n}

to satisfy the latency constraint, which increases the corresponding participating device number

N_{s}

and thereby improves the test accuracy. These phenomena demonstrate the effectiveness of the edge-assisted FL framework and the offloading strategy in these two edge-assisted schemes.

The training loss performance comparison of the proposed scheme and two benchmark schemes versus communication rounds is shown in Figure 5. As expected, with the increasing number of communication rounds, the training loss of both three schemes decreases until it becomes convergent. In addition, it is clear that the proposed PQB-OS scheme shows a better performance than the FL w.o. scheme. This is because, in the proposed PQB-OS scheme, the devices that have a long training latency can choose to offload partial data to the edge server, thus reducing their FL completion time. Therefore, the participating device number is improved, and the training loss is reduced, which is consistent with previous experiments. Moreover, we observe that POS-CSI always outperforms the proposed PQB-OS scheme. The reason is that the POS-CSI scheme is based on perfect CSI and can be seen as the optimal solution as an offloading strategy.

Next, in Figure 6, we varied the latency constraint

Δ T

to see the corresponding effect on test accuracy. The latency constraint was increased from 30 to 90 s. Overall, with the increase in the latency constraint

Δ T

, the test accuracy of both the proposed PQB-OS scheme and the two benchmark schemes shows an increasing trend. This is because, as the latency constraint

Δ T

becomes larger, all devices have a higher probability to finish the model training and upload within the latency constraint threshold. Hence, more devices can successfully send their model to the edge server for aggregation, thereby improving the learning performance. Moreover, the proposed PQB-OS scheme and the benchmark POS-CSI scheme always show a higher test accuracy than the FL w.o. scheme in a large range of

Δ T

. The reason is that, in these two schemes, IIoT devices can offload partial data to the edge server to reduce the model training time, thus having a higher probability to finish FL progress in the latency constraint. This phenomenon further demonstrates the effectiveness of the proposed edge-assisted FL framework and the proposed Q-learning-based offloading scheme.

In addition, we changed the bandwidth

B_{n}

to see the corresponding influence on test accuracy. The corresponding results are shown in Figure 7. Apparently, the POS-CSI scheme has the best test accuracy, while the FL w.o. scheme displays the worst accuracy performance. Furthermore, as the bandwidth

B_{n}

becomes larger, the test accuracy of both three schemes shows an increasing trend. To explain, as

B_{n}

increases, each device has a higher transmission rate, and the corresponding upload latency is, therefore, reduced. More devices can successfully upload their model to the edge server within the latency constraint.

Next, in Figure 8, we adjusted the transmit power

P_{n}

of each device to see the corresponding influence on test accuracy. We find that both POS-CSI and the proposed PQB-OS schemes are superior to the FL w.o. scheme with different values of transmit power. Furthermore, with the increase in transmit power

P_{n}

, all schemes show an increasing trend in terms of FL test accuracy. The reason is that all devices have a higher uplink rate, and the model upload time is, thereby, reduced. In addition, in two edge-assisted schemes, each device can offload partial data to the edge server with lower latency, which can further reduce the total latency. Overall, the phenomena in Figure 8 demonstrate the effectiveness of the proposed Q-learning-based offloading strategy for the enhancement of learning performance in terms of learning accuracy.

Finally, in Figure 9, we compared the accuracy performance of the proposed offloading strategy to the benchmark schemes by varying the data sample number

D_{n}

from 180 to 240. It is shown that our proposed Q-learning-based offloading strategy always achieves higher test accuracy than the FL w.o. scheme, which validates the effectiveness of the proposed edge-assisted FL framework as well as the proposed offloading strategy. In addition, with the increase in

D_{n}

, the accuracy of all schemes presents a decreasing trend. The reason is that each device needs more time for local training; thus,

T_{n}^{l o c}

and

T_{n}

become larger accordingly. The device number

N_{s}

, which satisfies the latency constraint

Δ T

, will decrease, and learning performance is decreased. Furthermore, the proposed PQB-OS scheme can achieve a similar accuracy performance to the POS-CSI scheme with lower practical communication overhead, which demonstrates the practicality and effectiveness of the proposed Q-learning-based offloading strategy.

6. Conclusions

In this paper, we proposed an edge-assisted federated learning framework over Industrial IoT by combining it with an MEC technique. In the proposed framework, each straggling IIoT device can offload partial local data to the edge server for edge training and thereby mitigate the straggler problem of FL. In order to obtain the optimal offloading strategy, we formulated and solved an FL loss function minimization problem under latency and resource constraints. In addition, we extended to include a practical imperfect CSI scenario and proposed a Q-learning-based offloading strategy to solve the problem accordingly. Finally, the numerical results show that our proposed Q-learning-based offloading strategy can effectively improve the test accuracy of FL by about 4.7% compared to the benchmark FL w.o. scheme. Furthermore, the proposed Q-learning-based scheme achieves similar performance to the POS-CSI scheme and always outperforms the benchmark FL w.o. scheme in different system parameters, which validates the effectiveness of the proposed edge-assisted FL framework and the Q-learning-based offloading strategy in the proposed framework. In future work, we will jointly study heterogeneous resource allocation, such as bandwidth and transmit power, as well as the computation capacity of devices in edge-assisted FL frameworks, to further improve FL learning performance.

Author Contributions

Conceptualization, S.W. and L.Z.; methodology, S.W.; software, S.W.; validation, S.W., H.X. and L.Z.; formal analysis, S.W.; investigation, S.W.; resources, L.Z.; data curation, S.W.; writing—original draft preparation, S.W.; writing—review and editing, L.Z.; visualization, S.W.; supervision, H.X. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported, in part, by the National Natural Science Foundation of China under Grant 62101174; in part by the Hebei Natural Science Foundation under Grant F2022402001, Grant A2020402013, and Grant F2021402005; in part by the Open Fund of Chongqing Engineering Research Center of Intelligent Sensing Technology and Microsystem under Grant D2021337; and in part by the Collaborative Education Project of Industry–Academia Partnership from Ministry of Education of China under Grant 202102250008.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhang, K.; Zhu, Y.; Maharjan, S.; Zhang, Y. Edge intelligence and blockchain empowered 5G beyond for the industrial Internet of Things. IEEE Netw. 2019, 33, 12–19. [Google Scholar] [CrossRef]
Zhou, H.; She, C.; Deng, Y.; Dohler, M.; Nallanathan, A. Machine learning for massive industrial internet of things. IEEE Wirel. Commun. 2021, 28, 81–87. [Google Scholar] [CrossRef]
Zhang, L.; Zhao, H.; Hou, S.; Zhao, Z.; Xu, H.; Wu, X.; Wu, Q.; Zhang, R. A survey on 5G millimeter wave communications for UAV-assisted wireless networks. IEEE Access 2019, 7, 117460–117504. [Google Scholar] [CrossRef]
Sun, W.; Liu, J.; Yue, Y. AI-enhanced offloading in edge computing: When machine learning meets industrial IoT. IEEE Netw. 2019, 33, 68–74. [Google Scholar] [CrossRef]
Shen, X.S.; Huang, C.; Liu, D.; Xue, L.; Zhuang, W.; Sun, R.; Ying, B. Data management for future wireless networks: Architecture, privacy preservation, and regulation. IEEE Netw. 2021, 35, 8–15. [Google Scholar] [CrossRef]
Li, T.; Sahu, A.K.; Talwalkar, A.; Smith, V. Federated learning: Challenges, methods, and future directions. IEEE Signal Process. Mag. 2020, 37, 50–60. [Google Scholar] [CrossRef]
Nguyen, D.C.; Ding, M.; Pathirana, P.N.; Seneviratne, A.; Li, J.; Niyato, D.; Poor, H.V. Federated learning for industrial internet of things in future industries. IEEE Wirel. Commun. 2021, 28, 192–199. [Google Scholar] [CrossRef]
Vu, T.T.; Ngo, D.T.; Ngo, H.Q.; Dao, M.N.; Tran, N.H.; Middleton, R.H. Joint resource allocation to minimize execution time of federated learning in cell-free massive MIMO. IEEE Internet Things J. 2022, 9, 21736–21750. [Google Scholar] [CrossRef]
Tran, T.X.; Pompili, D. Joint task offloading and resource allocation for multi-server mobile-edge computing networks. IEEE Trans. Veh. Technol. 2019, 68, 856–868. [Google Scholar] [CrossRef] [Green Version]
Abbas, N.; Zhang, Y.; Taherkordi, A.; Skeie, T. Mobile edge computing: A survey. IEEE Internet Things J. 2018, 5, 450–465. [Google Scholar] [CrossRef] [Green Version]
Yao, J.; Ansari, N. Enhancing federated learning in fog-aided IoT by CPU frequency and wireless power control. IEEE Internet Things J. 2021, 8, 3438–3445. [Google Scholar] [CrossRef]
Pham, Q.V.; Le, M.; Huynh-The, T.; Han, Z.; Hwang, W.J. Energy-efficient federated learning over UAV-enabled wireless powered communications. IEEE Trans. Veh. Technol. 2022, 71, 4977–4990. [Google Scholar] [CrossRef]
Luo, S.; Chen, X.; Wu, Q.; Zhou, Z.; Yu, S. HFEL: Joint edge association and resource allocation for cost-efficient hierarchical federated edge learning. IEEE Trans. Wirel. Commun. 2020, 19, 6535–6548. [Google Scholar] [CrossRef]
Zhao, Z.; Xia, J.; Fan, L.; Lei, X.; Karagiannidis, G.K.; Nallanathan, A. System optimization of federated learning networks with a constrained latency. IEEE Trans. Veh. Technol. 2022, 71, 1095–1100. [Google Scholar] [CrossRef]
Yang, H.H.; Liu, Z.; Quek, T.Q.; Poor, H.V. Scheduling policies for federated learning in wireless networks. IEEE Trans. Commun. 2020, 68, 317–333. [Google Scholar] [CrossRef] [Green Version]
Zhang, M.; Zhu, G.; Wang, S.; Jiang, J.; Liao, Q.; Zhong, C.; Cui, S. Communication-efficient federated edge learning via optimal probabilistic device scheduling. IEEE Trans. Wirel. Commun. 2022, 21, 8536–8551. [Google Scholar] [CrossRef]
Ren, J.; He, Y.; Wen, D.; Yu, G.; Huang, K.; Guo, D. Scheduling for cellular federated edge learning with importance and channel awareness. IEEE Trans. Wirel. Commun. 2020, 19, 7690–7703. [Google Scholar] [CrossRef]
Chen, M.; Poor, H.V.; Saad, W.; Cui, S. Convergence time optimization for federated learning over wireless networks. IEEE Trans. Wirel. Commun. 2020, 20, 2457–2471. [Google Scholar] [CrossRef]
Chen, H.; Huang, S.; Zhang, D.; Xiao, M.; Skoglund, M.; Poor, H.V. Federated learning over wireless IoT networks with optimized communication and resources. IEEE Internet Things J. 2022, 9, 16592–16605. [Google Scholar] [CrossRef]
Chen, M.; Yang, Z.; Saad, W.; Yin, C.; Poor, H.V.; Cui, S. A joint learning and communications framework for federated learning over wireless networks. IEEE Trans. Wirel. Commun. 2021, 20, 269–283. [Google Scholar] [CrossRef]
Shi, W.; Zhou, S.; Niu, Z.; Jiang, M.; Geng, L. Joint device scheduling and resource allocation for latency constrained wireless federated learning. IEEE Trans. Wirel. Commun. 2021, 20, 453–467. [Google Scholar] [CrossRef]
Yu, L.; Albelaihi, R.; Sun, X.; Ansari, N.; Devetsikiotis, M. Jointly optimizing client selection and resource management in wireless federated learning for internet of things. IEEE Internet Things J. 2022, 9, 4385–4395. [Google Scholar] [CrossRef]
Bozorgchenani, A.; Mashhadi, F.; Tarchi, D.; Monroy, S.A.S. Multi-objective computation sharing in energy and delay constrained mobile edge computing environments. IEEE Trans. Mob. Comput. 2021, 20, 2992–3005. [Google Scholar] [CrossRef]
Sheng, M.; Wang, Y.; Wang, X.; Li, J. Energy-efficient multiuser partial computation offloading with collaboration of terminals, radio access network, and edge server. IEEE Trans. Commun. 2020, 68, 1524–1537. [Google Scholar] [CrossRef]
Wang, K.; Huang, P.Q.; Yang, K.; Pan, C.; Wang, J. Unified offloading decision making and resource allocation in ME-RAN. IEEE Trans. Veh. Technol. 2019, 68, 8159–8172. [Google Scholar] [CrossRef] [Green Version]
Tang, M.; Wong, V.W.S. Deep reinforcement learning for task offloading in mobile edge computing systems. IEEE Trans. Mob. Comput. 2022, 21, 1985–1997. [Google Scholar] [CrossRef]
Ji, Z.; Chen, L.; Zhao, N.; Chen, Y.; Wei, G.; Yu, F.R. Computation offloading for edge-assisted federated learning. IEEE Trans. Veh. Technol. 2021, 70, 9330–9344. [Google Scholar] [CrossRef]
Nguyen, D.C.; Hosseinalipour, S.; Love, D.J.; Pathirana, P.N.; Brinton, C.G. Latency optimization for blockchain-empowered federated learning in multi-server edge computing. IEEE J. Select. Areas Commun. 2022, 40, 3373–3390. [Google Scholar] [CrossRef]
Zhao, T.; Li, F.; He, L. DRL-based joint resource allocation and device orchestration for hierarchical federated learning in NOMA-enabled industrial IoT. IEEE Trans Ind. Inform. 2022, 1. [Google Scholar] [CrossRef]
Yang, Z.; Chen, M.; Saad, W.; Hong, C.S.; Shikh-Bahaei, M. Energy efficient federated learning over wireless communication networks. IEEE Trans. Wirel. Commun. 2021, 20, 1935–1949. [Google Scholar] [CrossRef]
Zeng, Q.; Du, Y.; Huang, K.; Leung, K.K. Energy-efficient resource management for federated edge learning with CPU-GPU heterogeneous computing. IEEE Trans. Wirel. Commun. 2021, 20, 7947–7962. [Google Scholar] [CrossRef]
Min, M.; Xiao, L.; Chen, Y.; Cheng, P.; Wu, D.; Zhuang, W. Learning-based computation offloading for IoT devices with energy harvesting. IEEE Trans. Veh. Technol. 2019, 68, 1930–1941. [Google Scholar] [CrossRef] [Green Version]
Zhang, L.; Ma, X.; Zhuang, Z.; Xu, H.; Sharma, V.; Han, Z. Q-Learning Aided Intelligent Routing with Maximum Utility in Cognitive UAV Swarm for Emergency Communications. IEEE Trans. Veh. Technol. 2022, 72, 3707–3723. [Google Scholar] [CrossRef]
Zhang, L.; Wang, H.; Xue, H.; Zhang, H.; Liu, Q.; Niyato, D.; Han, Z. Digital twin-assisted edge computation offloading in industrial Internet of Things with NOMA. TechRxiv 2022. [Google Scholar] [CrossRef]
McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; Arcas, B.A.Y. Communication-efficient learning of deep networks from decentralized data. Artif. Intell. Stat. 2017, 54, 1273–1282. [Google Scholar]

Figure 1. An illusion of system model and proposed edge-assisted FL framework.

Figure 2. The illusion of system latency in the proposed edge-assisted FL framework.

Figure 3. The Q-learning framework for proposed edge-assisted FL.

Figure 4. Test accuracy of proposed scheme and two benchmark schemes for MNIST dataset under non-IID distribution, when

Δ T = 50 s

,

B_{n} = 1 MHz

,

P_{n} = 0.5 W

, and

D_{n} = 200

.

Figure 4. Test accuracy of proposed scheme and two benchmark schemes for MNIST dataset under non-IID distribution, when

Δ T = 50 s

,

B_{n} = 1 MHz

,

P_{n} = 0.5 W

, and

D_{n} = 200

.

Figure 5. Training loss of proposed scheme and two benchmark schemes for MNIST dataset under non-IID distribution, when

Δ T = 50 s

,

B_{n} = 1 MHz

,

P_{n} = 0.5 W

, and

D_{n} = 200

.

Figure 5. Training loss of proposed scheme and two benchmark schemes for MNIST dataset under non-IID distribution, when

Δ T = 50 s

,

B_{n} = 1 MHz

,

P_{n} = 0.5 W

, and

D_{n} = 200

.

Figure 6. Test accuracy performance of different schemes with different latency constraint

Δ T

with MNIST dataset for non-IID data distribution, when

B_{n} = 1 MHz

,

P_{n} = 0.5 W

, and

D_{n} = 200

.

Figure 6. Test accuracy performance of different schemes with different latency constraint

Δ T

with MNIST dataset for non-IID data distribution, when

B_{n} = 1 MHz

,

P_{n} = 0.5 W

, and

D_{n} = 200

.

Figure 7. Test accuracy performance of different schemes with different bandwidth

B_{n}

with MNIST dataset for non-IID data distribution, when

Δ T = 50 s

,

P_{n} = 0.5 W

, and

D_{n} = 200

.

Figure 7. Test accuracy performance of different schemes with different bandwidth

B_{n}

with MNIST dataset for non-IID data distribution, when

Δ T = 50 s

,

P_{n} = 0.5 W

, and

D_{n} = 200

.

Figure 8. Test accuracy performance of different schemes with different power

P_{n}

with MNIST dataset for non-IID data distribution, when

Δ T = 50 s

,

B_{n} = 1 MHz

, and

D_{n} = 200

.

Figure 8. Test accuracy performance of different schemes with different power

P_{n}

with MNIST dataset for non-IID data distribution, when

Δ T = 50 s

,

B_{n} = 1 MHz

, and

D_{n} = 200

.

Figure 9. Test accuracy versus data sample number of different schemes with MNIST dataset for non-IID data distribution, when

Δ T = 50 s

,

B_{n} = 1 MHz

, and

P_{n} = 0.5 W

.

Figure 9. Test accuracy versus data sample number of different schemes with MNIST dataset for non-IID data distribution, when

Δ T = 50 s

,

B_{n} = 1 MHz

, and

P_{n} = 0.5 W

.

Table 1. Simulation parameters.

Parameters	Value
Transmit power of device, $P_{n}$	0.5 W
Bandwidth of device, $B_{n}$	1 MHz
Variance of the AWGN, $σ_{n}^{2}$	−20 dBm
Upload model data size, $s_{n}$	100 Kbits
Number of data sample, $D_{n}$	200
Shadowing standard deviation, $ϖ_{n}$	4 dB
Path loss model	$140.7 + 36.7 \log_{10} d [km]$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, S.; Xue, H.; Zhang, L. Q-Learning-Aided Offloading Strategy in Edge-Assisted Federated Learning over Industrial IoT. Electronics 2023, 12, 1706. https://doi.org/10.3390/electronics12071706

AMA Style

Wu S, Xue H, Zhang L. Q-Learning-Aided Offloading Strategy in Edge-Assisted Federated Learning over Industrial IoT. Electronics. 2023; 12(7):1706. https://doi.org/10.3390/electronics12071706

Chicago/Turabian Style

Wu, Suiyuan, Hongmei Xue, and Long Zhang. 2023. "Q-Learning-Aided Offloading Strategy in Edge-Assisted Federated Learning over Industrial IoT" Electronics 12, no. 7: 1706. https://doi.org/10.3390/electronics12071706

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Q-Learning-Aided Offloading Strategy in Edge-Assisted Federated Learning over Industrial IoT

Abstract

1. Introduction

2. Related Work

2.1. FL over Wireless Network

2.2. Mobile Edge Computing

2.3. MEC-Assisted Federated Learning

3. System Model and Problem Formulation

3.1. Federated Learning Model

3.2. Proposed Edge-Assisted FL Model

3.3. Problem Formulation

4. Proposed Offloading Strategy

4.1. Offloading Strategy with Perfect CSI

4.1.1. Feasible Analysis

4.1.2. Offloading Strategy

4.2. Q-Learning-Based Offloading Strategy with Imperfect CSI

4.2.1. Q-Learning Framework

4.2.2. Training Process and Proposed Algorithm

4.2.3. Complexity Analysis

5. Simulation Results and Analysis

5.1. Simulation Setting

5.2. Results and Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI