Trajectory and Phase Shift Optimization for RIS-Equipped UAV in FSO Communications with Atmospheric and Pointing Error Loss

Jia, Haocheng; Chen, Gaojie; Huang, Chong; Dang, Shuping; Chambers, Jonathon A.

doi:10.3390/electronics12204275

Open AccessArticle

Trajectory and Phase Shift Optimization for RIS-Equipped UAV in FSO Communications with Atmospheric and Pointing Error Loss

by

Haocheng Jia

¹,

Gaojie Chen

²

,

Chong Huang

^2,*,

Shuping Dang

³ and

Jonathon A. Chambers

¹

School of Engineering, University of Leicester, Leicester LE1 7RH, UK

²

Institute for Communication Systems (ICS), 5GIC & 6GIC, University of Surrey, Guildford GU2 7XH, UK

³

Department of Electrical and Electronic Engineering, University of Bristol, Bristol BS8 1UB, UK

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(20), 4275; https://doi.org/10.3390/electronics12204275

Submission received: 13 September 2023 / Revised: 7 October 2023 / Accepted: 10 October 2023 / Published: 16 October 2023

(This article belongs to the Special Issue Mobile Networking: Latest Advances and Prospects)

Download

Browse Figures

Versions Notes

Abstract

:

This paper proposes a new framework for reconfigurable intelligent surface (RIS)-equipped unmanned aerial vehicles (UAVs) in free-space optical (FSO) communication. To ensure practicality, we consider atmospheric loss caused by fog, which leads to an inhomogeneous medium for laser propagation. In addition, we incorporate the pointing error loss caused by the power fraction on the photodetector (PD) into the system and derive a closed-form expression for the elliptical beam footprint in the pointing error loss. We then propose a leading angle assisted particle swarm optimization (PSO) method to efficiently optimize the numerical results of pointing error loss. Furthermore, after obtaining these numerical results as a precondition, the UAV trajectory is optimized using the proximal policy optimization (PPO) method to achieve the maximum average capacity. Numerical simulations demonstrate that the proposed optimization method achieves greater efficiency and accuracy compared to the decode-and-forward (DF) relay and deep Q-learning (DQN) methods.

Keywords:

free space optical communications; reconfigurable intelligent surfaces; unmanned aerial vehicle; particle swarm optimization; deep reinforcement learning

1. Introduction

Free space optical (FSO) communications were developed and researched extensively as a potential technology for sixth generation (6G) wireless communication networks [1]. Different from radio frequency (RF) communications, the main feature of FSO communications is to use the laser beam to carry information, which brings various benefits, including high transmission rate, free of eavesdroppers by transmission directivity, and free of frequency spectrum crowding [2]. However, there are two problems that affect the performance of FSO communications. The first one is that FSO communication requires line-of-sight (LOS) links between the light source (LS) and the photodetector (PD); however, in practical circumstances, the connection between LS and PD could be blocked by obstacles [3]. The second problem is the susceptibility to the varying propagation environment caused by rain, fog, and other low-visibility atmospheric phenomena [4].

With the intention of dealing with the non-LOS requirement and maximizing the throughput, the authors in [5] utilized a fixed position relay. Usually, this type of relay is deployed on a building to assist non-LOS transmission. Following that, inter-relay cooperation was proposed in FSO communication to reduce the outage probability in [6]. However, the relay is fixed in a position, so that the communication between the transmitter and receiver lacks flexible connectivity capability. To tackle this issue, a UAV was introduced as an operating platform to assist communication and improve system performance flexibility and efficiency. For example, in [7], the authors considered a UAV decode-and-forward relay in a downlink maritime communication scenario, solving the average achievable rate maximization problem. Furthermore, the authors in [8] proposed a multi-UAV assisted solution to achieve the capacity maximization criterion assuming a cognitive network. Motivated by using a UAV in RF communications, there are existing works that consider a UAV relay for improving FSO communication. For example, in [9], the authors introduced a dual-hop space-air-ground integrated RF/FSO network; with the assistance of a UAV, the satellite could link the terrestrial users with low outage probability. In [10], a similar scenario was proposed for the multi-users on the ground using quadrature amplitude modulation. Furthermore the UAV facilitating the link between space and air, in [11], was utilized to assist the FSO communication between ground non-LOS transmitter and receiver deployed in an urban area. Therefore, the deployment of the UAVs for FSO communication is a valid solution for high data rate purposes, especially under complex circumstances. However, the power consumption and additional signal processing are significant problems for a UAV equipped-relay mobile platform [12], which is a limitation for performance and UAV operating endurance.

For the purpose of reducing power consumption and processing time delay, reflecting intelligent surface (RIS) technology was widely utilized in wireless networks. Thanks to the feature of RIS that can control wireless signal transmission through passive phase adjustment with low cost [13]. In [14], the feasibility was described for RIS application on new generation RF communication, where the channel has been turned into a controllable system that can apply the optimization. Furthermore, in paper [15], the phase shifts of RIS can be designed and improved by the joint transmit beamforming method. Not only in RF communication but the work [16] also indicated that RIS and its feature of freedom rotations could be utilized in FSO communication. What is more, the work in [17] showed that RIS could be regarded as a mirror with a phase shift function in FSO communication, which means eliminating the time delay and power consumption caused by signal processing. The works in [16,17] showed the fixed position mounted RIS could solve the LOS requirement of FSO communication. However, the flexibility and capability coverage for fixed mounted RIS is limited. Therefore, in [18], a UAV was deployed to improve the flexibility and capability of coverage for FSO communication. The work [19] indicates that the RIS can be assembled in a UAV platform and capture the advantages from both to maximize the system throughput. However, the UAV was set to hovering in a fixed position, which lacks the consideration of dynamic trajectory. To further highlight and clarify the novelty of this work, the main difference between the proposed work and existing works is given in Table 1.

In order to mitigate atmospheric loss and pointing error loss, UAV trajectory optimization is considered for UAV-assisted FSO communication. The paper [20] shows that machine learning can be utilized in communication scenarios. In [21], a traditional UAV trajectory optimization method based on geometry analysis was proposed to find the shortest path and best cellular network coverage in a connectivity-constrained system. Furthermore, in [22], a 2D UAV trajectory was designated by the conventional non-convex optimization under different optimal conditions such as speed limitation, hovering duration requirement, and coverage requirement. As the learning method identifies the trends and patterns of data much more easily, the efficiency and accuracy can be improved continuously by gaining experience [23]. Therefore, when the optimal problem is non-convex or NP-hard, an AI or learning-powered method is a potential option for UAV trajectory optimization [24]. For example, in [25], trajectory design and power allocation based on typical machine learning optimization were formulated for maximizing the instantaneous sum rate of multi-UAVs. Moreover, conventional machine learning was limited when processing the raw data such as UAV flight status [26]. The deep reinforcement learning (DRL) method constructed by multiple processing layers to learn representations of data is, therefore, more efficient for trajectory optimization in wireless communication scenarios. The DRL algorithm is proposed to solve the instability problem and to improve the data inefficiency. In [27], by proposing the policy-based learning method, where an estimator of the policy gradient is been added. Also, a stochastic gradient ascend is exploited in order to achieve the maximum rewards. These features showed us more clear options for our optimization method. In the work [28], the DRL learning method was proposed as an online altitude control and scheduling policy for UAVs to obtain instantaneous channel state information (CSI) in real time along with any adjustment to its deployment altitude. What is more, in [29], a multi-UAV network was constructed, where the problem is non-convex with sophisticated states, and the individual UAV may not know the reward functions of other UAVs, the DRL was utilized as an improved clip and count-based algorithm for the multi-agent deep reinforcement learning scenario, which enables each UAV to select its policy in a distributed manner. Not only the altitude control and multi-agent optimization, in [30], the DRL method was proposed to optimize the new generation terahertz communication throughput, including UAVs to ground stations association, transmit power, and trajectory optimization problems. In [31], the DRL shows its high applicability for the complicated scenario where a multi-UAV trajectory planning task is required. Both the geographical fairness of user equipment and overall energy consumption are needed to optimize.

Motivated by these works, the RIS-equipped UAV for FSO communication under the influence of atmospheric and pointing error loss will be investigated in this paper. To overcome the physical impacts and improve performance, the optimization proposed for pointing error loss uses a novel leading angle assisted particle swarm optimization (PSO) method, which is competent in efficiently finding the optimal continuous phase shifts. Furthermore, we consider a PPO optimization for determining the UAV trajectory in FSO communication. The main contributions of this work are as follows:

We propose a new RIS-equipped UAV FSO communication technique in a non-LOS scenario with atmospheric and pointing error loss and derive a closed-form expression with the laser beam incident upon the PD having non-orthogonality status. An ellipse beam footprint geometrical model is considered to express the power density on the PD.
Based on the proposed framework, we derived the closed-form expression for physical impacts, a novel leading angle assisted PSO optimization method is proposed to optimize the numerical results of pointing error loss as a precondition. Then, the PPO method is introduced to solve the UAV trajectory optimization to reduce the complicated physical impacts and optimize the average capacity.
Simulation results verify that the leading angle assisted PSO and PPO methods are efficient and indicate that the UAV tends to fly to the area with the maximum value of average capacity and avoids the fog. Furthermore, the proposed RIS-assisted strategy improves the average capacity significantly compared to the conventional decode-and-forward (DF) relay-assisted UAV networks.

The remainder of this paper is organized as follows. In Section 2, the system and problem formulation are described, where we clear our aim and model the FSO channel coefficients. After confirming the target of optimization, Section 3 presents the first optimization with the PSO method for optimizing the phase shifts of the RIS, which minimizes the impact of pointing error loss. Then, Section 4 operates the second optimization for the UAV trajectory, based on the PPO learning method. The details of the UAV operation under these two optimizations with analysis and discussion are shown in Section 5. Finally, the conclusion of the work is given in Section 6.

Notations: We use lower-case, boldface lower-case letters to represent scalars and vectors, respectively. Furthermore,

| | \cdot | |

is the norm of a vector;

\erf (\cdot)

denotes the error function and

\exp (\cdot)

denotes the exponential function;

“ \cdot ”

and

“ \times ”

denote the dot and cross product between vectors, respectively.

2. System and Problem Formulation

In this paper, we consider a 3D Cartesian coordinate system for FSO communication networks, as shown in Figure 1. A laser source (LS) as a transmitter aims to communicate with a photodetector (PD) receiver through a Gaussian laser beam. Furthermore, a UAV carries the RIS as a reflecting platform. We assume that the UAV could acquire the needed information in our scenario, such as the position of and PD [32]. We also assume that there is no direct link between the LS and PD, and the LS is located at the origin point at

O (0, 0, 0)

, and consider a PD has a circular detection aperture with radius a and the center of the PD is located at

S (x_{s}, y_{s}, 0)

. For practical consideration, we assume the beam is affected by fog, which means the atmospheric conditions vary during the beam propagation. We model a fog that can severely affect the system’s Channel State Information (CSI), and treat fog as a spherical entity with radius

r_{f}

for the sake of simplicity and initial simulations [33] (Note: In the ’Optimization Results’ of this paper, we address the fog in irregular shape for practical thinking and present corresponding results), located at

F (x_{f}, y_{f}, z_{f})

(Note: It should be noted that the UAV’s ability to ascertain the location of fog is facilitated by certain technologies and methods. These can include meteorological data services, onboard atmospheric sensors, or advanced imaging techniques [34,35]). To satisfy the aiming conditions from LS to PD via RIS, we assume that the rotations of the LS can keep tracking the UAV (Note: The details of the laser tracking and aiming method can be achieved by using precise pointing, acquisition and tracking technology [36], we assume the UAV is deployed and control by other ground station to operate the communication task in this area, This paper does not consider the orientation of the UAV because the phase shift change caused by the orientation of the UAV can be added to the optimal phase shifts of the RIS once we know the flight orientation of the UAV). Moreover, we assume the area of the RIS is big enough to cover the beam footprint from the LS [37]. Due to the laser beam following the specular reflection with the RIS, this perspective is central to our methodology as the reflective properties of the RIS allow it to direct the path of light, we consider the status of RIS instead of UAV and introduce the concept of leading angles, which stand for the rotation status of RIS. As shown in Figure 2, the leading angles can be expressed by

θ_{1}

and

θ_{2}

, which stand for the rotations along the x-axis and y-axis with counterclockwise direction. This approach integrates the UAV status information such as the Angle of Arrival (AoA) of the UAV into the leading angles of the RIS. Therefore, the phase shifts of the RIS at the nth time slot can be set as

Θ [n] (θ_{1} [n], θ_{2} [n]) \in (- π, π) .

(1)

We consider the 3D flight operation for the UAV with the nth time slot,

n \in \{1, 2, \dots, N\}

. The constraints of phase shifts of RIS can be denoted as

Φ = {Θ [n], n = 1, 2, \dots, N}

. The coordinates of the UAV with time slots are also set as following the positive integer waypoints in the flight zone, which are denoted as

G [n] (x_{g} [n], y_{g} [n], z_{g} [n])

, where

G [1]

and

G [N]

denote the initial and final positions of the UAV. We set the velocity of the UAV as

V [n]

at the nth time slot, and the maximum and minimum velocity of the UAV are set as

V_{m a x}

and

V_{m i n}

, the operation height of the UAV is limited from

H_{l}

to

H_{h}

. According to the above description, we have the following constraints:

\begin{matrix} Φ = {Θ [n], n = 1, 2, \dots, N} \\ - π \leq Θ [n] \leq π \\ G [1] = (x_{g} [1], y_{g} [1], z_{g} [1]) \\ G [N] = (x_{g} [N], y_{g} [N], z_{g} [N]) \\ V_{m i n} \leq V [n] \leq V_{m a x}, n = 1, 2, \dots, N \\ H_{l} \leq z_{g} [n] \leq H_{h} . \end{matrix}

(2)

The channel modeling of the system is measured by the channel coefficients at the nth time slot from the LS to PD via the RIS as

h [n]

, which are represented by (Note: Here, due to the turbulence of atmosphere, is a random variable that only relies on the constant from circumstances, most of the papers engage the UAV and FSO communication optimization and do not take it into consideration, such as [38,39])

h [n] = h_{t} h_{a} [n] h_{p} [n],

(3)

where

h_{t}

is the turbulence of the atmosphere, this parameter is related to the wind speed in refractive index and the laser wavelength [40], where the probability density function (PDF) of

h_{t}

can be denoted as [41]

f (h_{t}) = \frac{2 {(α β)}^{\frac{α + β}{2}} h_{t}^{\frac{α + β}{2} - 1} K_{α - β} (2 \sqrt{α β h_{t}})}{Γ (α) Γ (β)},

(4)

where

K_{α - β} (.)

is the modified Bessel function of the second kind, the

Γ (.)

is the Gamma function, and the

α

and

β

denote the numbers of small and large turbulence cells as follows,

\begin{matrix} α = {(exp (\frac{0.49 σ_{R}^{2}}{{(1 + 1.11 σ_{R}^{\frac{12}{5}})}^{\frac{7}{6}}}) - 1)}^{- 1}, \\ β = {(exp (\frac{0.51 σ_{R}^{2}}{{(1 + 0.69 σ_{R}^{\frac{12}{5}})}^{\frac{5}{6}}}) - 1)}^{- 1}, \end{matrix}

(5)

respectively, where Rytov variance

σ_{R}

, which can be calculated as

σ_{R}^{2} = 0.5 k^{\frac{7}{6}} C_{N}^{2} L^{\frac{11}{6}}

, and

C_{N}^{2}

is the index of refraction structure parameter,

k = \frac{2 π}{λ_{f}},

can be obtained by optical wavelength

λ_{f}

.

The

h_{a} [n]

is the atmospheric loss caused by fog, which can be given by

h_{a} [n] = exp (- σ_{c} d_{c} [n] - σ_{f} d_{f} [n]),

(6)

where

σ_{c}

and

σ_{f}

denote the measurement of atmospheric attenuation conditions under clean air and fog [42], respectively, which can be given as

σ_{c} = \frac{3.91}{L_{v}} {(\frac{λ}{550})}^{- p_{c}}

and

σ_{f} = \frac{3.91}{L_{v}} {(\frac{λ}{550})}^{- p_{f}}

where

p_{c} = \{\begin{matrix} \begin{matrix} 1.6 & L_{v} > 50 \\ 1.3 & 6 < L_{v} < 50, \end{matrix} \end{matrix}

(7)

and

p_{f} = \{\begin{matrix} \begin{matrix} 0.16 L_{v} + 0.34 & 1 < L_{v} < 6 \\ L_{v} - 0.5 & 0.5 < L_{v} < 1 \\ 0 & L_{v} < 0.5, \end{matrix} \end{matrix}

(8)

where

L_{v}

denotes the visibility range in air. Furthermore,

d_{c} [n]

and

d_{f} [n]

denote the laser propagation distances in clear air and fog at the nth time slot, respectively, which can be obtained as in Appendix A.

Due to the UAV operating with a 3D flight, the incident laser beam is not perpendicular to the PD, and the Gaussian profile beam footprint is an ellipse on the PD plane as shown in Figure 3. The pointing error loss

h_{p} [n]

is caused by laser power fraction on the PD and can be the integral results from the beam power intensity, denoted as

I (ρ; [n]) = \frac{2}{π W_{x} [n] W_{y} [n]} exp (- \frac{{2 | | ρ | |}^{2}}{W_{x} [n] W_{y} [n]}),

(9)

then consider it under the polar coordinate system of the detector plane with radial a and angular

2 π

, we can express the pointing error loss as

\begin{matrix} h_{p} [n] = & \int_{A} I (| | ρ [n] - r [n] | |) d ρ [n] \\ = & \int_{0}^{a} \int_{0}^{2 π} \frac{2}{π W_{y} [n] W_{x} [n]} exp \frac{- 2 {(ρ [n] - r [n])}^{2}}{W_{y} [n] W_{x} [n]} ρ [n] d ϕ [n] d ρ [n] \end{matrix}

(10)

where A is the detector area, I is the normalized spatial distribution of the transmitted intensity,

ρ [n]

is the radial vector which starts from the beam center and ends at any point on the PD,

r [n]

is the pointing error vector starting from the center of the PD and ending at the center of the beam footprint with norm

r [n]

.

After some derivations, we can obtain the closed-form expression of pointing error loss as follows:

\begin{matrix} h_{p} [n] = & (r [n] \sqrt{2 π W_{y} [n] W_{x} [n]} (\erf (\frac{\sqrt{2} r [n]}{\sqrt{W_{y} [n] W_{x} [n]}}) \\ + \erf (\frac{\sqrt{2} (a - r [n])}{\sqrt{W_{y} [n] W_{x} [n]}})) + W_{y} [n] W_{x} [n] \\ (exp (\frac{- 2 r^{2} [n]}{W_{y} [n] W_{x} [n]}) - exp (\frac{- 2 {(a - r [n])}^{2}}{W_{y} [n] W_{x} [n]}))) \\ \frac{1}{W_{y} [n] W_{x} [n]} . \end{matrix}

(11)

where

W_{y} [n]

and

W_{x} [n]

denote two different beamwidths of the beam footprint, which can be calculated by using a rotation matrix and the law of specular reflection as in Appendix B.

According to the intensity modulation and direct detection (IM/DD) method as in [43], the average capacity between the LS and PD can be obtained by representing the probability density function (PDF) of the received electrical signal-to-noise ratio (SNR)

f (γ [n])

c [n] = {log}_{2} (1 + ϵ γ [n]),

(12)

where

ϵ = \frac{e}{2 π}

[44], and the received SNR

γ [n]

can be obtained by

γ [n] = \frac{2 P_{t}^{2} μ^{2} h_{a}^{2} [n] h_{p}^{2} [n] h_{t}^{2}}{σ^{2}},

(13)

where

P_{t}

is the transmitting power from the LS, and

σ^{2}

is the Gaussian noise variance, and

μ

is the detector responsivity.

Then, to overcome the atmospheric and pointing error loss, and maximize the average capacity by optimizing the phase shifts of the RIS

Φ = {Θ [n], n = 1, 2, \dots, N}

, and the trajectory of the UAV

G = {G [n], n = 1, 2, \dots, N}

, we formulate a problem as follows:

\begin{matrix} (P 1) : & max_{Φ, G} \frac{1}{N} \sum_{n = 1}^{N} c [n] \\ s . t . (1) (2) . \end{matrix}

(14)

According to the formulated optimal problem, it can be seen that the constraint of phase shifts problem (1) and the UAV trajectory constraint (2) are independent; the optimal problem can be divided into two independent optimal problems, one is to optimize the phase shift, and the other is optimizing the UAV trajectory to maximize the average capacity. In the following section, the phase shift optimization of the RIS is proposed.

3. PSO-Based Optimization of RIS Phase Shifts

On the purpose of fulfilling the target to achieve the maximum average capacity, where the maximum SNR

γ [n]

should be reached. Then, the target translates to maximize the pointing error and atmospheric loss coefficients

h_{a} [n]

and

h_{p} [n]

. Therefore, we exploit Particle Swarm Optimization (PSO) to optimize the pointing error loss within each position point of the UAV during its operation. Due to the UAV operating the 3D flight, the incident laser beam is not perpendicular to the PD, and the Gaussian profile beam footprint is an ellipse on the PD plane. Furthermore, we considered the pointing error loss

h_{p} [n]

in this work to model the effect of the fluctuation, which is equivalent to considering imperfect CSI at the PD. By iteratively adjusting leading angles of RIS in each iteration

m \in \{1, 2, \dots, M\}

of PSO, the number of

h_{p} [n]

will be maximized as a precondition in each UAV position, denoted as

\begin{matrix} (P 2) : & max_{Φ, G} h_{p} [n] \\ s . t . (1) (2) . \end{matrix}

(15)

This method allows for the decoupling of the phase shifts optimization problem and the UAV trajectory optimization problem, which simplifies our optimization challenge into two independent yet separated problems. In this paper, the PSO algorithm was proposed for the purpose of obtaining high-precision continuous phase shifts leading angles for the most optimized RIS status. The steps of PSO are listed as:

3.1. Initialization

Set the initial conditions of the PSO, which starts from generating the population

p \in {1, 2, \dots, P}

of particle swarms. The following description of the algorithm is for each swarm, as we set the loop iterations in the PSO algorithm as

m \in \{1, 2, \dots, M\}

. Then the initial velocity of each swarm can be set as

v_{1}

[1] and

v_{2}

[1]. After that, the corresponding updating parameter including local and global acceleration

c_{1}

and

c_{2}

are also defined.

3.2. Calculate the Leading Angles

The leading angles

θ_{1}^{*}

and

θ_{2}^{*}

are RIS phase shifts, which are proposed to accelerate the optimization, calculated as the center of the beam aligned with the center of the PD, and the calculation method can be found in Appendix C.

3.3. Set Personal and Global Best

After calculating the leading angles of phase shifts, we set the initial personal best

θ_{P_{1}} = θ_{1}^{*}

and

θ_{P_{2}} = θ_{2}^{*}

, as the same as the global best

θ_{G_{1}}

and

θ_{G_{2}}

.

3.4. PSO Main Loop

At the beginning of the main loop of the PSO algorithm, the swarm status is updated by the initial settings, denoted as

\begin{matrix} θ_{1} [m + 1] = θ_{1} [m] + v_{1} \\ θ_{2} [m + 1] = θ_{2} [m] + v_{2} . \end{matrix}

(16)

After updating the swarm status at the beginning of each loop, the swarm velocity in each loop can be denoted using random parameter

s = rand (\cdot)

between 0 and 1, and constants

c_{1}

and

c_{2}

\begin{matrix} v_{1} [m + 1] = v_{1} [m] + & c_{1} s (θ_{P_{1}} [m] - θ_{1} [m]) + \\ c_{2} s (θ_{G_{1}} [m] - θ_{1} [m]) \\ v_{2} [m + 1] = v_{2} [m] + & c_{1} s (θ_{P_{2}} [m] - θ_{2} [m]) + \\ c_{2} s (θ_{G_{2}} [m] - θ_{2} [m]) . \end{matrix}

(17)

After the swarm position and velocity are all updated, the current pointing error loss can be gained for further comparison. Furthermore, the cross boundary treatment [45] is included as a limitation for the searching swarm could not move outside the search zone, denoted as

θ_{1} [m] = \{\begin{matrix} \begin{matrix} min (θ_{m a x}, 2 θ_{m i n} - θ_{1} [m]), θ_{1} [m] < θ_{m i n} \\ max (θ_{m i n}, 2 θ_{m a x} - θ_{1} [m]), θ_{1} [m] > θ_{m a x}, \end{matrix} \end{matrix}

(18)

θ_{2} [m] = \{\begin{matrix} \begin{matrix} min (θ_{m a x}, 2 θ_{m i n} - θ_{2} [m]), θ_{2} [m] < θ_{m i n} \\ max (θ_{m i n}, 2 θ_{m a x} - θ_{2} [m]), θ_{2} [m] > θ_{m a x}, \end{matrix} \end{matrix}

(19)

where the search boundary is

Θ [n] (θ_{1} [n], θ_{2} [n]) \in (- π, π)

, and normally the search zone is set to include all the phase shifts, which make the beam footprint overlap with the PD. After confirming that the swarm will not be outside the search boundary, the local best position of the swarm should be updated. The update method uses a comparison of the target function output between the current swarm and the local best swarm position, then updates the local best pointing error loss if it’s less than the current output. Similar to updating the global best, by comparing the global best and local best pointing error loss output, the most optimized phase shifts and their corresponding pointing error loss can be found. The whole algorithm is written as pseudocode in Algorithm 1.

Algorithm 1: PSO

1:: input the parameter set of the PSO $P, M, c_{1}, c_{2}, v_{1}, v_{2}$
2:: derive the rotation quaternion and its rotation matrix.
3:: extract the leading angles of phase shifts.
4:: input the initial local and global best positions $θ_{P_{1}}, θ_{P_{2}}, θ_{G_{1}}, θ_{G_{2}}$ .
5:: for $m = 1, 2, \dots, M$
6:: for $p = 1, 2, \dots, P$
7:: update the swarm status and swarm velocity as (16) and (17).
8:: output the current pointing error loss $h_{p} [m]$
9:: implement the cross boundary treatment as (18) and (19).
10:: compare the current swarm output and the local best swarm output.
11:: if $h_{p} (θ_{1} [m], θ_{2} [m]) > h_{p} (θ_{P_{1}} [m], θ_{P_{2}} [m])$
12:: update the current angles and output as personal best angles and output
13:: end if
14:: compare the personal best swarm output and the global best swarm output.
15:: if $h_{p} (θ_{P_{1}} [m], θ_{P_{1}} [m]) > h_{p} (θ_{G_{1}} [m], θ_{G_{2}} [m])$
16:: update the personal angles and output as global best angles and output
17:: end if
18:: end for
19:: end for
20:: output the global best outcome as the maximum pointing error.

4. PPO-Based Optimization of the UAV Trajectory

After solving the formulated problem

P 2

to gain the most optimized pointing error based on leading angles, the constraints of the optimization problem were only based on the trajectory of the UAV. Then, when the optimal phase shift is given, problem

P 1

can be rewritten as

\begin{matrix} (P 3) : & max_{G} \frac{1}{N} \sum_{n = 1}^{N} c [n] \\ s . t . (2) . \end{matrix}

(20)

To solve

P 3

as maximizing the average capacity based on the maximum

h_{p} [n]

and UAV flight constraints, a PPO algorithm is introduced for planning the operating trajectory of the UAV. As shown in Figure 4, the proposed PPO-based trajectory optimization algorithm is based on the Markov decision process (MDP) and improved by the prioritized experience replay (PER) method, which is introduced in the following.

MDP formulation

The deep reinforcement learning (DRL) method is based on the MDP, which includes the state space, action space, and reward design. The MDP of this system considers time steps with the index of time slots n with the upper limitation N, where

n \in \{1, 2, \dots, N\}

.

State space

The state-space S includes the UAV’s operating zone limitations as the input of the PPO, consisting of all accessible UAV positions in the whole terrain, and also includes the remaining operation time of the UAV.

Action Space

The action space A includes the current direction, velocity, and coordinates of the UAV.

Reward Design

The reward function is formulated as the average capacity of the FSO communication. As the UAV can be at any 3D waypoint

G [n]

, there is a corresponding average capacity

c [n]

.

4.1. PPO Algorithm

Learning Algorithm

In this section, we present the proposed PPO learning algorithm for UAV trajectory design, and the steps are written as pseudocode in Algorithm 2. PPO is an on-policy, model-free reinforcement learning algorithm in which decision-making is based on data collected from the most up-to-date policy. The goal of the PPO model is to enable the agent to execute optimal actions that maximize long-term cumulative rewards. Consequently, in the PPO model, the selected action may not be the optimal choice for the current time slot, but it aims to be the optimal choice for pursuing long-term benefits. The central controller, which controls the UAV, acts as the agent in this process.

At each time slot, the agent observes a state from the state space S, which consists of the coordinates of the RIS-equipped UAV. The action space A represents the set of available actions, the agent based on the current state and decided by stochastic policy

π_{υ_{k}}

, where the actions decided by policy rules can be denoted as

a_{t}

. After executing the actions, the agent receives a reward determined by the average capacity under the current FSO connectivity condition.

In order to adjust the surrogate objective problem where the huge ratio can be denoted as

r_{t} (υ_{k}) = \frac{π_{υ_{k + 1}} (a_{t} | s_{t})}{π_{υ_{k}} (a_{t} | s_{t})},

(21)

where

υ_{k}

is the policy parameter that includes the operation movement rules established for the UAV. Between the current policy

π_{υ}

and new policy

π_{υ_{k + 1}}

, via the PPO-clip update policy

υ_{k + 1} = \arg \max L^{c l i p} (υ_{k}),

(22)

to maximize the clipped surrogate objective

L^{c l i p} (υ_{k})

, which can be denoted as

\begin{matrix} L^{c l i p} (υ_{k}) = {\hat{E}}_{t} [min (r_{t} (υ_{k}) {\hat{D}}_{t}, \\ clip (r_{t} (υ_{k}), 1 - ϱ, 1 + ϱ) {\hat{D}}_{t})], \end{matrix}

(23)

where the clip function can be denoted as

clip (ϱ, {\hat{D}}_{t}) = \{\begin{matrix} \begin{matrix} 1 + ϱ {\hat{D}}_{t} \geq 0 \\ 1 - ϱ {\hat{D}}_{t} < 0 \end{matrix} \end{matrix}

(24)

and

ϱ

is a hyperparameter,

{\hat{E}}_{t}

is the estimator of the advantage function

{\hat{D}}_{t}

. The conventional update policy between two iterations relies on the Monte Carlo approximation, where the surrogate objective is maximized, denoted as

υ_{k + 1} = \arg \max L (υ_{k}) .

(25)

When the value of advantage

{\hat{D}}_{t}

for the state-action pair is positive, the objective reduces to

L^{p o s} (υ_{k}) = {\hat{E}}_{t} [\min (\frac{π_{υ_{k + 1}} (a_{t} | s_{t})}{π_{υ_{k}} (a_{t} | s_{t})}, (1 + ϱ)) {\hat{D}}_{t}],

(26)

when the value of advantage

{\hat{D}}_{t}

for the state-action pair is negative, the objective reduces to

L^{n e g} (υ_{k}) = {\hat{E}}_{t} [\max (\frac{π_{υ_{k + 1}} (a_{t} | s_{t})}{π_{υ_{k}} (a_{t} | s_{t})}, (1 - ϱ)) {\hat{D}}_{t}] .

(27)

After updating the policy by maximizing the PPO-clip objective, the value function needs to be fitted by regression on mean-squared error.

Algorithm 2: PPO

1:: initialize the environment
2:: initialize the policy parameter $υ_{0}$ and value function parameters $η_{0}$
3:: for $k = 0, 1, 2, \dots, K$
4:: collect set of trajectories $P_{k} = υ_{i}$ by running policy $π_{k} = π (υ_{k})$ in the environment.
5:: compute rewards ${\hat{R}}_{t} = c [n]$
6:: compute advantage estimates ${\hat{D}}_{t}$ based on the current value function $V_{η_{k}}$
7:: update the policy by maximizing the PPO-clip objective $υ_{k + 1} = \arg \max L^{c l i p (υ_{k})}$ .
8:: fit value function by regression on mean-squared error $η_{k + 1}$ via gradient descent.
9:: end for

5. Optimization Results

This section shows the UAV trajectory after optimization under different conditions. The parameters used in these numerical results and their corresponding values are listed in Table 2. The parameter settings follow the practical scenario. For example, the visibility range in fog follows the research on weather effects on FSO communication [46]. The selected laser type is 1550

nm

wavelength that most FSO scenarios use [47], we take

h_{t} = 0.91

for weak turbulence when the wind is not strong for this type of laser [48,49]. The UAV settings follow the civil aviation law [50] of height limitation and flight performance of consumer drones [51]. Based on [36,37], we also set the minimum and maximum height of the UAV as

H_{l} = 60 m

and

H_{h} = 90 m

, and the maximum speed of the UAV as

1.5 m / s

(Note: To simplify the optimization problem, as in [21,22,25,52], the acceleration of the UAV is not considered in this paper, we note that it is affected by the minimum and maximum velocity and total flight times).

We use a decode-and-forward (DF) relay-assisted FSO system as a comparison benchmark to show the performance gain of the proposed scheme. We assume that the UAV has both FSO PD and transmitter function, each hop’s pointing error is modeled with a circular beam footprint, and the misalignment is set as zero between footprint and PD, which will provide the best case for the benchmark scenario, which also shows the comparison if the results of PSO is not obtained precondition for trajectory optimization. Moreover, the deep Q-network (DQN) is utilized as the trajectory optimization benchmark in this paper.

Figure 5 compares the trajectories under RIS and DF scenarios without fog effect with the different flight times. It is shown that the UAV operating the flight with maximum speed reaches the area with maximum capacity under both RIS and DF scenarios. The average capacity of PPO-based trajectory in the 150-time slots is 4.05

bits / s / Hz

. Due to the reduction in flight times, the UAV could not fly to the optimal location to achieve the maximum capacity. Therefore the average capacity with 100-time slots is 3.91

bits / s / Hz

for PPO. Furthermore, we calculated the average capacity of the DF relay trajectories under zero misalignment pointing error in the 150 and 100-time slots, which are 2.64

bits / s / Hz

and 2.52

bits / s / Hz

, respectively. In Table 3, the comparison results demonstrate the average capacity of the proposed RIS-assisted FSO system outperforms that of the DF relay-assisted system significantly. Moreover, compared with the benchmark DQN, the proposed PPO-based trajectory optimization achieves a higher transmission rate. This result verifies that PPO could avoid overestimation for efficient convergence to obtain better performance.

Figure 6 shows the trajectories when fog with radius

r_{f} = 15 m

with 150, 100, and 50 time slots flight time. The UAV is operating the flight tendency to avoid the fog, which also passes close to the zone and dives to the lowest altitude to obtain the maximum value of average capacity. Therefore, the avoidance and diving action of the PPO-based optimization brings the average capacity to 3.92

bits / s / Hz

and 3.77

bits / s / Hz

for 150 and 100 time slots, respectively, while DQN only achieves 3.77

bits / s / Hz

and 3.64

bits / s / Hz

, respectively. However, the lack of flight time causes the UAV to lose the tendency to dive for 50 time slots, which only has 1.37

bits / s / Hz

average capacity, and the average capacity comparison is demonstrated in Table 4.

Figure 7 demonstrates the comparison of the results with 150 time slots between the different fog conditions, whose radii are

15 m

and

20 m

. Compared to DQN, PPO achieves better performance due to avoiding overestimation in Q-networks. Under the same total flight time, all of the trajectories show the tendency to avoid to entering the fog even when the radius of fog is changed and also show the tendency to decrease to the lowest altitude limitation. Table 5 shows the comparison of average capacity among the trajectories when different fog conditions are assumed. We can see that due to avoiding the fog, the UAV moves far from the area with maximum average capacity, the trajectory with the highest average capacity is reached when the fog does not exist, and the fog with the largest radius causes the lowest average capacity of trajectory.

Figure 8 exhibits the trajectories comparison with altered flight time as 100 and 60 in the presence of fog with

r_{f} = 15 m

located at

(0, 20, 65) m

. The starting and ending points of the UAV are also adjusted to (−

20, 30, 80) m

and

(20, 30, 80) m

, respectively. As shown in Table 6, the PPO-based optimization method maintains its ability to make UAV trajectory operating avoiding and diving, yielding the average capacity of 3.91

bits / s / Hz

and 3.09

bits / s / Hz

for 100 and 60 time slots, respectively. On the other hand, the DQN strategy, while effective, demonstrates a lower performance with average capacities of 3.48

bits / s / Hz

and 3.02

bits / s / Hz

for 100 and 60 time slots, respectively. The results underline the efficiency of the proposed PPO-based trajectory optimization, validating its ability to better avoid overestimation for improved performance.

As shown in Figure 9 and detailed in Table 7, the PPO optimization method still outperforms DQN under the conditions of irregular fog, achieving an average capacity of 4.31

bits / s / Hz

and 3.73

bits / s / Hz

for 150 and 100 flight time respectively. As comparison, the DQN method still achieves lower average capacities, reaching only 3.61

bits / s / Hz

and 3.51

bits / s / Hz

for the same flight time. These results reaffirm the superiority of the PPO method in efficiently learning and converging to an optimal UAV flight strategy. Moreover, these findings underscore the fact that a larger number of flight times (150 compared to 100) provides the UAV with more options for path selection, thereby leading to better performance in both PPO and DQN cases.

For all results, it is noticed that under the same conditions, if the UAV’s flight extends and the average capacity increases due to the emergence of a more extensive scope for the UAV to tactically maneuver to the positions optimal for mitigating the pointing error loss. Within these specific air spaces, given the spatial 3D space relationships amongst the LS, UAV, and PD, the laser’s beam footprint on the PD tends to approximate a more circular shape, thereby elevating the upper constraint of the pointing error loss coefficient. Thus, when the UAV is close to these optimal spots during its flight, there’s a marked enhancement in the FSO link’s average capacity.

6. Conclusions

In this paper, we proposed an RIS-equipped UAV deployed in a non-LOS FSO communication scenario. We considered a practical scenario with atmospheric and pointing error loss. Our method combined the use of PSO for RIS phase shift optimization with PPO for UAV trajectory. Specifically, with PSO method, the pointing error loss was efficiently addressed by optimizing the phase shifts of RIS. This assures the FSO communication quality for all UAV flight times. Then, the UAV trajectory with optimized average capacity under determinate flight times was found. The PPO optimization was proposed because of its stable training compared to the other optimization learning methods as it limits the policy update at each step, reducing the possibility of policy divergence. Our results showed that our approach was effective and performed better than other optimization learning methods we looked into. However, there are more aspects and future work that can be explored. The PSO and PPO optimization methods have the potential to look into scenarios with multiple UAVs or multiple links, while other communication impacts could be considered as well. There is also potential to explore other algorithms or combine methods for even better results.

Author Contributions

Methodology C.H.; Writing—original draft, H.J.; Writing—review & editing, S.D. and J.A.C.; Supervision, G.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

LOS	Line of sight
UAV	Unmanned Aerial Vehicle
RIS	Reconfigurable Intelligent Surface
FSO	Free Space Optical
PSO	Particle Swarm Optimization
PPO	Proximal Policy Optimization
LS	Laser source
PD	Photodetector

Appendix A. Propagation Distance in Clean Air and Fog

The calculation of all distances is determined by the position of the RIS center at the nth time slots (Note: The time index n is ignored below unless necessary for notational convenience). We define the clean air propagation distance as

d_{c} = d_{c_{1}} + d_{c_{2}},

(A1)

where

d_{c_{1}}

and

d_{c_{2}}

denote the clean air propagation distance from LS to RIS and from RIS to PD, respectively. Then, the propagation distance in fog can be denoted as

d_{f} = d_{O G} + d_{G S} - d_{c},

(A2)

where

d_{O G}

and

d_{G S}

denote the beamline from LS to RIS and from the RIS to the PD, respectively, which are shown in Figure A1. The distance from the fog center to the center of the RIS is written as

d_{F G}

. Then we denote distances between the fog center and beamlines as

d_{F I}

and

d_{F J}

. Furthermore, the points

M_{u}

,

N_{u}

,

N_{f}

and

M_{f}

are the intersection points of beamlines

d_{O G}

and

d_{G S}

with fog surface sphere.

Figure A1. Illustration of the propagation across fog.

The different situations for the propagation distances depend on the related position between the UAV and the fog. The judgement conditions and situations analysis of propagation distances are given in (A3) and (A4) on the top of the next page.

d_{c_{1}} = \{\begin{matrix} d_{O G}, d_{F I} ⩾ r_{f} \\ d_{O G}, (d_{F I} < r_{f}) \cap (d_{O G} ⩽ d_{O M}) \\ d_{O G} - d_{O M_{u}}, (d_{F I} < r_{f}) \cap (d_{O M_{u}} < d_{O G} ⩽ d_{O M_{f}}) \\ d_{O G} - d_{M_{u} M_{f}}, (d_{F I} < r_{f}) \cap (d_{O M_{f}} < d_{O G}), \end{matrix}

(A3)

d_{c_{2}} = \{\begin{matrix} d_{G S}, d_{F J} ⩾ r_{f} \\ d_{G S}, (d_{F J} < r_{f}) \cap (d_{G S} ⩽ d_{G N}) \\ d_{G S} - d_{G N_{u}}, (d_{F J} < r_{f}) \cap (d_{G N_{u}} < d_{O G} ⩽ d_{G N_{f}}) \\ d_{G S} - d_{M_{u} M_{f}}, (d_{F J} < r_{f}) \cap (d_{G N_{f}} < d_{G S}), \end{matrix}

(A4)

\begin{matrix} W_{y} [n] & = \frac{\sqrt{{(x_{O^{'}} [n] - x_{S^{'}} [n])}^{2} + {(y_{O^{'}} [n] - y_{S^{'}} [n])}^{2} + z_{O^{'}} {[n]}^{2}}}{sin (π - ∠ κ - \frac{δ}{2})} \\ + \frac{\sqrt{{(x_{O^{'}} [n] - x_{S^{'}} [n])}^{2} + {(y_{O^{'}} [n] - y_{S^{'}} [n])}^{2} + z_{O^{'}} {[n]}^{2}}}{sin (∠ κ - \frac{δ}{2})}, \\ W_{x} [n] & = 2 (tan (\frac{δ}{2}) \sqrt{{(x_{O^{'}} [n] - x_{S^{'}} [n])}^{2} + {(y_{O^{'}} [n] - y_{S^{'}} [n])}^{2} + z_{O^{'}} {[n]}^{2}}), \end{matrix}

(A5)

Appendix B. Beamwidth Calculation

The beamwidth can be calculated by the rotation matrix and the method of specular reflection, in Figure A2, we set

u_{OG}

as the unit vector of incident laser center line

O G

, whose reflected beamline is the

G S^{'}

, and set the unit vector of reflected laser center line as

u_{{GS}^{'}}

. We set the initial normal unit vector of the RIS surface as

u_{R} = (x_{R}, y_{R}, z_{R}) = (0, 0, - 1)

, due to the RIS being parallel to the ground surface at the beginning. Then, we set the normal vector after the first rotation as

u_{R^{'}} = (x_{R^{'}}, y_{R^{'}}, z_{R^{'}})

, and the normal vector after the second rotation as

u_{R^{″}} = (x_{R^{″}}, y_{R^{″}}, z_{R^{″}})

. These two transfer angles

θ_{1}

and

θ_{2}

describe the RIS rotations around the x-axis and y-axis. The angles are set as anti-clockwise and positive when looking toward the origin. The rotations of the RIS can be denoted as the rotations of the normal unit vector of the RIS surface, according to the rotation matrix, as shown below

\begin{matrix} x_{R^{'}} = x_{R} \\ y_{R^{'}} = y_{R} cos θ_{1} - z_{R} sin θ_{1} \\ z_{R^{'}} = y_{R} sin θ_{1} + z_{R} cos θ_{1}, \\ x_{R^{″}} = z_{R^{'}} sin (θ_{2}) + x_{R^{'}} cos (θ_{2}) \\ y_{R^{″}} = y_{R^{'}} \\ z_{R^{″}} = z_{R^{'}} cos (θ_{2}) - x_{R^{'}} sin (θ_{2}) . \end{matrix}

(A6)

Figure A2. The visual LS and the project center of laser.

After confirming the normal vector of the RIS after rotations, according to the law of specular reflection [53], the unit vector

u_{{GS}^{'}}

of the reflected beam centre line can be denoted as

u_{{GS}^{'}} = 2 (u_{R^{″}} \cdot u_{OG}) u_{R^{″}} - u_{OG},

(A7)

then the problem is transformed to obtain the beam footprint center

S^{'} (x_{S^{'}} [n], y_{S^{'}} [n], 0)

which is a line–plane intersection problem between reflected beam line

O^{'} G S^{'}

and ground plane.

Then, according to the geometrical analysis with line

O^{'} G S^{'}

along with reflected beam, the beamwidth can be calculated by the visual LS coordinate

O^{'} (x_{O^{'}} [n], y_{O^{'}} [n], z_{O^{'}} [n])

and beam footprint center

S^{'} (x_{S^{'}} [n], y_{S^{'}} [n], 0)

, expressed at (A5), where the divergence angle

δ

of the Gaussian laser beam can be denoted as [49],

δ = \frac{2 λ}{π W_{0}},

(A8)

where

λ

is the wavelength of the laser used for communication and

W_{0}

is the initial beam waist and can be seen as equal to the radius of LS.

Appendix C. Leading Angles of Phase Shifts Calculation

The leading angles of phase shifts, which stand for the center of the laser beam aligned with the center of the PD under the current RIS coordinate and rotations status. As shown in Figure A2, the rotation angles can be expressed by

θ_{1}

and

θ_{2}

, which stand for the rotations along the x-axis and y-axis with counterclockwise direction. Therefore, the inversion processing of the rotation matrix is proposed.

First, the rotated normal unit vector

u_{R}^{*}

can be solved by the specular reflection equation

u_{GS} = 2 (u_{R}^{*} \cdot u_{OG}) u_{R}^{*} - u_{OG},

(A9)

where

u_{{GS}^{'}}

is the laser unit vector from the center of the RIS to the beam projection point, the unit vector

u_{GS}

starts from the center of the RIS to the center of the PD. By solving (A9) to find the solution

u_{R}^{*}

, the problem is reversed to find the rotation angles as the solution from vector

u_{R}

to vector

u_{R}^{*}

.

Second, we use the quaternion rotation method [54] to solve this problem. A quaternion is a 4-tuple written formally as

q = q_{1} + q_{2} i + q_{3} j + q_{4} k

, where

q_{m}, m \in {1, 2, 3, 4}

are real numbers and

i, j, k

are imaginary parts, the axis-angle expression of rotations between

u_{R}

and

u_{R}^{*}

can be denoted as the unit vector

u_{R}

rotation angle

ξ

around the axis unit vector

ν = ν_{x} i + ν_{y} j + ν_{z} k

, which can be written as quaternion

q = cos (\frac{ξ}{2}) + (ν_{x} i + ν_{y} j + ν_{z} k) sin (\frac{ξ}{2}),

(A10)

where

ξ = arccos (u_{R} \cdot u_{R}^{*})

, and

ν = \frac{u_{R} \times u_{R}^{*}}{| u_{R} \times u_{R}^{*} |}

. Thus

q_{m}

can be extracted from the (A10).

Third, after finding the elements

q_{m}

of the quaternion, the rotation phase shifts can be expressed by the quaternion elements

\begin{matrix} θ_{2}^{*} = & - arcsin (2 (q_{2} q_{4} - q_{1} q_{3})), \\ θ_{1}^{*} = & arctan (\frac{2 (q_{3} q_{4} + q_{1} q_{2})}{cos θ_{2}^{*}}, \frac{1 - 2 q_{2}^{2} - 2 q_{3}^{2}}{cos θ_{2}^{*}}) . \end{matrix}

(A11)

References

Corral, F.V.; Cuenca, C.; Soto, I. Design of an Optical Wireless Network using Free Space Optics Technology (FSO), in 5G/6G Networks Environment. In Proceedings of the 2021 IEEE ICA-ACCA, Valparaíso, Chile, 22–26 March 2021. [Google Scholar]
Chowdhury, M.Z.; Shahjalal, M.; Ahmed, S.; Jang, Y.M. 6G Wireless Communication Systems: Applications, Requirements, Technologies, Challenges, and Research Directions. IEEE Open J. Commun. Soc. 2020, 1, 957–975. [Google Scholar] [CrossRef]
Boluda-Ruiz, R.; García-Zambrana, A.; Castillo-Vázquez, B.; Castillo-Vázquez, C. Ergodic Capacity Analysis of Decode-and-Forward Relay-Assisted FSO Systems Over Alpha–Mu Fading Channels Considering Pointing Errors. IEEE Photonics J. 2016, 8, 7900611. [Google Scholar] [CrossRef]
Zedini, E.; Soury, H.; Alouini, M.S. On the Performance Analysis of Dual-Hop Mixed FSO/RF Systems. IEEE Trans. Wirel. Commun. 2016, 15, 3679–3689. [Google Scholar] [CrossRef]
Najafi, M.; Jamali, V.; Schober, R. Optimal Relay Selection for the Parallel Hybrid RF/FSO Relay Channel: Non-Buffer-Aided and Buffer-Aided Designs. IEEE Trans. Commun. 2017, 65, 2794–2810. [Google Scholar] [CrossRef]
Abou-Rjeily, C.; Noun, Z. Impact of Inter-Relay Co-Operation on the Performance of FSO Systems with Any Number of Relays. IEEE Trans. Wirel. Commun. 2016, 15, 3796–3809. [Google Scholar] [CrossRef]
Zhang, J.; Liang, F.; Li, B.; Yang, Z.; Wu, Y.; Zhu, H. Placement optimization of caching UAV-assisted mobile relay maritime communication. China Commun. 2020, 17, 209–219. [Google Scholar] [CrossRef]
Ji, B.; Li, Y.; Cao, D.; Li, C.; Mumtaz, S.; Wang, D. Secrecy Performance Analysis of UAV Assisted Relay Transmission for Cognitive Network with Energy Harvesting. IEEE Trans. Veh. Technol. 2020, 69, 7404–7415. [Google Scholar] [CrossRef]
Xu, G.; Song, Z. Performance Analysis of a UAV-Assisted RF/FSO Relaying Systems for Internet of Vehicles. IEEE Internet Things J. 2022, 9, 5730–5741. [Google Scholar] [CrossRef]
Singya, P.K.; Alouini, M.S. Performance of UAV assisted Multiuser Terrestrial-Satellite Communication System over Mixed FSO/RF Channels. IEEE Trans. Aerosp. Electron. Syst. 2021, 58, 781–796. [Google Scholar] [CrossRef]
Dabiri, M.T.; Khankalantary, S.; Piran, M.J.; Ansari, I.S.; Uysal, M.; Saad, W.; Hong, C.S. UAV-Assisted Free Space Optical Communication System with Amplify-and-Forward Relaying. IEEE Trans. Veh. Technol. 2021, 70, 8926–8936. [Google Scholar] [CrossRef]
Shafique, T.; Tabassum, H.; Hossain, E. Optimization of Wireless Relaying with Flexible UAV-Borne Reflecting Surfaces. IEEE Trans. Commun. 2021, 69, 309–325. [Google Scholar] [CrossRef]
Cheng, Y.; Peng, W.; Huang, C.; Alexandropoulos, G.C.; Yuen, C.; Debbah, M. RIS-Aided Wireless Communications: Extra Degrees of Freedom via Rotation and Location Optimization. IEEE Trans. Wirel. Commun. 2022, 21, 6656–6671. [Google Scholar] [CrossRef]
ElMossallamy, M.A.; Zhang, H.; Song, L.; Seddik, K.G.; Han, Z.; Li, G.Y. Reconfigurable Intelligent Surfaces for Wireless Communications: Principles, Challenges, and Opportunities. IEEE Trans. Cogn. Commun. Netw. 2020, 6, 990–1002. [Google Scholar] [CrossRef]
Wang, J.; Wang, H.; Han, Y.; Jin, S.; Li, X. Joint Transmit Beamforming and Phase Shift Design for Reconfigurable Intelligent Surface Assisted MIMO Systems. IEEE Trans. Cogn. Commun. Netw. 2021, 7, 354–368. [Google Scholar] [CrossRef]
Ajam, H.; Naja, M.; Jamali, V.; Schober, R. Channel Modeling for IRS-Assisted FSO Systems. In Proceedings of the 2021 IEEE Wireless Communications and Networking Conference (WCNC), Nanjing, China, 29 March–1 April 2021; pp. 1–7. [Google Scholar]
Najafi, M.; Schober, R. Intelligent Reflecting Surfaces for Free Space Optical Communications. In Proceedings of the 2019 IEEE GLOBECOM, Waikoloa, HI, USA, 9–13 December 2019. [Google Scholar]
Jia, H.; Zhong, J.; Janardhanan, M.N.; Chen, G. Ergodic Capacity Analysis for FSO Communications with UAV-Equipped IRS in the Presence of Pointing Error. In Proceedings of the 2020 IEEE 20th International Conference on Communication Technology (ICCT), Nanning, China, 28–31 October 2020. [Google Scholar]
Nguyen, M.H.T.; Garcia-Palacios, E.; Do-Duy, T.; Dobre, O.A.; Duong, T.Q. UAV-Aided Aerial Reconfigurable Intelligent Surface Communications with Massive MIMO System. IEEE Trans. Cogn. Commun. Netw. 2022, 8, 1828–1838. [Google Scholar] [CrossRef]
Simeone, O. A Very Brief Introduction to Machine Learning with Applications to Communication Systems. IEEE Trans. Cogn. Commun. Netw. 2018, 4, 648–664. [Google Scholar] [CrossRef]
Yang, D.; Dan, Q.; Xiao, L.; Liu, C.; Cuthbert, L. An Efficient Trajectory Planning for Cellular-Connected UAV under the Connectivity Constraint. China Commun. 2021, 18, 136–151. [Google Scholar] [CrossRef]
Hu, Y.; Yuan, X.; Xu, J.; Schmeink, A. Optimal 1D Trajectory Design for UAV-Enabled Multiuser Wireless Power Transfer. IEEE Trans. Commun. 2019, 67, 5674–5688. [Google Scholar] [CrossRef]
Mitchell, T.M. Machine Learning; McGraw-Hill: New York, NY, USA, 1997. [Google Scholar]
Chen, X.; Wang, Z.; Hua, Q.; Shang, W.L.; Luo, Q.; Yu, K. AI-Empowered Speed Extraction via Port-Like Videos for Vehicular Trajectory Analysis. IEEE Trans. Intell. Transp. Syst. 2023, 24, 4541–4552. [Google Scholar] [CrossRef]
Liu, X.; Liu, Y.; Chen, Y.; Hanzo, L. Trajectory Design and Power Control for Multi-UAV Assisted Wireless Networks: A Machine Learning Approach. IEEE Trans. Veh. Technol. 2019, 68, 7957–7969. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436. [Google Scholar] [CrossRef] [PubMed]
Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal Policy Optimization Algorithms. arXiv 2017, arXiv:1707.06347. [Google Scholar]
Samir, M.; Assi, C.; Sharafeddine, S.; Ghrayeb, A. Online Altitude Control and Scheduling Policy for Minimizing AoI in UAV-Assisted IoT Wireless Networks. IEEE Trans. Mob. Comput. 2022, 21, 2493–2505. [Google Scholar] [CrossRef]
Dai, C.; Zhu, K.; Hossain, E. Multi-Agent Deep Reinforcement Learning for Joint Decoupled User Association and Trajectory Design in Full-Duplex Multi-UAV Networks. IEEE Trans. Mob. Comput. 2022, 22, 6056–6070. [Google Scholar] [CrossRef]
Hassan, S.S.; Min Park, Y.; Tun, Y.K.; Saad, W.; Han, Z.; Hong, C.S. TO: THz-Enabled Throughput and Trajectory Optimization of UAVs in 6G Networks by Proximal Policy Optimization Deep Reinforcement Learning. In Proceedings of the ICC 2022-IEEE International Conference on Communications, Seoul, Republic of Korea, 16–20 May 2022; pp. 5712–5718. [Google Scholar]
Wang, L.; Wang, K.; Pan, C.; Xu, W.; Aslam, N.; Hanzo, L. Multi-Agent Deep Reinforcement Learning-Based Trajectory Planning for Multi-UAV Assisted Mobile Edge Computing. IEEE Trans. Cogn. Commun. Netw. 2021, 7, 73–84. [Google Scholar] [CrossRef]
Niu, Y.; Jin, X.; Wen, Z. Research on Command and Control Technology in Small-UAV Defense System. In Proceedings of the 2021 International Conference on Computer Information Science and Artificial Intelligence (CISAI), Kunming, China, 17–19 September 2021; pp. 673–677. [Google Scholar]
Company, N.S. Advances in Real-time Rendering in 3D Graphics and Games. 3D Graph. 2009. [Google Scholar] [CrossRef]
Houston, A.L.; PytlikZillig, L.M.; Walther, J.C. National Weather Service Data Needs for Short-term Forecasts and the Role of Unmanned Aircraft in Filling the Gap: Results from a Nationwide Survey. Bull. Am. Meteorol. Soc. 2021, 102, E2106–E2120. [Google Scholar]
Cao, Q.; Zhao, Z.; Zeng, Q.; Wang, Z.; Long, K. Real-time vehicle trajectory prediction for traffic conflict detection at unsignalized intersections. J. Adv. Transp. 2021, 2021, 8453726. [Google Scholar] [CrossRef]
Kim, D.; Jo, K.; Lee, M.; Sunwoo, M. L-Shape Model Switching-Based Precise Motion Tracking of Moving Vehicles Using Laser Scanners. IEEE Trans. Intell. Transp. Syst. 2018, 19, 598–612. [Google Scholar] [CrossRef]
Ajam, H.; Najafi, M.; Jamali, V.; Schmauss, B.; Schober, R. Modeling and Design of IRS-Assisted Multi-Link FSO Systems. IEEE Trans. Commun. 2022, 70, 3333–3349. [Google Scholar] [CrossRef]
Wang, Y.; Chen, J. Online 3D Placement for UAV-aided Free-space Optical Communication under Shadowing. In Proceedings of the 2022 31st Wireless and Optical Communications Conference (WOCC), Shenzhen, China, 11–12 August 2022; pp. 91–96. [Google Scholar]
Lee, J.H.; Park, K.H.; Ko, Y.C.; Alouini, M.S. Throughput Maximization of Mixed FSO/RF UAV-Aided Mobile Relaying with a Buffer. IEEE Trans. Wirel. Commun. 2021, 20, 683–694. [Google Scholar] [CrossRef]
Barillé, R. Turbulence and Related Phenomena; IntechOpen: Rijeka, Croatia, 2019. [Google Scholar]
Al-Habash, M.; Andrews, L.; Philips, R. Mathematical Model for the Irradiance PDF of a Laser Beam Propagating through Turbulent Media. Opt. Eng. 2001, 40, 1554–1562. [Google Scholar] [CrossRef]
Kaushal, H.; Jain, V.; Kar, S. Free Space Optical Communication; Optical Networks; Springer: Harsana, India, 2017. [Google Scholar]
Lapidoth, A.; Moser, S.M.; Wigger, M.A. On the Capacity of Free-Space Optical Intensity Channels. IEEE Trans. Inf. Theory 2009, 55, 4449–4461. [Google Scholar] [CrossRef]
Advanced Optical Wireless Communication Systems; Cambridge University Press: Cambridge, UK, 2012.
Juárez-Castillo, E.; Acosta-Mesa, H.G.; Mezura-Montes, E. Empirical study of bound constraint-handling methods in Particle Swarm Optimization for constrained search spaces. In Proceedings of the 2017 IEEE Congress on Evolutionary Computation (CEC), Donostia, Spain, 5–8 June 2017; pp. 604–611. [Google Scholar]
Nadeem, F.; Kvicera, V.; Awan, M.S.; Leitgeb, E.; Muhammad, S.S.; Kandus, G. Weather Effects on Hybrid FSO/RF Communication Link. IEEE J. Sel. Areas Commun. 2009, 27, 1687–1697. [Google Scholar] [CrossRef]
Farid, A.A.; Hranilovic, S. Outage Capacity Optimization for Free-Space Optical Links with Pointing Errors. J. Light. Technol. 2007, 25, 1702–1710. [Google Scholar] [CrossRef]
Alkholidi, A.; Altowij, K. Effect of Clear Atmospheric Turbulence on Quality of Free Space Optical Communications in Western Asia. In Optical Communications Systems; Das, N., Ed.; IntechOpen: Rijeka, Croatia, 2012; Volume 2. [Google Scholar]
Majumdar, A. Advanced Free Space Optics (FSO); Springer: New York, NY, USA, 2015; Volume 186. [Google Scholar]
The UK Civil Aviation Authority. The Drone and Model Aircraft Code; The UK Civil Aviation Authority: London, UK, 2022.
DJI Official Website. The Flight Performance Comparison of Consumer Drones. Available online: https://www.dji.com/uk/products/compare-consumer-drones (accessed on 1 October 2022).
Mei, H.; Yang, K.; Liu, Q.; Wang, K. 3D-Trajectory and Phase-Shift Design for RIS-Assisted UAV Systems Using Deep Reinforcement Learning. IEEE Trans. Veh. Technol. 2022, 71, 3020–3029. [Google Scholar] [CrossRef]
Sunday, D. Practical Geometry Algorithms: With C++ Code; Amazon Digital Services LLC-KDP: Seattle, WA, USA, 2021. [Google Scholar]
Kuipers, J.B. Quaternions and Rotation Sequences: A Primer with Applications to Orbits, Aerospace, and Virtual Reality; Princeton University Press: Princeton, NJ, USA, 1999. [Google Scholar]

Figure 1. The position and orientation of the system in the considered coordinates.

Figure 2. The visual LS and the project center of laser.

Figure 3. The ellipse beam footprint and circular PD with error vector.

Figure 4. The PPO network.

Figure 5. The comparison of the UAV trajectories between PPO and DQN methods for different flight times without fog.

Figure 6. The comparison of the UAV trajectories between PPO and DQN method for different flight time with fog

r_{f} = 15 m

.

Figure 6. The comparison of the UAV trajectories between PPO and DQN method for different flight time with fog

r_{f} = 15 m

.

Figure 7. The comparison of the UAV trajectories between PPO and DQN method for 150 flight time with different fogs.

Figure 8. The comparison of the UAV trajectories between PPO and DQN methods with different flight time slots, where

G [1] = (- 20, 30, 80) m

and

G [N] = (20, 30, 80) m

.

Figure 8. The comparison of the UAV trajectories between PPO and DQN methods with different flight time slots, where

G [1] = (- 20, 30, 80) m

and

G [N] = (20, 30, 80) m

.

Figure 9. The comparison of the UAV trajectories between PPO and DQN methods with irregular fog.

Table 1. Comparisons of some existing works, the ✗ stands for not included, the ✓ stands for included, and the N/A stands for not applicable.

Ref.	Channel Medium	Fog Impact Modelling	No Transmission Delay	RIS Optimization	UAV Trajectory Design
[5]	FSO/RF	✗	✗	N/A	N/A
[6]	FSO	✗	✗	N/A	N/A
[7]	RF	N/A	✗	N/A	✗
[8]	RF	N/A	✗	N/A	✗
[9]	FSO/RF	✗	✗	N/A	✗
[10]	FSO/RF	✗	✗	N/A	✗
[11]	FSO	✗	✗	N/A	✗
[12]	RF	N/A	✓	✓	✗
[13]	RF	N/A	✓	✓	✗
[16]	FSO	✗	✓	✓	N/A
[17]	FSO	✗	✓	✓	N/A
[18]	FSO	✗	✓	✓	✗
This paper	FSO	✓	✓	✓	✓

Table 2. Simulation Parameters.

Parameter	Value
Visibility range clean/fog	200/0.3 $km$
Transmitting power	45 $dBm$
LS coordinate	(0, 0, 0) $m$
PD coordinate	(0, 100, 0) $m$
Laser wavelength	1550 $nm$
Receiver radius	0.05 $m$
Detector responsivity	0.5
Atmospheric turbulence	0.91
UAV height limitation	60–90 $m$
UAV velocity limitation	1.5 $m / s$

Table 3. Comparison of 150 and 100 flight times without fog.

Flight Time	RIS (bits/s/Hz)	DF (bits/s/Hz)
PPO $N = 150$	4.05	2.64
PPO $N = 100$	3.91	2.52
DQN $N = 150$	3.87	2.49
DQN $N = 100$	3.77	2.46

Table 4. Comparison of 150, 100, and 50 flight times with

r_{f} = 15 m

fog.

Table 4. Comparison of 150, 100, and 50 flight times with

r_{f} = 15 m

fog.

Flight Time	Average Capacity (bits/s/Hz)
PPO $N = 150$	3.92
PPO $N = 100$	3.77
PPO $N = 50$	1.37
DQN $N = 150$	3.77
DQN $N = 100$	3.64
DQN $N = 50$	1.17

Table 5. Comparison for different fog conditions under 150 time slots.

Fog Radii (m)	Average Capacity (bits/s/Hz)
PPO No fog	4.04
PPO $r_{f} = 15$	3.92
PPO $r_{f} = 20$	3.19
DQN No fog	3.87
DQN $r_{f} = 15$	3.77
DQN $r_{f} = 20$	2.97

Table 6. Comparison of 100 and 60 flight times without fog.

Flight Time	Average Capacity (bits/s/Hz)
PPO $N = 100$	3.91
PPO $N = 60$	3.09
DQN $N = 100$	3.48
DQN $N = 60$	3.02

Table 7. Comparison of 150 and 100 flight times with irregular fog.

Flight Time	Average Capacity (bits/s/Hz)
PPO $N = 150$	4.31
PPO $N = 100$	3.73
DQN $N = 150$	3.61
DQN $N = 100$	3.51

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jia, H.; Chen, G.; Huang, C.; Dang, S.; Chambers, J.A. Trajectory and Phase Shift Optimization for RIS-Equipped UAV in FSO Communications with Atmospheric and Pointing Error Loss. Electronics 2023, 12, 4275. https://doi.org/10.3390/electronics12204275

AMA Style

Jia H, Chen G, Huang C, Dang S, Chambers JA. Trajectory and Phase Shift Optimization for RIS-Equipped UAV in FSO Communications with Atmospheric and Pointing Error Loss. Electronics. 2023; 12(20):4275. https://doi.org/10.3390/electronics12204275

Chicago/Turabian Style

Jia, Haocheng, Gaojie Chen, Chong Huang, Shuping Dang, and Jonathon A. Chambers. 2023. "Trajectory and Phase Shift Optimization for RIS-Equipped UAV in FSO Communications with Atmospheric and Pointing Error Loss" Electronics 12, no. 20: 4275. https://doi.org/10.3390/electronics12204275

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Trajectory and Phase Shift Optimization for RIS-Equipped UAV in FSO Communications with Atmospheric and Pointing Error Loss

Abstract

1. Introduction

2. System and Problem Formulation

3. PSO-Based Optimization of RIS Phase Shifts

3.1. Initialization

3.2. Calculate the Leading Angles

3.3. Set Personal and Global Best

3.4. PSO Main Loop

4. PPO-Based Optimization of the UAV Trajectory

4.1. PPO Algorithm

Learning Algorithm

5. Optimization Results

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Propagation Distance in Clean Air and Fog

Appendix B. Beamwidth Calculation

Appendix C. Leading Angles of Phase Shifts Calculation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI