Intelligent Design of Hairpin Filters Based on Artificial Neural Network and Proximal Policy Optimization

Ye, Yunong; Wu, Yifan; Chen, Jiayu; Su, Guodong; Wang, Junchao; Liu, Jun

doi:10.3390/app13169379

Open AccessArticle

Intelligent Design of Hairpin Filters Based on Artificial Neural Network and Proximal Policy Optimization

by

Yunong Ye

^1,†,

Yifan Wu

^2,†,

Jiayu Chen

²,

Guodong Su

²

,

Junchao Wang

^2,*

and

Jun Liu

^2,*

¹

Information Science Academy, China Electronics Technology Group Corporation, Beijing 100041, China

²

Zhejiang Provincial Key Laboratory of Large Scale Integrated Circuit Design, School of Electronic Information, Hangzhou Dianzi University, Hangzhou 310018, China

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Appl. Sci. 2023, 13(16), 9379; https://doi.org/10.3390/app13169379

Submission received: 25 July 2023 / Revised: 11 August 2023 / Accepted: 16 August 2023 / Published: 18 August 2023

Download

Browse Figures

Versions Notes

Abstract

:

Microstrip filters are widely used in high-frequency circuit design for signal frequency selection. However, designing these filters often requires extensive trial and error to achieve the desired performance metrics, leading to significant time costs. In this work, we propose an automated design flow for hairpin filters, a specific type of microstrip filter. We employ artificial neural network (ANN) modeling techniques to predict the circuit performance of hairpin filters, and leverage the efficiency of low-cost models to deploy reinforcement learning agents. Specifically, we use the proximal policy optimization (PPO) reinforcement learning algorithm to learn abstract design actions for the filters, allowing us to achieve automated optimization design. Through simulation results, we demonstrate the effectiveness of the proposed approach. By optimizing the geometric dimensions, we significantly improve the performance metrics of hairpin filters, and the trained agent successfully meets our specified design goals within 5 to 15 design steps. This work serves as a conceptual validation attempt to apply reinforcement learning techniques and pre-trained ANN models to automate MMIC filter design. It exhibits clear advantages in terms of time-saving and performance efficiency when compared to other optimization algorithms.

Keywords:

hairpin filter; reinforcement learning; proximal policy optimization

1. Introduction

In recent years, the rapid development of wireless communication technology has driven the rapid growth in demand for various custom microwave circuits. As a core component of wireless communication systems, microwave circuits not only need to provide high-frequency signal processing and transmission functions but also face higher performance and integration requirements. However, the design and manufacturing of custom microwave circuits is not an easy task. It requires engineers to possess deep expertise and experience, as well as advanced simulation and testing equipment. Additionally, the development cycle of custom microwave circuits is long and the cost is high, requiring a trade-off between performance and cost. Therefore, in different application scenarios, engineers need to carefully analyze the requirements and select suitable microwave circuit solutions to meet the requirements of various scenarios. The design process of microwave filters serves as a typical example.

Microstrip filters are devices that utilize microstrip line structures to select signal frequencies. They are known for their compact structure, small size, low cost, and ease of integration and manufacturing [1], making them a critical component in modern wireless communication systems [2]. The hairpin filter is a classic design of microstrip filters, which enhances the filter’s compactness and further reduces its size by using a folded resonator structure [3].

To ensure an effective design, microstrip filter designers typically need to go through several crucial steps [4,5,6]: 1. Determine the design specifications of the circuit and represent them using transfer functions; 2. Synthesize the transfer functions using an ideal lumped element network; 3. Due to the comparable wavelength of circuit elements in the microwave frequency range, it is necessary to transform the lumped circuit implementation into a distributed circuit implementation and confirm the equivalence between the lumped synthesis network and the desired distributed circuit structure; 4. In the optimization stage, adjust the dimensions of the actual filter structure to achieve the desired frequency response.

The final step often requires engineers to iterate and modify the physical dimensions of the filter to meet the specified design requirements. This iterative process heavily relies on the engineer’s expertise and design experience [7], which can be tedious and time-consuming. It also limits engineers from engaging in more innovative circuit structure development. Therefore, research on automating this design process continues to grow. Additionally, this process is constrained by the computational resources and time costs associated with electromagnetic simulation algorithms [8]. Depending on the complexity of the electromagnetic model, the computation time can range from minutes to hours [9]. These two factors are the main reasons for the longer development cycle of MMIC compared to other IC designs.

Developing more efficient Electronic Design Automation (EDA) software to meet the growing market demands is one of the research directions that most researchers are focusing on [10]. To overcome the challenges of CPU-intensive electromagnetic (EM) simulation calculations and automated synthesis of microwave circuits, researchers have begun exploring the potential applications of machine learning techniques in design automation and computer-aided design, such as artificial neural networks (ANN) and heuristic optimization algorithms.

The main contributions of this paper are as follows:

(1): For the design of hairpin filters, we transform the design process into a continuous reinforcement learning (RL) problem and propose an RL framework based on the PPO algorithm.
(2): With guidance from practical design experience, we appropriately design a reward function to stabilize and accelerate the training process of RL.
(3): To reduce the time cost of repetitive success or failure in the RL training process, we introduce a forward neural network model trained through supervised learning to replace the CPU-intensive EM simulation.

The remaining structure of the article is as follows. Section 2 introduces the related work on neural networks and heuristic algorithms. In Section 3, we present the algorithm framework, design a method for hairpin filter design based on PPO, and provide detailed explanations for the training settings of the forward network model for hairpin filters, as well as the design of RL states, actions, and reward functions, with a specific focus on the reward function design. To validate the effectiveness of the proposed design method, we conducted a series of experiments, and the experimental results are presented and discussed in Section 4. Finally, Section 5 concludes the article, discussing the achievements, limitations, and future research plans.

2. Related Work

2.1. Modeling and Applications of Artificial Neural Networks

Artificial neural networks (ANNs) are a type of machine learning algorithm inspired by biological neurons. They simulate the mechanism of brain structure and external stimulus response, can capture any nonlinear relationship with high precision, and fit and estimate functions, thus forming mathematical mapping between any input–output relationship. Once the ANN model is trained, it can quickly and accurately complete calculations with almost negligible overhead. Therefore, artificial neural network technology is considered a possible solution, training the ANN model with the simulation results of relevant electromagnetic simulation software in order to reduce the cost of simulation calculations by replacing expensive EM models with ANN models [11,12].

Artificial neural network technology has been widely applied in the modeling and design of microwave components or circuits, including the modeling of on-chip square spiral inductors [13], the development of efficient and accurate models of CPW-related components [14], and the modeling of small-signal and large-signal models of field-effect transistors combined with equivalent circuit modeling and ANN modeling technology [15]. It also involves research directions such as the optimization of filter design [16,17,18], power amplifiers [19], and microstrip antennas [20]. Another design idea based on ANN modeling is called “neural network inverse modeling” [21], from which there have been some research results [22,23,24]. Unlike the forward model of a microwave device, where the input is the geometric design parameters, the input of the neural network inverse model is electrical parameters and the output is the geometric design parameters of the device. Although the inverse model of the neural network can produce design solutions faster than traditional optimization methods, the multivalued problem that the input and output are not uniquely mapped makes the inverse modeling much more difficult than the forward one [25].

Neural network modeling technology is not only used in the field of microwave device modeling; ANN modeling technology is also employed to predict the boundary conditions of the concentration distribution of a microfluidic concentration gradient generator (a device used to generate gradient concentrations in microfluidic chips [26]), which is widely used in chemistry, biology, medicine, and other fields. Compared to traditional methods, this approach achieved an acceleration of over 300 times while maintaining an accuracy rate of 93.71%.

In this work, we utilize neural network modeling techniques to establish an ANN surrogate model for hairpin filters, replacing expensive EM simulations.

2.2. Application of Nature-Inspired Heuristic Algorithms

In recent years, heuristic intelligent algorithms have been widely applied in various fields. These algorithms utilize optimization methods such as particle swarm optimization algorithm [27], genetic algorithm [28], and reinforcement learning [29] for the automatic design and optimization of neural network parameters.

Ref. [30] proposes a hybrid algorithm, MPA-SCA, for parameter selection of hybrid active power filters (HAPF) in power plants. This algorithm combines the Marine Predator Algorithm (MPA) and Sine Cosine Algorithm (SCA) to optimize the performance of HAPF. The proposed algorithm is tested in various industrial power plants and the results show that it outperforms other algorithms in mitigating power harmonics. The study concludes that the proposed algorithm has the potential to be a valuable tool for designing and optimizing HAPFs in the power industry.

Ref. [31] introduces a novel metaheuristic approach based on cuckoo search for deep learning-based depression prediction. This method is tested on a Twitter post dataset, achieving an accuracy of 87.5%. The results demonstrate the potential of this approach in utilizing social media data for predicting depression.

Heuristic algorithms are widely applied in the field of microwave/RF as well. Due to the lack of analytical solutions or well-defined mathematical formulas in microwave circuit design optimization methods and the involvement of a highly complex optimization environment, these heuristic algorithms provide a flexible exploratory approach and have the ability to escape local optima. Therefore, they are well-suited to global optimization design of microwave circuits.

Ref. [32] applied the particle swarm optimization algorithm to microwave circuit design, specifically optimizing microstrip couplers and single parallel short-circuit matching circuits. Simulation tests confirmed the effectiveness of this method. Ref. [33] proposed a universal precise design method for RF circuits based on the NSGA-II genetic algorithm, addressing the problem of multiple conflicting objectives in RF circuit design using a low noise amplifier as an example. Refs. [34,35] proposed an automatic tuning framework for cavity filters based on reinforcement learning algorithms. This framework was tested in a simulation environment and verified its applicability under various custom tuning tasks. Ref. [36] utilized the deep deterministic policy gradient algorithm to automatically adjust and design the topology structure of high-sensitivity microwave microfluidic sensors, resulting in a significant improvement in sensor sensitivity.

However, these heuristic intelligent search algorithms still require a large number of iterative optimizations to explore the entire design space. In microwave circuit design, each iteration of optimization requires evaluation based on electromagnetic simulations, resulting in significant computational overhead. In this work, we address this issue by utilizing neural network modeling techniques. Additionally, building upon previous research, this study explores the extension and application of the PPO reinforcement learning algorithm in microstrip filter design tasks. Furthermore, we accelerate the training process by incorporating a reward function based on engineer’s expertise, thereby improving learning efficiency.

3. Overview of the Design Method

The object of our study is a fifth-order hairpin filter, as shown in Figure 1. The hairpin filter achieves a more compact design by folding parallel-coupled line resonators into a “U” shape [37], essentially functioning as a coupled bandpass filter.

Its operation is based on the characteristics of microstrip lines and the propagation principles of electromagnetic waves. A microstrip line is a conductive wire fabricated on a dielectric substrate, possessing certain inductance and capacitance. When multiple microstrip lines are arranged in parallel, electromagnetic coupling occurs between them. This coupling causes a phase difference in the propagation of electromagnetic waves between the microstrip lines, resulting in filtering effects. Specifically, when an input signal passes through the hairpin filter, it is influenced by the coupling between the microstrip lines. Signals of different frequencies propagate at different speeds between the microstrip lines, leading to changes in the phase difference at the output end. By properly designing the parameters and arrangement of the microstrip lines, the phase difference within specific frequency ranges can be selectively increased or decreased, achieving filtering effects on the signal. Additionally, the inductance and capacitance characteristics of the microstrip lines also contribute to signal filtering. The inductance and capacitance of microstrip lines are frequency-dependent, and the propagation speed and impedance of signals on microstrip lines vary with different frequencies. By appropriately designing the inductance and capacitance of the microstrip lines, signals within specific frequency ranges can be selectively impeded or transmitted, achieving filtering effects.

By adjusting the geometric parameters and spacing of the microstrip lines, the resonant frequency and bandwidth of the hairpin filter can be controlled, selectively impeding or transmitting signals within specific frequency ranges, achieving filtering effects. As shown in Figure 1, the arm lengths

(L_{1}, L_{2}, L t)

, gaps

(d_{1}, d_{2})

, and line width

(W)

are all related to the frequency response of the filter, affecting its stopband attenuation and selectivity.

3.1. Framework

Figure 2 illustrates the design flowchart of the optimization strategy proposed for the hairpin filter. We propose a method for designing and optimizing the hairpin filter using neural network modeling and reinforcement learning techniques. In this method, we train an ANN model using EM simulation data of the hairpin filter. The trained ANN model is then used as a substitute for EM simulation and an RL agent is established to interact with the design environment of the hairpin filter. The design solutions are evaluated using the ANN model and reasonable rewards are given to the agent, allowing it to learn the strategies for designing the hairpin filter during this process. Therefore, the entire design process can be divided into two stages.

The initial stage, known as the ANN modeling stage, involves simulating and modeling the electrical behavior of the hairpin filter. To acquire different physical characteristics of the filter under various designs, we scan the geometric parameters that might influence its performance within a chosen design space and conduct simulations. We then create a database that maps the relationship between the geometric and performance parameters, serving as the foundation for training the ANN model. This enables us to develop a cost-effective model that provides the circuit performance, specifically the S-parameters in this instance, with significantly lower computational and time costs compared to traditional electromagnetic simulation software.

In the second stage, namely the reinforcement learning phase, we employ the PPO algorithm. PPO retains the concepts of the actor network and critic network within the Actor–Critic (AC) architecture: the actor network is a policy function that determines what action to take based on the current state of the environment to achieve the maximum reward; the critic network’s task is to estimate the value of the current state, minimizing the error between the predicted state value function and the actual reward. The actor makes decisions based on the critic’s evaluation and optimizes the policy through feedback from the environment. The critic updates the value function based on the actual reward and the action chosen by the actor, providing a more accurate value assessment. Through this mechanism, the agent continuously interacts with the environment by taking actions, and optimizes its own strategy by observing the current state and feedback from the environment, ultimately leading the agent to adopt a strategy that aligns with our expectations and completes the optimization of the hairpin filter.

Our proposed method is not limited to hairpin filters. This method can be extended to the sizing design of other microwave circuits as well.

3.2. Training an ANN for Predicting the S-Parameters of Hairpin Filters

In this section, we delve into the specifics of the training phase for the artificial neural network applied to the hairpin filter. Given that the S-parameters of the microstrip filter precisely delineate its passband, stopband, insertion loss, and return loss frequency response characteristics, we designated the S-parameters as our research objective. We established an artificial neural network model to decipher the nonlinear correlation between the geometric design parameters of the hairpin filter and its S-parameters (amplitude). By varying the geometric design parameters of the hairpin filter, we investigated a potential design space, as illustrated in Table 1. We employed the electromagnetic simulation software, HFSS, to acquire the corresponding return loss and insertion loss of the filter. In this phase, we simulated 1759 distinct combinations of geometric parameters. Our goal was to uniformly distribute the sampled design solutions across the design space to ensure our samples effectively encapsulate the frequency characteristics of the hairpin filter within this design space.

The quality of the dataset significantly influences the performance of the final neural network model. Thus, we initially cleanse the acquired simulation data by eliminating outliers and duplicate entries. Subsequently, each simulation dataset is shuffled to form pairs of design-frequency-S parameter data, amounting to a total of 118,698 data pairs. Ultimately, the acquired dataset is standardized. We have engineered an artificial neural network model to predict the S parameters of the hairpin filter. The model’s input layer comprises eight geometric design parameters of the hairpin filter and simulated frequency values. Four hidden layers are constructed, each containing 128 neurons, with the ReLU function serving as their activation function. Furthermore, the amplitude (expressed in dB) of S11 and S21 of the designed filter at the simulated frequency point functions as the output of the artificial neural network, as depicted in Figure 3.

During the training phase, we define the accuracy using the coefficient of determination,

R^{2}

, as illustrated in Formula (1):

R^{2} = 1 - \frac{\sum_{i = 1}^{m} {(f (x_{i}) - y_{i})}^{2}}{\sum_{i = 1}^{m} {(f (x_{i}) - \bar{y_{i}})}^{2}},

(1)

where m refers to the total number of items in the current training or testing batch, while i stands for the index of each item.

f (x_{i})

is the neural network’s predicted value when the input is

x_{i}

and

y_{i}

is the actual observed value (i.e., the result of the simulation) for

x_{i}

.

\bar{y_{i}}

represents the average of the actual observed values for the current batch. A larger

R^{2}

value indicates that the model is more successful at explaining the observed data.

The loss, on the other hand, is defined using the mean squared error (MSE), as depicted in Formula (2):

L o s s = \frac{1}{m} \sum_{m}^{i = 1} {(f (x_{i}) - y_{i})}^{2},

(2)

where m represents the total number of items in the current training or testing batch, while i represents the index of each item within the batch.

x_{i}

refers to the input of the neural network, which is the geometric dimensions of the hairpin filter. The functions

f (x_{i})

and

y_{i}

correspond to the predicted value associated with

x_{i}

and the actual label value, respectively.

3.3. The PPO in the Reinforcement Learning Algorithm

This section primarily elucidates the foundational concepts of reinforcement learning and the rudimentary principles of the PPO algorithm. Reinforcement learning can be conceptualized as a Markov Decision Process (MDP), wherein the current decision is exclusively contingent upon the present state, devoid of any influence from previous states. Thus, in the reinforcement learning process, we define a sequence of states, actions, and rewards at each time step, akin to

τ = {s_{1}, a_{1}, r_{1}, s_{2}, a_{2}, r_{2}, \dots, s_{T}, a_{T}, r_{T}}

, where T denotes the termination time step of the learning process. The objective of reinforcement learning is to maximize future rewards by discounting them using a discount factor

γ

. We define the discounted return as

R_{t} = \sum_{T}^{t^{^{'}}} γ^{t^{^{'}} - t} r_{t^{^{'}}}

. PPO is a type of policy gradient algorithm. Its goal is to maximize the discounted return by directly enhancing the parameters, denoted as

θ

, of the policy denoted as

π_{θ}

. Presently, the estimators that are widely used for computing policy gradients can be expressed using the following Equation (3):

\hat{g} = {\hat{E}}_{t} [\nabla_{θ} {log}_{} π_{θ} (a_{t} ∣ s_{t}) {\hat{A}}_{t}],

(3)

where

{\hat{A}}_{t}

represents the estimate of the advantage function at the current time step,

E_{t} [\dots]

signifies the empirical average of a finite batch of samples, and

\hat{g}

can be obtained by differentiating the loss function

L^{P G} (θ)

with respect to the parameter

θ

:

L^{P G} (θ) = {\hat{E}}_{t} [{log}_{} π_{θ} (a_{t} ∣ s_{t}) {\hat{A}}_{t}],

(4)

where the meanings of the various symbolic variables are the same as in Equation (3):

π_{θ} (a_{t} ∣ s_{t})

represents the possibility of taking action

a_{t}

under state

s_{t}

, and

{\hat{A}}_{t}

represents the advantage function at the current time step.

However, this updating method of policy gradients is susceptible to excessively updating the policy parameters, which disrupts the stability of the training process. To address this issue, PPO introduces a clipping mechanism to limit the magnitude of updates between the new and old policies, enabling a more stable policy update. Initially, we calculate the probability ratio between the new and old policies, denoted as

r_{t} (θ)

, with the following Formula (5):

r_{t} (θ) = \frac{π_{θ} (a_{t} | s_{t})}{π_{θ_{o l d}} (a_{t} | s_{t})},

(5)

where

π_{θ} (a_{t} | s_{t})

represents the probability of selecting action

a_{t}

by the current policy in state

s_{t}

, while

π_{θ_{o l d}} (a_{t} | s_{t})

represents the probability of selecting action

a_{t}

by the original policy before the update in state

s_{t}

. Based on

r_{t} (θ)

, PPO introduces a new objective function, denoted as

L^{C L I P} (θ)

:

L^{C L I P} (θ) = {\hat{E}}_{t} [m i n (r_{t} (θ) {\hat{A}}_{t}, c l i p (r_{t} (θ), 1 - ϵ, 1 + ϵ) {\hat{A}}_{t})],

(6)

where

ϵ

is a hyperparameter. The second term in

m i n

,

c l i p (r_{t} (θ), 1 - ϵ, 1 + ϵ) {\hat{A}}_{t}

modifies the surrogate objective by clipping the probability ratio, eliminating incentives outside the range of

[1 - ϵ, 1 + ϵ]

for

r_{t} (θ)

. Finally, the minimum value between the pre-clip and the post-clip is taken. This method is easier to implement and more effective compared to the approach proposed by TRPO [38].

In Algorithm 1, we provide the pseudocode for the specific steps of designing a Kalman filter using the PPO reinforcement learning algorithm. We also provide additional details in Algorithm 1 regarding the implementation and deployment of the ANN forward model in our proposed method.

3.4. Reinforcement Learning Approach for Optimizing Hairpin Filters

In this study, we utilized the PPO reinforcement learning framework within a continuous action space to execute the task. Herein, we delineate the reinforcement learning configuration employed for the hairpin filter design task, ensuring the reinforcement learning methodology is suited to the task at hand. The reinforcement learning environment is defined by states, actions, and rewards, with a specific focus on the reward definition. A well-structured reward can steer the agent towards learning more effective strategies and directly influence the speed of convergence and the final performance of the algorithm. The amalgamation of reinforcement learning and artificial neural network techniques markedly diminishes the time spent in the iterative search for the optimal policy. Nonetheless, it remains vital to accurately assess the agent’s behavior within the environment. Constructing a logical reward function that aligns with the practical design requirements is a fundamental aspect of reinforcement learning.

The ANN model is deployed as a substitute for an EM simulator, such as the Ansys High-Frequency Structure Simulator (HFSS), as discussed in the preceding section. The ANN model provides a more rapid response to the agent’s actions than solvers, thereby facilitating more efficient learning.

State: The state is represented by concatenating two types of feature vectors: (1) a vector comprising geometric design parameters of the hairpin filter, namely, [W,

L t

,

L_{1}

,

L_{2}

,

d_{1}

,

d_{2}

,

t w

,

s c p w g

], which are normalized based on the upper and lower limits of each parameter’s design space; and (2) the S-parameters (S11 and S21) of the corresponding designed filter at specific frequency points within the simulation frequency range. In this instance, we design within a simulation frequency band from 60 GHz to 125 GHz with a 1 GHz interval, resulting in a total of 66 frequency points for both S11 and S21, amounting to a total of 132 frequency points.

Algorithm 1 Automatic design of hairpin filters based on PPO.

1:: Initializing actor policy network parameters $π_{θ}$ and $π_{θ_{o l d}}$ ; critic value network parameters $V (w)$ .
2:: Initialize the replay buffer D.
3:: Randomly initializing the design of the hairpin filter.
4:: for eposide = 1 to M do
5:: Calling the forward network of ANN to predict the S-parameters of the current design of the hairpin filter.
6:: Initializing the state $s_{1}$ .
7:: for step = 1 to T do
8:: Select action $a_{t}$ at according to the current policy $π_{θ}$ .
9:: Adjusting the design of the hairpin filter to perform the corresponding action.
10:: Calculating the S-parameters of the newly designed hairpin filter based on the ANN model.
11:: Generating the new state $s_{t + 1}$ , and computing the reward $r_{t}$ .
12:: Store the transition ( $s_{t}$ , $a_{t}$ , $r_{t}$ , $s_{t + 1}$ ) in D.
13:: if data volume of D ≥ minimum batch size then
14:: Estimate advantages ${\hat{A}}_{t} = \sum_{t^{'} > t}^{} γ^{t^{'} - t} r_{t^{'}} - V (s_{t}; w)$ .
15:: for k = 1 to K do
16:: $π_{θ_{o l d}}$ ← $π_{θ}$
17:: Calculate $L (θ)$ based on Equation (6).
18:: Update the parameters of the policy network $π_{θ}$ : $θ \leftarrow θ + α \nabla θ L (θ)$ .
19:: Update the critic network by minimizing the loss $L (w) = \frac{1}{T} \sum_{t = 1}^{T} {({\hat{A}}_{t})}^{2}$ : $w \leftarrow w - β \nabla w L (w)$
20:: end for
21:: end if
22:: end for
23:: end for

Action: The agent’s actions govern its interaction with the environment. This framework employs a continuous action space defined as

{[- 1, 1]}^{m}

, where m corresponds to the number of geometric parameters for designing the hairpin filter. In this paper, m = 8. For instance, when

a_{t}^{i} = 1

, it signifies an increase in the upper limit value of the ith geometric parameter size in the current design space of the hairpin filter.

Reward: The design of a hairpin filter is typically associated with electrical characteristics such as passband range, insertion loss, and return loss. Consequently, the reward function must reflect the quality of the relevant design criteria. Using return loss as an example, it gauges the extent to which the filter aligns with the signal transmission within the passband. Generally, designers aim for the return loss of the filter to be below a certain threshold within the passband and above a certain threshold within the stopband, as depicted in Figure 4:

The assessment of return loss (S11) can be achieved by assigning weighted scores to the performance at each frequency point in both the passband and the stopband. Specifically, within the passband, a straightforward approach involves calculating the score for each frequency point by determining the distance between a specific standard value l (for example, −15 dB) and the S11 at the corresponding frequency point. If the S11 surpasses the standard value at this point, the score is set to be negative; conversely, if the S11 is below the standard value, the score is positive. To avoid the tendency of the system to overly focus on individual frequency points with high scores while neglecting the overall performance across the entire frequency range, we can apply the function

g (x) = 1 - e^{- (1 + x)}

to scores greater than 0 to limit score inflation. Consequently, the score

F_{p} (f)

for a single frequency point f within the passband can be represented by Equation (7):

F_{p} (f) = \{\begin{matrix} l - S_{11} (f) & i f S_{11} (f) > l \\ g (l - S_{11} (f)) & i f S_{11} (f) \leq l \end{matrix}

(7)

where

S 11 (f)

denotes the dB value of S11 corresponding to the frequency point f, l represents the standardized metric within the passband, and

F_{p} (f)

stands for the evaluation score for a single frequency point f.

The overall performance score of the entire passband, represented by

F_{p}

, can be derived by calculating the average score of all frequency points within the passband, as demonstrated in Equation (8):

F_{p} = \frac{1}{n} \sum_{i = 0}^{n - 1} F_{p} (f_{i}),

(8)

where n denotes the number of frequency points on the passband, i is the index of the frequency point,

F_{p} (f_{i})

represents the score estimation at the frequency point

f_{i}

, and

F_{p}

signifies the overall score estimation after taking into account all the frequency points on the passband.

Similarly, we can calculate the performance score

F_{s}

within the stopband. By combining these two scores with appropriate weights, we can compute the overall performance score

F_{S_{11}}

for the return loss:

F_{S_{11}} = β_{1} F_{p} + β_{2} F_{s},

(9)

where

β_{1}

and

β_{2}

are the weighting coefficients,

F_{p}

represents the composite scores on the passband,

F_{s}

represents the composite scores on the stopband, and

F_{S_{11}}

represents the composite scores on the S11 after combining both the stopband and passband scores.

If we want to consider the performance score of the insertion loss (S21) as well, a similar method can be applied and the detailed process is omitted here. Therefore, we can obtain the overall performance score of the filter as

F_{t o t a l} = γ_{1} F_{S_{11}} + γ_{2} F_{S_{21}}

, where

F_{S_{21}}

represents the score of the insertion loss, taking into account both the stopband and passband performance. Here,

γ_{2}

and

γ_{2}

are weighting coefficients.

Additionally, we introduce the concepts of short-term objective and ultimate objective to provide additional rewards

r_{e x t r a}

:

r_{e x t r a} = \{\begin{matrix} 10 & i f a c h i e v e E_{g} \\ 1 & i f a c h i e v e E_{s} \\ 0 & o t h e r w i s e \end{matrix},

(10)

where

E_{s}

represents the short-term objective and

E_{g}

represents the ultimate objective. Essentially, these short-term and ultimate objectives are the same as the design objectives mentioned earlier, but the indicators of the short-term objectives are easier to achieve than those of the ultimate objectives. If the current design meets or surpasses the short-term objective

E_{s}

, a reward of 1 is given to the system. If the ultimate objective

E_{g}

is met or exceeded, the system receives a reward of 10. If neither objective is achieved, no additional reward is provided.

The ultimate reward function R is defined as follows:

R = η \cdot F_{t o t a l} + r_{e x t r a},

(11)

where

η

is a weighting factor,

F_{t o t a l}

is the previous score for the overall performance of the filter, and

r_{e x t r a}

is the additional reward.

4. Results and Discussion

4.1. Training of the Proposed ANN

We processed the simulation data from the filters, yielding a total of 118,698 pairs of design-frequency-S parameter data. These pairs were randomly assorted, with 80% (94,958 pairs) designated as the training dataset and the remaining 20% (23,740 pairs) as the testing dataset. The artificial neural network (ANN) was trained over 200 epochs using a learning rate of

1 \times 10^{- 3}

. Furthermore, we implemented a decay of the learning rate by a factor of 0.9 every five epochs. The ANN’s training outcomes are depicted in Figure 5a. To more effectively demonstrate the fluctuations in the loss function and accuracy during the later stages of training, we selected the range of 0 to 5 for the loss function and 96.5% to 100% for the accuracy. Observing the variations in the loss function and accuracy curves throughout the training process, it is evident that the neural network model converged to a low error level in a relatively brief period, and the trend progressively stabilized. The error converged to 0.4507, with an accuracy of 99.76% on the training set. On the testing set, the accuracy ultimately stabilized at 99.4%, with no significant overfitting detected.

We selected a test group to compare the prediction results of the ANN model with the simulation results of the EM. As shown in Figure 5b, it was found that the predictions of the ANN model highly conformed to the actual EM simulation results, accurately predicting the changing trends of S11 and S21. This suggests that the ANN model can replace the EM simulation software in predicting the electrical performance of the card filter in our experiments.

We further scrutinized the absolute error distribution of the ANN model. The absolute error distributions for S11 and S21 are depicted in Figure 5c,d, respectively. In both distributions (with the vertical axis employing a logarithmic scale), there are 19,186 data points for S11 with errors predominantly under 0.2, representing approximately 80.82% of the entire testing dataset. Within these, 95.0% of the data exhibit absolute errors less than 1, and the fraction of data pairs with absolute errors less than 3 reaches as high as 97.96%. For S21, around 75.88% of the data display errors less than 0.5, while 91.27% of the data possess prediction errors within 1. This suggests that the ANN model has effectively assimilated the nonlinear relationship between geometric dimensions and electrical behavior. As depicted in Figure 5b, the ANN model demonstrates excellent consistency with the actual simulation results, mirroring the frequency response characteristics of the designed filter. Consequently, we can infer that the established ANN model is aptly suited for predicting the behavior of the hairpin filter and can supersede the electromagnetic simulation kernel in predicting the electrical behavior of the hairpin filter.

4.2. The Experimental Results of Reinforcement Learning

In order to assess our proposed hairpin filter design optimization method, we executed the design within a frequency band of 60 GHz to 125 GHz. During the design process for the frequency range of 85 GHz to 100 GHz, we stipulated that the return loss S11 should be below −11 dB, and the insertion loss S21 should be above −4 dB. For the frequency ranges of 60 GHz to 79 GHz and 105 GHz to 125 GHz, our objectives were for S11 to exceed −1 dB and for S21 to be less than −25 dB. Each episode consisted of 100 optimization steps.

We implemented five training iterations on the reinforcement learning agent, tracking the trend of the average rewards the agent garnered throughout these design processes, as illustrated in Figure 6a. The shaded region represents the observed range of rewards (maximum and minimum values) during these five training iterations. In the actual 6000 episodes of training, we observed a continuous increase in reward, maintaining near a certain level. From the trend of reward changes, we can conclude that the agent gradually mastered the strategy of maximizing reward after a period of learning. This also indicates that the design results of the card filter are gradually approaching or reaching the design task indicators with the increase of completed episodes, optimizing the characteristics of the return loss and insertion loss of the design goal.

Figure 6b records the number of tuning steps used by the agent to reach the design goal for the first time in the training process. If the design goal is not reached in the current episode, the tuning steps are recorded as 100. It can be seen that from the 2000th episode, the agent begins to reach our design goal, and with the increase of episodes, it becomes more and more clear about its goal, completing our design goal with fewer and fewer operation steps.

We demonstrate the fluctuations in the S-parameter curves during the testing experiment. In Figure 7, we validate the design proposals generated by the PPO algorithm using HFSS simulation software in the process of optimizing the filter. It can be observed from the figure that the initial design of the hairpin filter did not meet our performance requirements. However, the frequency response curves showed a clear optimization trend during the algorithm design process, improving the frequency response characteristics within the specified frequency range. In the experiment, the design proposals provided by the algorithm ultimately met the performance requirements and achieved the predetermined design goals for S11 and S21 parameters. In most cases, the trained agent was able to complete the design within approximately 10 steps. Additionally, to verify the actual effectiveness of the ANN in the design process, we provide the ANN evaluation results for the final design proposal in Figure 7.

4.3. Discussions

The results demonstrate the effectiveness of the PPO algorithm in designing hairpin filters. The trained and learned PPO algorithm can modify the geometric dimensions of the filter to achieve the desired frequency response.

In the design of microstrip filters, the topology design of the filter structure is relatively mature, and most structures can be found in various handbooks and literature. However, engineers often face challenges in adjusting the size parameters of the filter. In the past, this process could only be manually completed by experienced engineers, which was expensive and time-consuming, lengthening the development cycle. In this work, by interacting with the design environment, the agent learns the filter design experience and replaces human engineers in executing this strategy. Additionally, we use a neural network model to replace the commonly used EM simulation by engineers, which speeds up the learning and design process, reducing computational costs. Table 2 summarizes the comparisons of different optimization targets, methods, and time costs in related fields in recent years. Among all the compared methods, our proposed method accomplishes the design task in the shortest CPU operation time.

However, it is important to note that the neural network model is trained using EM simulation data, which means that the accuracy of the ANN model cannot surpass that of the EM simulation. This may result in the agent missing out on certain reasonable designs or providing designs that do not meet the specifications, which is one limitation of this method. This could potentially be addressed by improving the model and increasing the training dataset. In the majority of cases, using the ANN model as a replacement for the EM simulation is still reasonable.

Furthermore, the discussions in this paper are limited to exploring fixed topology circuit structures. These circuit structures are reasonable but also restrict the exploration space of the agent. A potential research direction is to use artificial intelligence techniques to explore new topology structures, which may be irregular but potentially more reasonable.

5. Conclusions

In this paper, we propose a novel method for optimizing the geometric dimensions of filters using a new design approach. This method utilizes artificial neural network modeling techniques to replace EM simulation and employs the PPO reinforcement learning algorithm to learn design strategies through trial and error. We validate this method on a hairpin filter and demonstrate its ability to correctly adjust structural parameters to meet design specifications. Compared to other related research directions, our method shows significant advantages in terms of time efficiency. Finally, we believe that this method has the potential to be applied to the design of other microwave circuits.

Author Contributions

Conceptualization, J.W.; methodology, Y.Y., Y.W., J.C., G.S. and J.W.; writing—original draft preparation, Y.Y. and Y.W.; writing—review and editing, J.W. and J.L.; supervision, J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China under grant number 62206081.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Yousefi, M.; Aliakbarian, H.; Sadeghzadeh, R. Design and integration of a high-order hairpin bandpass filter with a spurious suppression circuit. In Proceedings of the 2015 Loughborough Antennas & Propagation Conference (LAPC), Loughborough, UK, 2–3 November 2015; pp. 1–4. [Google Scholar]
Ahmed, R.; Emiri, S.; İmeci, Ş.T. Design and analysis of a bandpass hairpin filter. In Proceedings of the 2018 International Applied Computational Electromagnetics Society Symposium (ACES), Denver, CO, USA, 25–29 March 2018; pp. 1–2. [Google Scholar]
Deng, P.H.; Lin, Y.S.; Wang, C.H.; Chen, C.H. Compact microstrip bandpass filters with good selectivity and stopband rejection. IEEE Trans. Microw. Theory Tech. 2006, 54, 533–539. [Google Scholar] [CrossRef]
Hunter, I.C.; Billonet, L.; Jarry, B.; Guillon, P. Microwave filters-applications and technology. IEEE Trans. Microw. Theory Tech. 2002, 50, 794–805. [Google Scholar] [CrossRef]
Levy, R.; Snyder, R.V.; Matthaei, G. Design of microwave filters. IEEE Trans. Microw. Theory Tech. 2002, 50, 783–793. [Google Scholar] [CrossRef]
Su, J.; Cai, J.; Zheng, X.; Sun, L. A Fast Two-Tone Active Load-Pull Algorithm for Assessing the Non-linearity of RF Devices. Chin. J. Electron. 2022, 31, 25–32. [Google Scholar]
Miraftab, V.; Yu, M. Innovative combline RF/microwave filter EM synthesis and design using neural networks. In Proceedings of the 2007 International Symposium on Signals, Systems and Electronics, Montreal, QC, Canada, 30 July–2 August 2007; pp. 1–4. [Google Scholar]
Rizzoli, V.; Costanzo, A.; Masotti, D.; Lipparini, A.; Mastri, F. Computer-aided optimization of nonlinear microwave circuits with the aid of electromagnetic simulation. IEEE Trans. Microw. Theory Tech. 2004, 52, 362–377. [Google Scholar] [CrossRef]
Fedeli, A.; Montecucco, C.; Gragnani, G.L. Open-source software for electromagnetic scattering simulation: The case of antenna design. Electronics 2019, 8, 1506. [Google Scholar] [CrossRef]
Wei, Y.; Liu, J.; Sun, D.; Su, G.; Wang, J. From Netlist to Manufacturable Layout: An Auto-Layout Algorithm Optimized for Radio Frequency Integrated Circuits. Symmetry 2023, 15, 1272. [Google Scholar] [CrossRef]
Goasguen, S.; El-Ghazaly, S.M. A practical large-signal global modeling simulation of a microwave amplifier using artificial neural network. IEEE Microw. Guid. Wave Lett. 2000, 10, 273–275. [Google Scholar] [CrossRef]
Watson, P.M.; Gupta, K.C. EM-ANN models for microstrip vias and interconnects in dataset circuits. IEEE Trans. Microw. Theory Tech. 1996, 44, 2495–2503. [Google Scholar] [CrossRef]
Singh, A.; Kaur, A. Synthesis of on-chip square spiral inductors for RFIC’s using artificial neural network toolbox and particle swarm optimization. Adv. Electron. Electr. Eng 2013, 3, 933–940. [Google Scholar]
Watson, P.M.; Gupta, K.C. Design and optimization of CPW circuits using EM-ANN models for CPW components. IEEE Trans. Microw. Theory Tech. 1997, 45, 2515–2523. [Google Scholar] [CrossRef]
Gao, J.; Shen, L.; Luo, D. High frequency HEMT modeling using artificial neural network technique. In Proceedings of the 2015 IEEE MTT-S International Conference on Numerical Electromagnetic and Multiphysics Modeling and Optimization (NEMO), Ottawa, ON, Canada, 11–14 August 2015; pp. 1–3. [Google Scholar]
Roshani, S.; Heshmati, H.; Roshani, S. Design of a Microwave Lowpass–Bandpass Filter using Deep Learning and Artificial Intelligence. J. Inst. Electron. Comput. 2021, 3, 1–16. [Google Scholar] [CrossRef]
Cao, Y.; Reitzinger, S.; Zhang, Q.J. Simple and efficient high-dimensional parametric modeling for microwave cavity filters using modular neural network. IEEE Microw. Wirel. Components Lett. 2011, 21, 258–260. [Google Scholar] [CrossRef]
Kabir, H.; Wang, Y.; Yu, M.; Zhang, Q.J. High-dimensional neural-network technique and applications to microwave filter modeling. IEEE Trans. Microw. Theory Tech. 2009, 58, 145–156. [Google Scholar] [CrossRef]
Liu, Z.; Hu, X.; Liu, T.; Li, X.; Wang, W.; Ghannouchi, F.M. Attention-based deep neural network behavioral model for wideband wireless power amplifiers. IEEE Microw. Wirel. Components Lett. 2019, 30, 82–85. [Google Scholar] [CrossRef]
Sağık, M.; Altıntaş, O.; Ünal, E.; Özdemir, E.; Demirci, M.; Çolak, Ş.; Karaaslan, M. Optimizing the gain and directivity of a microstrip antenna with metamaterial structures by using artificial neural network approach. Wirel. Pers. Commun. 2021, 118, 109–124. [Google Scholar] [CrossRef]
Jin, J.; Feng, F.; Zhang, Q.J. An overview of neural network techniques for microwave inverse modeling. In Proceedings of the 2021 IEEE International Symposium on Radio-Frequency Integration Technology (RFIT), Hualien, Taiwan, 25–27 August 2021; pp. 1–2. [Google Scholar]
Pan, G.; Wu, Y.; Yu, M.; Fu, L.; Li, H. Inverse modeling for filters using a regularized deep neural network approach. IEEE Microw. Wirel. Compon. Lett. 2020, 30, 457–460. [Google Scholar] [CrossRef]
Gosal, G.; Almajali, E.; McNamara, D.; Yagoub, M. Transmitarray antenna design using forward and inverse neural network modeling. IEEE Antennas Wirel. Propag. Lett. 2015, 15, 1483–1486. [Google Scholar] [CrossRef]
Kabir, H.; Wang, Y.; Yu, M.; Zhang, Q.J. Neural network inverse modeling and applications to microwave filter design. IEEE Trans. Microw. Theory Tech. 2008, 56, 867–879. [Google Scholar] [CrossRef]
Zhang, C.; Jin, J.; Na, W.; Zhang, Q.J.; Yu, M. Multivalued neural network inverse modeling and applications to microwave filters. IEEE Trans. Microw. Theory Tech. 2018, 66, 3781–3797. [Google Scholar] [CrossRef]
Zhang, N.; Liu, Z.; Wang, J. Machine-learning-enabled design and manipulation of a microfluidic concentration gradient generator. Micromachines 2022, 13, 1810. [Google Scholar] [CrossRef] [PubMed]
Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 2002, 6, 182–197. [Google Scholar] [CrossRef]
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing atari with deep reinforcement learning. arXiv 2013, arXiv:1312.5602. [Google Scholar]
Ali, S.; Bhargava, A.; Saxena, A.; Kumar, P. A Hybrid Marine Predator Sine Cosine Algorithm for Parameter Selection of Hybrid Active Power Filter. Mathematics 2023, 11, 598. [Google Scholar] [CrossRef]
Jawad, K.; Mahto, R.; Das, A.; Ahmed, S.U.; Aziz, R.M.; Kumar, P. Novel Cuckoo Search-Based Metaheuristic Approach for Deep Learning Prediction of Depression. Appl. Sci. 2023, 13, 5322. [Google Scholar] [CrossRef]
Ülker, S. Particle swarm optimization application to microwave circuits. Microw. Opt. Technol. Lett. 2008, 50, 1333–1336. [Google Scholar] [CrossRef]
Fallahpour, M.B.; Hemmati, K.D.; Parsayan, A.; Golmakani, A. Multi objective optimization of a LNA using genetic algorithm based on NSGA-II. In Proceedings of the 2011 International Conference on Electrical Engineering and Informatics, Bandung, Indonesia, 17–19 July 2011; pp. 1–4. [Google Scholar]
Wang, Z.; Yang, J.; Hu, J.; Feng, W.; Ou, Y. Reinforcement learning approach to learning human experience in tuning cavity filters. In Proceedings of the 2015 IEEE International Conference on Robotics and Biomimetics (ROBIO), Zhuhai, China, 6–9 December 2015; pp. 2145–2150. [Google Scholar]
Wang, Z.; Ou, Y.; Wu, X.; Feng, W. Continuous reinforcement learning with knowledge-inspired reward shaping for autonomous cavity filter tuning. In Proceedings of the 2018 IEEE International Conference on Cyborg and Bionic Systems (CBS), Shenzhen, China, 25–27 October 2018; pp. 53–58. [Google Scholar]
Wang, B.X.; Zhao, W.S.; Wang, D.W.; Wang, J.; Li, W.; Liu, J. Optimal design of planar microwave microfluidic sensors based on deep reinforcement learning. IEEE Sens. J. 2021, 21, 27441–27449. [Google Scholar] [CrossRef]
Esa, M.; Thayaparan, D.; Abdullah, M.; Malik, N.A.; Murad, N.A. Miniaturized microwave modified Koch fractal Hairpin Filter with harmonic suppression. In Proceedings of the 2010 IEEE Asia-Pacific Conference on Applied Electromagnetics (APACE), Port Dickson, Malaysia, 9–11 November 2010; pp. 1–4. [Google Scholar]
Schulman, J.; Levine, S.; Abbeel, P.; Jordan, M.; Moritz, P. Trust region policy optimization. In Proceedings of the 32nd International Conference on Machine Learning, PMLR, Lille, France, 6–11 July 2015; pp. 1889–1897. [Google Scholar]
Na, W.; Liu, K.; Cai, H.; Zhang, W.; Xie, H.; Jin, D. Efficient EM optimization exploiting parallel local sampling strategy and Bayesian optimization for microwave applications. IEEE Microw. Wirel. Components Lett. 2021, 31, 1103–1106. [Google Scholar] [CrossRef]
Wei, J.; Chen, W.; Wu, Q.; Wang, H. Machine Learning-Assisted Automatic Filter Synthesis with Prior Knowledge and Its Application to Single-Mode Bandpass Filter Design. In Proceedings of the 2022 International Applied Computational Electromagnetics Society Symposium (ACES-China), Xuzhou, China, 9–12 December 2022; pp. 1–3. [Google Scholar]
Zhao, P.; Wu, K. Homotopy optimization of microwave and millimeter-wave filters based on neural network model. IEEE Trans. Microw. Theory Tech. 2020, 68, 1390–1400. [Google Scholar] [CrossRef]
Ding, D.; Zhang, X.; Zhang, J.; Cao, Y.; Bai, J.L.; Yang, J. Multiobjective optimization of microwave circuits with many structural parameters and objectives. In Proceedings of the 2019 International Conference on Microwave and Millimeter Wave Technology (ICMMT), Guangzhou, China, 19–22 May 2019; pp. 1–3. [Google Scholar]

Figure 1. This is the topology structure of the hairpin filter used for the experiment.

Figure 2. The flowchart illustrates our design process for optimizing geometric parameters. It involves two stages: the ANN modeling stage and the reinforcement learning stage.

Figure 3. Schematic diagram that describes the process of modeling the S-parameters of a hairpin filter using artificial neural network technology.

Figure 4. The design specifications for the filter, in which only return loss is considered, are represented as S-parameters. Typically, these parameters should be below a certain threshold (indicated as −15 dB in the figure) within the passband (ranging from 85 GHz to 100 GHz as shown). Conversely, within the stopband (spanning from 60 GHz to 79 GHz and 105 GHz to 125 GHz as illustrated), they should surpass a different value (noted as −1 dB in the figure).

Figure 5. (a) The training process of the neural network model in terms of loss and accuracy. (b) The predictions of the neural network model for the design of the hairpin filter and the validation results through the HFSS simulation. (c) The absolute error of the neural network model in predicting S11. (d) The absolute error of the neural network model in predicting S21.

Figure 6. (a) The blue line represents the average reward during five training iterations, while the shaded area indicates the observed reward range (maximum and minimum values) throughout these five training processes. (b) Quantity of steps necessitated to attain the optimization objective.

Figure 7. Comparison of frequency response curve variations during the PPO design process of hairpin filters. The red square scatter represents the predictions of the final design made by the ANN. (a) S11, (b) S21.

Table 1. Simulation parameter space for the hairpin filter in HFSS.

Design Parameters	Lower Bound/μm	Upper Bound/μm	Step Size/μm
W	25	30	0.1
$L t$	350	380	1
$L_{1}$	500	600	1
$L_{2}$	70	100	0.1
$d_{1}$	10	20	0.1
$d_{2}$	10	25	0.1
$t w$	30	60	0.1
$s c p w g$	6	20	0.1

Table 2. Comparison of optimization results for different methods.

	Design Target	Design Method	EM/ANN Model Evaluation Counts	CPU Operation Time
[39]	coupled-line filter	BO with local sampling	1077	5.32 h
[40]	SIW * filter	narrowband bandpass synthesis algorithm	101	44.75 h
[41]	waveguide filter	Homotopy Optimization + ANN	1500∼17,263	8.3 min∼12 min
[42]	Microstrip filter	MOEA/D-FDTD	≈2700	9.5 day
this work	Hairpin Filter	PPO + ANN	5∼15	in seconds

* substrate integrated waveguide.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ye, Y.; Wu, Y.; Chen, J.; Su, G.; Wang, J.; Liu, J. Intelligent Design of Hairpin Filters Based on Artificial Neural Network and Proximal Policy Optimization. Appl. Sci. 2023, 13, 9379. https://doi.org/10.3390/app13169379

AMA Style

Ye Y, Wu Y, Chen J, Su G, Wang J, Liu J. Intelligent Design of Hairpin Filters Based on Artificial Neural Network and Proximal Policy Optimization. Applied Sciences. 2023; 13(16):9379. https://doi.org/10.3390/app13169379

Chicago/Turabian Style

Ye, Yunong, Yifan Wu, Jiayu Chen, Guodong Su, Junchao Wang, and Jun Liu. 2023. "Intelligent Design of Hairpin Filters Based on Artificial Neural Network and Proximal Policy Optimization" Applied Sciences 13, no. 16: 9379. https://doi.org/10.3390/app13169379

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Intelligent Design of Hairpin Filters Based on Artificial Neural Network and Proximal Policy Optimization

Abstract

1. Introduction

2. Related Work

2.1. Modeling and Applications of Artificial Neural Networks

2.2. Application of Nature-Inspired Heuristic Algorithms

3. Overview of the Design Method

3.1. Framework

3.2. Training an ANN for Predicting the S-Parameters of Hairpin Filters

3.3. The PPO in the Reinforcement Learning Algorithm

3.4. Reinforcement Learning Approach for Optimizing Hairpin Filters

4. Results and Discussion

4.1. Training of the Proposed ANN

4.2. The Experimental Results of Reinforcement Learning

4.3. Discussions

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI