Next Article in Journal
Multi-Modal Haptic Rendering Based on Genetic Algorithm
Previous Article in Journal
Improvement in Error Recognition of Real-Time Football Images by an Object-Augmented AI Model for Similar Objects
Previous Article in Special Issue
Finite Differences for Recovering the Plate Profile in Electrostatic MEMS with Fringing Field
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

CNN-Based Surrogate Models of the Electrostatic Field for a MEMS Motor: A Bi-Objective Optimal Shape Design

by
Paolo Di Barba
1,
Maria Evelina Mognaschi
1,* and
Slawomir Wiak
2
1
Department of Electrical, Computer and Biomedical Engineering, University of Pavia, 27100 Pavia, Italy
2
Institute of Mechatronics and Information Systems, Lodz University of Technology, 90-924 Lodz, Poland
*
Author to whom correspondence should be addressed.
Electronics 2022, 11(23), 3877; https://doi.org/10.3390/electronics11233877
Submission received: 31 October 2022 / Revised: 17 November 2022 / Accepted: 22 November 2022 / Published: 24 November 2022

Abstract

:
The use of a convolutional neural network to develop a surrogate model of the electric field in MEMS devices is proposed. An electrostatic micromotor is considered as the case study. In particular, different CNNs are trained for the prediction of the torque profile and the maximum torque value at a no-load condition and the radial force which could arise in case of the radial displacement of the rotor during motion. The proposed deep learning approach is able to predict the abovementioned quantities with a low error and, in particular, it allows for a decrease in the computational cost, especially in case of optimization problems based on FE models.

1. Introduction

In recent years, the impressive developments in the area of machine learning and big data have paved the way for pattern recognition in field problems [1]. In this respect, deep learning techniques have been used to build surrogate models for the field analysis of electromagnetic devices [2,3]. Convolutional Neural Networks (CNNs) are neural networks that allow one to evaluate a field quantity (either a value or a vector of values, e.g., a field distribution) starting from an image [4]. The training of these CNNs is usually conducted with numerical analyses, and in particular, a Finite Element Model (FEM) is used. Once a neural network is trained, it can be used as a surrogate model and inserted in an optimization loop [5,6]. This way, the computational burden is limited to the net training while the evaluation of the field quantities during the optimization loop is rather inexpensive.
Accordingly, in this paper, the possibility of applying a Deep Neural Network (DNN) to electric field analysis in MEMS devices is investigated: in particular, a bitmap approach, which describes a device geometry as a set of pixels, is proposed. The architecture of the selected network resembles that for image segmentation purposes, and it is easily trained by means of a set of Finite Element (FE) Analyses: this way, the ground information is amenable to electric field distributions over the input domains, simulated from traditional FE solvers.

2. The Case Study

The electrostatic micromotor considered as the case study [7] exhibits 18 stator electrodes and 6 rotor teeth. It has an outer rotor radius of 60 µm and a stator radius of 63 μm (Figure 1).
The stator electrodes are supplied by a three-phase system of square voltages, equal to 100 V, while the rotor potential is floating.
A 2D Finite Element Model (FEM) of the motor was implemented. A unit axial length for the motor was considered. The driving torque T in the no-load condition was computed over a rotation angle of 60 degrees (pole pitch, see Figure 2), and its maximum value Tm was considered. Specifically, the FE model consisted of n = 101 field analyses, where a step of Δ = 0.6 degrees was considered for the rotation over 60 degrees.
Moreover, to account for friction, the side-pull effect was considered, caused by the radial displacement of the rotor during its motion. Actually, the eccentric motion determines an unbalanced electric pull, which appears as a radial force acting in the direction of the shortest air gap. Based on the Maxwell stress tensor method, the radial force Fr(φ) acting on the rotor was evaluated as a function of its angular position φ, in addition to the driving torque T(φ) (Figure 2). To simulate the radial displacement, a clearance between the rotor and shaft equal to 1 µm was considered; the direction of the displacement was assumed to be fixed and independent of φ (static eccentricity).
The rotor geometry can vary depending on three parametric variables: x = [x1, x2, x3] = [R1, α, β], which are the inner rotor radius and two angles defining the geometry of the rotor tooth, respectively (see Figure 1).

3. Surrogate Model with CNN

3.1. The CNN Structure

A convolutional neural network (CNN) composed of 18 layers was used to implement the surrogate model; its architecture is reported in Table 1. This structure was proposed by Mathworks to solve a regression problem, i.e., to predict the angles of rotation of handwritten digits [8], and it has been used by the authors to successfully solve a regression problem in magnetostatics [9].
The image of a circular sector of the rotor (60 degrees) was supplied as the input of the CNN. A typical input image is shown in Figure 3, where in Figure 3a the image at a full resolution of 1455 × 969 pixels is shown, while in Figure 3b the low-resolution image of 30 × 50 pixels, used as the input of the CNN, is shown; blurring makes the computational cost affordable without dramatically deteriorating the model accuracy.
The output of the CNN could be the torque profile, i.e., Np = 101 points as the CNN output, or the maximum value of the torque Tm, i.e., Np = 1, or the radial force Fr, i.e., Np = 1. For each case, a CNN with the structure shown in Table 1 was trained with a proper dataset. During the CNN training, the output quantities were normalized in the range [−1,1].
This way, due to the correspondence between MEMS geometry and the field map, the problem of field analysis was re-cast as a problem of pattern recognition.
In order to train the CNN, a database of solutions, based on the FE model, was created. In particular, 1200 FE solutions were collected; for each solution, the torque profile and the radial force on the rotor was stored and associated with the relevant image of the circular sector of the rotor.
To evaluate the goodness of the surrogate model, the root mean square error e was calculated on N = nv points of the validation set, namely,
e = 1 N i = 1 N ( Y i Y i _ pred ) 2
where Y is the output value calculated with FEM and Ypred is the output value predicted by the CNN. This error was calculated for the values normalized in the range [−1,1].

3.2. Multi-Objective Optimization Based on the Surrogate Models

A multi-objective optimization for the shape design of the MEMS motor was performed. It was based on the radius R1 and angles (α,β) as the design variables. Two objective functions were defined in terms of the three-dimensional design vector g = (R1, α, β), namely:
  • the highest value of the driving torque on a pole pitch f1(g) at no load under single-phase supply;
  • the value of radial force on the rotor f2(g) in the direction of the shortest air gap.
The problem reads: given the stator supply and rotor misalignment, find the family of rotor geometries g such that f1(g) is maximum and f2(g) is minimum.
The variation range of the design variables is shown in Table 2.
In the last decades, different optimization techniques inspired by nature have been investigated. Genetic algorithms (GA) are a very useful and widely studied optimization techniques; in particular, one of the most used algorithms is the Non-Dominated Sorting Genetic Algorithm NSGA-II [10]. Gas works very well when the number of design variables is low e.g., lower than 10; so, they are commonly used in the multi-objective optimization of, e.g., electrical machines and electromagnetic devices, because these problems are characterized by a few design degrees of freedom. In this paper, the NSGA-II was applied to a scheduling problem, which is usually characterized by a large number of parameters, depending on the problem formulation.
In GA optimization, a population, which is a set of individuals, is evaluated at each iteration (generation) in terms of rank and crowding distance. In particular, the population is ordered in terms of Pareto front and sub-fronts and a rank is assigned to each front: the solutions belonging to the Pareto front have the lowest rank (e.g., rank = 1) while higher ranks are assigned to dominated sub-fronts. For each solution belonging to a given front, the crowding distance is calculated: the solutions aggregating in a cluster are characterized by a crowding distance value that is lower than the isolated solutions. Taking into account the crowding distance helps with obtaining regularly spaced, i.e., not clustered, solutions in the objective space.
The Individuals of the current population were ordered first by rank index and then, among each front, by the crowding distance. The best (fittest) individuals were the ones characterized by a low rank and high crowding distance.
In order to create the next generation, highly fit individuals from the current iteration were combined by a crossover operator to produce offspring.
Meanwhile, to increase variation in the search space, a mutation operator was performed at a certain probability level.
The generated offspring solutions were added to the current population. Suppose the population size was N. Then, the current population size would be 2∙N since the offspring solutions are added to the current population. Then, the current population was sorted using non-dominance and the crowding distance, and they were truncated to N individuals. In this way, the next generation population had size N.
This process continued until the stopping criterion, usually a maximum number of generations, was met.
To solve the design problem of the MEMS motor, N = 20 individuals and 500 iterations were used. The population size of 20 was the minimum suggested for NSGA-II [10], and it is a size that allows the Pareto front to be represented sufficiently well. On the other hand, in order to obtain a good approximation of the front in terms of avoiding clustering and improving the approximation of non-dominated solutions, i.e., possibly getting closer to the ideal front, a large number of iterations was used.
Because of the high number of objective function calls (at least 10,000) required for this optimization to converge, the use of a surrogate model is particularly suited. For this purpose, two previously trained CNNs were used: the CNN able to predict the maximum value of the torque at no load (function f1) and the one able to predict the radial force on the rotor (function f2) given the image of the circular sector of the rotor.
Because the CNN needs an image as the input, for each design vector g = (R1, α, β) generated by the optimization algorithm, an ad hoc pre-processing was performed:
  • − the image is automatically drawn;
  • − the image is processed in order to fit the 30 × 50 pixel size;
  • − a bitmap matrix representing the image is generated;
  • − the matrix is passed to the trained CNN as input.

4. Results

Three different CNNs were trained as surrogate models of the FE analyses: the torque profile was predicted first, then a CNN was trained to predict the maximum value of the torque only and, finally, the radial force was predicted by means of another CNN.
All of them were trained with 1200 samples, obtained by means of Finite Element Analyses (FEAs): 1000 samples were used for the training set while 200 were used for the validation set.

4.1. Torque Profile Prediction

The error calculated with Equation (1) for the net to predict the whole torque profile that was trained with the FE model for 500 epochs was equal to e = 0.052.
The true values of Y versus the predicted ones for the validation set is plotted in Figure 4; the values plotted in Figure 4 are denormalized to the original range.
To train the CNN, 1200 FE analyses were needed. Each field analysis consisted of 101 field solutions because the rotor was rotated every 0.6 degrees over 60 degrees.
An example of the torque profile re-construction is shown in Figure 5, where two different geometries are considered.
In particular, torque A was relevant to a device with x = [4.56 × 10−5, 46.8, 14.5], while torque B was relevant to a device with x = [2.21 × 10−5, 26.2, 12.6]. The geometry of the two devices is shown in Figure 6.

4.2. Maximum Torque Value Prediction

The error calculated with Equation (1) for the net trained for 1000 epochs was equal to e = 0.029.
A typical root mean square error versus the iterations is shown in Figure 7.
Looking at Figure 7, it can be seen that the underfitting phenomenon was avoided because of the low error reached at the end of the training procedure; moreover, the overfitting phenomenon also did not take place because the error of the validation set was low towards convergence, as was the error of the training set. Hence, both the validation and training errors were low, and this twofold condition was considered to be appropriate for a reliable operation of the trained network.
The true values Y versus the predicted ones for the validation set is plotted in Figure 8; the values plotted in Figure 8 are denormalized to the original range.
To train the CNN, 1200 FE analyses were needed. Each field analysis consisted of 101 field solutions because the rotor was rotated every 0.6 degrees over 60 degrees.

4.3. Radial Force Value Prediction

The error calculated with Equation (1) for the net trained for 500 epochs was equal to e = 0.033.
A typical root mean square error versus the iterations is shown in Figure 9.
The true values Y versus the predicted ones for the validation set is plotted in Figure 10; the values plotted in Figure 11 are denormalized to the original range.
To train the CNN, 1200 FE analyses were needed. Each analysis consisted of one field solution considering a clearance between rotor and shaft equal to 1 µm in order to simulate the radial displacement.

4.4. Optimization Based on Surrogate Models

The results of the optimization are shown in Figure 11. Starting from 20 points, 20 arrival solutions were found, based on the objective function evaluation with the two previously trained CNNs.
In order to understand the goodness of the approximation of the Pareto front found by using the CNNs, the non-dominated solutions were re-calculated by means of FEAs. These points are highlighted with red squares in Figure 11.

5. Discussion

The results in Section 4 showed that a deep learning approach to building surrogate models of electrostatic field for MEMS is possible and also effective.
In this paper, three different CNNs were trained as a surrogate model for the whole torque profile, the maximum value of the torque and the radial force acting on the rotor in case of axial displacement, respectively.
The accuracy of the CNN prediction can be qualitatively visualized with the plots that show the true vs. predicted values (see Figure 4, Figure 7 and Figure 8). The more the points are located close to the diagonal, the more accurate the CNN prediction is. For all the three CNNs, the points were located very close to the diagonal, which is highlighted with a dashed red line.
Moreover, the error was quantitative evaluated as in Equation (1): considering that the quantities were normalized in the range [−1,1], the error could be considered small for all the three CNNs. The most accurate results were those obtained when the prediction of a single value was required (maximum torque or radial force), while the error for the whole torque profile prediction was a little bit higher (0.052 versus 0.029 and 0.033 for the maximum torque and the radial force, respectively).
The results found for these electrostatic field-related quantities were comparable with those obtained by the authors for a magnetostatic problem; in [9], an error of 0.04 (calculated with Equation (1)) was found for the prediction of the magnetic field in ten points along a slab under investigation in the frame of a material testing problem. The same order of magnitude of error was obtained in a paper where a surrogate model for the magnetic field of a die press was performed by means of a deep learning approach [11]. These results confirmed the feasibility of using a deep learning approach to implement surrogate models in low-frequency electromagnetics. Moreover, to the best of the authors’ knowledge, while surrogate models for magnetic fields are becoming rather common, this is not the case for the electric field, and in particular, for surrogate models of electric field in MEMS: this paper aims to bridge this gap.
Finally, in this paper, the surrogate models were used to solve an optimization problem. The genetic algorithm NSGA-II was used: it is considered a kind of benchmark in the area of multi-objective optimization. The optimization was run with 20 individuals for 500 generations. Without a surrogate model, the computational burden would be 20 × 500, i.e., 10,000 FEAs. Using the surrogate model in the optimization loop means no computational burden during the optimization. On the other hand, the CNNs were trained with a database composed of 1200 FEAs, so we can state that the overall computational cost for running the optimization based on the surrogate models is 1200 FEAs, which means saving 88% of the computational cost with respect to the optimization based on FEAs. Hence, the use of surrogate models based on CNNs allows one to reduce a lot of the computational cost.
However, the accuracy of the results must be investigated, and the optimal solutions obtained with the surrogate models were re-calculated with the FEM. In fact, the optimal solutions found with the surrogate models approximated a Pareto front as shown in Figure 9 (see black circles). A priori, it cannot be stated that these solutions were re-calculated by means of a FEM approximate at the Pareto front as well. In fact, the error introduced by the surrogate models based on the CNN could lead, in principle, to either a lower or a higher value of the approximated quantities. Figure 9 shows that the optimal solutions re-calculated by means of FEAs (red squares) still approximate a Pareto front as the black circles (optimal solutions obtained with the FEAs) do. This behavior is probably due to the trained CNN whose error for the optimal solutions led to a higher torque and a lower radial force with respect to the ones calculated with FEAs. In fact, for most of the solutions in Figure 9, the red squares were characterized by a smaller torque and a higher radial force than the corresponding black circles.
Hence, the use of CNNs as surrogate models for electric field problems in MEMS devices can lead to a good approximation of the field-related quantities, and the use of these surrogate models in optimization problems could allow one to save computational time while preserving the accuracy.

Author Contributions

Conceptualization, P.D.B. and M.E.M.; methodology, P.D.B. and M.E.M.; software, M.E.M.; validation, P.D.B. and S.W.; formal analysis, P.D.B.; resources, S.W.; data curation, M.E.M.; writing—original draft preparation, M.E.M.; writing—review and editing, P.D.B. and S.W.; supervision, P.D.B.; funding acquisition, S.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Li, Y.; Lei, G.; Bramerdorfer, G.; Peng, S.; Sun, X.; Zhu, J. Machine Learning for Design Optimization of Electromagnetic Devices: Recent Developments and Future Directions. Appl. Sci. 2021, 11, 1627. [Google Scholar] [CrossRef]
  2. Khan, A.; Ghorbanian, V.; Lowther, D. Deep Learning for Magnetic Field Estimation. IEEE Trans. Magn. 2019, 55, 1–4. [Google Scholar] [CrossRef]
  3. Khan, A.; Mohammadi, M.H.; Ghorbanian, V.; Lowther, D.A. Efficiency Map Prediction of Motor Drives Using Deep Learning. IEEE Trans. Magn. 2020, 56, 1–4. [Google Scholar] [CrossRef]
  4. Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; The MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
  5. Barmada, S.; Fontana, N.; Formisano, A.; Thomopulos, D.; Tucci, M. A Deep Learning Surrogate Model for Topology Optimization. IEEE Trans. Magn. 2021, 57, 1–4. [Google Scholar] [CrossRef]
  6. Aoyagi, T.; Otomo, Y.; Igarashi, H.; Sasaki, H.; Hidaka, Y.; Arita, H. Prediction of Current-Dependent Motor Torque Characteristics Using Deep Learning for Topology Optimization. IEEE Trans. Magn. 2022, 58, 1–4. [Google Scholar] [CrossRef]
  7. Di Barba, P.; Dughiero, F.; Mognaschi, M.E.; Savini, A.; Wiak, S. Biogeography-Inspired Multiobjective Optimization and MEMS Design. IEEE Trans. Magn. 2016, 52, 1–4. [Google Scholar] [CrossRef]
  8. MathWorks Deep Learning Toolbox User’s Guide. Available online: https://it.mathworks.com/help/pdf_doc/deeplearning/nnet_ug.pdf (accessed on 30 September 2022).
  9. Di Barba, P.; Mognaschi, M.E.; Sieni, E.; Ziolkowski, M. Convolutional neural networks for the shape design of a magnetic core for material testing: Forward and inverse approaches. Int. J. Appl. Electromagn. Mech. 2022, 69, 389–399. [Google Scholar] [CrossRef]
  10. Deb, K. Multi-Objective Optimisation Using Evolutionary Algorithms; Wiley: Hoboken, NJ, USA, 2001. [Google Scholar]
  11. Tucci, M.; Barmada, S.; Formisano, A.; Thomopulos, D. A regularized procedure to generate a deep learning model for topology optimization of electromagnetic devices. Electronics 2021, 10, 2185. [Google Scholar] [CrossRef]
Figure 1. Parametric variables of the electrocstatic motor.
Figure 1. Parametric variables of the electrocstatic motor.
Electronics 11 03877 g001
Figure 2. Torque versus rotation angle over 60°.
Figure 2. Torque versus rotation angle over 60°.
Electronics 11 03877 g002
Figure 3. Image of the circular sector of the rotor: full resolution (a), 30 × 50 pixels resolution (b).
Figure 3. Image of the circular sector of the rotor: full resolution (a), 30 × 50 pixels resolution (b).
Electronics 11 03877 g003
Figure 4. True values versus predicted values of the torque profile.
Figure 4. True values versus predicted values of the torque profile.
Electronics 11 03877 g004
Figure 5. A comparison between true (line) and predicted (cross) torque profiles: two geometries (device A and B) are considered.
Figure 5. A comparison between true (line) and predicted (cross) torque profiles: two geometries (device A and B) are considered.
Electronics 11 03877 g005
Figure 6. Geometry of device A (a) and B (b).
Figure 6. Geometry of device A (a) and B (b).
Electronics 11 03877 g006
Figure 7. Root mean square error for training set (blue line) and validation set (black line).
Figure 7. Root mean square error for training set (blue line) and validation set (black line).
Electronics 11 03877 g007
Figure 8. True values versus predicted values of the maximum torque value.
Figure 8. True values versus predicted values of the maximum torque value.
Electronics 11 03877 g008
Figure 9. Root mean square error for training set (blue line) and validation set (black line).
Figure 9. Root mean square error for training set (blue line) and validation set (black line).
Electronics 11 03877 g009
Figure 10. True values versus predicted values of the radial force value.
Figure 10. True values versus predicted values of the radial force value.
Electronics 11 03877 g010
Figure 11. Starting and arrival points of the multi-objective optimization based on CNN surrogate models. The optimal solutions re-calculated by means of FEAs are highlighted with red squares.
Figure 11. Starting and arrival points of the multi-objective optimization based on CNN surrogate models. The optimal solutions re-calculated by means of FEAs are highlighted with red squares.
Electronics 11 03877 g011
Table 1. CNN structure.
Table 1. CNN structure.
Layer NumberLayer Type
1Image based input (size 30 × 50)
2Convolution 2D (size 3 × 8)
3Batch normalization
4ReLU activation function
5Average pooling 2D (2 × 2, Stride)
6Convolution 2D (size 3 × 16)
7Batch normalization
8ReLU activation function
9Average pooling 2D (2 × 2, Stride)
10Convolution 2D (size 3 × 32)
11Batch normalization
12ReLU activation function
13Convolution 2D (size 3 × 32)
14Batch normalization
15ReLU activation function
16Dropout (20% of probability)
17Fully connected layer (Np outputs)
18Regression layer
Table 2. Variation range of the design variables.
Table 2. Variation range of the design variables.
Design VariableMinimum ValueMaximum Value
R1 [µm]1555
A [deg]1055
Β [deg]1055
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Di Barba, P.; Mognaschi, M.E.; Wiak, S. CNN-Based Surrogate Models of the Electrostatic Field for a MEMS Motor: A Bi-Objective Optimal Shape Design. Electronics 2022, 11, 3877. https://doi.org/10.3390/electronics11233877

AMA Style

Di Barba P, Mognaschi ME, Wiak S. CNN-Based Surrogate Models of the Electrostatic Field for a MEMS Motor: A Bi-Objective Optimal Shape Design. Electronics. 2022; 11(23):3877. https://doi.org/10.3390/electronics11233877

Chicago/Turabian Style

Di Barba, Paolo, Maria Evelina Mognaschi, and Slawomir Wiak. 2022. "CNN-Based Surrogate Models of the Electrostatic Field for a MEMS Motor: A Bi-Objective Optimal Shape Design" Electronics 11, no. 23: 3877. https://doi.org/10.3390/electronics11233877

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop