Methodology to Solve the Multi-Objective Optimization of Acrylic Acid Production Using Neural Networks as Meta-Models

Sepulveda, Geraldine Cáceres; Ochoa, Silvia; Thibault, Jules

doi:10.3390/pr8091184

Open AccessFeature PaperArticle

Methodology to Solve the Multi-Objective Optimization of Acrylic Acid Production Using Neural Networks as Meta-Models

by

Geraldine Cáceres Sepulveda

¹,

Silvia Ochoa

² and

Jules Thibault

^1,*

¹

Department of Chemical and Biological Engineering, University of Ottawa, Ottawa, ON K1N 6N5, Canada

²

SIDCOP Research Group-Departamento de Ingeniería Química, Universidad de Antioquia, Medellín 050010, Colombia

^*

Author to whom correspondence should be addressed.

Processes 2020, 8(9), 1184; https://doi.org/10.3390/pr8091184

Submission received: 29 July 2020 / Revised: 2 September 2020 / Accepted: 16 September 2020 / Published: 18 September 2020

(This article belongs to the Collection Multi-Objective Optimization of Processes)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

It is paramount to optimize the performance of a chemical process in order to maximize its yield and productivity and to minimize the production cost and the environmental impact. The various objectives in optimization are often in conflict, and one must determine the best compromise solution usually using a representative model of the process. However, solving first-principle models can be a computationally intensive problem, thus making model-based multi-objective optimization (MOO) a time-consuming task. In this work, a methodology to perform the multi-objective optimization for a two-reactor system for the production of acrylic acid, using artificial neural networks (ANNs) as meta-models, is proposed in an effort to reduce the computational time required to circumscribe the Pareto domain. The performance of the meta-model confirmed good agreement between the experimental data and the model-predicted values of the existent relationships between the eight decision variables and the nine performance criteria of the process. Once the meta-model was built, the Pareto domain was circumscribed based on a genetic algorithm (GA) and ranked with the net flow method (NFM). Using the ANN surrogate model, the optimization time decreased by a factor of 15.5.

Keywords:

multi-objective optimization; acrylic acid production; Pareto domain; artificial neural networks; surrogate model

1. Introduction

Given the highly competitive market in the chemical industries and the very high capital and operating costs of chemical manufacturing plants, it is paramount to constantly determine the optimal operating conditions, while considering economic, environmental, and societal constraints. However, formulating and solving an optimization problem in chemical engineering is not a straightforward task, because engineers will invariably have to deal with numerous conflicting/competing objectives. Therefore, there is a need to use optimization strategies that will provide the decision-maker with different alternative solutions that accurately reflect the existing underlying relationships between the process variables, which can be achieved via multi-objective optimization (MOO).

One of the most common approaches used for solving MOO problems is to convert them into a single-objective optimization problem, where an aggregated single objective function is built by taking the weighted sum of the different objectives [1]. However, this approach presents important drawbacks, including: (i) the optimal solution is a single point inside the feasible region; and (ii) this solution is highly sensitive to the selection of the weights assigned to each objective. In some cases, it may not even be possible to aggregate these competing objectives into a single objective, since some of the objectives are of a qualitative nature and it may be difficult to numerically quantifying them. Solving a MOO problem explicitly can be very expensive in terms of the required computational load given the large number of model evaluations that are required to circumscribe the Pareto domain. Usually, one can develop a phenomenological model of the process to be used with the optimization algorithm, which is usually a very complex and time-consuming endeavor, to generate the Pareto-optimal front, which contains all the non-dominated solutions to the overall problem. Solutions are non-dominated when the improvement of any one of the objectives leads to the deterioration of at least one other objective [2,3]. One can also resort to a state-of-the-art process simulator to simulate the entire or part of the process while facing the challenge of interfacing the simulation software with the optimization algorithm. In addition, sometimes the computation time to perform one simulation run can be relatively long, such that the time to circumscribe the Pareto can take a significant amount of time for each optimization scenario. To reduce the simulation time and expedite the determination of the Pareto domain for a given optimization scenario, some authors have successfully used meta-models, also known as surrogate models, to represent the underlying relationships existing between the input or decision process variables and the objectives [4,5,6,7,8,9,10,11].

Artificial neural networks (ANNs) are often used as meta-models due their high plasticity to encapsulate underlying relationships within the process data. In training ANN models, the connection weights between two layers of neurons are adjusted as to minimize the sum of squares of the errors between the experimental and predicted outputs [12]. The main advantage of using ANNs as surrogate models is their capability of effectively model very complex nonlinear behavior, even if it includes a large number of variables. When trained, an ANN can be used to perform a large number of simulations and to determine the Pareto domain very rapidly. In this paper, a methodology for solving multi-objective optimization problems using artificial neural networks as meta-models is proposed and applied to a chemical engineering problem, namely the production of acrylic acid.

2. Description of the Acrylic Acid Production

The comprehensive optimization study of a process normally requires having an accurate and representative model. In this investigation, a first-principle based model was developed for the reactor section of an acrylic acid production plant, which was then used for generating the data in order to train and validate an ANN for solving the MOO. The first-principle based model was simulated in FORTRAN and validated successfully by comparing it with simulation results obtained from ASPEN and Honeywell UniSim. All the main considerations that were accounted for are described in the following sections.

2.1. Reactor Model

Acrylic acid (AA) plays an important role in the production of polymeric products. The worldwide production of acrylic acid reached

~

3.9 million metric tons in 2009 [13]. This important chemical is used in the manufacture of superabsorbent polymers (SAPs) which are involved in a variety of applications [14]. A very large proportion of AA is converted to a wide range of esters that are applied in surface coatings, textiles, adhesives, paper treatment, polishes, plastics, and many others [13,15,16]. In a smaller proportion, it is also used for the production of detergents and flocculants [13].

Nowadays, the commercial production of AA comes from the petrochemical industry [15]. Currently, the preferred production route is by the partial vapor phase catalytic oxidation of propylene in a two-step process, as shown in Figure 1. In this process, propylene is first oxidized to acrolein, by supplying a mixture of propylene, air, and steam to the first reactor. Acrolein, which is an intermediate product, is subsequently oxidized to AA in the second reactor [16].

It is important to note that the desired and undesired reactions are highly exothermic and highly temperature-dependent, such that steam, which is fed to the first and second reactors, acts as a thermal sink to moderate the rise in temperature. Furthermore, this process relies on two compressors to bring the reactor feed to the desired operating pressure and, as a result, higher air and water vapor flowrates will significantly increase the compression work. Another important consideration is the flammability of propylene in the first reactor, which will be addressed in the following sections.

2.2. Propylene Oxidation

The propylene vapor phase oxidation occurring in the first reactor is performed in a catalytic wall reactor (CWR), where the catalyst is coated on the inner surface of the reactor and it guarantees isothermal conditions through a temperature control loop for a range of 330–430 °C [17]. The oxide catalyst annular section of the CWR is constituted of bismuth and molybdenum containing montmorillonite.

The reaction scheme considered in the model formulation for this reactor is depicted in Figure 2, in which a total of 10 reactions involving propylene, oxygen, acrolein, acrylic acid, acetaldehyde, acetic acid, formaldehyde, carbon dioxide, and carbon monoxide were considered [18]. Given the very low concentrations of formaldehyde observed in the reactor outlet mixture following the series of experiments performed by Redlingshofer et al. [17], it was assumed to be negligible in this study, and was not considered in the model. As was previously mentioned, steam is present along with propylene and oxygen in the input stream. Steam plays an important role, by increasing the selectivity towards acrolein by suppressing the formation of carbon oxides at low temperatures and contributing to the catalyst re-oxidation. In addition, steam has a dilution effect that contributes not only as a thermal sink but also to ensure the reactor is operating below the flammability limits [18,19] (more information is provided in Section 2.4).

At temperatures higher than 360 °C, oxygen has a weak influence on the formation of acrolein and the catalytic reduction by propylene will be the rate-determining step. On the other hand, at temperatures lower than 360 °C, if the oxygen concentration increases, the formation of acrolein is accelerated. As a result, the expression for the reaction rate from propylene to acrolein will change depending on whether the temperature is below or above 360 °C [18].

2.3. Acrolein Oxidation

The acrolein vapor phase oxidation kinetics for the second reactor were determined using a catalyst that contains oxides of antimony, nickel, and molybdenum [20]. In this case, carbon dioxide was the only major by-product in the reaction scheme. Thus, only the formation of acrylic acid and carbon dioxide were used to determine suitable rate expressions [20].

C_{3} H_{4} O + \frac{1}{2} O_{2} \overset{r_{10}}{\to} C_{3} H_{4} O_{2}

(R1)

C_{3} H_{4} O + \frac{7}{2} O_{2} \overset{r_{11}}{\to} {3 CO}_{2} {+ 2 H}_{2} O

(R2)

The experimental study to obtain the kinetics for the acrolein oxidation was carried over a temperature range of 285 °C–315 °C [20]. The vapor phase acrolein oxidation reactor was assumed to take place in a packed bed reactor (PBR), and as the catalyst used is very specific for use in acrolein oxidation, the other components aside from steam and air were accounted as inerts.

The complete set of reaction rates considered in the first and second reactors are given in the Appendix A.

2.4. Flammability Limits

Operating outside the lower and upper flammability limits (LFL and UFL) is always recommended to avoid any potentially hazardous situation. Nevertheless, this range will tend to change depending on the temperature, pressure and inert gas concentration inside the reactor. For this reason, it is more convenient to define a minimum oxygen concentration (MOC), below which flame propagation is not possible [21]. Indeed, if the concentration of oxygen is maintained below the MOC, it will be possible to prevent fires or explosions regardless of the concentration of the fuel. The MOC value is expressed in units of volume percentage of oxygen in the mixture of fuel and air.

Propylene, which is a flammable gas present in the feed of the first reactor, has a LFL and an UFL of 2.4 and 11 vol% in air, respectively, under standard conditions [22]. At 25 °C, the MOC in air with nitrogen as the inert gas for propylene as a fuel is 11.5 vol%, while it is 14 vol% when the inert gas is carbon dioxide [21]. The MOC for the maximum possible temperature and pressure of the first reactor was calculated, and it was determined that the reactor should be operated below 7.0 vol% of oxygen. As for the second reactor, there was no need to impose a constraint to avoid flammability. If the flammability limit was satisfied in the first reactor, it was expected that it will be satisfied in the second reactor. Furthermore, water was added into the second reactor, which further diluted the reactor concentration.

3. Methodology for Solving a Multi-Objective Optimization Problem

In this investigation, the production of acrylic acid was optimized using a multi-objective optimization approach. A methodology to solve large multi-objective optimization problems is proposed in this paper, so that the disadvantage of computational time can be addressed when finding the optimal Pareto front. For this purpose, three-layer artificial neural networks are proposed as a meta-model to directly predict the objective functions.

The proposed methodology comprises six main steps, as illustrated in Figure 3. The optimization problem is first established (Step 1) by defining the set of objective functions to be minimized or maximized, and the set of decision variables (with their respective allowable ranges). Building a representative model to perform the optimization study in a timely manner relies on the data set obtained experimentally or through a comprehensive model of the process (Step 2). To obtain the most suitable set of data, an experimental design is usually adopted to gain maximum information with a minimum number of experiments. This is particularly important when solving the comprehensive model or doing experiments is very time-consuming. In this investigation, uniform design (UD) was adopted as a first approach to determine the process data (Step 3) required to build the meta-model of the process using ANN (Step 4) from the phenomenological model. Once the surrogate model is built, which consists of one ANN for each objective function, it is used to circumscribe the Pareto domain (Step 5) using a genetic algorithm. In this investigation, a large initial population of 5000 individuals was used. Upon analysis of the prediction performance of the ANNs, possible refinement of the optimization problem is considered, such as adjusting the ranges of the decision variables. Initially, the ranges of the decision variables could be larger to ensure a wide range of operation is encompassed. However, when identifying the initial Pareto domain, some ranges of the decision variables could be reduced to ensure more precision for the meta-model. Usually, it would be necessary to return to Step 1, because many data points in the data set used to generate the initial meta-model contains points outside the region of search. Finally, all Pareto-optimal solutions are ranked (Step 6) using the net flow method (NFM) where the preferences of a decision maker are embedded in relative weights and three threshold criteria to assist in the ranking of the Pareto domain. The optimal solution identified based on the ANN and NFM is then corroborated with an actual experiment or via the comprehensive first-principle based model. More details for some of these steps are addressed in the next subsections, where we describe how each step was applied to the optimization of the AA reaction section.

3.1. Definition of the Optimization Problem

The multi-objective optimization problem is based on nine objective functions (OF) impacting the economics of the process. These objective functions are presented in Table 1 with their respective nomenclature, equation, and whether they need to be maximized or minimized. The set of decision or input variables consists of eight variables, which are described in Table 2, including their respective lower and upper limits.

It is important to notice that the constraint on the molar concentration of oxygen [O₂] in the first reactor, denoted as Sum_E, was implemented as a soft constraint expressed as objective function OF₉. This objective function represents the minimization of the integration of the excess oxygen concentration above the lower flammability limit. The integration for handling the oxygen concentration as a soft constraint allows for an insight into the set of decision variables that could violate this constraint, and provides more flexibility in the optimization process.

The MOO problem can be expressed mathematically by the following equations:

\begin{array}{l} \max_{x} [OF (x)] = \max_{x} [{- OF}_{1} {(x), OF}_{2} {(x), OF}_{3} {(x), OF}_{4} {(x), - OF}_{5} {(x), OF}_{6} {(x), OF}_{7} {(x), OF}_{8} {(x), - OF}_{9} (x)] \\ {Subject to x}_{\min} \leq x \leq x_{\max} \end{array}

(1)

3.2. Design of Experiments

Process data for the production of AA are required to build a representative model that can be rapidly used to generate the Pareto domain of the process. In this study, a set of neural networks, one for each objective function, were used as a meta-model. Historical process data obtained during the actual operation of a process is one source of information. However, for most processes, historical data usually lack generality, as they only contain information in a very restricted range of operation. This information is usually not sufficient to build a representative model, which is useful for optimization. To obtain process data over a wider range of operation, a common approach resorts to experimental design to adequately cover the range of interest for the operation of the process. An experimental design can be used to obtain process data by conducting experiments on the real process or using a comprehensive first-principle-based model of the process. An important question arises around the number of experimental points necessary and sufficient to properly develop a representative surrogate model, which in this case means training an ANN to predict each objective function based on a set of design points.

Usually, the design of experiments (DOEs) is used to partition the search domain generated by the independent variables to obtain experimental data that represent the entire domain of interest. The motivation for using DOEs is to determine the underlying relationship that may exist between the decision variables and the objective functions of the process with a restricted amount of data, since a good experimental design should minimize the number of experiments to acquire as much information as possible. Most of the experimental designs assume that the underlying model is known, as it is the case for orthogonal and optimal designs. However, the structure of the model is not always known or may be very complex and highly nonlinear [27]. For such cases, the uniform design (UD) proposed by Fang [28] may be used. UD is a design in which the design points are distributed uniformly on the experimental domain to better capture the relationship between the response and the contributing factors [29].

A large number of UD suggested design points have been tabulated and are available online, where each UD takes the form of U_n(q^s) [30]. In this design, a complete table of the normalized design points can be obtained given the dimensionality of factor space s (decision variables), the number of levels q of the factors, and the desired number of data points n. The normalized information obtained from the selected UD table is then used along with the minimum and maximum values of the decision variables to determine the actual values of decision variables from which the objective functions are calculated. The following uniform designs were used to generate the initial set of process data, divided as learning and validation data sets, to build the ANN: U₅₀(5⁸) and U₂₀(5⁸), i.e., 50 and 20 design points, which translates to using a ratio of 70/30 for the learning and validation data points.

For more complex cases, where the dimensionality of the input space is large, and a higher number of design points are required in order to develop a good predictive model, one has to resort to another method to define the design points because tables for UD are not available beyond a certain number of design points. In the present investigation, with eight decision variables, it was necessary to use well-distributed random data. This topic will be addressed in more detail in the Results section.

3.3. Artificial Neural Networks as Meta-Models

Neural networks are a very versatile meta-model, since they have the plasticity to encapsulate the underlying relationships that exist between input and output process variables based on a number of process data. In this work, the simulated data obtained from the phenomenological process model was used to train a three-layer feedforward artificial neural network (FFANN) for each objective function defined in Section 3.1.

3.3.1. ANN Architecture

Figure 4 presents the architecture of the FFANN used in this study, which comprises three layers: the input layer, one hidden layer, and an output layer [12]. The input layer is composed of nine neurons; eight neurons corresponding to the decision variables of the MOO problem, and one bias neuron. The only function of the input neurons is to accept the input variables, normalized between 0 and 1, and fan out these scaled input variables to the processing neurons of the hidden layer. The eight decision variables, as defined in Section 3.1, are the molar flowrates of propylene (F_P), air (F_A) and steam (F_S1) to the first reactor, the flowrate of vapor water (F_S2) to the second reactor, and the operating temperature and pressure of both reactors (T₁, T₂, P₁, and P₂).

The hidden layer consists of a number of processing neurons and one bias neuron. The number of hidden neurons is chosen as a compromise to obtain an excellent prediction while avoiding overfitting. Each neuron in the hidden layer, except the bias neuron, performs two simple mathematical operations: (i) the weighted sum of all outputs of the input layer, including the bias neuron; and (ii) a nonlinear transformation of the weighted sum using a sigmoid function [11]. Multiple hidden layers could be included, however, it has been shown that a single hidden layer is generally sufficient for classifying most data sets [31]. Finally, the output neuron performs the same two mathematical functions as the neurons of the hidden layer to predict the output variables that are also scaled between 0 and 1. The prediction of the actual output is obtained by de-normalizing the scaled output. For this case study, ANNs with a single output neuron were used to predict each objective function independently to facilitate the learning process of the networks. This implies that the meta-model thereby consists of nine ANNs in parallel to determine each of the nine optimization criteria defined in Section 3.1.

3.3.2. Building the ANN

The experimental data set was divided into learning and validation data to train the networks using a ratio of 70/30 respectively, and 1000 random test points were used to test each ANN after validation. This “second validation” had zero impact on the ANN, as it was only performed to confirm the good adjustment and precision of the neural networks when exposed to new input data. This additional validation was only possible because a phenomenological model of the process was available. To develop an ANN model of the process based on the series of decision variables and objectives, the connection weights were initially assigned small random values and the predicted output was calculated. The sum of squares of the differences between the actual and predicted output was used to change the connection weights in a way to minimize its value. In this investigation, the quasi-Newton optimization method was used to determine the optimal set of connection weights. The sum of squares based on the learning data set was used to adjust the connection weights until the minimum was achieved, which usually requires a good number of iterations. Throughout the learning process, the sum of squares of the errors based on the validation data set was also calculated at each iteration and the selected set of weights was the one associated with this minimum sum of squares. This procedure is an effective way to avoid overfitting and to choose the adequate number of neurons in the hidden layer [32]. Consequently, the R² value was plotted against the number of hidden neurons to determine the proper number of hidden neurons for every ANN. The program used to build the networks was coded in FORTRAN.

3.3.3. Modified Garson Algorithm

Being a black box model, one would usually expect that no information can be retrieved from the surrogate model besides its predictive ability for a set of input data. Nevertheless, knowing the connection weights of the ANNs, it is possible to determine the percentages of the relative importance of each input variable used to generate a specific output in order to perform a sensitivity analysis. For this purpose, the modified Garson method proposed by Goh [33] was used. The algorithm is given by Equation (1) [33,34,35]:

Q_{i k} = \frac{\frac{\sum_{j = 1}^{L} | W_{i j} W_{j k} |}{\sum_{r = 1}^{N} | W_{r j} |}}{\sum_{i = 1}^{N} \sum_{j = 1}^{L} (\frac{| W_{i j} W_{j k} |}{\sum_{r = 1}^{N} | W_{r j} |})}

(2)

where

Q_{i k}

represents the relative influence of the input variable i on the output variable k.

W_{i j}

is a matrix whose elements are the connection weights between the input neuron i and the hidden neuron j, and

W_{j k}

corresponds to the connection weights between the hidden neuron j and the output neuron k. i and j are the indices of the neurons in the input and the hidden layers, respectively. The term

\sum_{r = 1}^{N} | W_{r j} |

is the sum of the connection weights between the N input neurons and the hidden neuron j. L stands for the number of hidden neurons connected to the output neuron k. Equation (1) is a modified Garson algorithm, since in order to avoid the counteracting influence due to positive and negatives values of the connection weights, the absolute value of each weight was used [34]. It is important to note that the percentages obtained with this equation are normalized, such that the sum of all of the inputs’ relative importance for one objective function adds up to 100.

3.4. Optimization Algorithm

To solve the multi-objective optimization problem, gradient-free methods are known to be a good alternative [36]. Evolutionary algorithms that are based on Darwin’s theory of survival of the fittest have been widely applied to solve these types of problems [37]. The most well-known and widely used method in this category is the genetic algorithm (GA), developed by Holland in the 1970s [38]. One major drawback of these algorithms is that they require thousands of evaluations of the process model to reach the Pareto front, which consists of only non-dominated solutions. For a process model that is computationally extensive, which is common, it may require days to circumscribe the Pareto domain. In those cases, the use of a meta or surrogate model is very advantageous. In this investigation, ANNs were used as surrogate models for each of the objective functions.

Once trained and validated, the nine ANNs, one for each objective, are used within the MOO algorithm. To solve the optimization problem, the dual population evolutionary algorithm (DPEA), coded in FORTRAN, was used in this work [39,40]. As is the case for other GAs, this algorithm is based on the evolution of a population of individuals, each of which is a solution to the optimization problem [4]. The initial population was comprised of 5000 sets of solutions that were obtained with different sets of decision variables, each decision variable being randomly generated within the permissible ranges. The solutions were then evaluated in pairs to determine the number of times a given solution was dominated. For the next generation, all currently non-dominated solutions were kept along with a fraction of the least dominated solutions. The other solutions were discarded, and the solutions retained from the previous generation were then used to produce new solutions to bring the population to its original number of individuals. This procedure was repeated until 5000 non-dominated solutions were obtained. Using a surrogate model, the MOO converged rapidly, and the 5000 non-dominated solutions lead to a well-defined Pareto domain.

3.5. Ranking of the Pareto Domain

The Pareto domain was circumscribed without any bias, that is, with no preferences given to any of the objectives, apart from specifying if a given objective needs to be minimized or maximized. It is obvious that some Pareto-optimal solutions are better than other solutions such that a method is required to rank all Pareto-optimal solutions using some preferences expressed by an expert or decision-maker. In this investigation, the net flow method (NFM) was used. This method requires an expert who has a good knowledge of the process and can give an appreciation on the nature of each criteria. This information is expressed for each objective function via four parameters, namely the relative weight (W_k), the indifference threshold (Q_k), the preference threshold (P_k), and the veto threshold (V_k) [41,42]. Using these quantitative parameters, the NFM performs a pairwise comparison of all Pareto-optimal solutions and attributes a score to each one, which then allows all of the solutions to be ranked. An interesting feature of this ranking method is its robustness, which means that changes in the weights will not incur in major changes of the optimal zones [41].

4. Results

4.1. Construction of the Meta-Model

The initial attempt to develop the nine ANNs, as a surrogate model to represent each of the nine objective functions, was performed with training and validation data sets that were the design points of the uniform design. In this first attempt, 50 and 20 design points were used for the training and validation data sets, respectively. Eight decision variables and five levels were used to separate the ranges of the decision variables, which correspond to U₅₀(5⁸) and U₂₀(5⁸) for the training and validation data sets, respectively. The coefficients of determination (R²) for each of the nine objective functions (Table 1) are plotted as a function of the number of hidden neurons (Figure 5).

Results of Figure 5 show that some objectives, namely the power of the two compressors (OF₁ and OF₅) and the heat recovery of the first reactor (OF₂), are relatively well predicted. However, the ANNs of the other six objectives show poorer predictions with R² values below 0.90. Furthermore, it is not possible to observe a clear trend for those OFs when one would expect the R² value to increase as the number of neurons increases. These results suggest that, for these objectives, the number of design data points is insufficient to allow the ANN to capture the underlying relationships that exist between the decision variables and the objectives. The large number of input variables as inputs to the ANNs also points to the necessity to present the neural networks with richer information. Since the available tables of uniform design are limited to a relatively small number of design points, it was decided to use well-distributed random design points, which offer the possibility of using any desired number of design points.

A series of ANNs were developed for an increasing number of hidden neurons and different numbers of design points. The total number of design points were divided in an approximate ratio of 70:30 for the learning and validation data, respectively. Results for objective functions that showed the best and worst predictions, namely the compression power of the first compressor (OF₁) and the heat recovery of the second reactor (OF₆) respectively are presented in Figure 6. The total number of design points (training and validation) in the data set varied between 70 and 1430. The predictions for OF₁ wwere very good for a relatively low number of hidden neurons and a small number of training data points. Indeed, the very high R² value indicates that the predictions of the neural network for the compression power of the first compressor were independent of the number of design data points above approximately 140. For the heat recovery of the second reactor (OF₆), the coefficient of determination (R²) increased with the number of design points, whereas it was not a function of the number of hidden neurons above five neurons. This trend was more significant for some objectives due to their dependency on the input or decision variables. For example, a simple dependency prevailed for OF₁ as it was mainly correlated to the air flowrate and the operating pressure of the first reactor. In contrast, OF₆ is a much more complex dependency as it is affected by a larger number of inputs, namely the four input flowrates and the operating temperature and pressure of the first reactor, and thereby requires more data points to capture the underlying relationships between the inputs of the ANN to properly predict this output.

Based on the previous discussion, we performed a sensitivity analysis to extract the contribution of all neural network inputs to explain each output. Numerous techniques have been proposed to provide this information, and they partly alleviate the black box character of neural networks. In this investigation, the modified Garson method was used [35]. Results of this sensitivity analysis are presented in Table 3 in terms of the percentages of the relative importance of the eight decision variables for each of the objective function in the ANNs. It is important to note that the percentages in Table 3 are normalized such that the sum of each row associated to one objective function adds up to 100; if more variables are correlated to an output, the percentages will be obviously lower. First, these results show the paramount importance of the bias neuron of the input layer, which acts in a similar way to the intercept of a linear equation to shift the weighted sum to obtain a better fit. Some strong correlations are logically expected and indicate that the neural networks were trained adequately to capture the underlying behavior of the process. For instance, this is the case for the power of the two compressors that are strongly correlated with the desired pressures and the pertinent flow rates. These sensitivity coefficients offer a valuable introspection on the causal effect of each decision variable on the objective functions.

A large number of ANNs were obtained to represent the nine objective functions. The final selection was made as a compromise of the following criteria: (1) the minimum number of data for training and validating the neural networks; (2) higher than 0.9 values for the coefficient of determination (R²); and (3) the minimum number of neurons. The selected set of nine ANNs, one for each objective function, can now be used as the surrogate model to generate the Pareto domain and find the optimal operating set of decisions variables.

Before proceeding, the quality of the predictions will be examined. Figure 7a,b present the ANN predictions of the conversion of the second reactor (OF₈) and the compression power of the first compressor (OF₁), respectively, as a function of the values calculated using the phenomenological model. The green points represent the learning data, the orange points correspond to the validation data, and the grey points to the testing data. The testing data were generated using the phenomenological model by randomly selecting the decision variables within their allowable ranges, as defined in Table 2. This data set was used as a “second validation” to confirm the good adjustment and precision of the ANNs when exposed to new input data. As previously mentioned, it had no impact on the meta-model training. The predictions of Figure 7a,b correspond to the ANNs with the lowest and highest R² values for all data presented: 0.912 and 0.999, respectively. Predictions for the other OFs were very good as well, having R² values between the two previous values. The predicted conversion in the second reactor (R-101) had the majority of the points near the 45o-line, but with the lowest R² value due to few scattered points with poor predictions. When examining the sensitivity parameters of Table 3, the two objective functions that are influenced by a larger number of input variables are the conversions of the first and second reactors (OF₄ and OF₈). As mentioned before, the more an objective function is correlated to a larger number of decision variables, the more learning data points are required to obtain better predictions. As a compromise needs to be made between the R² value and the number of learning data, a value of R² above 0.9 was considered a good result.

4.2. Multi-Objective Optimization

After having obtained a good surrogate model, i.e., one consisting of nine AANs, one for each objective function, the Pareto domain was circumscribed with the DPEA and all Pareto-optimal solutions were ranked with the NFM, using both the phenomenological model and the surrogate model for the reactor section to compare the results.

Since there are nine objective functions, the Pareto front is in fact a surface of a nine-dimensional space. To visualize the ranked Pareto domain, it is necessary to resort to two-dimensional projections. In this paper, the ranked Pareto domain of four objective functions are presented for the surrogate and phenomenological models. Figure 8a,b present the Pareto domain projected on the two-dimensional space of the productivity of acrylic acid (OF₇) and the heat recovery of the second reactor (OF₆), while Figure 8c,d present the projection on the plane of the conversion of propylene (OF₄) and heat recovery of the first reactor (OF₂). Based on the NFM ranking, the Pareto domain was divided into four different regions: (i) the best solution in red; (ii) Pareto-optimal solutions ranked in the top 5%; (iii) solutions in the next 45%; and (iv) the remaining 50% of the solutions. The best ranked solution of Figure 8b corresponds to a productivity of 1.179 kmol/m³h and a heat recovery of 10,755 kW in the second reactor. When the values of the decision variables, associated with the best-ranked Pareto-optimal solution, were used within the first-principle based model for comparison purposes, the values for OF₇ and OF₆ were 1.202 kmol/m³h and 11,104 kW, respectively, yielding errors in the vicinity of 2%–3%. This was also the case for the other objective functions, as shown in Table 4. When the phenomenological model is used to circumscribe the Pareto domain and then Pareto-optimal solutions are ranked with NFM, as depicted by Figure 8a, values of 1.2524 kmol/m³h and 11 291 kW were obtained for OF₇ and OF₆, respectively, for differences of approximately 7% and 5%. The corresponding conversion in R-100 of the best ranked solution was 94.97% and 97.27% for the meta-model and the phenomenological model, respectively, as shown in Figure 8c,d. In contrast, a conversion of 96.25% was predicted when the decision variables of the best-ranked solution identified with the ANNs were used in the first-principle based model.

The Pareto domains generated with the phenomenological and surrogate models were very similar, as illustrated in Figure 8. Occasionally, some minor differences between the two Pareto domains occurred. For instance, there is a small region in Figure 8a that is empty, whereas the same region is covered in Figure 8b. Fortunately, it was observed in this investigation that these discrepancies very often appear in regions where Pareto-optimal solutions are ranked relatively low. These discrepancies were usually due to the inability of the neural network to recognize intrinsic constraints embedded in the code of the first-principle based model and that restricts the operation within those limits. As ANNs are built from experimental data of the model, it will not be possible for the meta-model to explicitly handle the constraints, unless they are treated as soft constraints as was the case for the oxygen concentration. In lieu, it filled the empty region by interpolating the data that is provided to train the ANNs. The best ranked solution as well as the first 5% of the Pareto-optimal solutions were well identified by the meta-model.

The similarity of the Pareto domains (Figure 8) obtained using the phenomenological and surrogate models is a clear indication that the ANNs were able to adequately predict the existing relationship between the decision variables and the objective functions. To make a more complete comparison between the two Pareto domains, the decision variables and the objective functions of the best-ranked solution of the surrogate model were normalized with respect to the best-ranked solution obtained with the phenomenological model, where a value of one was assigned to the latter. The normalized variables are presented in Figure 9a,b. These results clearly show that the use of a surrogate model to perform the MOO is a viable solution, as the great majority of the decision variables and objective functions were very close to the best ranked solution of the phenomenological model. The steam flowrate input was the only variable that has an error above 10%, meaning that using the values obtained from the meta-model will result in 20% more steam usage that what would be required if the first-principle based model was used.

In order to compare the resulting solution using the weighted sum method instead, the results of the Pareto domain were used to determine the optimal solution that, as explained earlier, will be a single point in the feasible region for a SOO. All the objectives were assigned a weight of 0.1, except for OF₃ and OF₇ which were assigned a weight of 0.15. The resulting optimal solution corresponded to a solution ranked in the top 5% when using the NFM method, more specifically the solution ranked 232th out of 5000. The values of the objective functions of this solution are presented in Table 5.

The computation time required to circumscribe the Pareto domain via the surrogate model was 38 s. On the other hand, to obtain the Pareto domain optimizing for the reactor section using the first-principle model took 558 s, which means that the optimization process was 15.5 times faster using the ANNs. In this particular instance, one simulation with the first-principle model was relatively fast. In other problems, it may take many days of computation time to obtain a sufficient number of Pareto-optimal solutions, and this is where the methodology proposed in this work would greatly benefit. In addition, once the surrogate model is ready, it also allows a large number of optimization scenarios to be rapidly analyzed. Even if small changes were to be made, the ANN could easily adapt to those changes.

5. Conclusions

The aim of this work was to propose an easy-to-follow methodology that would counteract the high computational load of performing multi-objective optimization using first-principle based models. After carrying the optimization process using both the phenomenological model and the meta-model consisting of nine three-layer ANNs in parallel, one for each objective function, the computational time was reduced by a factor of 15.5 while using the meta-model. This approach can also be very useful in terms of hypothesis testing, due to quick convergence of the Pareto domain using ANNs. It should also be noted that the meta-model was able to properly model the existing relationships between the decision variables and the objective functions. This was confirmed by the proper determination of the Pareto domain when comparing the optimization results of the meta-model against the ones obtained with the mathematical model. Results show that the best ranked solution from the NFM using the ANN meta-model was very close to the optimal solution obtained with the phenomenological model. This work therefore successfully demonstrates the advantage of using ANNs as surrogate models to carry out MOO.

Author Contributions

Conceptualization, G.C.S. and J.T.; Methodology, G.C.S. and J.T.; Software, G.C.S. and J.T.; Validation, G.C.S., S.O. and J.T.; Formal Analysis, G.C.S.; Investigation, G.C.S.; Writing-Original Draft Preparation, G.C.S.; Editing, G.C.S., S.O. and J.T.; Supervision, S.O. and J.T.; Project Funding Acquisition, J.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Sciences and Engineering Research Council of Canada under the Discovery Grant Program (Grant Number RGPIN 004572).

Conflicts of Interest

The authors declare no conflict of interest.

Nomenclature

AA	Acrylic acid	-
Ac	Acrolein	-
Ace	Acetaldehyde	-
AceA	Acetic acid	-
ANN	Artificial neural network	-
C_i	Molar concentration of i	kmol/m³cat
CWR	Catalytic wall reactor	-
DOE	Design of experiments	-
F_A	Molar flowrate of air	kmol/h
F_P	Molar flowrate of propylene	kmol/h
F_S	Molar flowrate of steam/water vapor	kmol/h
LFL	Lower flammability limits	vol%
MOC	Minimum oxygen concentration	vol%
MOO	Multi-objective optimization	-
$\dot{n}$	olar flowrate	kmol/h
NFM	Net flow method	-
OF	Objective function	-
n_i	Order of reaction j	-
p_i	Partial pressure of i	bar^nj
P	Pressure	bar
PBR	Packed bbed reactor	-
Prop	Propylene	-
R	Gas constant	J/mol K
R²	Coefficient of determination	-
T	Temperature	^oC
UD	Uniform design	-
UFL	Upper flammability limits	vol%
V	Volume	m³
W	Catalyst weight and connection weight of the ANN	kg

Greek Symbols

$η$	Efficiency of the compressor	%
$ξ$	Extent of reaction	kmol/h

Subscripts

1	First reactor
2	Second reactor
Feed	Feed stream to first reactor
i	Index of the input layer of the ANN
j	Index of the hidden layer of the ANN
k	Index of the output layer of the ANN

Appendix A. Set of Rate Equations for Each Reaction

The parameters for the reaction rates involved in the first reactor are presented in Table A1 and the rate law equations are below.

Table A1. Experimental parameters for the rate law of acrolein formation [18].

	Constants	Value	Units
r₁	k_1Red,o	0.0628	$[\frac{k m o l}{k g \cdot s \cdot b a r}]$
	k_1Ox,o	16,000	$[\frac{k m o l}{k g \cdot s \cdot b a r^{0.75}}]$
	α_H2O	8.2	$[\frac{1}{b a r}]$
r₂	k_2,o	2.3200	$[\frac{k m o l}{k g \cdot s \cdot b a r^{0.86 + 0.3}}]$
r₃	k_3,o	0.0150	$[\frac{k m o l}{k g \cdot s \cdot b a r}]$
r₄	k_4,o	1.4700	$[\frac{k m o l}{k g \cdot s \cdot b a r^{0.73}}]$
r₅	k_5,o	0.0363	$[\frac{k m o l}{k g \cdot s \cdot b a r}]$
	K_H2O	1.9	$[\frac{1}{b a r}]$
r₆	k_6,o	0.00034	$[\frac{k m o l}{k g \cdot s \cdot b a r}]$
	K_H2O	1.9	$[\frac{1}{b a r}]$
r₇	k_7,o	1.3800	$[\frac{k m o l}{k g \cdot s \cdot b a r}]$
	K_H2O,AA	55.1	$[\frac{1}{b a r}]$
r₈	k_8,o	0.00038	$[\frac{k m o l}{k g \cdot s \cdot b a r}]$
r₉	k_9,o	4.75 × 10⁹	$[\frac{k m o l}{k g \cdot s \cdot b a r}]$

The following equations correspond to the rate law of the acrolein formation (Equations (A1) and (A2)). As previously mentioned, the expression for the reaction rate from propylene to acrolein will change depending on whether the temperature is below or above 360 °C. Below 360 °C, if the oxygen concentration increases, the formation of acrolein is accelerated. On the other hand, if the temperature is above 360 °C, the catalytic reduction by propylene will be the rate-determining step.

If T_{1} \geq 360 ° C : r_{1} = k_{1 Red, o} \cdot e^{- (\frac{39600}{R T})} \cdot p_{P r o p}

(A1)

If T_{1} < 360 ° C : r_{1} = k_{1 O x, o} \cdot e^{- (\frac{114000}{R T})} \cdot p_{O_{2}}^{0.75} \cdot (2 - e^{- (α_{H 2 O} \cdot p_{H 2 O})})

(A2)

The side reactions are described using power-law expressions (A3) to (A10). Equations (A6)–(A8) consider water in the adsorption term of the hyperbolic rate expressions since water suppresses the formation of both carbon oxides and leads to higher yields of acrolein and acrylic acid.

r_{2} = k_{2, o} \cdot e^{- (\frac{72500}{R T})} \cdot p_{A c}^{0.86} p_{O_{2}}^{0.30}

(A3)

r_{3} = k_{3, o} \cdot e^{- (\frac{52400}{R T})} \cdot p_{O_{2}}

(A4)

r_{4} = k_{4, o} \cdot e^{- (\frac{86700}{R T})} \cdot p_{O_{2}}^{0.73}

(A5)

r_{5} = \frac{k_{5, o} \cdot e^{- (\frac{60900}{R T})} \cdot p_{O_{2}}}{1 + K_{H_{2} O} \cdot p_{H_{2} O}}

(A6)

r_{6} = \frac{k_{6, o} \cdot e^{- (\frac{38100}{R T})} \cdot p_{P r o p}}{1 + K_{H_{2} O} \cdot p_{H_{2} O}}

(A7)

r_{7} = \frac{k_{7, o} \cdot e^{- (\frac{82900}{R T})} \cdot p_{O_{2}}}{1 + K_{H_{2} O, A A} \cdot p_{H_{2} O}}

(A8)

r_{8} = k_{8, o} \cdot e^{- (\frac{14900}{R T})} \cdot p_{A c e}

(A9)

r_{9} = k_{9, o} \cdot e^{- (\frac{178700}{R T})} \cdot p_{A c e A}

(A10)

As for the parameters of the reaction rates involved in the second reactor, they are presented in Table A2 and the rate law equations are below.

Table A2. Experimental parameters for the rate law for acrylic acid formation [20].

	Constants	Values	Units
r₁₀	k_10,o	19436	$[\frac{1}{s}]$
	K_o	9.78 × 10^-6	$[\frac{1}{s}]$
r₁₁	k_11,o	49070	$[\frac{1}{s}]$
	K_o	9.78 × 10^-6	$[\frac{1}{s}]$

The following equations correspond to the rate law of the acrylic acid and CO₂ formation:

r_{10} = \frac{k_{10, o} \cdot e^{- (\frac{55019.6}{R T})} \cdot C_{A c}}{1 + K_{o} \cdot e^{- (\frac{- 31421.84}{R T})} \cdot C_{A A}}

(A11)

r_{11} = \frac{k_{11, o} \cdot e^{- (\frac{72006.64}{R T})} \cdot C_{A c}}{1 + K_{o} \cdot e^{- (\frac{- 31421.84}{R T})} \cdot C_{A A}}

(A12)

References

Marler, R.T.; Arora, J.S. Survey of multi-objective optimization methods for engineering. Struct. Multidiscip. Optim. 2004, 26, 369–395. [Google Scholar] [CrossRef]
Rondeau, T.W.; Bostian, C.W. Cognitive Techniques: Physical and Link Layers. In Cognitive Radio Technology, 2nd ed.; Bradley Department of Electrical and Computer Engineering Virginia Tech: Blacksburg, VA, USA, 2009; pp. 219–268. [Google Scholar]
Fleming, P. Designing control systems with multiple objectives. IEE Master Class Adv. Control Technol. 1999, 142, 4. [Google Scholar] [CrossRef]
Dias, L.; Asadi, E.; da Silva, M.G.; Glicksman, L.; Antunes, C.H. Multi-objective optimization for building retrofit: A model using genetic algorithm and artificial neural network and an application. Energy Build. 2014, 81, 444–456. [Google Scholar]
Elmeligy, A.; Mehrani, P.; Thibault, J. Artificial Neural Networks as Metamodels for the Multiobjective Optimization of Biobutanol Production. Appl. Sci. 2018, 8, 961. [Google Scholar] [CrossRef] [Green Version]
Farshad, F.; Iravaninia, M.; Kasiri, N.; Mohammadi, T.; Ivakpour, J. Separation of toluene/n-heptane mixtures experimental, modeling and optimization. Chem. Eng. J. 2011, 173, 11–18. [Google Scholar] [CrossRef]
Nascimento, C.A.O.; Giudici, R.; Guardani, R. Neural network based approach for optimization of industrial chemical processes. Comput. Chem. Eng. 2000, 24, 2303–2314. [Google Scholar] [CrossRef]
Magnier, L.; Haghighat, F. Multiobjective optimization of building design using TRNSYS simulations, genetic algorithm, and Artificial Neural Network. Build. Environ. 2010, 45, 739–746. [Google Scholar] [CrossRef]
Tagliarini, G.A.; Christ, J.F.; Page, E.W. Optimization Using Neural Networks. IEEE Trans. Comput. 1991, 40, 1347–1358. [Google Scholar] [CrossRef]
Altissimi, R.; Brambilla, A.; Deidda, A.; Semino, D. Optimal operation of a separation plant using artificial neural networks. Comput. Chem. Eng. 1998, 22, S939–S942. [Google Scholar] [CrossRef]
Anna, H.R.S.; Barreto, A.G.; Tavares, F.W.; de Souza, M.B. Machine learning model and optimization of a PSA unit for methane-nitrogen separation. Comput. Chem. Eng. 2017, 104, 377–391. [Google Scholar] [CrossRef]
Yao, X. Evolving Artiﬁcial Neural Networks. Proc. IEEE 1999, 87, 1423–1447. [Google Scholar]
Wittcoff, H.A.; Reuben, B.G.; Plotkin, J.S. Industrial Organic Chemicals, 3rd ed.; Wiley: Hoboken, NJ, USA, 2013. [Google Scholar]
Farmer, J.J. Process for Production of Acrylic Acid. Patent WO2016130993A1 International Application No. PCT/US2016/017868, 12 February 2016. [Google Scholar]
Xu, X.B.; Lin, J.P.; Cen, P.L. Advances in the Research and Development of Acrylic Acid Production from Biomass. Chin. J. Chem. Eng. 2006, 14, 419–427. [Google Scholar] [CrossRef]
Lin, M.M. Selective oxidation of propane to acrylic acid with molecular oxygen. Appl. Catal. A Gen. 2001, 207, 1–16. [Google Scholar] [CrossRef]
Redlingsho, H.; Kro, O.; Bo, W.; Huthmacher, K.; Emig, G. Catalytic Wall Reactor as a Tool for Isothermal Investigations in the Heterogeneously Catalyzed Oxidation of Propene to Acrolein. Ind. Eng. Chem. Res. 2002, 41, 1445–1453. [Google Scholar] [CrossRef]
Redlingshofer, H.; Fischer, A.; Weckbecker, C.; Huthmacher, K.; Emig, G. Kinetic Modeling of the Heterogeneously Catalyzed Oxidation of Propene to Acrolein in a Catalytic Wall Reactor. Ind. Eng. Chem. Res. 2003, 42, 5482–5488. [Google Scholar] [CrossRef]
Drysdale, D. An Introduction to Fire Dynamics, 3rd ed.; Wiley: Hoboken, NJ, USA, 2011. [Google Scholar]
Malshe, V.C.; Chandalia, S.B. Vapour Phase Oxidation of Acrolein to Acrylic Acid on Mixed Oxides as Catalyst. J. Appl. Chem. Bioiechnol. 1977, 27, 575–584. [Google Scholar] [CrossRef]
Perry, R.H.; Green, D.W. Perry’s Chemical Engineers’ Handbook, 8th ed.; McGraw-Hill: New York, NY, USA, 2007. [Google Scholar]
Zabetakis, M.G. Flammability Characteristics of Combustible Gases and Vapors; Bureau of Mines: Washington, DC, USA, 1965. [Google Scholar]
Bub, G.; Mosler, J.; Maschmeyer, D.; Sabbagh, A.; Fornika, R.; Peuckert, M. Process for the Production of Acrylic Acid. U.S. Patent US7294741B2, 13 November 2007. [Google Scholar]
Tsuneki, O.H.; Nonoguchi, I.M.; Nishi, I.K. Process for Producing Acrolein, Acrylic Acid and Derivatives Thereof. U.S. Patent US9422377B2, 23 August 2016. [Google Scholar]
Wibawanta, S.A.S. Catalytic Partial Oxidation of Propylene for Acrolein Production. Ph.D. Thesis, Curtin University, Perth, Australia, 2011. [Google Scholar]
Shiraishi, T.; Kishiwada, S.; Shimizu, S.; Shigern, H.; Hiroshi, I.; Yoshihiko, N. Catalytic Process for the Preparation of Acrolein. U.S. Patent US3970702A, 20 July 1976. [Google Scholar]
Kai-Tai, F.; Lin, D.K.J. Uniform Experimental Designs and their Applications in Industry. Handb. Stat. 2003, 22, 131–170. [Google Scholar]
Tai, F.K. The uniform design: Application of number-theoretic methods in experimental design. Acta Math. Appl. Sin. 1980, 3, 363–372. [Google Scholar]
Li, R.; Lin, D.K.J.; Chen, Y. Uniform Design: Design, Analysis and Applications. Int. J. Mater. Prod. Technol. 2005, 20, 101–114. [Google Scholar] [CrossRef]
Yeung, K. The Uniform Design. Hong Kong Baptist University. 2004. Available online: http://www.math.hkbu.edu.hk/UniformDesign/ (accessed on 23 August 2020).
Dreiseitl, S.; Ohno-Machado, L. Methodological Review Logistic regression and artificial neural network classification models: A methodology review. J. Biomed. Inform. 2002, 35, 352–359. [Google Scholar] [CrossRef] [Green Version]
Liu, Y.; Starzyk, J.A.; Zhu, Z. Optimizing number of hidden neurons in neural networks. In Proceedings of the 25th IASTED International Multi-Conference: Artificial Intelligence and Applications, Innsbruck, Austria, 12–14 February 2007; pp. 121–126. [Google Scholar]
Goh, A.T.C. Back-propagation neural networks for modeling complex systems. Artif. Intell. Eng. 1995, 9, 143–151. [Google Scholar] [CrossRef]
Zhou, B.; Vogt, R.D.; Lu, X.; Xu, C. Relative Importance Analysis of a Refined Multi-parameter Phosphorus Index Employed in a Strongly Agriculturally Influenced Watershed. Water Air Soil Pollut. 2015, 226, 25–38. [Google Scholar] [CrossRef]
Garson, G.D. Interpreting neural-network connection weights. AI Expert 1991, 6, 46–51. [Google Scholar]
Biegler, L.T.; Grossmann, I.E. Retrospective on optimization. Comput. Chem. Eng. 2004, 28, 1169–1192. [Google Scholar] [CrossRef]
Chiandussi, G.; Codegone, M.; Ferrero, S.; Varesio, F.E. Comparison of multi-objective optimization methodologies for engineering applications. Comp. Math. Appl. 2012, 63, 912–942. [Google Scholar] [CrossRef] [Green Version]
Holland, J.H. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications To Biology, Control, and Artificial Intelligence; University of Michigan Press: Ann Arbor, MI, USA, 1975. [Google Scholar]
Halsall-Whitney, H.; Thibault, J. Multi-objective optimization for chemical processes and controller design: Approximating and classifying the Pareto domain. Comput. Chem. Eng. 2006, 30, 1155–1168. [Google Scholar] [CrossRef]
Fettaka, S.; Gupta, Y.P.; Thibault, J. Multiobjective optimization of an industrial styrene reactor using the dual population evolutionary algorithm (DPEA). Int. J. Chem. React. Eng. 2012, 10. [Google Scholar] [CrossRef]
Thibault, J. Net Flow and Rough Sets: Two Methods for Ranking the Pareto Domain. In Multi-Objective Optimization—Techniques and Applications in Chemical Engineering; World Scientific: Singapore, 2009; pp. 199–246. [Google Scholar]
Vandervoort, A.; Thibault, J.; Gupta, Y.P. Multi-Objective Optimization of an Ethylene Oxide Reactor Multi-Objective Optimization of an Ethylene Oxide. Int. J. Chem. React. Eng. 2011, 9. [Google Scholar] [CrossRef]

Figure 1. Process flow diagram of the two-reactor section of the production of acrylic acid.

Figure 2. Reaction scheme for the propylene oxidation reactor (adapted from [18]).

Figure 3. Flowchart of the proposed methodology for solving multi-objective optimization using a three-layer artificial neural network (ANN) as meta-model.

Figure 4. Three layer feedforward artificial neural network for the acrylic acid MOO problem.

Figure 5. R² values for all the objective functions using 50 learning and 20 validation uniform design (UD) design points vs. Number of neurons in the hidden layer.

Figure 6. R² values for different number of random data points used for training and validation vs. Number of neurons in the hidden layer for (a) compression power in C-100 and (b) heat recovery in R-101.

Figure 7. Predictions of (a) conversion in R-101 and (b) compression power in C-100.

Figure 8. Ranked Pareto domain with net flow method (NFM) obtained with: (a) and (c) the phenomenological model and (b) and (d) the ANNs.

Figure 9. Best ranked solution for (a) decision variables and (b) objective functions using NFM with ANN, normalized with respect to the best solution of the phenomenological model.

Table 1. Objective functions for the multi-objective optimization (MOO) of the acrylic acid production process.

Objective Function	Variable	Max/Min	Equation
Compression Power in C-100 [kW]	OF₁	Min	$\dot{W} = \frac{{\dot{n} RT}_{1} ({(\frac{P_{1}}{P_{Feed}})}^{a} - 1)}{a \cdot η}$
Heat recovery in R-100 [kW]	OF₂	Max	${\dot{H}}_{rxn j} (T) = \sum_{j} Δ H_{rnx j} (T) \cdot ξ_{j}$
Productivity in R-100 [kmol/m³h]	OF₃	Max	$Prod = \frac{F_{Acrolein}}{V_{1}}$
Conversion in R-100 [%]	OF₄	Max	$Conv = \frac{(F_{reactant in} - F_{reactant out}}{F_{reactant in}}) \times 100$
Compression Power in C-101 [kW]	OF₅	Min	$\dot{W} = \frac{{\dot{n} RT}_{1} ({(\frac{P_{2}}{P_{1}})}^{a} - 1)}{a \cdot η}$
Heat recovery in R-101 [kW]	OF₆	Max	${\dot{H}}_{rxn j} (T) = \sum_{j} Δ H_{rnx j} (T) \cdot ξ_{j}$
Productivity in R-101 [kmol/m³h]	OF₇	Max	$Prod = \frac{F_{AcrylicA}}{V_{2}}$
Conversion in R-101 [%]	OF₈	Max	$Conv = \frac{(F_{reactant in} - F_{reactant out}}{F_{reactant in}}) \times 100$
Excess oxygen concentration above LFL	OF₉	Min	Sum_E = 0 If [O₂] > 0.07 then Sum_E=Sum_E + ([O₂] − 0.07) ∗ dW

Table 2. Decision variables and their allowable ranges.

Decision Variables	x	Min	Max	References
Molar flowrate of propylene [kmol/h]	F_P	91	203
Molar flowrate of air [kmol/h]	F_A	433	2900
Molar flowrate of steam [kmol/h]	F_S1	91	3047
Molar flowrate of water vapor [kmol/h]	F_S2	100	4000
Temperature in R-100 [°C]	T₁	330	430	[17,18]
Temperature in R-101 [°C]	T₂	285	315	[20]
Pressure in R-100 [bar]	P₁	1.05	6	[23,24,25,26]
Pressure in R-101 [bar]	P₂	3	6	[23,24,26]

Table 3. Relative importance of the input variables on the objective functions in the selected ANN according to the modified Garson method.

	Relative Importance (%)
Objectives/Decision Variables	F_P	F_A	F_S1	F_S2	T₁	T₂	P₁	P₂	Bias
OF₁	1.62	17.14	2.64	1.02	1.76	1.96	41.28	1.79	30.79
OF₂	9.07	13.75	5.23	0.79	27.76	1.93	8.22	1.33	31.90
OF₃	13.42	9.03	3.69	1.24	41.58	0.83	12.25	3.01	14.95
OF₄	5.59	18.22	7.03	4.04	36.30	1.33	19.59	2.01	5.89
OF₅	0.62	11.45	16.24	11.69	6.36	2.22	2.02	26.39	23.02
OF₆	15.24	18.19	9.69	7.66	18.37	1.83	15.42	2.61	10.99
OF₇	17.44	34.58	5.90	6.16	10.72	1.50	8.64	2.80	12.27
OF₈	8.64	38.13	8.07	6.60	13.50	3.42	6.16	7.38	8.10
OF₉	10.91	26.29	15.98	0.86	4.55	0.30	10.09	0.77	30.26

Table 4. Objective functions of the best ranked solution using NFM from the Pareto domain obtained with a population of 5000; F_P = 210.0 kmol/h, F_A = 1507.9 kmol/h, F_S1 = 206.6 kmol/h, F_S2 = 100.0 kmol/h, T₁ = 697.65 °C, T₂ = 580.72 °C, P₁ = 1.05bar and P₂ = 4.01 bar.

Objective Function	Meta-Model	Phenomenological Model	% Difference
OF₁	90.42	91.10	0.75
OF₂	23460	23700	1.02
OF₃	0.8992	0.9866	9.27
OF₄	94.97	96.25	1.34
OF₅	6581	6794	3.19
OF₆	10755	11104	3.19
OF₇	1.179	1.202	1.93
OF₈	81.67	82.97	1.58
OF₉	0.016	0.000	-

Table 5. Objective functions of the solution obtained using the weighted sum method from the Pareto domain; F_P = 210.0 kmol/h, F_A = 1636.5 kmol/h, F_S1 = 495.74 kmol/h, F_S2 = 100.0 kmol/h, T₁ = 628.46 °C, T₂ = 570.79 °C, P₁ = 3.66 bar, and P₂ = 6.00 bar.

Objective Function	Meta-Model
OF₁	2170.50
OF₂	25349
OF₃	1.1457
OF₄	100.00
OF₅	9450
OF₆	13259
OF₇	1.331
OF₈	88.50
OF₉	0.057

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sepulveda, G.C.; Ochoa, S.; Thibault, J. Methodology to Solve the Multi-Objective Optimization of Acrylic Acid Production Using Neural Networks as Meta-Models. Processes 2020, 8, 1184. https://doi.org/10.3390/pr8091184

AMA Style

Sepulveda GC, Ochoa S, Thibault J. Methodology to Solve the Multi-Objective Optimization of Acrylic Acid Production Using Neural Networks as Meta-Models. Processes. 2020; 8(9):1184. https://doi.org/10.3390/pr8091184

Chicago/Turabian Style

Sepulveda, Geraldine Cáceres, Silvia Ochoa, and Jules Thibault. 2020. "Methodology to Solve the Multi-Objective Optimization of Acrylic Acid Production Using Neural Networks as Meta-Models" Processes 8, no. 9: 1184. https://doi.org/10.3390/pr8091184

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Methodology to Solve the Multi-Objective Optimization of Acrylic Acid Production Using Neural Networks as Meta-Models

Abstract

1. Introduction

2. Description of the Acrylic Acid Production

2.1. Reactor Model

2.2. Propylene Oxidation

2.3. Acrolein Oxidation

2.4. Flammability Limits

3. Methodology for Solving a Multi-Objective Optimization Problem

3.1. Definition of the Optimization Problem

3.2. Design of Experiments

3.3. Artificial Neural Networks as Meta-Models

3.3.1. ANN Architecture

3.3.2. Building the ANN

3.3.3. Modified Garson Algorithm

3.4. Optimization Algorithm

3.5. Ranking of the Pareto Domain

4. Results

4.1. Construction of the Meta-Model

4.2. Multi-Objective Optimization

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

Nomenclature

Greek Symbols

Subscripts

Appendix A. Set of Rate Equations for Each Reaction

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI