1. Introduction
Production planning and control (PPC) plays a crucial role in mitigating disruptions such as machine failures within manufacturing organizations [
1]. Machine failures, particularly those classified as “soft failures”, often arise due to the gradual and irreversible damages accrued during operation [
2]. An inaccurate model of a machine’s degradation trajectory can lead not only to erroneous fault predictions but also to eventual machine failures. Ultimately, unpredicted failures can result in various undesirable expenses, including the costs of lost production, wasted materials, and products [
3].
The field of predictive maintenance has engaged in comprehensive discussions about fault protection, generally viewing it from two unique perspectives. The first of these perspectives takes a macro approach (highlevel), focusing on broader policymaking strategies such as optimizing maintenance through reliability analysis. Typically, this method views the system and its various subsystems as interconnected, unified entities. In contrast, the second perspective adopts a micro approach (lowlevel), concentrating on improving the field machines’ functionalities. The main aim here is to enhance the availability of field machinery and to accurately estimate the state of health (SoH) of the machinery involved.
The primary focus of highlevel reliability analysis and maintenance optimization is to avoid system failure by determining the optimal time for maintenance based on different existing constraints. Most of these approaches employ statistical or mathematical models to substitute the physical model of deterioration to estimate the reliability of the system [
4]. For instance, the authors in [
5] proposed a method for the optimization of conditionbased maintenance for systems under random shocks by optimizing the inspection times according to the system reliability reduction after each shock. In [
6], the maintenance planning was optimized in accordance with maintenance resource constraints by deploying nonperiodic inspections and minimizing the expected total cost per unit of the failure or repair. In [
7,
8], the authors proposed maintenance strategies for the optimization of different criteria, e.g., the maximization of availability or minimization of the maintenance cost, for the maintenance systems with constraints on resources and capacity.
Although these methods optimize maintenance based on various constraints, they have strong assumptions about the degradation models or accessibility of some data, making their results exclusive to particular problems. Additionally, the use of mathematical and statistical models replaces the direct relationship between the system’s structure and its degradation with datadriven statistics, bypassing the need for physically interpretable data [
9,
10,
11]. This means that, despite the costsaving benefits of this simplification by eliminating physical modeling, which is an expensive task, this method hinders the ability to reason based on causeandeffect relationships that physical reasoning provides. This approach implies that the system’s reliability is influenced by obscure or, at most, partially understood processes that can only be approximated rather than enhanced or controlled. In this way, systems cannot actively support highlevel decisions, such as delaying or scheduling failures, by taking optimal actions in the field machines.
On the other hand, with a lowlevel reliability analysis, the focus is on enhancing the accuracy of the SoH estimation for a single machine as a physical entity. This is usually accomplished by modeling the degradation in the system using different methods. For example, refs. [
12,
13] used a knowledgebased method, refs. [
14,
15,
16,
17] used physical modeling, and [
18,
19,
20,
21,
22] used datadriven methods for the SoH estimation or remaining useful life (RUL) prediction. Additionally, several studies have proposed methods for controlling the degradation in the machines by either physically modeling the dynamics of the system and its respective degradation [
23,
24,
25,
26] or using datadriven methods for estimating the physical parameters of the machine and its degradation [
27,
28,
29,
30].
However, to achieve a more comprehensive and flexible PPC, highlevel policymaking methods should incorporate these lowlevel methods and work as an integrated method. Nonetheless, two reasons make existing lowlevel reliability analysis methods exclusive and only adaptable to some highlevel decision making methods. First, these methods try to fit the system’s degradation into predefined mathematical or physical templates, which is a strong assumption as degradation patterns may differ across parts of the system or may not even adhere to any recognizable mathematical form [
31,
32]. Second, similar to the highlevel methods, by discarding the physical model and using machine learning methods [
33,
34,
35], the connection to the physics of the system is not established, and valuable information regarding the type of degradation and the eventual fault is lost.
Given the gap between highlevel decision making methods and lowlevel machinespecific actions, the central question is whether it is possible to support highlevel decisions by implementing lowlevel actions such that machines reach a predetermined maintenance level at the desired time.
In order to do so, the degradation in the machines must be controlled. This requires accurate identification of the relationship between degradation and machine, and proper design of the controller. The authors of [
36] showed that degradation in the system can be observed through the residuals of the system’s mathematical model and its physical model. The physical relationship between degradation and system dynamics was later established in [
37,
38]. Here, the system was considered observable and the damage was considered an unobservable state. Following this, several researchers proposed different methods to address controller and actuator degradation under various constraints. A variable structure controller was designed in [
39] to manage different performance challenges and failures. An integrated fault detection, diagnosis, and reconfigurable control scheme was presented in [
40]. Methods to handle uncertainty and unobservability in degradation control were introduced in [
41] for linear systems and discussed in [
42] for nonlinear and singular systems. Moreover, a method for degradation control of systems with nonlinearities and missing data was suggested in [
43].
Examining the evolution path, studies on degradation control generally adopt two primary approaches. Methods like those presented in [
44,
45] employ a robust control scheme to counteract the degradation and inefficiency of actuators. Conversely, methods such as those in [
38,
46] leverage the physical model of degradation when designing controllers. The main challenge for both sets of methods is their stringent assumptions regarding the degradation model, disturbances, and accessibility.
To address the challenges in degradation control, this paper proposes a method for controlling degradation by modifying optimal control schemes. Controlling degradation results in longer mean time to failure (MTTF) and reduced maintenance costs by minimizing downtimes or rerouting faults from expensive system components to more economical ones. This study controls degradation by defining it as a new virtual, controllable state of the system. This approach tackles the challenges associated with assumptions about the degradation model, encountered in physical modeling of degradation or inaccessibility to disturbances and the degradation model as faced by robust controllers. These issues are addressed using the sparse Bayesian learning method, which empirically identifies the degradation model using the system’s historical data. Furthermore, the mathematics of machine degradation are studied comprehensively to determine the means of incorporating degradation behavior into the system dynamics.
The remainder of the paper is organized as follows.
Section 2 introduces the methodology of the proposed approach.
Section 3 explains the procedure used for the simulation of the method.
Section 4 presents the results and validation process, while
Section 5 discusses the advantages and disadvantages of the method. Finally, conclusions are presented in
Section 6.
2. Materials and Methods
In the proposed method, machine degradation is controlled through a fourstep process. Initially, the degradation is detected, followed by the identification of the dynamics of degradation via processcontrolled learning, which maps the degradation into the system’s states and inputs. Subsequently, these dynamics are incorporated into the model of the system. Finally, control is exerted over the degradation while simultaneously maintaining the quality of the output.
2.1. StateSpace Mode and Degradation
The statespace mode (SSM) can be written as follows:
where
x includes the system state(s),
u is system input(s),
A and
B represent the physical system parameters (considered constant in timeinvariant systems),
C is the relationship between the output(s) and state(s) of the system,
z is the controlled output,
M configures the states to be controlled,
${\omega}_{1}$ is the process noise, and
${\omega}_{2}$ is the measurement noise.
Also, degradation is defined as a trend in the recorded signal(s) or signal feature(s) of the system [
47]. According to the SSM, degradation either affects the output
or the input and output at the same time:
where
t is time,
x is the system state,
${x}_{D}$ is the degraded state,
u is the input,
${y}_{D}$ is the system’s output affected by the degradation, and
$D\left(t\right)$ is the degradation as a function of time [
31].
According to (2) and (3), degradation is reflected in changes in the
y value. These changes arise from alterations in the system parameters (
$A,B$, or
C from (
1)) due to system deterioration. Conversely, controllers are engineered with the purpose of sustaining a desired output. This is under the presumption that system parameters are constant over time, which is not accurate in the face of degradation. As a result, the controller persistently seeks to match the real output with the desired output by manipulating system states through system inputs and compensating for undesired changes in system output. Over time, since the controller’s design relies on the nominal values of the system parameters, the actual controlled output begins to diverge from the desired output. Once this discrepancy surpasses a set limit, or, in other words, the system output is outside the desired tolerance range, the system is deemed to have failed and requires maintenance.
Figure 1 illustrates the process of degradation and its impact on the closedloop system, ultimately leading to system failure.
2.2. Optimal Control
A linear quadratic regulator (LQR) is an optimal controller designed based on the SSM [
48]. The quadratic criterion that the LQR minimizes is provided as follows:
where
$e=zr$,
r is the reference signal,
z is the controlled output,
e is the error,
u is the control input, and
${Q}_{1}$ and
${Q}_{2}$ are penalty matrices for the error and input signal, respectively. The optimal control signal for this controller can be written as follows:
where the optimal feedback gain,
L, is then calculated by solving
subjected to a Riccati equation:
The LQR optimizes the control problem for the infinite horizon, which means that the optimal feedback gain stays the same regardless of the inputs and outputs throughout the system’s lifetime.
On the other hand, a model predictive controller (MPC) finds the optimal solution that will minimize a cost function at every single point in time within a set future timeframe. In other words, unlike the Riccati equation that optimizes the solution for the infinite horizon in LQR design, the MPC performs the optimization in the finite horizon. The cost function for the MPC can be expressed as follows:
where
$\widehat{e}(t+it)=\widehat{z}(t+it)r(t+it)$,
$\widehat{z}$ is the predicted controlled output,
r is the desired output,
$\widehat{e}$ is the predicted error,
${N}_{p}$ is the prediction horizon,
${N}_{c}$ is the control horizon,
$\mathsf{\Delta}\widehat{u}$ is the predicted control increment, and
${Q}_{1}$ and
${Q}_{2}$ are penalty matrices.
Both the LQR and MPC control strategies are built upon the system’s original model, using nominal parameter values. However, degradation induces a gradual shift in the nominal parameter values, moving them away from the original values that were used for designing the controller. This drift is more than just a minor concern; it directly compromises the controller’s performance and its ability to maintain the desired level of control quality. As this drift accumulates, the gap between the system’s intended and actual outputs begins to widen. When this gap crosses a certain threshold, it signals impending system failure.
2.3. Virtual Health State
In order to have the degradation as a distinct controllable state, the system’s state space must be extended. However, since the health state is not a physical state of the machine, it should be considered a virtual state dependent on the physical states and inputs of the machine. This extended state space of the system for degradation control can be written as follows:
where
n is the number of outputs, and
ℓ is the number of inputs, and let
${\mathit{W}}_{\mathit{x}}$ and
${\mathit{W}}_{\mathit{u}}$ denote the vectors of coefficients that map system outputs and inputs to the derivative of degradation (
$\dot{D}$).
Evidently, to be able to generate (
10) to control the degradation,
$\dot{D}$ should follow the specific format of
and
${\mathit{W}}_{\mathit{x}}$ and
${\mathit{W}}_{\mathit{u}}$ should be estimated according to the machinespecific degradation, which is influenced by machineexclusive working conditions.
2.4. Identification of $\dot{D}$
To compute the
${\mathit{W}}_{\mathit{x}}$ and
${\mathit{W}}_{\mathit{u}}$ values as outlined in (
11), the initial step involves calculating
${\dot{D}}_{c}$ as a target value for degradation. This calculation should align with the system’s degradation trend, which primarily falls into two categories: linear and exponential. The focus here is solely on the rate at which the system parameters are drifting, rather than the specific form of the degradation pattern. Therefore, even if the degradation follows a different curve—say, a sinusoidal pattern—the key concern is the rate at which its amplitude increases or decreases.
In order to determine the target value for degradation, it is essential to identify the degradation trend and calculate its likelihood of being either linear or exponential simultaneously. Considering the noise, the trend of the degradation in the machine will follow
or
where
${y}_{ss}\left(c\right)$ is the steady state of controlled output at cycle
c,
${\mu}_{D}$ is the drift of the degradation, and
V is the variance of the degradation.
Knowing the structure of the drifting degradation, it is possible to calculate the likelihood of observing
${y}_{ss}\left(c\right)$ given that the degradation follows a linear or exponential model. The likelihood of the degradation generated by model one can be calculated by
and the likelihood of the degradation generated by model two can be calculated by
Then, the probability of
${y}_{ss}\left(c\right)$ generated using both models can be expressed as
and
where
${L}_{i}\left(c\right)={L}_{i}\left({y}_{ss}\left(c\right)\right{y}_{ss}(c1))$ for
$i\in \{1,2\}$.
Knowing these probabilities, after each cycle, it is possible to find the target value of
D. If
${p}_{1}\left(c\right)\ge {p}_{2}\left(c\right)$:
if
${p}_{1}\left(c\right)<{p}_{2}\left(c\right)$:
where
a is an optional constant based on the limitations on desired calculation accuracy.
As the subject of interest in Equation (
11) is
$\dot{D}$ (the derivative of the degradation), the mapping should aim to correlate the system’s recorded features with
$\dot{\widehat{D}}=\frac{d\widehat{D}}{dc}$. This derivation can be obtained as follows:
where
${T}_{c}$ is the length of a cycle
c and
${T}_{c1}$ is the length of the cycle
$c1$.
Knowing
$\dot{D}$, and having access to the records of the system states and inputs over time, it is possible to identify
${\mathit{W}}_{\mathit{x}}$ and
${\mathit{W}}_{\mathit{u}}$ in (
11).
2.5. Identification of ${\mathit{W}}_{\mathit{x}}$ and ${\mathit{W}}_{\mathit{u}}$
Because the degradation rate changes much more slowly than the system inputs and outputs, the features extracted offer sufficient insights into the rate of degradation. Thus, the vector of features for each cycle can be generated using
where
${f}_{i}^{y}\left(c\right)$ is the feature of the
${i}_{th}$ output recorded during cycle
c (e.g., maximum, mean, etc.),
${f}_{j}^{u}\left(c\right)$ is the feature of the
${j}_{th}$ input recorded during cycle
c,
${Y}_{i}$ and
${U}_{j}$ are the sets including all recorded samples from outputs
i and input
j, respectively, from
${t}_{0}$ to
${t}_{end}$ (beginning and end time of cycle
c), and the chosen feature type may differ, provided it demonstrates monotonicity. This process is shown in
Figure 2. Finally,
${\mathit{W}}_{\mathit{x}}$ and
${\mathit{W}}_{\mathit{u}}$ can be calculated using the following process:
2.6. Relevance Vector Machine
Optimization mentioned in (23) can be solved using different optimization techniques. However, there are two main points that need to be considered when choosing a method. Firstly, as the number of states in the system may increase, using parsimonious or sparse regression becomes more suitable. This approach ensures that only parameters with high confidence will have coefficients with nonzero amplitude, thereby minimizing complexity in terms of controllability and observability when a virtual state is added to the system Secondly, the method must be fast as it will be used in online control systems.
RVM was first introduced in [
49] as a sparse Bayesian learning method. RVM calculation for linear kernels is as fast as the least squares method and provides parsimonious results due to its sparse nature. This makes it a superior choice for this optimization. The RVM structure is very similar to the SVM, which is provided as follows:
where
${N}_{s}$ is the number of samples;
${w}_{n}$ is a coefficient in
$\mathit{W}$, which is the vector of coefficients;
$k(\xb7,\xb7)$ is the kernel function; and
b is the bias parameter. The RVM defines the conditional distribution for the target value
$\mathit{y}$ given the vector of covariates
$\mathit{x}$, prediction outcome
$\widehat{\mathit{y}}$, regression coefficients
$\mathit{W}$, and a precision parameter
$\psi $ as
The likelihood function for
$\mathit{y}$ can be written as
The RVM introduces a prior distribution for each
w in
$\mathit{W}$ as a hyperparameter
$\alpha $:
where
${N}_{M}$ is the number of covariates (including bias). The hyperparameter
$\alpha $ measures the precision of each
${w}_{i}$. Following Bayesian inference, the distribution of the weights becomes Gaussian and takes the following form:
in which
2.7. Algorithm
The flowchart of the proposed method is shown in
Figure 3.
The pseudocode outlining the algorithm’s functionality is also presented in Algorithm 1.
Following each cycle during which the system’s inputs and outputs are recorded, the parameters can be updated. This allows for the recalculation of Equation (
10) with the new parameters. Subsequently, a new optimal feedback is computed, capable of not only regulating the system but also managing its degradation.
Algorithm 1 Iterative Degradation State Parameters Identification 
 1:
Initialize:  2:
${p}_{1}\leftarrow 0.5$, ${p}_{2}\leftarrow 0.5$ ▹ Initial model probabilities  3:
${\mu}_{1}\leftarrow 0$, ${\mu}_{2}\leftarrow 0$ ▹ Initial mean estimates  4:
${V}_{1}\leftarrow 1$, ${V}_{2}\leftarrow 1$ ▹ Initial variance estimates  5:
$\alpha \leftarrow initialize$, $\beta \leftarrow initialize$ ▹ Learning rates  6:
for each new data point ${y}_{c+1}$ do  7:
Calculate Likelihoods:  8:
${L}_{1}\leftarrow \frac{1}{\sqrt{2\pi {V}_{1}}}exp\left(\frac{{({y}_{c+1}({y}_{c}+{\mu}_{1}))}^{2}}{2{V}_{1}}\right)$  9:
${L}_{2}\leftarrow \frac{1}{\sqrt{2\pi {V}_{2}}}exp\left(\frac{{({y}_{c+1}({y}_{c}+exp\left({\mu}_{2}c\right)))}^{2}}{2{V}_{2}}\right)$  10:
Update Probabilities:  11:
${p}_{1}\left(c\right)\leftarrow \frac{{L}_{1}\times {p}_{1}(c1)}{{L}_{1}\times {p}_{1}(c1)+{L}_{2}\times {p}_{2}(c1)}$  12:
${p}_{2}\left(c\right)\leftarrow \frac{{L}_{2}\times {p}_{2}(c1)}{{L}_{1}\times {p}_{1}(c1)+{L}_{2}\times {p}_{2}(c1)}$  13:
${p}_{1}\left(c\right)\leftarrow {p}_{1}(c+1)$, ${p}_{2}\left(c\right)\leftarrow {p}_{2}(c+1)$  14:
Update Parameter Estimates:  15:
${\mu}_{1,\mathrm{new}}\leftarrow \alpha \times {\mu}_{1}+(1\alpha )\times ({y}_{c+1}{y}_{c})$  16:
${V}_{1,\mathrm{new}}\leftarrow \beta \times {V}_{1}+(1\beta )\times {({y}_{c+1}{y}_{c}{\mu}_{1,\mathrm{new}})}^{2}$  17:
${\mu}_{2,\mathrm{new}}\leftarrow \alpha \times {\mu}_{2}+(1\alpha )\times ({y}_{c+1}{y}_{c})$  18:
${V}_{2,\mathrm{new}}\leftarrow \beta \times {V}_{2}+(1\beta )\times {({y}_{c+1}{y}_{c}exp\left({\mu}_{2,\mathrm{new}}c\right))}^{2}$  19:
${\mu}_{1}\leftarrow {\mu}_{1,\mathrm{new}}$, ${V}_{1}\leftarrow {V}_{1,\mathrm{new}}$  20:
${\mu}_{2}\leftarrow {\mu}_{2,\mathrm{new}}$, ${V}_{2}\leftarrow {V}_{2,\mathrm{new}}$  21:
if ${p}_{1}\left(c\right)\ge {p}_{2}\left(c\right)$, $D\left(c\right)=ac$  22:
if ${p}_{1}\left(c\right)<{p}_{2}\left(c\right)$, $D\left(c\right)=a{e}^{c}$  23:
Calculate $\dot{D}$ according to (20)  24:
$\underset{\mathit{W}}{min}}\phantom{\rule{4pt}{0ex}}(\mathit{F}{\mathit{W}}^{\mathit{T}}\dot{\mathcal{D}})$  25:
end for

4. Results
This section first discusses the closedloop system responses for both LQR and MPC. Next, results from the degradation simulation in the closedloop system and the effect of penalty matrices on the system’s lifetime without a degradation controller are presented. The third part explains the calculated degradation coefficients. Then, the results from the degradation controller are detailed, followed by an explanation of why the degradation control is not a function of the penalty matrices. Finally, the output quality of regular controllers and degradation controllers is compared.
4.1. ClosedLoop Responses
Figure 5A illustrates the system’s behavior when controlled by an LQR for different penalty values. The first plot of
Figure 5A specifically focuses on the piston position. Systems with higher penalty values for
${Q}_{1}$ achieve the target position more quickly at the cost of increased input force. In the steady state, it is notable that systems with higher penalty values continue to exert greater force, resulting in higher pressures within the cylinder.
Figure 5B depicts the closedloop performance of the system under MPC, also considering varying penalty values. Although all three systems yield the same output, systems with higher penalty values require more force and, consequently, higher pressures on both sides of the cylinder in the steady state, similar to the LQRcontrolled systems.
In summary, based on the behaviors observed with both the LQR and MPC control methods, higher penalty values generally result in increased energy consumption. This is to account for uncertainties about future disturbances, but it also leads to greater degradation of the system.
4.2. Effect of Penalties
After designing the degradation controller, one state (degradation) is added to the system states, changing the SSM of the system. As a result, another configurable penalty value is added to ${Q}_{1}$.
Thus, the causality of the old and new penalty values on the final result should be removed to study the effect of the degradation controller separately.
For this reason, the system is simulated for the entire spectrum of possible penalty values.
Figure 6 shows the lifespan of the machine as a function of penalty value. The Monte Carlo simulation is performed 100 times for each penalty value. Zero lifetime for some penalty values means the controller cannot keep the output within the thresholds using that penalty value. The longest lifespan is achieved using
${Q}_{1}=10$ for all three degradation models under the LQR control scheme. The MPC behavior is similar for all degradation cases when
${Q}_{1}$ ranges from 6 to 10. As a result, the
${Q}_{1}$ is chosen to be 10 as the penalty value that yields the longest lifespan for all simulations.
4.3. Degradation Simulation
After designing the controller, the closedloop system is simulated for several cycles. The controller is designed according to the primary model (model with initial parameter values before degradation). However, after each running cycle, the chosen physical parameters are updated in the system model (
A,
B, or
C mentioned in (
1)) according to degradation models and the output recorded for each cycle. This process continues until the output exceeds the acceptable tolerance. The evolutions of the system’s parameters over time under different degradation models result in different MTTFs of the machines.
Figure 7 shows the effect of three degradation models explained in (
35) on the output. The maximum MTTF with the LQR occurs in the first degradation model, as shown in
Figure 7A, and it takes more than 800 cycles for the machine to fail. Meanwhile, the maximum MTTF with the MPC occurs in the second degradation model, as shown in
Figure 7B, and it takes more than 100 cycles for the machine to fail.
4.4. Identification of the Dynamics of Degradation
Table 3 presents the final degradation coefficients calculated using RVM on simulated data [
52].
Evidently, in the system using LQR, the pressures inside the cylinder and the input force affect machine degradation in all three models. Also, the coefficients are considerably smaller for the first degradation model compared to the other two models, which corroborates the longer machine lifetime with this model.
Meanwhile, the coefficients are negative in the second model, indicating that greater pressure and force cause less degradation. In this case, higher force and pressure mean more fresh oil and consequently less degradation; this can be observed in
Table 2, where the higher external leakage (
${C}_{e}$) compensates more for the oil degradation (
${b}_{i}$ reduces the mean value of the degradation of
${d}_{i}$). As the secondary side of the cylinder is larger, higher pressure on the secondary side means more leakage, and this is why its coefficient is larger than the coefficient of the other side. This also explains the negative coefficients with a larger amplitude of the secondary side pressure. Furthermore, degradation affects the system faster in the third degradation model using LQR. These large positive coefficients corroborate the shorter lifetime of the machine degrading with the third model. In the machine with MPC, the analysis is not as simple and may not be possible; the reason is discussed in the next section.
Figure 8 shows the online calculation of the degradation coefficient of the input force for both controllers using the proposed algorithm. In order to show the effectiveness of the method, a continuously working mill is assumed to be affected by different degradation models. The cycle when the degradation model affecting the machine has changed is depicted by the red line. Comparing these graphs shows that, in the systems with a timeinvariant feedback loop (LQR), the detection of the degradation trend, and, thus, the identification of the degradation dynamics, can be accomplished faster because of the Bayesian nature of the RVM. This achieved speed is because the noise is the only parameter affecting this identification. However, in the systems with a timevarying feedback loop, the identification of the degradation dynamics takes more time because the dynamic feedback loop behaves as noise for Bayesian learning.
For the next step, the system state space is updated using the calculated coefficients and according to the extended state space mentioned in (
10). Then, the controllers are designed based on the new SSM, which now includes the degradation state.
4.5. Degradation Control
Figure 9 shows the trend of output recorded at the steady state of each cycle after the controller is updated. According to the degradation models, if the system output stays within the desired limits (i.e., condition
F in (
36) is not met), the oil degradation (
${R}_{\rho}$, as defined in (
38)) will make maintenance necessary.
As the degradations are defined as timedependent, the longest possible MTTF (reaching the point when hydraulic oil needs changing), considering the disturbances, will be around 1400 cycles. The plots in
Figure 9 show the output affected by the new control policies in compensating for the degradation. The adverse effects of degradation models are successfully compensated, the system’s output is kept within desired limits, and the system has reached its maximum possible MTTF.
Figure 10 shows the system’s step response with and without degradation control. Unlike the normal controller that maintains pressure and input force throughout the process (
Figure 10A), degradation controllers bring these three states to zero (or near zero) and keep them at their lowest point after the output settles at their desired position (
Figure 10B,C).
Figure 10B shows the system’s step response using the controller designed for the first degradation model. According to
Table 3, the leading causes of the degradation are the pressures on both sides of the cylinder and input force. To control this degradation model, the peak pressure is reduced considerably compared to
Figure 10A, and the three states are settled at the lowest possible values throughout the process.
Figure 10C shows the control of the second degradation model. According to its coefficients from
Table 3, the pressure on each side of the cylinder has a reverse effect on the degradation. This is because, as previously explained, the system is assumed to compensate for the leaked oil with new oil. As a result, this new oil compensates for the oil degradation. Additionally, the force coefficient is negative because more force means more leakage, resulting in further compensation for the degradation. To reduce the degradation, the controller increases the peak pressure (compared to the previous degradation model in
Figure 10B) and maintains it for a longer time. These actions lead to more leakage, more fresh oil injection, and, as a result, less degradation.
As mentioned above, unlike the coefficients in the LQR, understanding the physics behind the calculated coefficients of the degradation state for the system using MPC may be challenging. The effect of the parameters can be seen in the control method, and the degradation control is successful (
Figure 9). However, the exact physical reason may be unknown due to various reasons. First, each control strategy may degrade system parts differently. Second, this method only works based on the information from the system states. Therefore, all information about the degradation is received through systems states. Thus, if not impossible, this is a complex task to find the exact physical reason based on these records. Third, and most important, the optimization methods used in controllers vary; e.g., the optimization for the LQR is only performed once during the design stage, maintaining the same controller behavior in all conditions. However, the MPC optimization is performed at each cycle, making it a complex task to analyze and physically interpret the result. The shift in the optimal point is readily apparent in
Figure 8. In this figure, the LQR controller maintains constant coefficients throughout the process, whereas the coefficients for the MPC controller exhibit variations during the same period. It is worth noting that the physical interpretation of its dynamics is not a matter of concern. Only the physical relation of the dynamics of the degradation to the system’s states is required for degradation control.
4.6. Effect of Penalties
To ensure the degradation control does not occur because of higher penalty values and to remove the causality of the penalty values on the final deduction, the penalty effect is studied without the degradation control and shown in
Figure 6.
Figure 11 shows the effect of the penalty value on the degradation controllers. These figures are generated using the Monte Carlo simulation. For each combination of
${Q}_{1}$ and
${q}_{6}$, where
${q}_{6}$ is the penalty value for the degradation state, the Monte Carlo simulation is performed 100 times.
Figure 11 plots the mean MTTF from these simulations. It can be seen that the system MTTF reaches its maximum for different combinations of
${Q}_{1}$ and
${q}_{6}$.
A comparison of the MTTFs of the normal machine and the machine with the degradation controller shows that the resilience to degradation is the result of including the degradation in the SSM as a controllable state, not a function of penalty value. This can be inferred because, regardless of the penalty value chosen for ${Q}_{1}$, the system never reaches the maximum possible MTTF with the controller designed without the degradation state.
4.7. Control Quality
To study the quality of the new controller and compare it with that of the normal controller, both systems are tested with similar disturbances. A firstorder disturbance model was adopted based on empirical data from rolling machines, wherein disturbances typically stabilize with minimal oscillations. The selected time constant mirrors that of the machine itself. This selection aligns with the fact that rolling machines often have adjustable output rates to achieve desired output quality standards. Moreover, the gain of the disturbance model is determined from the machine’s historical data, ensuring that disturbances do not exceed
$20\%$ of the maximum force. An essential consideration is that the disturbance data were not used as the prior information in the controller’s design. Thus, its model impacts all controllers uniformly, allowing for a generalizable outcome. The transfer function of the disturbance is provided as
excited with
$\mathcal{N}(0,1)$.
Figure 12 shows the result of the normal controller vs. degradation controllers. The top plot of
Figure 12A shows that the normal and degradation controllers have identical responses to the disturbances. As shown before in
Figure 10, this response is achieved with less force and pressure usage by the degradation controller, and, more importantly, the system MTTF increases to its maximum possible value. Considering the MPC response depicted in
Figure 12B and results from
Figure 9, it can be seen that the MTTF of the machine has reached its maximum at the same time with improvement in the control quality.
From these results, the degradation controller exhibits at worst the same control quality as the controller without degradation control but with significantly less consumed energy and degradation imposed on the system. Considering the best scenario, not only have the energy consumption and imposed degradation decreased but the control quality has also increased.
5. Discussion
Maintenance management is a task that can be performed at any stage, either in the design phase or during machine operation. The goal of degradation control is to employ effective maintenance management in controlling degradation in the machines so that they reach the maintenance threshold at a desired time. Generally, this degradation control method is designed to benefit production planning.
Significant research has been dedicated to simulating the degradation in a closedloop system in this article. However, in realworld scenarios, machines record data over time; when sufficient data are collected, the coefficients are calculated, and the degradation controller can work. This method can be used online during machine operation and offline during the design stage based on the SSM.
The main advantage of this method is its adaptivity. The machine can start working in various conditions with different degradation using only a controller without degradation control. After recording sufficient data, the controller can adapt to the software without stopping the machine for an extended period.
On the other hand, a primary limitation of the method lies in its sensitivity to noisy signals. If the recorded signals exhibit noise, the training set size needs to increase to maintain accuracy due to the noise vulnerability of the RVM. Another complexity arises when the system must concurrently offset various degradations (incorporated as distinct states in the SSM). In such cases (classification of degradation), utilizing historical maintenance data and understanding how each type impacts the system become essential. This classification demands more comprehensive information about the system.
The system model utilized for this method is a linear SSM. While the SSM is a feasible option for demonstrating degradation control, two considerations are significant. Firstly, realworld scenarios might present limitations in accessing certain system states. However, as most complex machinery is equipped with control systems, these necessary states for optimal control are typically attainable, either directly or via stateestimation methods like Kalman filtering. Secondly, the system or its degradation may display nonlinear characteristics. In such cases, the identification of degradation through the RVM is not hindered as its focus is on the trend in degradation rather than the degradation model itself. Moreover, nonlinearities within the system model can also be managed using MPC without alteration of the structure of the method.
Future work entails extending the control of machine degradation by considering its impact on other machines within the same production line, or on a group of robots performing a unified task. Leveraging advancements in multiagent control along with the insights from this article, it is feasible to assess how the actions of each machine influence the degradation of others. By taking into account various production costs, such as quality, delivery time, and maintenance, it becomes possible to optimize the actions and decisions of each machine based on their cumulative effect on the entire process.