An Optimal Opportunistic Maintenance Planning Integrating Discrete- and Continuous-State Information

Wei, Fanping; Wang, Jingjing; Ma, Xiaobing; Yang, Li; Qiu, Qingan

doi:10.3390/math11153322

Open AccessArticle

An Optimal Opportunistic Maintenance Planning Integrating Discrete- and Continuous-State Information

by

Fanping Wei

¹

,

Jingjing Wang

²,

Xiaobing Ma

¹,

Li Yang

^1,*

and

Qingan Qiu

³

¹

School of Reliability and Systems Engineering, Beihang University, Beijing 100191, China

²

School of Management Engineering, Qingdao University of Technology, Qingdao 266520, China

³

School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(15), 3322; https://doi.org/10.3390/math11153322

Submission received: 4 July 2023 / Revised: 26 July 2023 / Accepted: 27 July 2023 / Published: 28 July 2023

(This article belongs to the Special Issue Data-Driven Methods and Artificial Intelligence in Reliability and Maintenance, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

Information-driven group maintenance is crucial to enhance the operational availability and profitability of diverse industrial systems. Existing group maintenance models have primarily concentrated on a single health criterion upon maintenance implementation, where the fusion of multiple health criteria is rarely reported. However, this is not aligned with actual maintenance planning of multi-component systems on many occasions, where multi-source health information can be integrated to support robust decision making. Additionally, how to improve maintenance effectiveness through a scientific union of both scheduled and unscheduled maintenance remains a challenge in group maintenance. This study addresses these research gaps by devising an innovative multiple-information-driven group replacement policy for serial systems. In contrast to existing studies, both discrete-state information (hidden defect) and continuous degradation information are employed for group maintenance planning, and scheduled postponed maintenance and unscheduled opportunistic maintenance are dynamically integrated for the first time to mitigate downtime loss. To be specific, inspections are equally spaced to reveal system health states, followed by the multi-level replacement implemented when either (a) the degradation of the continuously degrading unit reaches a specified threshold, or (b) the age of the multi-state unit since the defect’s identification reaches a pre-set age (delayed replacement). Such scheduling further enables the implementation of multi-source opportunistic replacement to alleviate downtime. The Semi-Markov Decision Process (SMDP) is utilized for the collaborative optimization of continuous- and discrete-state thresholds, so as to minimize the operational costs. Numerical experiments conducted on the critical structure of circulating pumps verify the model’s applicability.

Keywords:

maintenance optimization; inspection planning; replacement decision making; opportunistic replacement; availability; cost-effectiveness

MSC:

90B25

1. Introduction

Preventive maintenance is a critical element of asset health management, which has a significant impact on ensuring the operational reliability and availability of industrial plants [1,2], as well as enhancing their profitability [3,4]. According to the foundation of maintenance decision making, preventive maintenance can be substantially partitioned into two types, time-based maintenance (TBM) and condition-based maintenance (CBM) [5,6]. In recent decades, with the rapid advancement of sensor and data processing technology, CBM has attracted considerable attention, with a series of field applications in areas such as navigation, rail transit, and advanced manufacturing [7,8]. The core principle of CBM is to collect multiple types of health information to support data-driven maintenance decision making, with full consideration of the system structure, operational conditions, and failure mechanisms [9,10].

Inspection, as a fundamental part of CBM, provides crucial supporting information for subsequent repair/spare part decisions [11,12]. Typically, there are two main types of health information that can be identified through inspections [13]. The first type of health data is continuous degradation information, which includes measurements such as blade crack length, wear accumulation of rotating machinery, capacity reduction of lithium batteries, and parameter drift of electronic devices [14]. The second type is discrete health status, which is typically observed in multi-state plants. Among these, the most commonly encountered pattern during inspection activities is two-stage deterioration, where two successive and random stages occur before ultimate failure [15,16]. The former stage is called the defect initialization stage, while the latter stage is called the defect expansion stage, whose duration is also called the delay time [17,18]. Typical defects include dents, holes, cracks, out-of-control quality, etc., which broadly exist in industrial plants such as in bearings, pumps, medical equipment, and manufacturing lines [19]. The defective state, analogous to continuous degradation, is usually unfatal and hidden, and its effective identification and removal are closely related to inspection activities [20,21]. This highlights the importance of scheduling and optimizing inspection plans, which significantly affects maintenance performance [22,23].

Despite the extensive research into CBM planning of critical components degrading either continuously or discretely, the existing literature on group maintenance models has primarily focused on a single failure mode, either degradation or sudden failure type [24,25]. There remains a lack of a unified group maintenance framework that examines the interaction effect between two separate failure modes [26]. For instance, Gopalan et al. [27] and Malefaki et al. [28] analyzed the degradation analysis and condition-based maintenance modeling approaches for two-component systems subject to gradual degradation. Zhang [29] studied the optimal maintenance decision with regard to group maintenance of a two-component system subject to sudden failure, with application to petrochemical enterprises. Wang et al. [6,30] extended the single failure mode to competing failures, either degradation-based or shock-induced. However, scheduled group maintenance was not considered in their work since shock-induced failure does not hinge on inspection outcome. Xu et al. [15] investigated the group maintenance optimization of generalized multi-component systems subject to imperfect inspections, which was also limited to degradation-based failure mode.

After conducting a detailed literature review, we identified four significant research gaps that remain to be addressed. Firstly, the majority of current group maintenance models made maintenance decisions solely based on a single type of health information, either accumulated degradation or hidden failure/defect. There is a scarcity of group maintenance frameworks that sufficiently utilize both continuous and discrete health information to (a) improve the robustness of maintenance decision making and (b) enhance system effectiveness [31]. However, as for realistic industrial systems such as wind turbines, gas pumps, and aircraft, both continuous degradation information and discrete-state information (for instance, defect information) are extensively seen through condition monitoring of crucial mechanical/electronic components. It is crucial to schedule group maintenance with sufficient consideration of both information arising from different failure modes, so as to develop more effective maintenance strategies [32,33]. Secondly, there are few studies that have investigated how inspection information can be used to integrate both (a) scheduled group maintenance (GM) and (b) unscheduled opportunistic maintenance (OM), with the goal of reducing downtime loss. Given that preventive actions for system components may vary depending on their health conditions, an effective integration of GM and OM is essential to capture the interaction between separate maintenance activities to alleviate maintenance downtime losses to a maximum extent [34,35]. Thirdly, most group maintenance models assume immediate maintenance, particularly for two-component systems [36]. This may be a sub-optimal option for multi-state plants since the potential of the remaining lifetime is not sufficiently exploited, leading to over-maintenance [37,38]. Also, immediate maintenance is unable to provide extra chances for opportunistic maintenance [39,40]. Fourthly, there is no high-efficiency algorithm to deal with such constrained maintenance interaction problems. The commonly adopted renewal–reward theory, although facilitating single-component maintenance modeling, confronts the renewal asynchrony challenges when applying to the OM-GM interaction model [41]. Simulation algorithms such as Monte Carlo, on the other hand, are challenged by the computation burden and model interpretability.

To address the foregoing research gaps, we innovatively devised an inspection-driven, multi-source group maintenance policy for serial systems subject to both continuous and discrete deterioration processes. As opposed to previous studies, the information about (a) continuous degradation accumulation and (b) discrete early warning signals is incorporated to support the implementation of mutually interacted maintenance actions such as opportunistic maintenance and delayed maintenance. Through such information interaction, the robustness and timeliness of group maintenance policies can be sufficiently ensured, which ultimately improves the system service availability. Moreover, we are the first to schedule defect-induced postponed maintenance within group-condition-based maintenance models, which allows preventive maintenance to be postponed to future inspection windows upon defect identification. As such, delayed replacement can be integrated with degradation-based replacement to constitute cost-effective group maintenance planning that sufficiently shares the set-up cost. Furthermore, we innovatively extract multiple maintenance opportunities from (a) the corrective replacement of both units, (b) the threshold-based replacement of degrading units, and (c) the delayed replacement of multi-state units. Through the scheduling of such multi-source opportunistic maintenance, the economic and structural dependence of the entire system can be fully harnessed to effectively mitigate system downtime. To the best of our knowledge, this is the first effort to integrate delayed maintenance and opportunistic maintenance within group maintenance models, which significantly promotes system profitability and availability through (a) multiple-source information fusion and (b) dynamic maintenance interaction. Ultimately, we devised a high-efficiency optimization algorithm oriented to such multi-variables that leverages the Semi-Markov Decision Process to solve the model convergence and computation burden problems caused by conventional renewal theories. The applicability of the proposed model was validated by numerical experiments on cycling pump systems experiencing both crack propagation and corrosive pitting processes.

To summarize, this study contributes to group maintenance optimization from the following four perspectives:

■: Constructing an innovative group maintenance framework integrating both continuous and discrete health information, which dynamically integrates (a) opportunistic maintenance and (b) delayed maintenance to significantly enhance maintenance effectiveness and availability;
■: Allowing defect removals to be postponed so as to (a) exploit the remaining lifetime potentials and (b) offer extra chances for the selection and implementation of cost-effectiveness opportunistic maintenance;
■: Scheduling multiple types of opportunistic maintenance arising from (a) threshold-based replacement, (b) delayed replacement, and (c) corrective maintenance to sufficiently control system downtime and enhance decision-making robustness;
■: Realizing the high-efficiency optimization of maintenance interaction problems via the Semi-Markov Decision Process, and demonstrating the model’s applicability through numerical experiments on a circling pump.

The remainder of this paper is structured as follows. Section 2 introduces the basic problem with regard to the basic system structure and unit failure mechanism. Section 3 designs the inspection-based maintenance policy. Section 4 formulates the maintenance model under the SMDP framework. Section 5 illustrates the applicability via a numerical experiment on a circling pump. Section 6 concludes the paper and lists some possible extensions.

2. Problem Description

This paper is mainly divided into four parts: degradation modeling, replacement policy, cost modeling, and optimization algorithm. Figure 1 is the research framework of this paper. Starting from this section, the optimal replacement policy for two-unit series systems considering discrete and continuous degradation information will be studied following the research ideas shown in the framework.

Consider a deteriorating system that consists of two critical units connected in series, in that failure of each unit leads to an immediate system failure. The two units possess independent deterioration mechanisms throughout their lifetime. Unit 1 is continuously deteriorating with observable degradation trajectories through inspections. Such mechanisms are widely seen in diverse industrial plants, such as fatigue crack propagation, battery capacity reduction, and wear accumulation [27]. Unit 1 is deemed as failed when the accumulated degradation attains a pre-set threshold

D, D > 0

according to industrial standards or safety constraints. Unit 2 is a multi-state unit that encounters one or more unfatal transition states prior to ultimate failure, which can be viewed as effective early-warning signals supporting timely preventive maintenance. The hidden health information of both units can only be identified by inspections, whose diagnosing outcomes support group maintenance decision making.

In this study, the degradation process of Unit 1 is characterized by the Wiener process. Wiener is a widespread stochastic process that captures non-monotone degradation behaviors, attributed to its good mathematical properties and physical interpretability [8]. Accordingly, the underlying degradation process

X (t)

is formulated as

X (t) = X_{0} + μ (t) + σ W (t)

, where

X_{0}

is the initial degradation,

μ (t) = μ t

is the drift process,

σ

represents the diffusion coefficient, and

W (t)

represents the standard Brownian motion. A prominent property of such a process is that the degradation increment within any interval is an independent variable following inverse Gaussian distributions. Notably, other forms of stochastic processes, such as random walk and Gamma processes, are also applicable without model restriction.

On the other hand, the deterioration process of Unit 2 is specified as two-phase deterioration, due to its generality and representativeness in inspection-based maintenance [40]. Such a process usually defines a non-fatal, identifiable defective state, during which the unit remains operational but experiences significantly higher malfunction risk. In other words, the random sojourn of the defect propagation process

γ_{b}

is statistically smaller than that of the defect initialization process

γ_{a}

. Representative instances of such defects include dents, holes, stripping, over-vibration, overheating, out-of-control quality, etc.

Obviously, inspections are crucial and fundamental preventive maintenance activities, as they report the hidden health state (either continuous or discrete) of both units, which supports timely, state-driven maintenance planning. In the following sections, we use the inspection outcomes to devise, formulate, and then optimize the group-level condition-based maintenance policy.

3. Maintenance Planning

The core focus of maintenance policy is to minimize the operational cost of the entire system by considering the following factors: (a) system structure, (b) unit failure behavior, and (c) combinations of maintenance activities. In particular, for a serial system with both structural dependency and economic dependency, group maintenance is more cost-effective than individual maintenance. Group maintenance allows for the sharing of set-up costs and the utilization of unavoidable maintenance downtime. In the remaining section, we will outline the approach for scheduling an inspection-based group maintenance policy that captures unit dependencies.

3.1. Basic Assumptions

In order to clarify the maintenance policy, some basic assumptions with regard to failure characteristics, operational conditions, and maintenance activities are outlined, with proper justifications or interpretations.

(a): Units 1 and 2 are as good as new when initially put into use. In other words, the initial degradation accumulation of Unit 1 and the virtual age of Unit 2 are equal to 0. This is a common assumption used to simply the maintenance problem, which can be easily relaxed [26];
(b): Inspections are instantaneous, non-destructive, and perfect. In other words, both the degradation severity (Unit 1) and the defective state (Unit 2) of the system can be accurately reported. This is a widely accepted setting since well-prepared inspections, in contrast to sensor-based condition monitoring, incur negligible measurement error [7];
(c): Maintenance, either corrective, preventive, or opportunistic, returns the unit back to as-good-as-new status. This is equivalent to the effect of spare part replacement. In the rest of the section, we use maintenance and replacement interchangeably;
(d): The time to execute maintenance activities is non-negligible, and these activities require the stoppage of the entire system, and the downtime loss cannot be ignored.

3.2. Group Maintenance Scheduling

As addressed earlier, group maintenance is a more cost-effective selection for multi-component systems compared with individual maintenance [5,42], due to its capacity to (a) adequately harness unavoidable downtime and (b) save set-up and personnel costs. On the other hand, for a multi-phase plant with a defective state, it is suggested to postpone preventive maintenance when revealing the defect, instead of an immediate execution. Such postponement ensures a sufficient exploration of the remaining lifetime potential.

In this study, we designed a novel inspection-based, multi-dimensional maintenance policy, which is the union of four types of mutually interacting maintenance activities: (a) threshold-centered replacement, (b) delayed replacement, (c) opportunistic replacement, and (d) corrective replacement. In particular, opportunistic replacement is integrated with the other three types of activities to form maintenance groups, so as to enhance maintenance efficiency. Also, the provision of postponed maintenance offers more space and flexibility for opportunistic maintenance. The specific scheme of the maintenance policy is outlined below, and a specific situation of integrated maintenance activities is presented in Figure 2.

Regular inspection. Inspections are equally spaced within the lifetime horizon according to an interval of $Δ, Δ > 0$ , with a cost $C_{I}$ per time unit. The function of inspection is twofold, (a) reporting the latest degradation magnitude of Unit 1, and (b) revealing the hidden defective state of Unit 2. Inspections are perfect and instantaneous, but require stopping the system;
Threshold-centered replacement (TR). Upon an inspection $T_{i} = i Δ, i = 1, 2, \dots$ , if the degradation of Unit 1 exceeds a pre-specified control limit $U, 0 < U < D$ , but does not yet reach the failure threshold $D$ , a TR is immediate with a cost $C_{p 1}$ and time $T_{p 1}$ ;
Delayed replacement (DR). If Unit 2 is found defective upon an inspection, then its preventive replacement is scheduled $H$ inspection intervals later, equivalent to $H Δ$ time units. Such replacement incurs a cost of $C_{p 2}$ and time $T_{p 2}$ ;
Corrective replacement (CR). CR is triggered when (1) the degradation measurement of Unit 1 at an inspection exceeds the failure threshold $D$ , and (2) the defect of Unit 2 deteriorates to a complete failure. The incurred cost for each unit is $C_{c 1}$ and $C_{c 2}$ , respectively; the incurred time for each unit is $T_{c 1}$ and $T_{c 2}$ , respectively;
Opportunistic replacement (OR). OR is available for both units, whose acceptance hinges on the entire system state. Here, two situations are possible:
(a)
OR of Unit 1. If Unit 2 fails unexpectedly due to defect evolution or requires replacement, Unit 1 is offered extra chances for OR. Specifically, if the degradation accumulation exceeds $L$ , the chance is accepted and a cost $C_{o 2}$ is incurred; otherwise, no action is taken;
(b)
OR of Unit 2. If the degradation of Unit 1 exceeds either the pre-set replacement threshold, while Unit 2 is waiting for DR, Unit 2 will be offered chances for OR. To be specific, if the time elapsed since the defect identification exceeds $V T, 1 \leq V < H$ , the chance for OR is accepted and a cost $C_{o 1}$ is incurred; otherwise, no action is taken;

♦: Remark. The sufficient interaction and integration of OR with TR and DR addressed in this maintenance policy is an effective and robust way to exploit unit dependencies (structural dependency and economy dependency), so as to enhance operational profitability by minimizing system downtime and downtime loss. For similar reasons, predictive maintenance is a cost-effective solution for multi-phase units/systems due to its nature of (a) avoiding excessive maintenance, (b) extending the remaining lifetime, (c) allowing sufficient resource preparing, and (d) offering extra chances for OR.

4. Model Formulation and Optimization

The objective of the maintenance model is to minimize the average cost per unit time

g (H, V, U, L)

, through the joint optimization of the degradation control limits

L, U

, and the delayed interval numbers

H, V

. Thus, the optimization problem can be described as

\begin{array}{l} g (H^{*}, V^{*}, U^{*}, L^{*}) = \inf g (H, V, U, L), \\ subject to 0 < L < U < D, 1 \leq V \leq H . \end{array}

(1)

In this study, we strive to solve the maintenance problem under the Semi-Markov Decision Process (SMDP) framework, which has been proven an efficient and steady analytical approach to renewal problems with generalized state sojourn time [9,37]. To this end, the one-step transition probabilities of each unit are calculated, based on which the system transition probabilities are derived. Then, the expected sojourn time and cost of the system are provided, and the optimal maintenance strategy minimizing the cost is searched via a random search approach.

4.1. State Transition of Unit 1

We begin with the state transition of Unit 1. To this end, we first investigate its stochastic degradation behavior. Let

{X (t) : t \in R^{+}}

represent the degradation process of Unit 1. Then, the degradation trajectory, starting from brand-new status, is formulated as

X (t) = μ (t) + σ W (t),

(2)

where

W (t)

represents the standard Brownian motion;

μ (t) = μ t

is the drift process;

σ

is the diffusion coefficient.

It is well acknowledged that the Wiener process is an independent incremental process. Therefore, for

0 < T_{1} < T_{2} < \dots < T_{k} < \infty

, the degradation increments

X (T_{1}) - X (T_{0})

,

X (T_{2}) - X (T_{1})

, and

\dots X (T_{k}) - X (T_{k - 1})

are independent, Gaussian-distributed random variables. Therefore, when Unit 1 is within the interval

[T_{k}, T_{k + 1}]

, the increment

X (T_{k}) - X (T_{k - 1})

yields

N (μ (T_{k} - T_{k - 1}), σ^{2} (T_{k} - T_{k - 1}))

, which is equivalent to

N (Δ, σ^{2} Δ)

.

To simplify the problem, we discretize the degradation state place of Unit 1 into

Ω_{1} = {0, 1, 2, 3, \dots i, \dots, F_{1} - 1, F_{1}}

, where

F_{1}

means the failure state, 0 represents the brand-new state, and

i

denotes the discrete degradation state. To be clear, the degradation increment per state is

ε, 0 < ε < S

. Accordingly, Unit 1 is defined to be in state 1 when its deterioration is within

(0, ε]

. Likewise, Unit 1 is in state

i

if its degradation is within

((i - 1) ε, i ε]

. Remember that the failure threshold of Unit 1 is

D

; in other words,

D = F_{1} ε

.

Let

P_{i, j}^{1} (t)

denote the state transition probability of Unit 1 from state

i

to state

j, j \geq i

within time

t

. When Unit 1 is known to be operable at time

T_{n}

, the one-step transition probability within a single inspection interval is written as

\begin{matrix} P_{i, j}^{1} (Δ) = P_{i, j}^{1} (T_{k + 1} - T_{k}) = P {X (T_{k + 1}) = j ε | X (T_{k}) = i ε} \\ = P (l b < X (T_{k + 1}) - X (T_{k}) < u b) = \int_{l b}^{u b} \frac{1}{σ \sqrt{2 π Δ}} e^{- \frac{{(x - μ Δ)}^{2}}{2 σ^{2} Δ}} d x, \end{matrix}

(3)

where the lower bound

l b = \max {0, (j - i - 1) ε}

, and the upper bound

u b = (j - i) ε

.

As aforementioned, replacement is required when the degradation of Unit 1 exceeds

U

at an inspection. In particular, CR is immediate if the degradation exceeds

D

; otherwise, TR is immediate. Unit 1 will be restored to as-good-as-new status after replacement. Then, the one-step transition probabilities from state

i

to state 0 are given as

P_{i, 0}^{1} (T_{c 1}) = P (X (T_{k + 1}) = 0 | X (T_{k}) = i ε \geq D) = 1,

(4)

and

P_{i, 0}^{1} (T_{p 1}) = P (X (T_{k + 1}) = 0 | D \geq X (T_{k}) = i ε \geq U) = 1 .

(5)

4.2. State Transition of Unit 2

Let

{Z (t) : t \in R^{+}}

denote the discrete deterioration process of Unit 2, which is partitioned into two phases. We denote its state space as

Ω_{2} = {(0, a), (1, a), \dots (k, a), \dots, (Θ - 1, a), (Θ, b), \dots, (Θ + V, b), \dots, (Θ + H, b), (0, F_{2}), \dots, (Θ + H, F_{2})}

, where

(k, a)

and

(k, b)

represent the discrete age state of Unit 2 under the normal and defective state, respectively;

Θ

represents the defect of Unit 2 found in the

Θ

-th inspection;

F_{2}

denotes the ultimate failure state.

Now we start to calculate the state transition for Unit 2. First, consider the situation that Unit 2 remains normal during the one-step transition from state k to

k + 1

. Then, its transition probability is

P^{2}_{(k, a) (k + 1, a)} (T_{n}, T_{n + 1}) = \Pr {Z (T_{n + 1}) = (k + 1, a) | Z (T_{n}) = (k, a)} = \frac{R_{γ_{a}} ((k + 1) Δ)}{R_{γ_{a}} (k Δ)},

(6)

where

R_{τ_{a}} (_{.}) = 1 - F_{τ_{a}} (_{.})

is the survival function of the normal stage. Similarly, the one-step transition from state

k

to

k + 1

in the defective state is

P^{2}_{(k, b) (k + 1, b)} (T_{k}, T_{k + 1}) = \Pr {Z (T_{k + 1}) = (k + 1, b) | Z (T_{k}) = (k, b)} = \frac{R_{γ_{b}} ((k + 1) Δ)}{R_{γ_{b}} (k Δ)},

(7)

where

R_{γ_{b}} (_{.})

is the survival function of the defective stage. Moreover, the probability of Unit 2 transforming from normal status to defective is

\begin{array}{l} P^{2}_{(k, a) (k + 1, b)} (T_{k}, T_{k + 1}) = \Pr {Z (T_{k + 1}) = (k + 1, b) | Z (T_{k}) = (k, a)} \\ = \frac{\int_{k Δ}^{(k + 1) Δ} f_{γ_{a}} (t) {\bar{F}}_{γ_{b}} ((k + 1) Δ - t) d t}{R_{γ_{a}} (k Δ)} . \end{array}

(8)

4.3. System Transition Probability

As addressed,

{X (t) : t \in R^{+}}

and

{Z (t) : t \in R^{+}}

represent the deterioration process of both units. In this study, the SMDP framework is employed to solve the maintenance optimization problem. To this end, we first define the state set and action set of the system, as stated below.

System state set. Denote the system operational state as $S_{o p} = S_{1} \cup S_{2}$ , where $S_{1} = {(i, k, a) | 0 \leq i ε < D, 0 \leq k \leq Θ}$ , $and S_{2} = {(i, k, b) | 0 \leq i ε < D, Θ \leq k \leq Θ + H}$ . Moreover, let state sets $S_{3} = {(F_{1}, k, a)},$ $S_{4} = {(F_{1}, k, b)}$ represent the state in which Unit 1 fails while Unit 2 remains normal/defective, $S_{5} = {(i, k, F_{2}) | 0 \leq i ε \leq D}$ represent the state in which Unit 2 fails while Unit 1 remains operational, and $S_{6} = {(F_{1}, k, F_{2})}$ represent that both units fail. Clearly, the entire system state set is the union of the above-mentioned scenarios, i.e., $S = S_{1} \cup S_{2} \cup S_{3} \cup S_{4} \cup S_{5} \cup S_{6}$ ;
System action set. A maintenance decision is made at each inspection time $T_{n} = n Δ$ . The set of maintenance actions is ${0, 1, 2, 3}$ , where 0 means no maintenance action, 1 means corrective replacement action, 2 means preventive replacement action (including TR of Unit 1 and DR of Unit 2), and 3 means opportunistic maintenance action. Let $A = (A_{1}, A_{2})$ represent the maintenance action taken for both units. For instance, when $i ε < U, k < Θ$ , this means both units are normally working, so no action is taken, and $(A_{1}, A_{2}) = (0, 0)$ .
System state transition. It is clear that the system transition probability can be derived by multiplying the transition probabilities of Unit 1 and Unit 2. For instance, we define $P_{(i, n, a), (j, n + 1, a)} (A_{1}, A_{2})$ as the transition probability from state $(i, k, a)$ to state $(j, k + 1, a)$ when action $(A_{1}, A_{2})$ is taken. Then, the probability $P_{(i, k, a), (j, k + 1, a)} (0, 0)$ is derived as

\begin{array}{l} P_{(i, k, a), (j, k + 1, a)} (0, 0) \\ = P {X (T_{k + 1}) = j ε, Z (T_{k + 1}) = ((k + 1) Δ, a) | X (T_{k}) = i ε < U, Z (T_{k}) = (k Δ, a)} \\ = P^{1}_{i, j} (T_{k}, T_{k + 1}) * P^{2}_{(k, a), (k + 1, a)} (T_{k}, T_{k + 1}) \\ = \int_{\max (0, (j - i - 1) ε)}^{(j - i) ε} \frac{1}{σ \sqrt{2 π Δ}} e^{- \frac{{(x - μ Δ)}^{2}}{2 σ^{2} Δ}} d x * \frac{R_{γ_{a}} ((k + 1) Δ)}{R_{γ_{a}} (k Δ)} . \end{array}

(9)

Similarly, the transition probability from state

(i, k, a)

to state

(j, k + 1, b)

when taking action

(0, 0)

, denoted by

P_{(i, k, a), (j, k + 1, b)} (0, 0)

, is calculated as

\begin{array}{l} P_{(i, k, a), (j, k + 1, b)} (0, 0) \\ = P {X (T_{k + 1}) = j ε, Z (T_{k + 1}) = ((k + 1) Δ, b) | X (T_{k}) = i ε < L, Z (T_{k}) = (k Δ, a)} \\ = P^{1}_{i, j} (T_{k}, T_{k + 1}) * P^{2}_{(k, a), (k + 1, b)} (T_{k}, T_{k + 1}) \\ = \int_{\max (0, (j - i - 1) ε)}^{(j - i) ε} \frac{1}{σ \sqrt{2 π Δ}} e^{- \frac{{(x - μ Δ)}^{2}}{2 σ^{2} Δ}} d x * \frac{\int_{k Δ}^{(k + 1) Δ} f_{γ_{a}} (t) R_{γ_{b}} ((k + 1) Δ - t) d t}{R_{γ_{a}} (k Δ)} . \end{array}

(10)

where

(i, k, a)

represents that at the k-th inspection point, component 1 is in state i, and component 2 is in a normal state.

(j, k + 1, b)

represents that at the (k + 1)-th inspection point, component 1 is in state j, and component 2 is in a defect state.

Equations (9) and (10) represent the joint probability distribution function of the degradation of component 1 and the defect of component 2 within an inspection interval, respectively.

Moreover, when at least one unit requires maintenance at the decision point, the system transition probability is equivalent to 1, since the maintained unit is restored to as-good-as-new status after replacement. For instance, when Unit 2 is experiencing CR, and Unit 1 takes the chance to execute OR, the state transition probability

P_{(i, k, F_{2}), (0, k + 1, a)} (3, 1)

is formulated as

P_{(i, k, F_{2}), (0, k + 1, a)} (3, 1) = P {X (T_{k + 1}) = 0, Z (T_{k + 1}) = (0, a) | X (T_{k}) = i ε \in (L, D), Z (T_{k}) = (k, F)} = 1 .

(11)

Other system transition probabilities can be derived in a similar manner. In the following, we specify each state transition possibilities with regard to system renewals, and derive the corresponding sojourn time and cost.

4.4. Expected Sojourn Time

Under the SMDP framework, the average maintenance cost

g (H, V, U, L)

, as a function of both time-based and condition-based decision variables, is determined by two crucial indicators: (a) the expected sojourn time between two successive decision epochs, and (b) the expected incurred cost over the sojourn time. Both indicators rely on the unit-level and system-level renewal scenarios.

To be clear, let

τ_{(i, k, a)} (A_{1}, A_{2})

and

τ_{(i, k, b)} (A_{1}, A_{2})

denote the expected sojourn time starting from state

(i, k, a)

and

(i, k, b)

, where

A_{1}

and

A_{2}

represent the maintenance action taken for Unit

i, i = 1, 2

. We conclude that four possible situations exist in sojourn time analysis.

a.: No maintenance action is taken

We first focus on the case of taking no maintenance action, which can be partitioned into the following two cases: (a) Unit 1 has not reached the TR threshold, and Unit 2 is normal-working; (b) Unit 1 has not reached the OR threshold, and Unit 2 remains defective and waiting for delayed maintenance. Specifically, the expected sojourn under the first scenario is

\begin{array}{l} τ_{(i, k, a)} (0, 0) = & Δ * \sum_{j = 0}^{L / ε} P_{i, j}^{1} (Δ) * \frac{1 - \int_{k Δ}^{(k + 1) Δ} f_{γ_{a}} (x) F_{γ_{b}} ((k + 1) Δ - x) d x}{R_{γ_{a}} (k Δ)} + \\ \sum_{j = 0}^{L / ε} P_{i, j}^{1} (Δ) * \frac{\int_{k Δ}^{(k + 1) Δ} \int_{x}^{(k + 1) Δ} (t - k Δ) f_{γ_{a}} (x) f_{γ_{b}} (t - x) d t d x}{R_{γ_{a}} (k Δ)}, \end{array}

(12)

where

k \leq Θ - 1, i \leq U / ε .

The left-hand term indicates that Unit 2 will not fail in the next inspection cycle, and the right-hand term indicates that Unit 2 will fail randomly in the next inspection cycle. Likewise, the expected sojourn of the right-hand term is

τ_{(i, k, b)} (0, 0) = Δ * \sum_{j = 0}^{S / ε} P_{i, j}^{1} (Δ) * \frac{R_{γ_{b}} ((k + 1) Δ)}{R_{γ_{b}} (k Δ)} + \sum_{j = 0}^{S / ε} P_{i, j}^{1} (Δ) * \frac{\int_{k Δ}^{(k + 1) Δ} (t - k Δ) f_{γ_{b}} (t) d t}{R_{γ_{b}} (k Δ)},

(13)

where

k \leq Θ + H - 1, i \leq L / ε .

b.: Only Unit 1 is replaced

When Unit 1 is experiencing TR or CR, Unit 2 will be decided whether to execute OR. Here, the time to execute TR/CR is no less than that of OR, and thus, we omit OR execution time. Notably, when maintenance action is determined at

T_{n}

, the subsequent decision point

T_{n + 1}

is the maintenance completion time, as graphically represented in Figure 2. Thus, the expected sojourn time when only Unit 1 is correctively or preventively repaired is equal to

τ_{(i, k, a)} (2, 0) = τ_{(i, k, b)} (2, 3) = T_{p_{1}},

(14)

and

τ_{(F_{1}, k, a)} (1, 0) = τ_{(F_{1}, k, a)} (1, 3) = T_{c 1} .

(15)

c.: Only Unit 2 is replaced

Analogously, when DR or OR is executed on Unit 2, we will determine whether to execute OR on Unit 1. The expected sojourn time with respect to such a case is given by

τ_{(i, k, b)} (0, 2) = τ_{(i, k, b)} (3, 2) = T_{p 2},

(16)

and

τ_{(i, k, F_{2})} (0, 1) = τ_{(i, k, F_{2})} (3, 1) = T_{c 2} .

(17)

d.: Both components are replaced

It is possible for both units to undergo preventive replacement at an inspection, when (a) the degradation of Unit 1 is within

(U, D)

and TR is immediate, and (b) the accumulated age of Unit 2 has reached the postponement threshold, and replacement is immediate. In this case, the expected sojourn time is the maximum of these two preventive replacement times, i.e.,

τ_{(i, k, b)} (2, 2) = \max (T_{p 1}, T_{p 2}) .

(18)

Similarly, the expected sojourn time when both units are experiencing corrective replacement is

τ_{(F_{1}, k, F_{2})} (1, 1) = \max (T_{c 1}, T_{c 2}) .

(19)

The sojourn times when one unit experiences preventive replacement while the other experiences corrective replacement are given by

τ_{(i, k, F_{2})} (2, 1) = \max (T_{p 1}, T_{c 2}),

(20)

and

τ_{(F_{1}, k, b)} (1, 2) = \max (T_{c 1}, T_{p 2}) .

(21)

4.5. Expected Cost

The maintenance cost arises due to the execution of inspections as well as replacements. Let

C_{(i, k, a)} (A_{1}, A_{2})

denote the maintenance cost incurred starting from state

(i, k, a)

when action

(A_{1}, A_{2})

is taken. As mentioned in Section 3, the maintenance cost can be divided into several components, including inspection cost, corrective maintenance cost, preventive maintenance cost, and opportunistic maintenance cost. Analogous to Section 4.4, there are four scenarios that need to be discussed.

a.: No maintenance action is taken

Clearly, when no maintenance action is taken, the only cost incurred is the inspection cost upon decision making. Remember that Unit 2 can either be normal or defective when experiencing inspection. Therefore, the expected cost can be constructed as

C_{(i, k, a)} (0, 0) = C_{(i, k, a)} (0, 0) = C_{I} .

(22)

b.: Only Unit 1 is replaced

When Unit 1 is experiencing TR or CR, Unit 2 is also offered a chance to execute OR. Depending on whether OR is accepted, four situations are possible:

C_{(F_{1}, k, a)} (1, 0) = C_{(F, k, b)} (1, 0) = C_{I} + C_{c 1}

(23)

C_{(F_{1}, k, b)} (1, 3) = C_{I} + C_{c 1} + C_{o 2},

(24)

C_{(i, k, a)} (2, 0) = C_{(i, k, b)} (2, 0) = C_{I} + C_{p 1},

(25)

C_{(i, k, b)} (2, 3) = C_{I} + C_{p 1} + C_{o 2} .

(26)

c.: Only Unit 2 is replaced

Similarly, when Unit 2 is experiencing DR or CR, four situations are possible

C_{(i, k, F_{2})} (0, 1) = C_{I} + C_{c 2},

(27)

C_{(i, k, F_{2})} (3, 1) = C_{I} + C_{c 2} + C_{o 1},

(28)

C_{(i, k, b)} (0, 2) = C_{I} + C_{p 2},

(29)

C_{(i, k, b)} (3, 2) = C_{I} + C_{p 2} + C_{o 1} .

(30)

d.: Both components are replaced

Similar to Scenario C, there are four possible situations depending on whether each unit is preventively or correctively replaced. The corresponding expected costs are

C_{(i, k, b)} (2, 2) = C_{I} + C_{p 1} + C_{p 2},

(31)

C_{(F_{1}, k, b)} (1, 2) = C_{I} + C_{c 1} + C_{p 2},

(32)

C_{(F_{1}, k, F_{2})} (1, 1) = C_{I} + C_{c 1} + C_{c 2},

(33)

C_{(U, k, F_{2})} (2, 1) = C_{I} + C_{p 1} + C_{c 2} .

(34)

4.6. System Objective Function

With the state transition function constructed in Section 4.3 and the expected sojourn/cost function constructed in Section 4.4 and Section 4.5, the stationary probabilities can be derived by solving the following set of equations:

{\begin{array}{l} π_{(j, k + 1, a)} = \sum_{(i, k, a) \in S} P_{(i, k, a), (j, k + 1, a)} π_{(i, k, a)}, (i, k, a) \in S \\ π_{(j, k + 1, b)} = \sum_{(i, k, a) \in S} P_{(i, k, a), (j, k + 1, b)} π_{(i, k, a)} + \sum_{(i, k, b) \in S} P_{(i, k, b), (j, k + 1, b)} π_{(i, k, b)}, (i, k, b) \in S \\ \sum_{(i, k, a) \in S} π_{(i, k, a)} + \sum_{(i, k, b) \in S} π_{(i, k, b)} = 1 \end{array}

(35)

P_{(i, k, a), (j, k + 1, a)} π_{(i, k, a)}

represents the stationary probability that the system state at the k-th inspection is

(i, k, a)

and transforms to

(j, k + 1, a)

at the next point.

Accordingly, the stationary average maintenance cost is formulated as

g (H, V, U, L) = \frac{\sum_{(i, k, a) \in S} C_{(i, k, a)} π_{(i, k, a)} + \sum_{(i, k, b) \in S} C_{(i, k, b)} π_{(i, k, b)}}{\sum_{(i, k, a) \in S} τ_{(i, k, a)} π_{(i, k, a)} + \sum_{(i, k, b) \in S} τ_{(i, k, b)} π_{(i, k, b)}} .

(36)

4.7. Solution Procedure

The proposed maintenance model contains multiple dependent decision variables, which are difficult to solve via analytical approaches. To this end, we propose a heuristic random search approach under the framework of the Ant Colony Algorithm. Ant Colony Optimization, initially proposed by Marco Dorigo, is a high-efficiency probabilistic simulated evolutionary algorithm employed to find optimal paths in graphs [43]. Its inspiration comes from the behavior of ants to find paths in the process of searching for food. In the process of movement, ants will leave something called pheromones, which gradually reduce as the distance of movement increases. Therefore, the concentration of pheromones is often the strongest around the home or food, and ants themselves will choose the direction according to the pheromones. The main procedure of the optimization approach is outlined below:

Step 1. Initialization parameters, including the degradation coefficients of Unit 1 and Unit 2, and the time and cost needed to execute CR, TR, DR, and OR. Initialize the ant amount $W$ , the information heuristic factor $α (α \geq 0)$ , the expectation heuristic factor $β (β \geq 0)$ , the objective function $g$ , and the iteration times $N$ . Initialize the solution set $(H^{*}, V^{*}, U^{*}, L^{*})$ ;
Step 2. Put the ant starting point in the current solution set. For each ant, the probability $P_{i j}$ is transferred to the next point $j$ , which places the vertex $j$ in the current solution set;
Step 3. Calculate the target value function $g (H^{*}, V^{*}, U^{*}, L^{*})$ for each ant;
Step 4. Modify the trajectory strength by updating the process $τ_{i j} (t + 1) = α τ_{i j} (t) + (1 - α) Δ τ_{i j}$ ;
Step 5. Reduce the number of iterations by one, i.e., $N = N - 1$ . If $N > 0$ , and there is no degradation behavior, return to step 2. Otherwise, output the optimal parameters $(H^{*}, V^{*}, U^{*}, L^{*})$ .

5. Numerical Experiment

This section applies the proposed maintenance framework to the crucial mechanical structure of the circulating pump. The circulating pump is a common component in larger-scale systems used for transport reaction, absorption, separation, and absorption liquid regeneration. The structure of the circulating pump consists of two safety-critical key components: the main bearing and the impeller. These two components are connected in series, meaning that if one of them fails, the entire pump will break down immediately. The primary mode of degradation for the main bearing is fatigue fracture. This type of degradation can be quantified by measuring the length of cracks that form in the bearing over time. In contrast, the degradation process for the impeller can be broken down into two distinct phases: cavitation and corrosive pitting. In particular, the degradation process that occurs during the corrosive pitting phase is typically more severe than that during the cavitation phase. This makes it intractable to monitor and detect these types of degradation processes in real time. Instead, the failure data as well as right-censored data can be employed to capture the time-scale failure characteristics.

Through the goodness-of-fit test, Weibull distribution scales characterize the sojourns of defect initialization and evolution processes [43], with scales

λ_{1} = 0.78

,

λ_{2} = 0.83

and shapes

k_{1} = 0.63

,

k_{2} = 1.65

, respectively. Additionally, the bearing degradation is described by the Wiener process, which has been widely adopted to characterize the crack propagation process [18], with a failure threshold of 22.8 mm. Based on parameter estimation outcomes, the drift and diffusion coefficients of the Wiener process are

μ = 0.741

and

σ = 0.012

, respectively. The circulating pump is inspected per week to identify its health state, including the bearing crack length and whether corrosive pitting is initialized. The cost structure is set as follows. The CR, TR/DR, and OR costs for the main bearing and impeller are 12,000, 6000, and 4500, respectively; the CR, TR/DR, and OR costs for the impeller are 9000, 4000, and 3000, respectively; the inspection cost is 500 per time unit.

5.1. Optimization Results

The optimal combination of decision variables and the minimum average maintenance cost is searched using the algorithm in Section 4.7. According to the optimization outcome, the minimum cost is obtained when (1) Unit 1 (main bearing) is preventively replaced (TR) when its crack length reaches 19.6 mm, or opportunistically replaced (OR) when the length reaches 17.1 mm; (2) Unit 2 (impeller) is preventively replaced (DR) 9 weeks since the identification of corrosive pitting, or opportunistically replaced (OR) if the chances arrive within 7 and 9 weeks. The minimum cost regarding the optimal solution is 2941.5 per week.

Clearly, the optimization outcome is affected by several health-related factors of the circulating pump, such as the propagation velocity/volatility of the bearing, as well as the state sojourns of the impeller. Therefore, we conducted a sensitivity analysis on crucial coefficients representing the deterioration severity of the system. Here, we mainly pay attention to the variations in bearing degradation coefficients, as the impeller deterioration can be analyzed in a similar way. First, we consider the sensitivity of the maintenance cost to the diffusion parameter. As shown in Figure 3a, as the diffusion increases, the optimal cost tends to decrease gradually, but this trend decreases gradually. Moreover, the optimal cost increases gradually as the drift increases, also with a decreasing trend. From Figure 4, the drift and diffusion parameters contributed significantly and almost equally to maintenance performance. Therefore, the bearing degradation should be carefully identified and controlled.

For a better illustration of the collaboration effect, we tested the maintenance cost variation with respect to both the drift and diffusion coefficients, as indicated in Figure 4. From the diagram, a larger drift and a smaller diffusion coefficient contribute to a higher cost. This is because the diffusion coefficient affects the degradation rate, in which case the condition-based threshold will decrease, indicating that threshold-based maintenance (TR) needs to be executed more regularly, resulting in higher maintenance-induced costs.

Figure 5 indicates the maintenance cost with respect to the (a) scale parameter and (b) shape parameter of the corrosive pitting initialization process. It can be seen from the figure that with the increase in the shape parameter and the decrease in the scale parameter, the maintenance cost gradually increases. Moreover, the impact of the scale parameter on the maintenance cost is larger than that of the shape parameter. This is due to the fact that the scale parameter is more closely related to the initialization duration and affects replacement executions more significantly.

5.2. Policy Comparison

To highlight the superiority of the proposed maintenance policy in cost control, we introduce three alternate policies for comparison, either widely used in maintenance engineering or easy to implement.

■: Policy 1. TR and CR are executed for Unit 1, but OR is ignored. Unit 2 undergoes TR, DR, CR, and OR, which is aligned with the proposed policy;
■: Policy 2. DR and CR are executed for Unit 2, but OR is ignored; Unit 1 remains the same as the proposed policy;
■: Policy 3. Both units undergo TR, DR, and CR, and OR is omitted. Then, the policy reduces to a conventional preventive maintenance policy without considering unit dependencies.

The comparison result between these four maintenance policies is indicated in Table 1. Clearly, the proposed policy outperforms the other heuristic policies, reducing the cost by 11.5%, 6.8%, and 13.7%, respectively. This indicates the profitability of (a) executing OM, either PM-induced or CM-induced, and (b) allowing defect-induced maintenance to be performed, since more maintenance opportunities can be integrated to reduce downtime. Moreover, due to the existence of OM, a more radical (scheduled) maintenance policy is possible, in that preventive maintenance can be arranged less frequently.

In order to test the robustness of the proposed policy, we conducted a sensitivity analysis on some critical cost parameters of the circulating pump, such as the opportunistic replacement cost as well as preventive replacement cost. We first tested the variation in pump maintenance cost with respect to the OR cost of the main bearing. As clearly indicated in Figure 6, the proposed policy outperforms the other three policies regardless of OR cost variation. Notably, the optimal cost under Policy 1 and Policy 3 is not affected by OR cost, due to the ignorance of maintenance opportunities. Also, the proposed maintenance policy is less sensitive to OR cost than Policy 2.

Likewise, the sensitivity of the optimal cost to the OR cost of the impeller can be analyzed. As one can see from Figure 7, the proposed policy is more cost-effective than the other policies, which is rarely affected by impeller OR cost. Also, the optimal cost is the least sensitive to the OR cost due to the existence of multiple interacting maintenance actions, which weakens the influence of a single maintenance action.

Finally, we tested the influence of threshold-centered TR cost on the optimal maintenance cost. To this end, we fixed the cost of OR and altered the scope of TR cost from 1000 to 6000. The test outcome is indicated in Figure 8. It is important to note that although the optimal maintenance cost increases with the increase in TR cost under all four maintenance policies, the cost-effectiveness of the prosed policy is not challenged by such variations, since its cost-increasing velocity is slightly smaller than or equivalent to other policies. This indicates the robustness of the maintenance policy and framework.

6. Conclusions

An innovative maintenance optimization problem regarding a two-unit serial system with both continuous and non-continuous degradation was investigated. Unlike previous studies, three types of maintenance renewal activities, namely, threshold-based renewal, postponement renewal, and opportunistic renewal, were integrated to enhance maintenance efficiency and mitigate downtime loss, which can be quantitatively analyzed and optimized via the Semi-Markov Decision Process. The applicability and cost superiorities over other conventional maintenance policies are demonstrated through a case study of the critical mechanical structure of a circulating pump. The comparative outcomes indicate that the proposed policy outperforms some heuristic/conventional policies in downtime mitigation and cost control, possessing better model robustness.

In future research, there are three possible extensions to the current study. First, the proposed maintenance model can be applied to more general and sophisticated multi-unit systems [44,45]. Second, the system structure mode can also be extended, including but not limited to serial, parallel, standby, and voting systems, with modified maintenance policies [30]. Third, the failure interaction and load sharing between units can also be integrated into the proposed model [40,46,47]. Last but not least, the statistical properties of the proposed model can be further explored to enhance the robustness and applicability of the maintenance policy.

Author Contributions

Methodology, F.W.; Formal analysis, F.W.; Resources, X.M. and L.Y.; Writing—original draft, F.W.; Writing—review & editing, J.W., L.Y. and Q.Q.; Visualization, F.W.; Supervision, X.M. and Q.Q.; Funding acquisition, L.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (grant no. 72101010).

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflict of interest.

Notations and Abbreviations

CBM	Condition-based maintenance
CDF	Cumulative distribution function
PDF	Probability distribution function
SMDP	Semi-Markov Decision Process
PM	Preventive maintenance
OM	Opportunistic maintenance
TR	Threshold-based replacement
DR	Delayed replacement
CR	Corrective replacement
OR	Opportunistic replacement
$Δ, Δ > 0$	Periodic inspection interval
$T_{k} = k Δ, k = 1, 2, \dots$	Execution time of the k-th inspection
$X (t), t > 0$	Degradation process characterized by Wiener process
$γ_{a}$	Random defect initialization time of Unit 2
$γ_{b}$	Random delay time of Unit 2
$D, D > 0$	Failure threshold regarding degradation of Unit 1
$U, 0 < U < D$	TR threshold regarding degradation of Unit 1
$L, 0 < L < U$	OR threshold regarding degradation of Unit 1
$H, H > 0$	Threshold for DR interval of Unit 2
$V, 0 < V < H$	Threshold for OR interval of Unit 2
$C_{I}$	Cost per inspection
$C_{C i}, i = 1, 2$	Cost per corrective replacement for Unit $i, i = 1, 2, \dots$
$C_{P i}, i = 1, 2$	Cost per TR for Unit 1 and DR for Unit 2
$C_{O i}, i = 1, 2$	Cost per opportunistic replacement for Unit $i, i = 1, 2, \dots$

References

Yang, L.; Chen, Y.; Qiu, Q.; Wang, J. Risk Control of Mission-Critical Systems: Abort Decision-Makings Integrating Health and Age Conditions. IEEE Trans. Ind. Inform. 2022, 18, 6887–6894. [Google Scholar] [CrossRef]
Wang, J.; Wang, Y.; Fu, Y. Joint Optimization of Condition-Based Maintenance and Performance Control for Linear Multi-State Consecutively Connected Systems. Mathematics 2023, 11, 2724. [Google Scholar] [CrossRef]
Qiu, Q.; Maillart, L.M.; Prokopyev, O.A.; Cui, L. Optimal condition-based mission abort decisions. IEEE Trans. Reliab. 2023, 72, 408–425. [Google Scholar] [CrossRef]
Wu, D.; Han, R.; Ma, Y.; Yang, L.; Wei, F.; Peng, R. A two-dimensional maintenance optimization framework balancing hazard risk and energy consumption rates. Comput. Ind. Eng. 2022, 169, 108193. [Google Scholar] [CrossRef]
Peng, R. Optimal maintenance strategy for systems with two failure modes. Reliab. Eng. Syst. Saf. 2019, 188, 624–632. [Google Scholar] [CrossRef] [Green Version]
Wang, J.; Viliam, M.; Xian, Z. Optimal condition-based and age-based opportunistic maintenance policy for a two-Unit series system. Comput. Ind. Eng. 2019, 134, 1–10. [Google Scholar] [CrossRef]
Van, P.D.; Bérenguer, C. Condition-based maintenance with imperfect preventive repairs for a deteriorating production system. Qual. Reliab. Eng. Int. 2012, 28, 624–633. [Google Scholar] [CrossRef]
Truong-Ba, H.; Borghesani, P.; Cholette, M.E.; Ma, L. Optimization of condition-based maintenance considering partial opportunities. Qual. Reliab. Eng. Int. 2020, 36, 529–546. [Google Scholar] [CrossRef]
Sugumaran, V.; Ramachandran, K. Automatic rule learning using decision tree for fuzzy classifier in fault diagnosis of roller hearing. Mech. Syst. Signal Process. 2007, 21, 2237–2247. [Google Scholar] [CrossRef]
Lei, B.; Ren, Y.; Luan, H.; Dong, R.; Wang, X.; Liao, J.; Gao, K. A Review of Optimization for System Reliability of Microgrid. Mathematics 2023, 11, 822. [Google Scholar] [CrossRef]
Berrade, M.; Scarf, P.; Cavalcante, C. A study of postponed replacement in a delay time model. Reliab. Eng. Syst. Saf. 2017, 168, 70–79. [Google Scholar] [CrossRef] [Green Version]
Zhao, X.; Sun, J.; Qiu, Q.; Chen, K. Optimal inspection and mission abort policies for systems subject to degradation. Eur. J. Oper. Res. 2021, 292, 610–621. [Google Scholar] [CrossRef]
Kou, G.; Liu, Y.; Xiao, H.; Peng, R. Optimal Inspection Policy for a Three-Stage System Considering the Production Wait Time. IEEE Trans. Reliab. 2022. [Google Scholar] [CrossRef]
Qiu, Q.; Cui, L. Gamma process based optimal mission abort policy. Reliab. Eng. Syst. Saf. 2019, 190, 106496. [Google Scholar] [CrossRef]
Jafari, L.; Makis, V. Optimal production and maintenance policy for a partially observable production system with stochastic demand. Int. J. Ind. Syst. Eng. 2019, 13, 449–456. [Google Scholar]
Wang, J.; Qiu, Q.; Wang, H. Joint optimization of condition-based and age-based replacement policy and inventory policy for a two-Unit series system. Reliab. Eng. Syst. Saf. 2021, 205, 107251. [Google Scholar] [CrossRef]
Castanier, B.; Grall, A.; Berenguer, C. A condition-based maintenance policy with non-periodic inspections for a two-Unit series system. Reliab. Eng. Syst. Saf. 2005, 87, 109–120. [Google Scholar] [CrossRef]
Yi, K.; Xiao, H.; Kou, G.; Peng, R. Trade-off between maintenance and protection for multi-state performance sharing systems with transmission loss. Comput. Ind. Eng. 2019, 136, 305–315. [Google Scholar] [CrossRef]
Cui, L.; Huang, J.; Yan, L. Degradation Models with Wiener Diffusion Processes Under Calibrations. IEEE Trans. Reliab. 2016, 65, 613–623. [Google Scholar] [CrossRef]
Xie, M.; Cho, G.; Kong, Y.; Li, D.L.; Altamirano, F.; Luo, X.; Hill, J.A. Activation of Autophagic Flux Blunts Cardiac Ischemia/Reperfusion Injury. Circ. Res. 2021, 129, 435–450. [Google Scholar] [CrossRef]
Wu, T.; Wei, F.; Yang, L.; Ma, X.; Hu, L. Maintenance optimization of k-out-of-n Load-Sharing Systems Under Continuous Operation. IEEE Trans. Syst. Man Cybern. Syst. 2023. [Google Scholar] [CrossRef]
Huynh, K.; Castro, I.; Barros, A.; Bérenguer, C. On the Use of Mean Residual Life as a Condition Index for Condition-Based Maintenance Decision-Making. IEEE Trans. Syst. Man Cybern. Syst. 2014, 44, 877–893. [Google Scholar] [CrossRef]
Qiu, Q.; Cui, L. Reliability modelling based on dependent two-stage virtual age processes. J. Syst. Eng. Electron. 2021, 32, 711–721. [Google Scholar]
Yang, L.; Chen, Y.; Ma, X.; Peng, R. A Prognosis-centered Intelligent Maintenance Optimization Framework under Uncertain Failure Threshold. IEEE Trans. Reliab. 2023. [Google Scholar]
Jardine, A.K. Optimizing condition-based maintenance decisions. In Proceedings of the Annual Reliability and Maintainability Symposium: 2002 Proceedings (Cat. No. 02CH37318), Seattle, WA, USA, 28–31 January 2002; pp. 90–97. [Google Scholar]
Li, S.; Chu, X. New tristate fault classification model and its threshold determination. J. Nanjing Univ. Aeronaut. Astronaut. 2008, 40, 292–296. [Google Scholar]
Gopalan, M.; Ramesh, T. Cost-benefit analysis of a one-server two-Unit system subject to shock and degradation. Microelectron. Reliab. 1986, 26, 499–518. [Google Scholar] [CrossRef]
Malefaki, S.; Koutras, V.; Platis, A. Optimizing Availability and Performance of a Two-Unit Redundant Multi-State Deteriorating System. In Recent Advances in Multi-State Systems Reliability: Theory and Applications; Springer: Berlin/Heidelberg, Germany, 2018; pp. 71–105. [Google Scholar]
Ahmad, R.; Kamaruddin, S. An overview of time-based and condition-based maintenance in industrial application. Comput. Ind. Eng. 2012, 63, 135–149. [Google Scholar] [CrossRef]
Levitin, G.; Xing, L.; Xiang, Y. Optimal replacement and reactivation in warm standby systems performing random duration missions. Comput. Ind. Eng. 2020, 149, 106791. [Google Scholar] [CrossRef]
Barbu, V.S.; D’Amico, G.; Gkelsinis, T. Sequential interval reliability for discrete-time homogeneous semi-Markov repairable systems. Mathematics 2021, 9, 1997. [Google Scholar] [CrossRef]
Dekker, R.; Wildeman, R.; Schouten, F. A review of multi-component maintenance models with economic dependence. Econom. Inst. Res. Pap. 1997, 45, 411–435. [Google Scholar] [CrossRef] [Green Version]
D’Amico, G.; Manca, R.; Petroni, F.; Selvamuthu, D. On the computation of some interval reliability indicators for semi-Markov systems. Mathematics 2021, 9, 575. [Google Scholar] [CrossRef]
Wang, W.; Liu, X.; Peng, R.; Guo, L. A delay-time-based inspection model for a two-component parallel system. In Proceedings of the 2013 Quality Reliability Risk Maintenance and Safety Engineering (QR2MSE), Chengdu, China, 15–18 July 2013; pp. 616–619. [Google Scholar]
Zhang, Z.; Yang, L. State-Based Opportunistic Maintenance with Multifunctional Maintenance Windows. IEEE Trans. Reliab. 2021, 70, 1481–1494. [Google Scholar] [CrossRef]
Najafi, S.; Makis, V. Comparison of two maintenance policies for a two-unit series system considering general repair. Int. J. Ind. Manuf. Eng. 2020, 14, 618–623. [Google Scholar]
Chen, Y.; Ma, X.; Wei, F.; Yang, L.; Qiu, Q. Dynamic scheduling of intelligent group maintenance planning under usage availability constraint. Mathematics 2022, 10, 2730. [Google Scholar] [CrossRef]
Qiu, Q.; Cui, L. Reliability evaluation based on a dependent two-stage failure process with competing failures. Appl. Math. Model. 2018, 64, 699–712. [Google Scholar] [CrossRef]
Yang, L.; Wei, F.; Qiu, Q. Mission Risk Control via Joint Optimization of Sampling and Abort Decisions. Risk Anal. 2023. [Google Scholar] [CrossRef]
Rafiee, K.; Feng, Q.; Coit, D.W. Reliability analysis and condition-based maintenance for failure processes with degradation-dependent hard failure threshold. Qual. Reliab. Eng. Int. 2017, 33, 1351–1366. [Google Scholar] [CrossRef]
Xiao, H.; Yi, K.; Peng, R.; Kou, G. Reliability of a distributed computing system with performance sharing. IEEE Trans. Reliab. 2021, 71, 1555–1566. [Google Scholar] [CrossRef]
Whitmore, G. Estimating degradation by a Wiener diffusion process subject to measurement error. Lifetime Data Anal. 1995, 1, 307–319. [Google Scholar] [CrossRef] [PubMed]
Dorigo, M.; Birattari, M.; Stutzle, T. Ant colony optimization. IEEE Comput. Intell. Mag. 2006, 1, 28–39. [Google Scholar] [CrossRef]
Peng, R.; He, X.; Zhong, C.; Kou, G.; Xiao, H. Preventive replacement policies with time of operations, mission durations, minimal repairs and maintenance triggering approaches. J. Manuf. Syst. 2020, 220, 108310. [Google Scholar] [CrossRef]
Levitin, G.; Xing, L.; Xiang, Y. Optimal multiple replacement and maintenance scheduling in two-Unit systems. Reliab. Eng. Syst. Saf. 2021, 22, 107803. [Google Scholar] [CrossRef]
Peng, R.; He, X.; Zhong, C.; Kou, G.; Xiao, H. Preventive maintenance for heterogeneous parallel systems with two failure modes. Reliab. Eng. Syst. Saf. 2022, 220, 108310. [Google Scholar] [CrossRef]
Basri, E.I.; Razak, I.H.A.; Ab-Samat, H.; Kamaruddin, S. Preventive maintenance (PM) planning: A review. J. Qual. Maint. Eng. 2017, 23, 114–143. [Google Scholar] [CrossRef]

Figure 1. Research framework.

Figure 2. An illustration of the integration for TR, OR, and CR.

Figure 3. The optimal cost with respect to the degradation coefficient of the main bearing. (a) Diffusion; (b) drift.

Figure 4. Maintenance cost with respect to drift and diffusion of the main bearing.

Figure 5. Sensitivity analysis for the first stage of Unit 2.

Figure 6. Optimal maintenance cost with respect to varying cost CO1 under four comparative policies.

Figure 7. Sensitivity analysis of maintenance costs for varying CO2.

Figure 8. Sensitivity analysis of maintenance costs for varying CP1.

Table 1. The optimal maintenance costs under four different maintenance policies.

Policies	H (Week)	V (Week)	S (mm)	L (mm)	Average Cost
Proposed policy	12	9	17.1	19.6	2941.5
Policy 1	12	9	16.5	18.4	3321.7
Policy 2	10	8	15.5	17.6	3154.9
Policy 3	9	7	15.2	16.8	3467.6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wei, F.; Wang, J.; Ma, X.; Yang, L.; Qiu, Q. An Optimal Opportunistic Maintenance Planning Integrating Discrete- and Continuous-State Information. Mathematics 2023, 11, 3322. https://doi.org/10.3390/math11153322

AMA Style

Wei F, Wang J, Ma X, Yang L, Qiu Q. An Optimal Opportunistic Maintenance Planning Integrating Discrete- and Continuous-State Information. Mathematics. 2023; 11(15):3322. https://doi.org/10.3390/math11153322

Chicago/Turabian Style

Wei, Fanping, Jingjing Wang, Xiaobing Ma, Li Yang, and Qingan Qiu. 2023. "An Optimal Opportunistic Maintenance Planning Integrating Discrete- and Continuous-State Information" Mathematics 11, no. 15: 3322. https://doi.org/10.3390/math11153322

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Optimal Opportunistic Maintenance Planning Integrating Discrete- and Continuous-State Information

Abstract

1. Introduction

2. Problem Description

3. Maintenance Planning

3.1. Basic Assumptions

3.2. Group Maintenance Scheduling

4. Model Formulation and Optimization

4.1. State Transition of Unit 1

4.2. State Transition of Unit 2

4.3. System Transition Probability

4.4. Expected Sojourn Time

4.5. Expected Cost

4.6. System Objective Function

4.7. Solution Procedure

5. Numerical Experiment

5.1. Optimization Results

5.2. Policy Comparison

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Notations and Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI