A Hybrid Cluster Variational Autoencoder Model for Monitoring the Multimode Blast Furnace System

Chen, Chenyu; Cai, Jinhui

doi:10.3390/pr11092580

Open AccessArticle

A Hybrid Cluster Variational Autoencoder Model for Monitoring the Multimode Blast Furnace System

by

Chenyu Chen

and

Jinhui Cai

^*

College of Metrology and Measurement Engineering, China Jiliang University, Hangzhou 310018, China

^*

Author to whom correspondence should be addressed.

Processes 2023, 11(9), 2580; https://doi.org/10.3390/pr11092580

Submission received: 2 August 2023 / Revised: 24 August 2023 / Accepted: 25 August 2023 / Published: 29 August 2023

Download

Browse Figures

Versions Notes

Abstract

:

Efficient monitoring of the blast furnace system is crucial for maintaining high production efficiency and ensuring product quality. This article introduces a hybrid cluster variational autoencoder model for monitoring the blast furnace ironmaking process which exhibits multimode behaviors. In contrast to traditional approaches, this method utilizes neural networks to learn data features and effectively handles the diverse feature types observed in different production modes. Through the utilization of a clustering process within the hidden layer of the variational autoencoder, the proposed technique facilitates efficient fault detection in the context of multimodal blast furnace data. Based on the variational autoencoder model, this study further establishes a unified monitoring index and defines a method for computing the control limits. The application of the model to real blast furnace data reveals its proficiency in accurately identifying faults across diverse modes; compared with the probabilistic principal component analysis based on the local nearest neighbor standardization method and the recursive probabilistic principal component analysis, the model shows a reduction in false positives by up to 10.3% and a substantial reduction of 19.2% in the missed detection rate. This method achieves a remarkable false detection rate of only 0.2% and 0 instances of missed detection.

Keywords:

multimode blast furnace system; process monitoring; variational autoencoder; Gaussian mixture model

1. Introduction

The iron and steel industry, as a cornerstone of modern manufacturing, plays a vital economic role in modern manufacturing. Blast furnace ironmaking is pivotal within this sector, significantly affecting system stability and productivity. The blast furnace system includes subsystems like feeding, gas treatment, hot air, injection, iron-out, and the furnace body [1]. The furnace body is central to ironmaking, housing reactions yielding molten iron. Here, raw materials enter through the top distributor, while hot air, oxygen-enriched gases, and fuel are injected from the bottom [2]. Liquid iron and slag exit the lower section, and gas is collected from the upper section [3]. Various types of faults occur in blast furnace ironmaking, such as suspension, low feed, cooling, accumulation, and airflow issues, stemming from factors like raw material instability, external fluctuations, manual errors, and equipment glitches. Therefore, detecting and addressing these faults early is crucial for maintaining production quality.

Two primary fault detection methods are employed in blast furnace ironmaking: expert system-based methods and data-driven methods [3]. Expert system-based methods rely on process knowledge and operator expertise to identify abnormal states within the blast furnace [4]. These methods are characterized by their ease of comprehension and widespread utilization, making them the most commonly employed fault detection approach. However, the expert system-based method requires substantial efforts in establishing and maintaining a comprehensive knowledge base. An alternative approach is data-driven, utilizing historical data; this approach uncovers relationships among variables derived from historical data to construct a fault detection model. Multivariate statistical process control such as principal component analysis (PCA) has been used to monitor the operational state of experimental blast furnaces, achieving significant success [5]. Other machine-learning-based approaches, including techniques like autoencoder [6] and support vector machines [7,8,9], have also been widely employed. Given that blast furnace ironmaking represents a traditional continuous chemical process, its data typically exhibit characteristics such as nonlinearity, multimodality, and dynamic patterns, etc. Researchers such as Zhou et al. [10] have employed the kernel principal component analysis (KPLS) method to deal with this. And Liu, Zeng et al. [11] developed a novel probabilistic monitoring framework using the probabilistic linear discriminant analysis (PLDA). Additionally, Zhang et al. [2] have demonstrated the efficacy of using denoising autoencoders to extract nonlinear characteristics, exhibiting promising results in detecting cold failures in blast furnaces. Traditional methods employed in the blast furnace ironmaking process have predominantly emphasized stable operating conditions. However, due to fluctuations in raw materials and changes in product proportions, the operational conditions of the blast furnace process often experience variations. In such instances, a multimode process fault detection method for blast furnace ironmaking becomes increasingly significant [12]. Zhu et al. [13,14] have enhanced the algorithm model based on the sliding window technique. Zhou [15] has proposed a fault detection method utilizing an improved independent component analysis (ICA) algorithm to discern the importance of different samples, thereby facilitating more reasonable model updates. Despite promising results obtained in simulation datasets, the application of these methods in blast furnace ironmaking is limited by the fact that sliding window methods discard useful information from long-term historical data.

This article presents a novel fault monitoring method for modeling the multimodal blast furnace system. The proposed method combines the variational autoencoder (VAE) framework with a Gaussian mixture model (GMM) to achieve clustering and fault monitoring objectives. The VAE learns the feature representation of nonlinear data and conducts cluster analysis on potential feature representations using the mixture distribution model. The model learns the latent features and reconstructs them as the observed multimodal blast furnace system data through the hybrid cluster VAE method. For fault detection, the method utilizes normal operational data for training and infers the probability density of reconstructed data by solving the posterior probability. Based on this, a probability monitoring index and its corresponding control limit are derived. The proposed method effectively addresses the challenges of nonlinearity and multimodality, rendering it suitable for fault detection in blast furnace ironmaking systems.

2. Variational Autoencoder and Gaussian Mixture Model

The variational autoencoder (VAE), a type of directed probability graphical model implemented through a variational Bayesian method, has consistently demonstrated excellent performance in uncovering the latent features of data. Figure 1 depicts the structure of the variational autoencoder. Assuming that an observed dataset

X \in ℝ^{N \times m}

has been collected, with n being the number of samples and m the amount of variables, let

x \in ℝ^{m}

be a data sample and

z \in ℝ^{d}

the latent feature vector, where d is the number of latent features. The variational autoencoder comprises two primary components: the decoder part, referred to as

p_{θ} (x | z)

, and the encoder part, denoted as

q_{ϕ} (z | x)

, where

ϕ

and

θ

denote the parameters.

The decoder part,

p_{θ} (x | z)

, of the variational autoencoder is commonly referred to as the generative model. It generates the high-dimensional observed variables x based on the latent variables z, with a prior assumption distribution. Through the trained network structure, the decoder performs the task of data reconstruction. Typically, the prior distribution of the latent variables is chosen by the standard normal distribution

p (z) \sim N (0, 1)

, and the decoder part can be described as a multivariate normal distribution as shown by the following formula:

p_{θ} (x | z) = N (μ (z; θ), diag (σ^{2} (z; θ)))

(1)

Here,

μ (z; θ)

and

diag (σ^{2} (z; θ))

represent the mean vector and covariance matrix, respectively.

The encoder of the variational autoencoder, also referred to as the inferential model or identification model, maps the observed data to a lower-dimensional latent space through nonlinear dimensionality reduction. The encoder operates under similar assumptions as the generative model regarding the distribution of the observed variables. The specific model is as follows:

q_{ϕ} (z | x) = N (μ (x; ϕ), diag (σ^{2} (x; ϕ)))

(2)

To estimate the unknown parameters in the variational autoencoder (VAE) model, the evidence lower bound (ELBO) is used as the cost function. The parameters are then determined by maximizing this variational lower bound:

L (θ, ϕ) = \underset{θ, ϕ}{\arg \max} (E_{q_{ϕ} (z | x)} [\log p_{θ} (x | z)] - D_{K L} [q_{ϕ} (z | x) | | p (z)])

(3)

Here,

D_{K L} (q | | p)

represents the Kullback–Leibler (KL) divergence, which serves as a metric for measuring the difference between two probability distributions. It quantifies how one distribution diverges from another, providing a means to compare the dissimilarity between probability distributions [16].

The Gaussian mixture model (GMM) can be used to classify data into different categories by considering their probability distribution. It assumes that the entire dataset is generated from a mixture of different Gaussian distributions. Assuming that the model consists of K Gaussian distributions, and given the parameter

Θ

, the posterior distribution of observed data

Y = {y_{1}, y_{2}, \dots, y_{n}} \in ℝ^{N \times m}

can be described as follows:

p (y | Θ) = \sum_{k = 1}^{K} p (ϑ_{k}) p (y | ϑ_{k}) = \sum_{k = 1}^{K} π_{k} f (y | ϑ_{k})

(4)

Here,

K

represents the number of Gaussian components,

π_{k} \geq 0, \sum_{k = 1}^{K} π_{k} = 1

, and

π_{k}

is the weight of the k-th Gaussian sub-model.

f (y | ϑ_{k})

is the posterior distribution of y under the k-th sub-model, with

ϑ_{k} = (μ_{k}, Σ_{k})

and

Θ = {ϑ_{1}, ϑ_{2}, \dots, ϑ_{k}}

. Hence,

f (y | ϑ_{k})

can be described as follows:

f (y | ϑ_{k}) = \frac{1}{{(\sqrt{2 π})}^{m} {|Σ|}^{\frac{1}{2}}} \exp (- \frac{{(y - μ_{k})}^{T} Σ_{k}^{- 1} (y - μ_{k})}{2})

(5)

where m stands for the dimension of the data, and the estimation of the parameters

Θ, Π = {π_{1}, π_{2}, \dots, π_{k}}

in the Gaussian mixture model can be achieved by maximizing the log-likelihood estimation given the training dataset, as follows:

Θ, Π = \underset{Θ, Π}{\arg \max} \log L (Θ, Π; Y)

(6)

Here, the log-likelihood function is

\log L (Θ, Π; Y) = \sum_{n = 1}^{N} \log p (y_{n} | Θ) = \sum_{n = 1}^{N} \log (\sum_{k = 1}^{K} π_{k} f (y_{n} | ϑ_{k}))

(7)

When dealing with the parameter estimation of the Gaussian mixture model, direct differentiation cannot be easily applied to maximize the log-likelihood function. Therefore, the Expectation–Maximization (EM) algorithm can be employed to solve the maximum likelihood estimation of the parameters in the probabilistic model with hidden variables. The EM algorithm iteratively updates the parameters by alternately performing the E-step (expectation) and M-step (maximization) until convergence is achieved. This iterative process allows for the estimation of the model parameters in the presence of hidden variables.

3. Multimodal Process Monitoring Method Based on Hybrid Cluster Variational Autoencoder

3.1. Model Structure and Parameter Estimation of the Hybrid Cluster Variational Autoencoders

In this section, the details of the hybrid cluster variational autoencoder model are presented. The model introduces a clustering effect on the latent variables, allowing for unsupervised clustering of the hidden-layer variables. This clustering capability facilitates fault monitoring of multimodal data during the generation process of the variational autoencoder. Furthermore, an enhanced version of the variational lower bound is derived for the proposed model. Moreover, a probability index for process monitoring is introduced to further enhance the fault monitoring capability. By considering the clustering aspect of the data, the proposed model effectively addresses the objective of the multimode blast furnace system.

In contrast to the traditional variational autoencoder, the proposed model integrates the variational autoencoder with a Gaussian mixture distribution, specifically designed to handle multimodal data. The aim is to ensure that data of the same mode exhibit proximity in the hidden layer, while data from different modes remain distinct.

In the variational autoencoder model, a common assumption is that the prior distribution

p (z)

of latent variables follows a standard normal distribution. However, this often restricts variational autoencoders to representing singular distributions, thereby limiting their capability to learn complex distributions with multimodal attributes. To enhance the model’s capacity for encompassing various forms of multimodal data, it is proposed to constrain the prior distribution of latent variables using a Gaussian mixture distribution. This adaptation enables the model to combine different modes in accordance with the characteristics of the Gaussian mixture distribution during the latent representation of observed variables. Within the Gaussian mixture distribution, the stratification of distinct data categories is achieved by introducing a parameter-controlled variable

Y

. Similarly, disparate categories of observation variables can be delineated by introducing parameter

c

in the HCVAE model.

Within the formulated hybrid clustering variational autoencoder architecture, the latent variables encompass both a discrete variable

c

, signifying the category of the observed data, and a continuous latent variable

z

, a constituent also present in the conventional variational autoencoder.

Firstly, we describe the decoding part of the network; in this part, the hidden layer of the model represents the latent variable

z

. The observed sample

x

is generated from

z

and

c

.

The joint probability distribution, also known as the generative model, can be expressed as follows:

p (x, z, c_{k}) = p_{θ_{x}} (x | z) p_{θ_{z}} (z | c_{k}) p (c_{k})

(8)

Since z, x, and c are independent of each other, the probability can be defined in an alternative manner as

\begin{array}{l} p (c_{k}) = C a t (c_{k} | π_{k}) \\ p_{θ_{z}} (z | c_{k}) = N (μ_{k} (c_{k}; θ_{z}), diag (σ_{k}^{2} (c_{k}; θ_{z}))) \\ p_{θ_{x}} (x | z) = N (μ_{x} (z; θ_{x}), diag (σ_{x}^{2} (z; θ_{x}))) \end{array}

(9)

The categories are generated through a categorical distribution

c_{k} \sim C a t (Π)

. The potential vector is then generated based on the generated categories, and the original input data are reconstructed using the decoding part of the hybrid cluster variational autoencoder,

p_{θ_{x}} (x | z)

, by sampling the latent variable

z

. Here, K is a predetermined initial parameter representing the number of modes in the input space. The prior probability of class k (k = 1,…, K) is denoted by

π_{k}

, where

π \in R_{+}^{K}, \sum_{k = 1}^{K} π_{k} = 1

,

C a t (Π)

is the distribution controlled by the parameters

Π

, and the prior is a uniform distribution of

p (c_{k}) = K^{- 1}

. The distribution

p (z) = \sum_{k = 1}^{K} p_{θ_{z}} (z | c_{k}) p (c_{k})

can be considered a Gaussian mixture distribution. And the means

μ_{k} (c_{k}; θ_{z})

and variances

diag (σ_{k}^{2} (c_{k}; θ_{z})

of each category of the mixture can be given by the networks within the input data. During the training process, the parameters will be updated. Then, by sampling from the distribution

p_{θ_{z}} (z | c_{k})

, the latent variable

z

can be obtained, and the mean

μ_{x} (z; θ_{x})

and variance

diag (σ_{x}^{2} (z; θ_{x})

are calculated by the networks.

With Equation (9), the logarithmic likelihood of the hybrid clustering variational autoencoder can be rederived as follows:

\begin{array}{l} \log p (x) = \log \int_{z} \sum_{c} p (x, z, c_{k}) d z \geq \\ E_{q_{ϕ} (z, c_{k} | x)} [\log \frac{p (x, z, c_{k})}{q_{ϕ} (z, c_{k} | x)}] = L_{E L B O} (x) \end{array}

(10)

where

L_{E L B O} (x)

is the evidence lower bound, and

q_{ϕ} (z, c_{k} | x)

is the encoding part of the networks, which is the same as the recognition model.

Furthermore, for the encoding part of the hybrid cluster variational autoencoder, the joint distribution

q_{ϕ} (z, c_{k} | x)

can be expressed as

q_{ϕ} (z, c_{k} | x) = q_{ϕ_{z}} (z | x) q_{ϕ_{c}} (c_{k} | x)

(11)

The encoding model consists of two parts.

q_{ϕ_{c}} (c_{k} | x)

is obtained by the neural network with a Softmax classifier. For the distribution

q_{ϕ_{z}} (z | x)

, the coding part of the variational autoencoder, whose outputs are the mean

μ (x; ϕ_{z})

and variance

σ^{2} (x; ϕ_{z})

, is used to obtain the latent variables; this process can be described by

μ (x; ϕ_{z}), σ^{2} (x; ϕ_{z})

. The model’s description is illustrated in Figure 2.

q_{ϕ_{z}} (z | x) = N (μ (x; ϕ_{z}), diag (σ^{2} (x; ϕ_{z})))

(12)

Based on Equations (8), (9) and (11), the logarithmic likelihood of the

p (x)

can be rewritten as

\begin{array}{l} L_{E L B O} (x) = E_{q_{ϕ} (z, c_{k} | x)} [\log \frac{p (x, z, c_{k})}{q_{ϕ} (z, c_{k} | x)}] \\ = E_{q_{ϕ} (z, c_{k} | x)} [\log p (x, z, c_{k}) - \log q_{ϕ} (z, c_{k} | x)] \\ = E_{q_{ϕ} (z, c_{k} | x)} [\log p_{θ_{x}} (x | z) + \log p_{θ_{z}} (z | c_{k}) \\ + \log p (c_{k}) - \log q_{ϕ_{z}} (z | x) - \log q_{ϕ_{c}} (c_{k} | x)] \end{array}

(13)

Then, by employing the stochastic gradient variational estimation and the reparameterization technique, the equation for

L_{E L B O} (x)

can be improved as

\begin{array}{l} L_{E L B O} (x) = \\ \frac{1}{L} \sum_{l}^{L} \sum_{m}^{M} x_{m} \log μ_{x}^{(l)} |_{m} + (1 - x_{m}) \log (1 - μ_{x}^{(l)} |_{m}) \\ - \frac{1}{2} \sum_{k}^{K} q_{ϕ_{c}} (c_{k} | x) \sum_{d = 1}^{D} (\log σ_{k}^{2} |_{d} + \frac{σ^{2} |_{d}}{σ_{k}^{2} |_{d}} + \frac{{(μ |_{d} - μ_{k} |_{d})}^{2}}{σ_{k}^{2} |_{d}}) \\ + \sum_{k}^{K} q_{ϕ_{c}} (c_{k} | x) \log \frac{π_{k}}{q_{ϕ_{c}} (c_{k} | x)} + \frac{1}{2} \sum_{d = 1}^{D} (1 + \log σ^{2} |_{d}) \end{array}

(14)

Here,

L

is the number of Monte Carlo samples in the SGVB estimation,

M

is the dimension of data

x

,

x_{m}

represents the m-th element of

x

,

D

is the dimension of

μ_{k}, σ_{k}^{2}, μ, σ^{2}

,

d

represents the d-th element, and

π_{c}

is the prior probability of class

c_{k}

.

Once the network is constructed, an issue arises if the latent space is directly sampled and linked with the decoder as it outputs the mean and variance. This situation can lead to difficulties in the network’s backpropagation process, hindering gradient propagation. To address this, a reparameterization trick is employed, which introduces a random noise variable to reparameterize the latent variable sampling process.

In the encoding part of the variational autoencoder, the mean and the variance can be obtained from the networks:

μ_{z}^{(l)}, σ_{z}^{2}^{(l)} = f (z^{(l)}; θ_{z})

(15)

where

z^{(l)}

is the l-th sample obtained from Monte Carlo sampling of the distribution

q_{ϕ_{z}} (z | x)

. By employing the reparameterization technique, the expression for

z^{(l)}

can be given as follows:

z^{(l)} = μ_{z}^{(l)} + σ_{z}^{(l)} \circ ε^{(l)}

(16)

where

ε^{(l)}

follows the standard normal distribution, and the mean

μ_{z}^{(l)}

and variance parameters

σ_{z}^{(l)}

are obtained from the encoder.

Furthermore, when dealing with the evidence lower bound, the objective is to maximize it in order to optimize the parameters.

\begin{array}{l} L_{E L B O} (x) = E_{q_{ϕ} (z, c_{k} | x)} [\log \frac{p (x, z, c_{k})}{q_{ϕ} (z, c_{k} | x)}] \\ = \int_{z} \sum_{c} q_{ϕ_{c}} (c_{k} | x) q_{ϕ_{z}} (z | x) [\log \frac{p_{θ_{x}} (x | z) p (z)}{q_{ϕ_{z}} (z | x)} + \log \frac{p (c_{k} | z)}{q_{ϕ_{c}} (c_{k} | x)}] d z \\ = \int_{z} q_{ϕ_{z}} (z | x) \log \frac{p_{θ_{x}} (x | z) p (z)}{q_{ϕ_{z}} (z | x)} d z \\ - q_{ϕ_{z}} (z | x) D_{K L} (q_{ϕ_{c}} (c_{k} | x) | | p (c_{k} | z)) d z \end{array}

(17)

In the equation, the first term is independent of c, and the second term is non-negative. Therefore, to maximize the evidence lower bound, it is necessary that

D_{K L} (q_{ϕ_{c}} (c | x) | | p (c | z)) \equiv 0

. In this case,

q (c_{k} | x)

can be expressed as follows:

q_{ϕ_{c}} (c_{k} | x) = p (c_{k} | z) \equiv \frac{p (c_{k}) p_{θ_{z}} (z | c_{k})}{\sum_{k = 1}^{K} p (c_{k}) p_{θ_{z}} (z | c_{k})}

(18)

The above formula reasonably addresses the problem of mode identification in input data.

When estimating the maximized ELBO, the potential representation z can be obtained through sampling. At this point, it becomes possible to determine which class the input sample belongs to by examining the distribution q. By analyzing the parameters of q, one can assign the input data to a specific class or cluster based on the highest probability. This allows for classification or clustering of the data based on the learned representations in the latent space.

3.2. Process Monitoring Method

The process monitoring approach proposed, grounded in the HCVAE model, is constructed within an unsupervised learning framework. During the model training phase, only normal data were employed. This training process iteratively refines the network model parameters by optimizing the variational lower bound. Concerning the data to be processed, the process data are translated into the latent variable space by learning the model

q_{ϕ} (z, c_{k} | x)

. Subsequently, the likelihood of the input data belonging to each category is computed using Formula (18), followed by sampling from the latent variable space. This sampled data are then processed through the decoder

p (x, z, c_{k})

to derive the fault monitoring indicators. This estimated density serves as a statistical index for process fault monitoring. By comparing the estimated density with control limits determined through Monte Carlo sampling, process faults can be detected.

Based on the generation process of the VAE model described earlier, the marginal probability density of the observed variable

x

can be calculated as

p (x) = \int_{z} \sum_{c} p (x, z, c_{k}) d z

(19)

For the normal variable

x

of the same mode, the probability of

p_{θ} (x | z)

is concentrated in a region with a greater probability. To sample from this concentrated distribution, we can utilize importance sampling with a concentrated posterior probability distribution. By incorporating importance sampling, the formula can be rewritten as

p (x) = E_{q_{ϕ_{z}} (z | x)} [\frac{p (x, z, c_{k})}{q_{ϕ_{z}} (z | x)}]

(20)

It can subsequently be written as sampling:

p (x) \approx \frac{1}{I} \sum_{i = 1}^{I} \frac{p_{θ_{x}} (x | z_{i}) p_{θ_{z}} (z_{i} | c_{k}) p (c_{k})}{q_{ϕ_{z}} (z_{i} | x)}

(21)

In this formulation, I is the number of importance samples, and

z_{i}

denotes the i-th sample obtained from the concentrated posterior distribution

q_{ϕ_{z}} (z_{i} | x)

. This approximation allows us to estimate the marginal probability density of the observed variable

x

by leveraging importance sampling and considering the concentration of the posterior distribution.

Drawing upon the principles of probability and statistics, the smaller the probability

p (x)

, the higher the likelihood of a fault occurrence. Therefore,

p (x)

can be utilized for fault detection. In the context of monitoring, it is crucial to establish control limits for detecting potential abnormal behavior. By defining confidence limits, we can determine the control limits of probabilities using the following integral [17]:

\int_{- \infty}^{h} p (x) d x = 1 - α

(22)

In this equation,

x_{α}

represents the threshold value corresponding to the confidence level

1 - α

. By solving this integral equation, we can determine the control limits of probabilities, enabling the identification of potential abnormal behavior in the monitored process.

As for the aforementioned formula, direct integration can be challenging, so we can approximate it by converting the integral into a Markov Chain Monte Carlo (MCMC) sampling method. In the context of the VAE mixture model, both the encoding process and decoding process can be viewed as a simple Markov chain. Through the MCMC sampling process, a large number of samples with complex distributions can be generated [18].

By utilizing the MCMC samples, we can estimate the control limit h through an approximate integral: obtain the

S

samples from

p (x^{(j)})

using MCMC sampling; calculate the likelihood

p (x^{j})

of each sample

x^{j}

; and sort the obtained results in descending order. The control limit can be obtained by

h = p (x_{l})

, where

l = S (1 - α)

.

Since

p (x)

is a non-negative number between 0 and 1, we can convert it into its negative logarithm to better visualize the monitoring results. This transformation allows us to express the monitoring index as follows:

L p (x) = - \log p (x)

(23)

The control limit is redefined as

h = - \log p (x_{l})

. If

L p (x) > h

, the data

x_{n}

are considered faulty.

The process monitoring strategy for blast furnace ironmaking, which employs a hybrid clustering variational autoencoder for multimode data, consists of two distinct phases: offline modeling and online monitoring. The offline modeling phase involves the following steps:

Step 1: Collect a sufficient number of samples with modal labels to construct a database for model training.

Step 2: Standardize the data to ensure the data have zero mean and unit variance.

Step 3: Construct the network proposed in Section 3 and train it using the prepared dataset.

Here are the steps involved in constructing and training the network:

Network Architecture: Design the architecture of the hybrid clustering variational autoencoder model. This includes defining the number of layers, the size of each layer, and the activation functions to be used. The model should have separate encoding and decoding parts, with the clustering component integrated into the hidden layer.
Training Data Preparation: Split the collected and standardized dataset into training and validation sets. The training set is used to update the model parameters.
Validation and Hyperparameter Tuning: learning rate, batch size, and the number of clusters, through techniques like cross-validation or grid search.

Step 4: Compute the monitoring index, and the control limit is determined using Equation (22).

The online monitoring procedure consists of the following steps:

Step 1: Collect the monitored sample and standardize it.

Step 2: Input the data into the network and calculate the monitoring index using Equation (21) and Equation (23). If

L p (x) > h

, the data

x_{n}

are considered faulty; otherwise, it is a normal sample.

4. Application to a Real Blast Furnace

This section focuses on the analysis of actual production data collected from a blast furnace in China. The blast furnace under consideration has an inter volume of 2500 m³. For the training process, 1500 samples of 10 process variables were considered, each collected at a sampling interval of 2 min. These variables include

x_{1}

(quality of blast),

x_{2}

(temperature of blast),

x_{3}

(pressure of blast),

x_{4}

(quantity of coal powder),

x_{5}

(top pressure),

x_{6}

(permeability index),

x_{7}

(quantity of blast oxygen),

x_{8}

(coke ratio),

x_{9}

(

C O

concentration), and

x_{10}

(

C O_{2}

concentration) [11]. These samples include process data from three different modes; each training sample is labeled with the corresponding mode, indicating the specific operating state of the blast furnace during that period. The modality number for each sample is listed in Table 1.

The test set consists of 1000 samples, which were also collected at a sampling interval of 2 min. It is worth noting that the fault occurs after the first 500 sample points in the test set. The abnormality is primarily attributed to the fluctuation of

C O

concentration (

x_{9}

) in the flue gas, which is caused by excessive coal powder (

x_{4}

) [19]. Figure 3 shows the time series of the 1000 samples.

Figure 3 clearly shows a rapid numerical increase in variables

x_{7}

and

x_{8}

, while the

C O

concentration (

x_{9}

) decreases in the last 500 points. However, an important observation is that the modal information of the data cannot be solely obtained from the multimode series data.

The proposed HCVAE model adopts a network structure consisting of five layers: an input layer, two intermediate layers, a hidden layer, and an output layer. The size of the intermediate layer is set to 26. The parameters of the latent variable in HCVAE were determined as three by 5-fold cross-validation, which minimizes the reconstruction error. During the model training process, the Adaptive Moment Estimation (ADAM) optimizer was employed with a learning rate of 0.001, the batch size was set to 32, and the activation function used in the model was the hyperbolic tangent (Tanh). After completing the model training process, the mode identification and fault monitoring were performed on a dataset that includes fault samples. The mode identification results are depicted in Figure 4, illustrating the successful identification of three modes. The results indicate that the data belonging to the same mode exhibit concentration and continuity, resembling the production process of blast furnace ironmaking, and the model effectively captures the mode transitions. These findings validate the reasonable and effective mode identification capability of the HCVAE model.

Notably, the mode identification results from the 500th to 1000th samples exhibit more pronounced anomalous identification compared with the normal operating state. This suggests that faults influence the data’s process characteristics, consequently affecting the mode identification results. Regarding fault monitoring, a unified statistic for fault monitoring was constructed. Figure 5 illustrates the monitoring results with a 95% confidence limit. It can be observed that at the 500th sample point, there is a sudden increase in the statistic, and subsequent sample data consistently exceed the confidence limit, indicating a malfunction in the blast furnace ironmaking system during operation.

For comparison, the probabilistic principal component analysis based on the local nearest neighbor standardization (LNSPPCA) method proposed in reference [20] and the recursive probabilistic principal component analysis (RPPCA) method proposed in reference [21] was tested. The LNSPPCA method identifies the K nearest neighbors for each sample point in the dataset and forms a neighbor set. The mean and standard deviation of the neighbor set are used to standardize the current sample points, followed by fault monitoring using the PPCA method. For the blast furnace system, the number of principal components in the LNSPPCA method was set to three based on the results of a 5-fold cross-validation. The RPPCA method is grounded in the analysis of the singular values of the historical data matrix. It segments the entire process into distinct steady modes and mode transitions. As the process shifts from one mode to another, the RPPCA algorithm facilitates the recursive update of model parameters [21]. The number of principal components was set to three based on the results of a 10-fold cross-validation. The confidence level of the two methods was set to 95%. The fault monitoring results of LNSPPCA are shown in Figure 6, and the fault monitoring results of RPPCA are shown in Figure 7.

Comparing Figure 5, Figure 6 and Figure 7, it is evident that the monitoring of HCVAE yields the best detection results. Although all the methods effectively identify faults after the 500th sample, aligning with the over-coal injection fault indicated by the dataset, a closer examination of the statistical indicators reveals that LNSPPCA exhibits a higher rate of false positives, with a monitoring error rate of 10.5% at the 95% confidence level. Conversely, the HCVAE method avoids missed detections and false positives. The proposed hybrid cluster variational autoencoder model outperforms the LNSPPCA method by considering the multimodality and nonlinearity of the blast furnace ironmaking system. The LNSPPCA method assumes a linear subspace for the underlying data distribution, which often leads to inadequate fault detection performance in industrial data. In addition, as shown in Figure 7, the RPPCA method has a low false alarm rate, with only one instance among all normal samples being erroneously identified as faulty. This outcome is similar to the HCVAE method, exhibiting a mere 0.2% false positive rate. However, the RPPCA approach proves ineffective when confronted with faulty samples, as it experiences a considerable number of missed detections. Specifically, the rate of missed detections, whereby faulty samples are incorrectly classified as normal ones, reaches as high as 19.2%. The efficacy of the RPPCA method is heavily reliant on the accuracy of its mode classification results, a dependency that could contribute to a significant error rate. This outcome renders the method unsuitable for practical real-world applications. The hybrid cluster variational autoencoder model employs a nonlinear dimensionality reduction method through neural networks for fault monitoring of multimodal blast furnace data. This fundamentally demonstrates the model’s suitability for handling complex industrial process data. The experimental results validate the effectiveness of the proposed monitoring method and strategy in the fault monitoring of the complex multimodal blast furnace ironmaking process.

5. Conclusions

This paper presents a novel fault monitoring method based on a hybrid cluster variational autoencoder model, aiming to address the challenge of multimode process fault monitoring in blast furnace ironmaking. In contrast to traditional methods, we utilize a neural network to learn the data features. With changing production conditions, data from various modes frequently display distinct feature patterns. By harnessing the clustering process within the hidden layer, facilitated by the variational autoencoder, the method effectively monitors faults in multimode data. Based on the proposed model, we established a unified monitoring index and determined the calculation method for the control limit. The application of this method in blast furnace ironmaking demonstrates its effectiveness in identifying faults occurring between different modes in the multimode process. The approach properly accommodates the nonlinearity and temporal variability of the blast furnace ironmaking process, leading to better fault detection capability compared with the comparison method.

Author Contributions

Conceptualization, C.C. and J.C.; Software, C.C.; Formal analysis, J.C.; Writing—original draft, C.C.; Writing—review & editing, J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Data Availability Statement

The data are not publicly available due to sensitivity concerns.

Acknowledgments

Thanks for the care and support of all authors.

Conflicts of Interest

The authors declare no conflict of interest.

References

Li, J.; Hua, C.; Yang, Y.; Guan, X. Bayesian Block Structure Sparse Based T–S Fuzzy Modeling for Dynamic Prediction of Hot Metal Silicon Content in the Blast Furnace. IEEE Trans. Ind. Electron. 2017, 65, 4933–4942. [Google Scholar] [CrossRef]
Zhang, T.; Wang, W.; Ye, H.; Huang, D.; Zhang, H.; Li, M. Fault detection for ironmaking process based on stacked denoising autoencoders. In Proceedings of the 2016 American Control Conference (ACC), Boston, MA, USA, 6–8 July 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 3261–3267. [Google Scholar]
Yan, F.; Zhang, X.; Yang, C.; Hu, B.; Qian, W.; Song, Z. Data-driven modelling methods in sintering process: Current research status and perspectives. Can. J. Chem. Eng. 2022, 101, 4506–4522. [Google Scholar] [CrossRef]
Otsuka, K.; Matoba, Y.; Kajiwara, Y.; Kojima, M.; Yoshida, M. A hybrid expert system combined with a mathematical model for blast furnace operation. ISIJ Int. 1990, 30, 118–127. [Google Scholar] [CrossRef]
Vanhatalo, E. Multivariate process monitoring of an experimental blast furnace. Qual. Reliab. Eng. Int. 2010, 26, 495–508. [Google Scholar] [CrossRef]
Qian, J.; Song, Z.; Yao, Y.; Zhu, Z.; Zhang, X. A review on autoencoder based representation learning for fault detection and diagnosis in industrial processes. Chemometr. Intell. Lab. Syst. 2022, 231, 104711. [Google Scholar] [CrossRef]
Saxén, H.; Lassus, L.; Seppänen, M.; Karjalahti, T. Pattern recognition and classification of blast furnace wall temperatures. Ironmak. Steelmak. 2000, 27, 207–211. [Google Scholar]
Zhang, T.; Ye, H.; Wang, W.; Zhang, H. Fault diagnosis for blast furnace ironmaking process based on two-stage principal component analysis. ISIJ Int. 2014, 54, 2334–2341. [Google Scholar]
Tian, H.; Wang, A. A novel fault diagnosis system for blast furnace based on support vector machine ensemble. ISIJ Int. 2010, 50, 738–742. [Google Scholar] [CrossRef]
Zhou, P.; Zhang, R.; Liang, M.; Fu, J.; Wang, H.; Chai, T. Fault identification for quality monitoring of molten iron in blast furnace ironmaking based on KPLS with improved contribution rate. Control Eng. Pract. 2020, 97, 104354. [Google Scholar] [CrossRef]
Liu, Y.; Zeng, J.; Bao, J.; Xie, L. A unified probabilistic monitoring framework for multimode processes based on probabilistic linear discriminant analysis. IEEE Trans. Ind. Inform. 2020, 16, 6291–6300. [Google Scholar] [CrossRef]
Zhou, B.; Ye, H.; Zhang, H.; Li, M. Process monitoring of iron-making process in a blast furnace with PCA-based methods. Control Eng. Pract. 2016, 47, 1–14. [Google Scholar] [CrossRef]
Wang, L.; Yang, C.; Sun, Y.; Zhang, H.; Li, M. Effective variable selection and moving window HMM-based approach for iron-making process monitoring. J. Process Control 2018, 68, 86–95. [Google Scholar] [CrossRef]
Zhou, P.; Zhang, R.; Xie, J.; Liu, J.; Wang, H.; Chai, T. Data-driven monitoring and diagnosing of abnormal furnace conditions in blast furnace ironmaking: An integrated PCA-ICA method. IEEE Trans. Ind. Electron. 2020, 68, 622–631. [Google Scholar] [CrossRef]
Doersch, C. Tutorial on variational autoencoders. arXiv 2016, arXiv:1606.05908. [Google Scholar] [CrossRef]
Chen, T.; Sun, Y. Probabilistic contribution analysis for statistical process monitoring: A missing variable approach. Control Eng. Pract. 2009, 17, 469–477. [Google Scholar] [CrossRef]
Yang, Y.; Ma, Y.; Song, B.; Shi, H. An aligned mixture probabilistic principal component analysis for fault detection of multimode chemical processes. Chin. J. Chem. Eng. 2015, 23, 1357–1363. [Google Scholar] [CrossRef]
Tang, P.; Peng, K.; Jiao, R. A process monitoring and fault isolation framework based on variational autoencoders and branch and bound method. J. Frankl. Inst. 2022, 359, 1667–1691. [Google Scholar] [CrossRef]
Cai, J.; Zeng, J.; Luo, S. A state space model for monitoring of the dynamic blast furnace system. ISIJ Int. 2012, 52, 2194–2199. [Google Scholar] [CrossRef]
Wang, K.; Forbes, M.G.; Gopaluni, B.; Chen, J.; Song, Z. Systematic development of a new variational autoencoder model based on uncertain data for monitoring nonlinear processes. IEEE Access 2019, 7, 22554–22565. [Google Scholar] [CrossRef]
Zhang, Z.; Peng, B.; Xie, L.; Peng, L. Process monitoring based on recursive probabilistic PCA for multi-mode process. IFAC-Pap. 2015, 8, 1294–1299. [Google Scholar]

Figure 1. The structure of the variational autoencoder.

Figure 2. Schematic of the HCVAE model.

Figure 3. The test series data of 10 variables.

Figure 4. The mode identification using HCVAE for blast furnace data.

Figure 5. Monitoring results of blast furnace fault using HCVAE.

Figure 6. Monitoring results of blast furnace fault using LNSPPCA.

Figure 7. Monitoring results of blast furnace fault using RPPCA.

Table 1. The number of samples in each modality.

Modes	The Number of Samples
Mode 1	356
Mode 2	357
Mode 3	787

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, C.; Cai, J. A Hybrid Cluster Variational Autoencoder Model for Monitoring the Multimode Blast Furnace System. Processes 2023, 11, 2580. https://doi.org/10.3390/pr11092580

AMA Style

Chen C, Cai J. A Hybrid Cluster Variational Autoencoder Model for Monitoring the Multimode Blast Furnace System. Processes. 2023; 11(9):2580. https://doi.org/10.3390/pr11092580

Chicago/Turabian Style

Chen, Chenyu, and Jinhui Cai. 2023. "A Hybrid Cluster Variational Autoencoder Model for Monitoring the Multimode Blast Furnace System" Processes 11, no. 9: 2580. https://doi.org/10.3390/pr11092580

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Hybrid Cluster Variational Autoencoder Model for Monitoring the Multimode Blast Furnace System

Abstract

1. Introduction

2. Variational Autoencoder and Gaussian Mixture Model

3. Multimodal Process Monitoring Method Based on Hybrid Cluster Variational Autoencoder

3.1. Model Structure and Parameter Estimation of the Hybrid Cluster Variational Autoencoders

3.2. Process Monitoring Method

4. Application to a Real Blast Furnace

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI