Next Article in Journal
Dynamic Analysis for a Reciprocating Compressor System with Clearance Fault
Next Article in Special Issue
Fault Diagnosis Using Cascaded Adaptive Second-Order Tristable Stochastic Resonance and Empirical Mode Decomposition
Previous Article in Journal
Endodontic Management of Endo-Perio Lesions
Previous Article in Special Issue
A Study on Anti-Shock Performance of Marine Diesel Engine Based on Multi-Body Dynamics and Elastohydrodynamic Lubrication
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Tri-Partition Alphabet-Based State Prediction for Multivariate Time-Series

1
School of Information and Engineering, Sichuan Tourism University, Chengdu 610100, China
2
School of Resources and Environment, University of Electronic Science and Technology of China, Chengdu 611731, China
3
School of Software Engineering, Chengdu University of Information Technology, Chengdu 610225, China
4
School of Electronic Information and Automation, Civil Aviation University of China, Tianjing 300300, China
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2021, 11(23), 11294; https://doi.org/10.3390/app112311294
Submission received: 15 October 2021 / Revised: 24 November 2021 / Accepted: 24 November 2021 / Published: 29 November 2021
(This article belongs to the Special Issue Soft Computing Application to Engineering Design)

Abstract

:
Recently, predicting multivariate time-series (MTS) has attracted much attention to obtain richer semantics with similar or better performances. In this paper, we propose a tri-partition alphabet-based state (tri-state) prediction method for symbolic MTSs. First, for each variable, the set of all symbols, i.e., alphabets, is divided into strong, medium, and weak using two user-specified thresholds. With the tri-partitioned alphabet, the tri-state takes the form of a matrix. One order contains the whole variables. The other is a feature vector that includes the most likely occurring strong, medium, and weak symbols. Second, a tri-partition strategy based on the deviation degree is proposed. We introduce the piecewise and symbolic aggregate approximation techniques to polymerize and discretize the original MTS. This way, the symbol is stronger and has a bigger deviation. Moreover, most popular numerical or symbolic similarity or distance metrics can be combined. Third, we propose an along–across similarity model to obtain the k-nearest matrix neighbors. This model considers the associations among the time stamps and variables simultaneously. Fourth, we design two post-filling strategies to obtain a completed tri-state. The experimental results from the four-domain datasets show that (1) the tri-state has greater recall but lower precision; (2) the two post-filling strategies can slightly improve the recall; and (3) the along–across similarity model composed by the Triangle and Jaccard metrics are first recommended for new datasets.

1. Introduction

Time-series analysis [1] has long been a subject that has attracted researchers from a diverse range of fields, including pattern discovery [2,3,4,5], clustering [6,7,8], classification [9,10], prediction [11], causality [12], and anomaly detection [13]. Time-series prediction is one of the most sought-after yet, arguably, the most challenging tasks [11]. It has played an important role in a wide range of fields, including the industrial [14], financial [15], health [16], traffic [17,18], and environmental [19] fields for several decades. For multivariate time-series (MTSs), existing methods inherently assume interdependencies among variables. In other words, each variable not only depends on its historical values but also on other variables. To efficiently and effectively exploit latent interdependencies among variables, many techniques such as deep learning-based ones [14,19,20,21,22], the matrix or tensor decomposition-based ones [23,24], the k-nearest neighbor (kNN)-based ones [15,17,18,21], and others [16,25,26,27] have been proposed. However, obtaining richer semantics with similar or better performances is meaningful but rare.
The trisecting–acting–outcome (TAO) model [28] of thinking in threes [29] to understand and process a whole via three distinct and related parts [30] has inspired many novel and significant theories and applications. Recently, theories such as three-way formal concept analysis [31] and three-way cognition computing [32,33] have focused on concept learning via multi-granularity from the viewpoint of cognition. The three-way fuzzy sets method [34], three-way decisions space [35], sequential three-way decisions [36], and generalized three-way decision models [37,38,39] have been proposed. Moreover, applications include the three-way recommender system [40], three-way active learning [41], three-way clustering [42], tri-partition neighborhood covering reduction [43], three-way spam filtering [44], three-way face recognition [45], and  the tri-alphabet-based sequence pattern [46]. However, the extension of TAO to MTS prediction needs to be studied in depth.
In this paper, a tri-partition alphabet-based state (tri-state) prediction method for symbolic multivariate time-series (MTS) was proposed. First, with the symbolic aggregate approximation (SAX) [47] technique, g symbols are generated with the piecewise aggregate approximation (PAA) [13] version of MTS and the hypothesis of a probability distribution function. Moreover, the most common standard normal distribution, i.e.,  N ( 0 , 1 ) , is used here. Hence, the  g 1 breakpoints can be obtained by averagely partitioning the under area of N ( 0 , 1 ) into g parts. As these breakpoints also provide the deviation degree far from the expectation, the two thresholds α and β ( α β > 0 ) can be specified from them. Hence, if the absolute value of a breakpoint is not less than α , the symbol is called a strong element. If the absolute value of a breakpoint is less than β , the symbol is called a weak element. Otherwise, the symbol is called a medium element. This way, for each variable of the given MTS, its alphabet, i.e., the set of symbols, is partitioned into the strong, medium, and weak regions.
Second, on the basis of the tri-partitioned alphabet, the predicted tri-state hence takes the form of a matrix with the size 3 × n (n is the number of variables). For each variable, we simultaneously predict the three most likely symbols occurring from the strong, medium, and weak regions. The state defined by the existing work only contains one case while the tri-state includes up to 3 n cases. Note that our method does not take the top three most likely occurring symbols as the prediction result because the deviation degree can provide some new orthogonal information. This way, the outliers are more noticeable for users.
Third, an along–across similarity model to generate the k-nearest matrix neighbors (kNMN) is presented. The along similarity considers the associations of the time stamps. The across similarity focuses on the relation between the variables. Additionally, with the PAA- and SAX-MTSs, the most popular numerical or symbolic metrics can be combined regardless of whether they are similarities or distances. Given a sliding window w, the PAA- and SAX-MTSs can be transformed into m w + 1 temporal subsequences, called instances. m is the number of time stamps, and all instances are matrices with the shape m × n . Moreover, the latest state following each instance is denoted as the decision information, called the label. With the optimal k labels from m w , the tri-state can be finally predicted using the traditional voting strategy.
Fourth, two post-filling strategies called the individual and related ones, are designed to fill the possibly missing symbols of each variable. The reason for which the tri-state may be uncompleted is that no strong, medium or weak symbols occur after all matrix instances. For brevity, given a tri-state, we assume that the strong symbol of its i-th variable ( a i ) is missing. The individual filling strategy (IFS) directly scans the history data of a i to obtain the most frequently occurring strong symbol. The related filling strategy (RFS) considers the associations between a i and the other n 1 variables. One of the other variables, which is the most linear related to a i , is its condition.
The main contributions of this paper are presented as follows:
  • Tri-state. It provides three kinds of symbols for each variable simultaneously. The proposed deviation degree-based alphabet tri-partition strategy makes the outliers more noticeable for experts. Moreover, the IFS and RFS are designed to obtain a completed tri-state.
  • Along–across similarity model. The similarities between time stamps and variables are considered simultaneously. This model provides a framework for the integration of the popular similarity or distance metrics.
  • Combination of the popular numerical or symbolic metrics. The PAA- and SAX-MTSs are simultaneously used in the above similarity model. The PAA-MTS is available for the numerical metrics, while the SAX-MTS fits the symbolic ones.
The experimental results undertaken on four real-world datasets show that (1) in terms of precision, the states are 30% to 50% higher than the three kinds of tri-states, while for the recall, the three kinds of tri-state are 10% higher than the state; (2) the IFS and RFS can slightly improve the recall by approximately 1%; and (3) the along–across similarity model composed of the Triangle and Jaccard metrics are first recommended for new datasets. Note that the IFS and RFS are necessary if the tri-state is incomplete. In other words, when the obtained tri-state is fulfilled, no difference is found among the three kinds of tri-states.
The rest of this paper is organized as follows. Section 2 reviews the existing work on time-series prediction. Section 3 presents the fundamental definitions of the tri-state. Section 4 proposes the algorithm for tri-state prediction. Section 5 discusses the performance of the prediction algorithm on four real-world datasets. Section 6 lists the conclusions and future work of this paper.

2. Time-Series Prediction

Various techniques have been proposed for predicting time-series. These methods can be categorized into the deep learning-based ones [14,19,20,21,22], matrix or tensor decomposition-based ones [23,24], k-nearest neighbor (kNN)-based ones [15,17,18,21], etc. [16,25,26,27].
For the deep learning-based ones aiming to solve the volatility problem of wind power, a forecasting model based on a convolution neural network and LightGBM was constructed by Ju [14]. Ma et al. proposed a deep learning-based method, namely transferred bi-directional long short-term memory model for air-quality prediction [19]. Weytjens et al. predicted accounts’ receivable cash flows by employing methods applicable to companies with many customers and many transactions [22].
In terms of the matrix or tensor decomposition-based ones, Shi et al. proposed a strategy that combines low-rank Tucker decomposition into a unified framework [48]. Ma et al. proposed a deep spatial-temporal tensor factorization framework, which provides a general design for high-dimensional time-series forecasting [49]. To model the inherent rhythms and seasonality of time-series as global patterns, Chen et al. [50] proposed a low-rank autoregressive tensor completion framework to model multivariate time-series’ data. To generalize the effect of distance and reachability, Wu et al. [51] developed an Inductive graph neural network kriging model to recover data for unsampled sensors on a network graph structure.
For the kNN-based ones, Zhang et al. [15] proposed a new two-stage methodology that combines the ensemble empirical mode decomposition with a multidimensional kNN model in order to simultaneously forecast the closing price and high price of stocks. Xu et al. [17] proposed an algorithm based on the kernel kNN to predict road traffic states in time-series. Yin et al. [18] proposed the multivariate predicting method and discussed the prediction performance of MTS by comparing it with the univariate time-series and kNN nonparametric regression model. Martinez et al. [21] devised an automatic tool, i.e., a tool that works without human intervention; furthermore, the methodology should be effective and efficient. The tool can be applied to accurately forecast many time series.
Other techniques were also used for MTS prediction. To handle multivariate long nonstationary time-series, Shen et al. [16] proposed a fast prediction model based on a combination of an elastic net and a higher-order fuzzy cognitive map. Chen et al. [25] proposed a weighted least squares support vector machine-based approach for univariate and multivariate time-series forecasting. To predict future outbreaks of methicillin-resistant Staphylococcus aureus, Jimenez et al. [26] proposed the use of artificial intelligence—specifically time-series forecasting techniques. The orthogonal decision tree may fail to capture the geometrical structure of data samples, so Qiu et al. [27] attempted to study oblique random forests in the context of time-series forecasting.

3. Models and Problem Statement

In this section, we first introduce the definitions of the original multivariate time-series (MTS) and its piecewise aggregate approximation (PAA) and symbolic aggregate approximation (SAX) versions. Second, we propose an along–across similarity model and the problem of state prediction. Third, we define the strategy of alphabet tri-partition and the problem of tri-partition alphabet-based state prediction. The notations are introduced in Table 1.

3.1. Data Model

The PAA and SAX versions of MTS are defined on the basis of the original numerical MTS.
Definition 1.
An original numerical MTS is the quadruple:
S = ( T , A , V = a A V a , f ) ,
where T = { t 1 , t 2 , , t M } is the finite set of time points, A = { a 1 , a 2 , , a n } is the finite set of variables, V a {real number} is the value ranges of variable a, and  f : T × A V is the mapping function. For brevity, f ( t i , a j ) can be denoted by f i , j . We further assume that t i + 1 t i = t i t i 1 ( 2 i n 1 ).
Definition 2.
The PAA-MTS S = ( T , A , V = a A V a , f ) has similar forms to Definition 1. However, two differences are present, namely (i) T = { t 1 , t 2 , , t m } , m < M , and (ii) i [ i , m ] , j [ i , M ] :
f i , a = m M j = ( i 1 ) M m + 1 i M m f j , a .
Example 1.
Figure 1 shows an example of the transition of NO 2 from the original numerical MTS ( S ) to the PAA version of MTS ( S ). Here, m = 10 and M = 100 . This way, the dimension is reduced from 100 to 10.
Definition 3.
The SAX-MTS S = ( T , A , V = a A V a , f ) also has a similar form to Definition 2. The only difference is that the numerical value is transformed into a symbolic one. To produce symbols with equiprobability, a set of breakpoints D = { δ 1 , δ 2 , , δ g 1 } dividing the area of under the probability distribution function (PDF) of a A is required. Therefore, let V a = { γ 1 , γ 2 , , γ g } containing g symbols; then, we have:
f i , a = γ j ,   i f f i , a ( δ j 1 , δ j ) ,
where j [ 1 , g ] , and  δ 0 , and δ g are defined as and + , respectively.
Example 2.
Table 2 shows a lookup table of breakpoints for the N ( 0 , 1 ) distribution. In practice, g can be set as an integer that is not less than 2. Notably, g = 2 means that D = { 0 } .
Example 3.
Figure 2 shows an example of the transition from PAA-MTS to SAX-MTS. Here, we let D = { 1.07 , 0.57 , 0.18 , 0.18 , 0.57 , 1.07 } , g = 6 , and its alphabet Σ = {a, b, c, d, e, f, g}.
Example 4.
Table 3 shows an example of SAX-MTS with three variables (i.e., A = {SO 2 ( a 1 ), NO 2 ( a 2 ), and PM2.5 ( a 3 )}), and 10 time stamps (i.e., T = { t 1 , t 2 , , t 10 } ). For variable SO 2 , symbols a and f are missing. For variable NO 2 , symbols b and e are missing. For variable PM2.5, symbols e, d, and e are missing. This phenomenon is temporary until the data are big enough.

3.2. State

First, a formal description of the state is introduced as follows. Additionally, the type of prediction result that the SAX-MTS state is was described.
Definition 4.
Given a SAX-MTS S = ( T , A , V = a A V a , f ) : i [ 1 , m ] ,
f i , = ( f i , 1 , f i , 2 , , f i , n )
is called a state of SAX-MTS at time t i . Moreover, the state of PAA-MTS (i.e, f i , = ( f i , 1 , f i , 2 , , f i , n ) ) is formally similar to this one.
Example 5.
With Table 3, f 10 , = {e, g, f} is called a state of SAX-MTS at time t 10 . Accordingly, f 10 , = {0.447, 1.038, 0.846} is called a state of PAA-MTS at time t 10 .
Second, the state f i , is denoted as a known label. This way, the corresponding instance of f i , is defined as follows.
Definition 5.
Given an SAX-MTS S = ( T , A , V , f ) and a sliding window w < m , an instance with the matrix form is:
O i w , n = f i w + 1 , 1 f i w + 1 , n f i , 1 f i , n = ( f , 1 i , , f , n i ) = f i w + 1 , i f i , i ,
where w ( 0 , m ) , i [ w , m ] , and  j [ 1 , n ] , | f , j i | = w , j [ i w + 1 , i ] , | f j , i | = w . For brevity, O i w , n can be denoted by O i when n and w are specified:
Example 6.
With Table 3, let w = 2 and i = 9 ; then:
O 9 = c d c c c b .
f 10 , = {e, g, f} is the label of O 9 .
This way, the set of all instances can be denoted by
SP = { O w , O w + 1 , , O m } ,
where | SP | = | T | w = m w + 1 .
Example 7.
With Table 3, let w = 2 ; then, SP = { O 2 , O 3 , O 10 } . Hence, | SP | = 10 2 + 1 = 7 .
Third, i [ w , m 1 ] , the set of instance–label pair {( O i , f i + 1 , )} can be constructed for the k-nearest matrix neighbors (kNMN).
Definition 6.
Given an instance O m SP , any N SP \ { O m } is called the set of kNMN of O m if | N | = k and:
min O N Δ ( O m , O ) max O SP \ N Δ ( O m , O ) ,
where Δ is the along–across similarity of the given matrix pair.
Note that the neighborhood N for O m may not be unique, where some other matrices have the same similarity with O m .
Fourth, the along–across similarity model Δ is proposed to obtain the neighborhood N by merging the popular similarity and distance metrics.
Definition 7.
Given PAA-MTS S = ( T , A , V , f ) , SAX-MTS S = ( T , A , V , f ) , and sliding window size w, the similarity between the two matrix-based instances O i and O j is:
Δ ( O i , O j ) = R i , j × C i , j ,
where the row vector similarity is:
R i , j = l = 1 w s ( h i w + l , i , h j w + l , j ) w ,
and the column vector similarity is:
C i , j = l = 1 n s ( h , l i , h , l j ) n ,
where:
h = f , i f t h e m e t r i c i s a v a i l a b l e f o r t h e P A A - M T S ; f , i f t h e m e t r i c i s a v a i l a b l e f o r t h e S A X - M T S .
Note that the row or column vector h in Equation (11) is indeed one of f and f , corresponding to PAA- and SAX-MTSs, respectively. Moreover, the data type of vector h in Equations (9) and (10) are coincident. In other words, the pairs of vectors in Equations (9) and (10) are either PAA-MTS or SAX-MTS. Namely, the case that h , l i is PAA-MTS while h , l j is SAX-MTS is not permitted.
Table 4 presents the availability of similarities and distances for Equation (11). Two things need to be further explained. One is the availability of the metrics. Given any two indices r and c ( r , c IDs), PAA(r) = True or PAA(c) = True indicates that the r-th or the c-th metric fits the numerical data. Similarly, SAX(r) = True or SAX(c) = False means that the r-th or the c-th metric fits the symbolic data. For example, PAA(0) = True indicates that the Euclidean distance fits PAA-MTS but not SAX-MTS.
The other is the transformation from the distance to similarity. As similarity and distance metrics are simultaneously used here, the distance needs be transformed into the similarity. Therefore, given two vectors h i and h j , the transformation from distance to similarity is:  
s ( h i , h j ) = 1 d ( h i , h j ) + 1 ,
where d denotes the distance between h i and h j . This way, 100 combinations of distances and similarities exist. Their performances are discussed in Section 5.
Example 8.
With Table 3 and Table 4, let r = 8 (Jaccard similarity, PAA ( 8 ) = True), c = 1 (Manhattan distance, SAX ( 1 ) = True), and  w = 2 , and the along–across similarity between O 6 and O 7 (i.e., Δ ( O 6 , O 7 ) ) is illustrated as follows. First, the SAX-MTS and PAA-MTS of O 6 is:
O 6 = g f f g f g , a n d 1.173 1.007 0.922 1.496 0.842 1.490 , r e s p e c t i v e l y .
Those of O 7 is:
O 7 = g f g e f f , a n d 1.496 0.842 1.490 0.272 0.609 0.959 , r e s p e c t i v e l y .
Second, the row vector similarity R 6 , 7 = 2 3 + 1 3 2 = 0.5 . More specifically, the Jaccard similarity between row vectors (g, f, f) and (g, f, g) is 2 3 . Third, the column similarity C 6 , 7   =   1 1.547 + 1 + 1 0.398 + 1 + 1 1.099 + 1 3   =   0.393 + 0.715 + 0.476 3   =   0.528 . More specifically, the Manhattan distance between column vectors ( 1.173 , 1.496 ) and ( 1 , 496 , 0.272 ) is 0.323 + 1.224 = 1.547 . Finally, Δ ( O 6 , O 7 )   =   0.5 × 0.528   =   0.264 .
Fifth, given a future time stamp (e.g., t 11 ), the state (e.g., f 11 ) at this time is unknown. Formally, the state occurring at time t m + 1 is denoted as p m + 1 = ( p m + 1 , 1 , p m + 1 , 2 , ⋯, p m + 1 , n ) . To obtain the components of p m + 1 , with the kNN-like method, the instances, neighbors, and labels were defined by the above. Therefore, the label of O m , i.e.,  p m + 1 , can be predicted with the following voting strategy.
Definition 8.
Given a SAX-MTS S = ( T , A , V , f ) , O m and N , j [ 1 , n ] , each component of p m + 1 , = ( p m + 1 , 1 , p m + 1 , 2 , , p m + 1 , n ) is:
p j = arg max v a j V a j v o t e ( v a j ) ,
and:
v o t e ( v a j ) = i { i | O i N } I ( v a j = f i + 1 , j ) | N | ,
where I ( · ) = 1 , if the condition ( · ) is True; otherwise, I ( · ) = 0 .
Example 9.
With Table 3 and Table 4, let r = 8 , c = 2 , and  w = 2 , the process of computing p 11 , = ( p 11 , 1 ,   p 11 , 2 ,   p 11 , 3 ) is illustrated as follows.
First, the  N of O 10 is found. Using the process shown in Example 8, the along–across similarities of { O 10 } × { O 2 , O 3 , , O 9 } are listed in Table 5. Let the size of N , i.e.,  k = 4 ; then, N   =   { O 5 , O 4 , O 7 , O 6 } .
Second, with the three nearest neighbors, the states/labels after them can be obtained. Namely, f 6 , = (g, f, g), f 5 , = (g, f, f), f 8 , = (c, d, c), and  f 7 , = (e, f, f).
Fourth, the results of voting can hence be obtained as p 11 , 1 = g, p 11 , 2 = f, p 11 , 3 = f. Namely, p 11 , = (g, f, f). More specially, in terms of a 1 , v o t e ( g ) = 2 > v o t e ( c ) = 1 = v o t e ( e ) .
Sixth, the prediction performance is better with less difference between the p m + 1 , and f m + 1 , in general. The measures of prediction performance such as the precision and recall are introduced here. i [ 1 , n ] , and the precision and recall of the state f m + 1 , have the same form, namely:
P m + 1 = Σ 1 i n I ( p m + 1 , i = = f m + 1 , i ) n
Finally, with the above definitions, the problem of state prediction is proposed as follows.
Problem 1.
kNMN-based state prediction for MTS:
Input:   S = ( T , A , V , f ) , S = ( T , A , V , f ) , w and k;
Output:   p m + 1 , = ( p m + 1 , 1 , p m + 1 , 2 , , p m + 1 , n ) .
Although two types of datasets, i.e., the PAA- and SAX-MTSs, are both used here, the space complexity remains the same. The time complexity is closely related to the size of the matrix instance and similarity metrics for vectors.
Example 10.
With Table 4, let r = 8 and c = 2 ; given PAA-MTS S , SAX-MTS S, and the sliding window w, the time complexities of the row and column vectors’ similarity between two matrix instances are both Θ ( w n ) . Moreover, the size of SP is n w + 1 ; hence, the time complexity of our method is Θ ( w n ( m w + 1 ) ) = Θ ( m n ) .

3.3. Tri-State

To enrich the semantics of predictions, we extend each component of p m to a column vector with length 3. For each vector, different components have various semantics. This way, the form of prediction is changed from a 1 × n vector into a 3 × n matrix.
First, we introduce the definition of the tri-partition alphabet as follows.
Definition 9.
Given an SAX-MTS S = ( T , A , V , f ) , a A ,
Σ a = ( Γ a , Λ a , Ω a )
is called a tri-partition alphabet of a if
  • Γ a Λ a Ω a = Σ a = V a ; and
  • Γ a Λ a = Γ a Ω a = Ω a Λ a = Ø .
Additionally, we call Γ a , Λ a , and  Ω a the strong, medium, and weak regions of attribute a A , respectively.
Example 11.
With Table 3, the range of values for variable NO 2 ( a 2 ) is {a, b, c, d, e, f, g}. Let Γ a 2 = {a, g}, Λ a 2 = {b, f}, and  Ω a 2 = {c, d, e}, Σ a 2 is called a tri-partition alphabet of NO 2 .
Definition 10.
Given a SAX-MTS S = ( T , A , V = a A V a , f ) : Σ = a A Σ a and O m , a tri-state at time stamp t m + 1 is:
P m + 1 , * = p m + 1 , 1 Γ a 1 , , p m + 1 , n Γ a n p m + 1 , 1 Λ a 1 , , p m + 1 , n Λ a n p m + 1 , 1 Ω a 1 , , p m + 1 , n Ω a n
where i [ 1 , n ] , p m + 1 , i Γ a i Γ i , p m + 1 , i Λ a i Λ i , and  p m + 1 , i Ω a i Ω i .
Compared with the state in Definition 4, we replace p m + 1 , i with p m + 1 , i   =   ( p m + 1 , i Γ a i , p m + 1 , i Λ a i , p m + 1 , i Ω a i ) T , i [ 1 , n ] . Moreover, this predicted vector can be interpreted as the most probable symbol from the strong, medium, and weak regions, respectively. Note that the three-way state is useless for historical data.
Therefore, we present the voting strategy for the three-way state prediction as follows. Given i [ 1 , n ] :
p m + 1 , i Γ a i = arg max v a i Γ a i v o t e ( v a i ) ; p m + 1 , i Λ a i = arg max v a i Λ a i v o t e ( v a i ) ; p m + 1 , i Ω a i = arg max v a i Ω a i v o t e ( v a i ) .
Practically, regions Γ a , Λ a , and  Ω a can be obtained using various partition strategies and have meaningful explanations. Here, we partition the range of symbolic values for each attribute using the following strategy. Based on Equations (2) and (3), we can evaluate the level deviating from the mean for each symbol. For each attribute a i ( i [ 1 , n ] ) , given a set of thresholds pair { ( α i , β i ) } with cardinality n, where α i , β i D = { δ 1 , δ 2 , , δ g 1 } , α i β i > 0 . j [ 1 , g 1 ] , the tri-partition strategy is formally described as
γ i , j Γ a i , if | δ j | α i ; γ i , j Λ a i , if α i > | δ j | β i ; γ i , j Ω a i , if | δ j | < β i .
The combination of PAA-MTS S = ( T , A , V , f ) and SAX-MTS S = ( T , A , V , f ) is first used here. The breakpoint δ g is + , and  δ g > α always holds. Hence, γ g always belongs to Γ a i .
However, up to 2 n thresholds need to be specified. Therefore, we assume that i , j [ 1 , n ] , i j , α i = α j , and  β i = β j for brevity. Consequently, we have Σ a i = Σ a j , and  Σ = { Σ a = ( Γ a , Λ a , Ω a ) | a A } which can be denoted by Σ = ( Γ , Λ , Ω ) . More choices are available for α and β with a greater g. Moreover, if  g = 4 , then D = { 0.67 , 0 , 0.67 } . When the threshold α is set to 0.67 , β can be set to 0.67 or 0.
However, the predicted tri-state is incomplete if no strong, medium, or weak symbols are found following the whole matrix neighbors. Namely, what the current method can guarantee is that each variable has at least one predicted symbol. Formally, given a tri-state P m + 1 , at t m + 1 , i [ 1 , n ] , we have:
( p i Γ , p i Λ , p i Ω ) T ( ϕ , ϕ , ϕ ) T .
Example 12.
With Table 3, let α = 1.07 and β = 0.57 , Σ a 1 = Σ a 2 = Σ a 3 = ( Γ , Λ , Ω ) , where Γ = {a, g}, Λ = {b, f} and Ω = {c, d, e}. Based on the four labels of Example 9, i.e.,  f 6 , = (g, f, g), f 5 , = (g, f, f), f 8 , = (c, d, c), and  f 7 , = (e, f, f), the predicted tri-state at t 11 is P 11 , = g , ϕ , g ϕ , f , f c/e , d , c .
Note that ϕ means the symbol of the current position is temporally unknown. More specifically, the strong symbol of a 2 and the medium symbol of a 1 are unknown. Here, “c / e” indicates that the final predicted symbol was randomly selected from them. For brevity, the symbol c was selected.
Moreover, the precision for the incomplete tri-state is calculated as follows:
P m + 1 = Σ 1 i n I ( p m + 1 , i Γ = = f m + 1 , i or p m + 1 , i Λ = = f m + 1 , i or p m + 1 , i Ω = = f m + 1 , i ) Σ i [ 1 , n ] Σ j [ 1 , 3 ] I ( P m + 1 , ϕ )
In order to remedy this defect, i.e., to obtain a completed tri-state, we propose two simplified and effective filling strategies called the individual and related ones, respectively. For each attribute, if one or two symbols are missing, the individual filling strategy (IFS) predicts them with the most frequent ones in its own history data. Then, i [ 1 , n ] , the IFS can be formally described as follows:
p i Γ = arg γ Γ max IFS Count ( γ ) , if p i Γ = ϕ ; p i Λ = arg γ Λ max IFS Count ( γ ) , if p i Λ = ϕ ; p i Ω = arg γ Ω max IFS Count ( γ ) , if p i Ω = ϕ ,
where:
IFS Count ( γ ) = Σ j = 1 m Index ( f j , i = γ ) m .
Example 13.
According to Example 12 and Table 3, for variable a 1 , p 1 Λ = b. This is because IFS-Count(b) = 3 10 > IFS-Count(f) = 0. For variable a 2 , p 2 Γ = a. This is because IFS-Count(a) = 2 10 > IFS-Count(g) = 1 10 . Hence, the tri-state filled by the IFS is P 11 , = g , a , g b , f , f c , d , c .
The related filling strategy (RFS) predicts the missing symbols by considering the association relationships between any pair of variables. Given two variables a i and a j ( i , j [ 1 , n ] , i j ), a j is the most linear related variable of a i . Namely, a j = arg a j A \ { a i } max Pearson ( a i , a j ) . Hence, their predicted vectors are ( p i Γ , p i Λ , p i Ω ) T and ( p j Γ , p j Λ , p j Ω ) T . Then, the RFS can be formally described as follows:
p i Γ = arg γ Γ max RFS Count ( γ ) , if p i Γ = ϕ ; p i Λ = arg γ Λ max RFS Count ( γ ) , if p i Λ = ϕ ; p i Ω = arg γ Ω max RFS Count ( γ ) , if p i Ω = ϕ ,
where:
RFS Count ( γ ) = Σ γ { p i Γ , p i Λ , p i Ω } Σ l = 1 m Index ( f l , i = γ and f l , j = γ and f l , j ϕ ) Σ γ { p i Γ , p i Λ , p i Ω } Index ( γ ϕ ) × m .
Example 14.
Based on Example 12 and Table 3, the Pearson correlations among the three variables are listed as follows. Pearson ( a 1 , a 2 ) = 0.892 , Pearson ( a 1 , a 3 ) = 0.919 , and Pearson ( a 2 , a 3 ) = 0.839 . Hence, for the variable a 1 , the most related one is a 3 . Then, when ( a 3 , g) happens, the happening symbols set of a 1 is {g}. When ( a 3 , f) happens, the happening symbols set of a 1 is {g, e, e}. When ( a 3 , c) happens, the happening symbols set of a 1 is {c}. No medium symbol for p 1 Λ by the RFS is available. Therefore, the result is b, which is predicted using the IFS.
Then, for the variable a 2 , the most related one is a 1 . Then, when ( a 1 , g) happens, the happening symbols’ set of a 2 is {f, f}. When ( a 1 , b) happens, the happening symbols set of a 2 is {a, a, c}. When ( a 1 , c) happens, the happening symbols set of a 2 is {d, c}. Therefore, the result of p i Γ is a.
Accordingly, the tri-state filled by the RFS is P 11 , = g , a , g b , f , f c , d , c . This result of the RFS is consistent with that of the IFS.
Moreover, the precision for the completed tri-state (IFS- and RFS-ones) is calculated as follows:
P m + 1 = Σ 1 i n I ( p m + 1 , i Γ = = f m + 1 , i or p m + 1 , i Λ = = f m + 1 , i or p m + 1 , i Ω = = f m + 1 , i ) 3 n
With the above Equations (15), (20) and (23):
  • 0 P m + 1 1 3 ;
  • P m + 1 P m + 1 ; and P m + 1 P m + 1 .
Finally, with all of the above definitions, we can define the problem of three-way state prediction as follows:
Problem 2.
Tri-state prediction for MTS.
Input:   S = ( T , A , V , f ) , S = ( T , A , V , f ) , w, k, α, and β;
Output:   P m + 1 , = ( p m + 1 , n , p m + 1 , n , , p m + 1 , n ) = p m + 1 , 1 Γ , , p m + 1 , n Γ p m + 1 , 1 Λ , , p m + 1 , n Λ p m + 1 , 1 Ω , , p m + 1 , n Ω .
Compared with Problem 1, Problem 2 has two more parameters α and β. The first process that generates Σ is required, but it has a polynomial time complexity Θ ( m n ) . The output is a matrix P with size 3 × n . Hence, we can obtain three of the most likely occurring symbols from the strong, medium, and weak regions, respectively. Note that Problem 1 obtains one predicted state at once, while Problem 2 can obtain up to 3 n possible states. Excitedly, the time and space complexity of the two problems remain the same.

4. Algorithms

In this section, the framework of the three-way state prediction algorithm with k nearest matrix neighbors (kNMN-3WSP) is shown in Figure 3. Three stages, namely kNMN construction, alphabet tri-partition, and three-way state prediction, are proposed. Note that datasets such as PAA S and SAX S are the inputs of all stages. In stages II and III, S and S were omitted for brevity.

4.1. Stage I

Algorithm 1 proposes the details of Stage I. First, ( r , c ) is a pair of indexes which identifies the distances and similarities from Table 4. In other words, we have r , c { 0 , 1 , , 9 } . Moreover, if  r = 0 , the similarity between two row vectors is measured using the Euclidean distance. If c = 7 , the similarity between two column vectors are measured using the Triangle one. Second, the cardinalities of O m and all elements in SP are w × n , | N | = k and m = | T | . Third, the availability of PAA and SAX is the key to integrating S and S. They are mutually exclusive.
Algorithm 2 presents the details of Line 4. With the last two columns of Table 4, if the similarity metric supports PAA-MTS, PAA(r) or PAA(c) is True (T). For example, PAA(0) = PAA(1) = True, and PAA(2) = PAA(3) = False (F). Finally, the time complexity of this stage is Θ ( m n 2 ) .
Algorithm 1kNMN construction.
Input:   S = ( T , A , V , f ) , S = ( T , A , V , f ) , w, k and ( r , c ) ;
Output:   N ;
Method: Construction.
  1:
Generate O m and SP with w;
  2:
Initialize N = Ø ;
  3:
for ( i [ w , m 1 ] ) do
  4:
    Compute Δ ( O m , O i ) with ( r , c ) ;
  5:
     N = N ( { O i , Δ ( O m , O i ) ) } ;
  6:
end for
  7:
Arrange the elements in set N in descending order of similarity Δ ;
  8:
Retain only the first k elements of N ;
  9:
return N
Algorithm 2 Similarity computation.
Input:   O m , O i and ( r , c ) ;
Output:   Δ ( O m , O i ) ;
Method:  Similarity.
  1:
row = 0.0;
  2:
for ( l [ 1 , w ] ) do
  3:
    if (PAA ( r ) ) then
  4:
        row += s ( f m w + l , * m , f i w + l , * i ) ;
  5:
    else
  6:
        row += s ( f m w + l , * m , f i w + l , * i ) ;
  7:
    end if
  8:
end for
  9:
row /= w;
  10:
col = 0.0;
  11:
for ( l [ 1 , n ] ) do
  12:
    if (PAA ( c ) ) then
  13:
        col += s ( f i * , l , f j * , l ) ;
  14:
    else
  15:
        col += s ( f * , l i , f * , l j ) ;
  16:
    end if
  17:
end for
  18:
col /= n;
  19:
return row × col;

4.2. Stage II

Algorithm 3 describes the details of Stage II. First, the variable g was specified to generate the SAX version of MTS. In other words, g 2 is the number of symbols for each attribute. Second, if  α = β , Λ is an Ø. When g = 2 , no other choices are available except for α = β . Finally, the time complexity of this stage is only Θ ( n g ) .
Algorithm 3 Alphabet tri-partition.
Input:   S = ( T , A , V , f ) , S = ( T , A , V , f ) , α and β ;
Output:   Σ = ( Γ , Λ , Ω ) ;
Method:  Tri-partition.
  1:
Initialize Γ = Λ = Ω = Ø ;
  2:
for ( i [ 1 , g 1 ] ) do
  3:
    if  ( | δ i | α ) then
  4:
         Γ = Γ { γ i } ;
  5:
else if  ( | δ i | < β ) then
  6:
         Ω = Ω { γ i } ;
  7:
    else
  8:
         Λ = Λ { γ i } ;
  9:
    end if
  10:
end for
  11:
Γ = Γ { γ g } ;
  12:
return   Σ = ( Γ , Λ , Ω ) ;

4.3. Stage III

Algorithm 4 discusses the details of Stage III. First, f j + 1 , i is the label of attribute a i . The predicted symbol is the one with the maximal frequency. Second, the purpose of using the index to count is to improve the efficiency of this algorithm. In Line 6, Count ( · ) is a mapping function for the count matrix in which the size is g × 2 . The l-th position stores the frequency of γ l ( l [ 1 , g ] ). Generally, the matrix is denoted by M = ((Count(1), 1 ) , (Count(2), 2 ) , ⋯, (Count(g), g ) ). For example, with Table 3, let the indices of H, M, and L be 0, 1, and 2, respectively, ( g = 3 ). Matrix ((3, 0), (2, 1), (4, 2)) means the frequencies of H, M, and L are 3, 2, and 4, respectively. Moreover, M 1 , = (3, 0), M 1 , 1 = 3, and  M 1 , 2 = 0.
In Line 9, the count matrices are listed in the count descending order. Generally, M = ( M 1 , , M 2 , , , M g , ) are subject to (1) i [ 1 , g ] , M i , M , and (2) j [ 1 , g ] , j i , M i , 1 M j , 1 . For example, the matrix ((3, 0), (2, 1), (4, 2)) is transformed into ((4, 2), (3, 0), (2, 1)). In Lines 10–21, the algorithm searches for three symbols with the biggest count from the strong, medium, and weak regions each. There is no need to continue searching if all three symbols of the current variable are known. The time complexity of this stage is  Θ ( m n ) .
Finally, the RFS considers more information than the IFS, but their time and space complexities are the same, namely Θ ( n m ) . This way, we can obtain four kinds of states called the state, tri-state, IFS-based tri-state (IFS-tri-state), and RFS-based tri-state (RFS-tri-state).
Algorithm 4 Three-way state prediction.
Input:   S = ( T , A , V , f ) , S = ( T , A , V , f ) , N and Σ = ( Γ , Λ , Ω ) ;
Output:   P 3 × n = p 1 Γ , , p n Γ p 1 Λ , , p n Λ p 1 Ω , , p n Ω ;
Method:  Prediction.
  1:
Initialize P 3 × n by filling with ϕ ;
  2:
for ( i [ 1 , n ] ) do
  3:
    M = ((Count(1), 1), (Count(2), 2), ⋯, (Count ( g ) , g)), Count ( l ) = 0 ( l [ 1 , g ] ) ;
  4:
    for (each neighbor O N ) do
  5:
        Get its last time stamp, denoted by t j ;
  6:
        Obtain the index of f j + 1 , i in V a i , denoted by l;
  7:
        Count ( l ) + + ;
  8:
    end for
  9:
    Obtain M by listing M in the descending order of Count(·);
  10:
    for ( j [ 1 , g ] ) do
  11:
        Let l = M j , 2 ;
  12:
        if ( γ l Γ and p i Γ ϕ ) then
  13:
             p i Γ = γ l ;
  14:
        else if ( γ l Λ ) and p i Λ ϕ then
  15:
             p i Λ = γ l ;
  16:
        else if ( γ l Ω ) and p i Ω ϕ then
  17:
             p i Ω = γ l ;
  18:
        else
  19:
           break;
  20:
        end if
  21:
    end for
  22:
endfor
  23:
return P 3 × n ;

5. Experiments

We attempted the discussion of the following issues using experiments:
  • The prediction performance of our along–across similarity model;
  • The stability of the similarity metrics combination.

5.1. Dataset and Experiment Settings

Experiments are undertaken on four datasets from four different domains, i.e., the environmental, financial, industrial, and health domains. The most important information from these datasets is listed in Table 6.
With Table 4, 10 × 10 = 100 combinations need to be discussed. The test set consists of the last 20% of the above three MTSs. However, the training set is dynamic at different time points within the testing set. Generally, for each time point i [ 20 % m , m ] , the training set contains the whole records within the time range [ 1 , i 1 ] . In other words, 80% is the smallest training set ratio when the time point i is 20 % m .

5.2. Prediction Performance

Figure 4 and Figure 5 show the meaning of precision, recall, and F1-measure for four kinds of states on the four datasets’ test sets. Commonly, the form ( r , c ) , r , c [ 0 , 9 ] , indicates the indices of row and column metrics, respectively. For example, ( 3 , 8 ) means that the row metric is Levenshtein and that the column metric is Jaccard. Second, with increasing k, the precisions of the state, tri-state, IFS-based tri-state, and RFS-based state are decreased. Third, the precision of state is better than that of the others. Moreover, the precision of tri-state is slightly better than that of the IFS- and RFS-based ones. The precisions of the IFS- and RFS-based tri-states are almost consistent. This is because three kinds of tri-states provide two additional symbols for each variable. However, tri-state may be incomplete while the IFS- and RFS-based ones are complete. Therefore, the precision of the tri-state is between the state and the IFS- and RFS-based tri-states. This can be observed in Figure 4b,c.
In Figure 5, the recalls of the three kinds of tri-states are better than that of the state. Moreover, the recalls of the IFS- and RFS-based tri-states are the highest. Similarly, the recall of the tri-state is also between that of the state and the IFS- and RFS-based tri-states. Interestingly, the recall of the IFS- and RFS-based tri-states on the Stocks (Dataset II) can reach 95% and 93%, respectively.
Compared with the state, the three kinds of tri-states have better recall but worse precision. Although the improvement of IFS- and RFS-based tri-states is not significant compared to the tri-state, more information can be provided. In most cases, k = 1 is the first choice for precision and recall.

5.3. Stability

Table 7 and Table 8 list the top 10 metric combinations for precision and recall with four kinds of states on the four datasets’ test sets. We can observe that some metric combinations are repeated. Hence, these combinations are considered more stable, with higher frequency/probability occurring in different datasets. For stronger discrimination, we additionally introduce a weighting strategy ranking for each metric combination.
Given a metric combination x [ 0 , 9 ] :
ρ ( x ) = ξ ( x ) 2 π ( x )
is the stability metric of x. Among them, ξ ( x ) is the occurrence on four datasets. Moreover:
π ( x ) = Σ i { I , II , III , IV } η ( x , i ) ξ ( x )
is the average ranking of x. If x does not occur in dataset i, η ( x , i ) = 0 . Otherwise, η ( x , i ) is the ranking of x.
Figure 6 shows the most stable metric combinations for precision. In terms of the state, combination ( 1 , 8 ) is the first choice. In terms of the tri-state, combination ( 1 , 8 ) is the first choice. In terms of the IFS-based tri-state, combination ( 3 , 8 ) is the first choice. In terms of the RFS-based tri-state, combination ( 3 , 8 ) is the first choice.
Figure 7 shows the most stable metric combinations for recall. In terms of the state, combination ( 1 , 8 ) is the first choice. In terms of the tri-state, combination ( 1 , 8 ) is the first choice. In terms of the IFS-based tri-state, combination ( 0 , 8 ) is the first choice. In terms of the RFS-based tri-state, combination ( 9 , 8 ) is the first choice.
With the above observations, the eighth metric, i.e., the Jaccard similarity, is the most frequently used, followed by the second one, i.e., the Manhattan distance.

6. Conclusions

In this paper, a new tri-state and its prediction problem were defined on multivariate time-series (MTS). The most likely occurring strong, medium, and weak symbols can be obtained with the tri-state. Second, a deviation degree-based tri-partition strategy and the algorithm were designed. For all symbols of each variable, the symbol was stronger and deviated further from the average value. Third, the along–across similarity model was proposed to capture the temporal and variables’ association relationships. Fourth, the integration of the PAA and SAX versions of MTS can combine numerical or symbolic similarities or distances. Finally, when a new dataset is introduced, the first choices in parameter settings are k = 1 (the size neighborhood), r = 1 (the Jaccard), and  c = 8 (the Manhattan).
The following research topics deserve further investigation:
  • More alphabet tri-partition strategies;
  • More tri-state completion strategies;
  • Adaptive learning of the parameters by cost-sensitive learning; and
  • More intelligent metrics combination strategies, e.g., integrated learning.
 

Author Contributions

Conceptualization, Z.-H.Z.; methodology, Z.-H.Z. and Z.-C.W.; software, Z.-C.W. and Z.-H.Z.; validation, Z.-C.W., J.-G.G. and S.-P.S.; formal analysis, Z.-H.Z., W.D. and Z.-C.W.; investigation, J.-G.G. and S.-P.S.; resources, X.-B.Z.; data curation, G.-S.C., J.-G.G. and S.-P.S.; writing—original draft preparation, Z.-H.Z. and W.D.; writing—review and editing, X.-B.Z., S.-P.S. and G.-S.C.; visualization, Z.-C.W., J.-G.G. and G.-S.C.; supervision, X.-B.Z. and W.D.; project administration, W.D.; funding acquisition, X.-B.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was jointly funded by the National Natural Science Foundation of China (grant number 41604114); the Sichuan Science and Technology Program (grant numbers 2019ZYZF0169, 2019YFG0307, 2021YFS0407); the A Ba Achievements Transformation Program (grant numbers 19CGZH0006, R21CGZH0001); the Chengdu Science and technology planning project (grant number 2021-YF05-00933-SN); and the Sichuan Tourism University Scientific Research Projects of China (grant number 2020SCTU14).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Wei, W.W.S. Multivariate Time Series Analysis and Applications; John Wiley & Sons: Hoboken, NJ, USA, 2018. [Google Scholar]
  2. Park, H.; Jung, J.Y. SAX-ARM: Deviant event pattern discovery from multivariate time series using symbolic aggregate approximation and association rule mining. Expert Syst. Appl. 2020, 141, 112950. [Google Scholar] [CrossRef]
  3. Xu, J.; Tang, L.; Zeng, C.; Li, T. Pattern discovery via constraint programming. Knowl.-Based Syst. 2016, 94, 23–32. [Google Scholar] [CrossRef]
  4. Zhang, Z.H.; Min, F. Frequent state transition patterns of multivariate time series. IEEE Access 2019, 7, 142934–142946. [Google Scholar] [CrossRef]
  5. Zhang, Z.H.; Min, F.; Chen, G.S.; Shen, S.P.; Wen, Z.C.; Zhou, X.B. Tri-Partition state alphabet-based sequential pattern for multivariate time series. Cogn. Comput. 2021, 1–19. [Google Scholar] [CrossRef]
  6. Cheng, R.; Hu, H.; Tan, X.; Bai, Y. Initialization by a novel clustering for wavelet neural network as time series predictor. Comput. Intell. Neurosci. 2015, 2015, 1–9. [Google Scholar] [CrossRef] [PubMed]
  7. Li, H.L. Multivariate time series clustering based on common principal component analysis. Neurocomputing 2019, 349, 239–247. [Google Scholar] [CrossRef]
  8. Li, H.L.; Liu, Z.C. Multivariate time series clustering based on complex network. Pattern Recognit. 2021, 115, 107919. [Google Scholar] [CrossRef]
  9. Baldán, F.J.; Benítez, J.M. Multivariate times series classification through an interpretable representation. Inf. Sci. 2021, 569, 596–614. [Google Scholar] [CrossRef]
  10. Baydogan, M.G.; Runger, G. Learning a symbolic representation for multivariate time series classification. Data Min. Knowl. Discov. 2015, 29, 400–422. [Google Scholar] [CrossRef]
  11. Araújo, R.D.A. A class of hybrid morphological perceptrons with application in time series forecasting. Knowl.-Based Syst. 2011, 24, 513–529. [Google Scholar] [CrossRef]
  12. Sugihara, G.; May, R.; Ye, H.; Hsieh, C.H.; Deyle, E.; Fogarty, M.; Munch, S. Detecting Causality in Complex Ecosystems. Science 2012, 338, 496–500. [Google Scholar] [CrossRef] [PubMed]
  13. Ren, H.R.; Liu, M.M.; Li, Z.W.; Pedrycz, W. A piecewise aggregate pattern representation approach for anomaly detection in time series. Knowl.-Based Syst. 2017, 135, 29–39. [Google Scholar] [CrossRef]
  14. Ju, Y.; Sun, G.Y.; Chen, Q.H.; Zhang, M.; Zhu, H.X.; Rehman, M.U. A model combining convolutional neural network and LightGBM algorithm for ultra-short-term wind power forecasting. IEEE Access 2019, 7, 28309–28318. [Google Scholar] [CrossRef]
  15. Zhang, N.; Lin, A.; Shang, P. Multidimensional k-nearest neighbor model based on EEMD for financial time series forecasting. Phys. A Stat. Mech. Appl. 2017, 477, 161–173. [Google Scholar] [CrossRef]
  16. Shen, F.; Liu, J.; Wu, K. Multivariate Time Series Forecasting based on Elastic Net and High-Order Fuzzy Cognitive Maps: A Case Study on Human Action Prediction through EEG Signals. IEEE Trans. Fuzzy Syst. 2020, 29, 2336–2348. [Google Scholar] [CrossRef]
  17. Xu, D.W.; Wang, Y.D.; Peng, P.; Shen, B.L.; Deng, Z.; Guo, H.F. Real-time road traffic state prediction based on kernel-kNN. Transp. A Transp. Sci. 2020, 16, 104–118. [Google Scholar] [CrossRef]
  18. Yin, Y.; Shang, P.J. Forecasting traffic time series with multivariate predicting method. Appl. Math. Comput. 2016, 291, 266–278. [Google Scholar] [CrossRef]
  19. Ma, J.; Cheng, J.C.; Lin, C.Q.; Tan, Y.; Zhang, J.C. Improving air quality prediction accuracy at larger temporal resolutions using deep learning and transfer learning techniques. Atmos. Environ. 2019, 214, 116885. [Google Scholar] [CrossRef]
  20. Liu, P.H.; Liu, J.; Wu, K. CNN-FCM: System modeling promotes stability of deep learning in time series prediction. Knowl.-Based Syst. 2020, 203, 106081. [Google Scholar] [CrossRef]
  21. Martínez, F.; Frías, M.P.; Pérez, M.D.; Rivera, A.J. A methodology for applying k-nearest neighbor to time series forecasting. Artif. Intell. Rev. 2019, 52, 2019–2037. [Google Scholar] [CrossRef]
  22. Weytjens, H.; Lohmann, E.; Kleinsteuber, M. Cash flow prediction: MLP and LSTM compared to ARIMA and Prophet. Electron. Commer. Res. 2021, 21, 371–391. [Google Scholar] [CrossRef]
  23. Zhou, Y.; Cheung, Y.M. Bayesian low-tubal-rank robust tensor factorization with multi-rank determination. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 43, 62–76. [Google Scholar] [CrossRef]
  24. Zhou, Y.; Lu, H.; Cheung, Y.M. Probabilistic rank-one tensor analysis with concurrent regularizations. IEEE Trans. Cybern. 2021, 51, 3496–3509. [Google Scholar] [CrossRef]
  25. Chen, T.T.; Lee, S.J. A weighted LS-SVM based learning system for time series forecasting. Inf. Sci. 2015, 299, 99–116. [Google Scholar] [CrossRef]
  26. Jimenez, F.; Palma, J.; Sanchez, G.; Marin, D.; Palacios, M.F.; López, M.L. Feature selection based multivariate time series forecasting: An application to antibiotic resistance outbreaks prediction. Artif. Intell. Med. 2020, 104, 101818. [Google Scholar] [CrossRef]
  27. Qiu, X.H.; Zhang, L.; Suganthan, P.N.; Amaratunga, G.A.J. Oblique random forest ensemble via least square estimation for time series forecasting. Inf. Sci. 2017, 420, 249–262. [Google Scholar] [CrossRef]
  28. Yao, Y.Y. The geometry of three-way decision. Appl. Intell. 2021, 51, 6298–6325. [Google Scholar] [CrossRef]
  29. Yao, Y.Y. Set-theoretic models of three-way decision. Granul. Comput. 2021, 6, 133–148. [Google Scholar] [CrossRef]
  30. Yao, Y.Y. Tri-level thinking: Models of three-way decision. Int. J. Mach. Learn. Cybern. 2020, 11, 947–959. [Google Scholar] [CrossRef]
  31. Sang, B.B.; Guo, Y.T.; Shi, D.R.; Xu, W.H. Decision-theoretic rough set model of multi-source decision systems. Int. J. Mach. Learn. Cybern. 2018, 9, 1941–1954. [Google Scholar] [CrossRef]
  32. Li, J.H.; Huang, C.C.; Qi, J.J.; Qian, Y.H.; Liu, W.Q. Three-way cognitive concept learning via multi-granularity. Inf. Sci. 2017, 378, 244–263. [Google Scholar] [CrossRef]
  33. Yao, Y.Y. Three-way decisions and cognitive computing. Cogn. Comput. 2016, 8, 543–554. [Google Scholar] [CrossRef]
  34. Deng, X.F.; Yao, Y.Y. Decision-theoretic three-way approximations of fuzzy sets. Inf. Sci. 2014, 279, 702–715. [Google Scholar] [CrossRef]
  35. Hu, B.Q. Three-way decisions space and three-way decisions. Inf. Sci. 2014, 281, 21–52. [Google Scholar] [CrossRef]
  36. Qian, J.; Liu, C.H.; Miao, D.Q.; Yue, X.D. Sequential three-way decisions via multi-granularity. Inf. Sci. 2020, 507, 606–629. [Google Scholar] [CrossRef]
  37. Li, X.N.; Yi, H.J.; She, Y.H.; Sun, B.Z. Generalized three-way decision models based on subset evaluation. Int. J. Approx. Reason. 2017, 83, 142–159. [Google Scholar] [CrossRef]
  38. Liu, D.; Liang, D.C.; Wang, C.C. A novel three-way decision model based on incomplete information system. Knowl.-Based Syst. 2016, 91, 32–45. [Google Scholar] [CrossRef]
  39. Xu, W.H.; Li, M.M.; Wang, X.Z. Information Fusion Based on Information Entropy in Fuzzy Multi-source Incomplete Information System. Int. J. Fuzzy Syst. 2017, 19, 1200–1216. [Google Scholar] [CrossRef]
  40. Zhang, H.R.; Min, F.; Shi, B. Regression-based three-way recommendation. Inf. Sci. 2017, 378, 444–461. [Google Scholar] [CrossRef]
  41. Wang, M.; Min, F.; Zhang, Z.H.; Wu, Y.X. Active learning through density clustering. Expert Syst. Appl. 2017, 85, 305–317. [Google Scholar] [CrossRef]
  42. Yu, H.; Wang, X.C.; Wang, G.Y.; Zeng, X.H. An active three-way clustering method via low-rank matrices for multi-view data. Inf. Sci. 2020, 507, 823–839. [Google Scholar] [CrossRef]
  43. Yue, X.D.; Chen, Y.F.; Miao, D.Q.; Qian, J. Tri-partition neighborhood covering reduction for robust classification. Int. J. Approx. Reason. 2016, 83, 371–384. [Google Scholar] [CrossRef]
  44. Zhou, B.; Yao, Y.Y.; Luo, J.G. Cost-sensitive three-way email spam filtering. J. Intell. Inf. Syst. 2014, 42, 19–45. [Google Scholar] [CrossRef]
  45. Li, H.X.; Zhang, L.B.; Huang, B.; Zhou, X.Z. Sequential three-way decision and granulation for cost-sensitive face recognition. Knowl.-Based Syst. 2016, 91, 241–251. [Google Scholar] [CrossRef]
  46. Min, F.; Zhang, Z.H.; Zhai, W.J.; Shen, R.P. Frequent pattern discovery with tri-partition alphabets. Inf. Sci. 2020, 507, 715–732. [Google Scholar] [CrossRef]
  47. Lin, J.; Keogh, E.; Wei, L.; Lonardi, S. Experiencing SAX: A novel symbolic representation of time series. Data Min. Knowl. Discov. 2007, 15, 107–144. [Google Scholar] [CrossRef] [Green Version]
  48. Shi, Q.Q.; Yin, J.M.; Cai, J.J.; Cichocki, A.; Yokota, T.; Chen, L.; Yuan, M.X.; Zeng, J. Block Hankel tensor ARIMA for multiple short time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volumr 34, pp. 5758–5766. [Google Scholar]
  49. Ma, X.Y.; Zhang, L.; Xu, L.; Liu, Z.C.; Chen, G.; Xiao, Z.L.; Wang, Y.; Wu, Z.T. Large-scale user visits understanding and forecasting with deep spatial-temporal tensor factorization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2403–2411. [Google Scholar]
  50. Chen, X.Y.; Sun, L.J. Low-rank autoregressive tensor completion for multivariate time series forecasting. arXiv 2020, arXiv:2006.10436. [Google Scholar]
  51. Wu, Y.K.; Zhuang, D.Y.; Labbe, A.; Sun, L.J. Inductive graph neural networks for spatiotemporal kriging. arXiv 2020, arXiv:2006.07527. [Google Scholar]
  52. Lonardi, S.; Lin, J.; Keogh, E.; Chiu, B.Y.C. Efficient discovery of unusual patterns in time series. New Gener. Comput. 2006, 25, 61–93. [Google Scholar] [CrossRef] [Green Version]
  53. Amir, A.; Charalampopoulos, P.; Pissis, S.P.; Radoszewski, J. Dynamic and internal longest common substring. Algorithmica 2020, 82, 3707–3743. [Google Scholar] [CrossRef]
  54. Behara, K.N.; Bhaskar, A.; Chung, E. A novel approach for the structural comparison of origin-destination matrices: Levenshtein distance. Transp. Res. Part C Emerg. Technol. 2020, 111, 513–530. [Google Scholar] [CrossRef]
  55. Chung, N.C.; Miasojedow, B.; Startek, M.; Gambin, A. Jaccard/Tanimoto similarity test and estimation methods for biological presence-absence data. BMC Bioinform. 2019, 20, 644. [Google Scholar] [CrossRef] [PubMed]
  56. Sun, S.B.; Zhang, Z.H.; Dong, X.L.; Zhang, H.R.; Li, T.J.; Zhang, L.; Min, F. Integrating triangle and jaccard similarities for recommendation. PLoS ONE 2017, 12, e0183570. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Figure 1. The original numerical MTS and PAA-MTS.
Figure 1. The original numerical MTS and PAA-MTS.
Applsci 11 11294 g001
Figure 2. The PAA-MTS and SAX-MTS for NO 2 .
Figure 2. The PAA-MTS and SAX-MTS for NO 2 .
Applsci 11 11294 g002
Figure 3. The process of the kNMN-3WSP algorithm.
Figure 3. The process of the kNMN-3WSP algorithm.
Applsci 11 11294 g003
Figure 4. The precisions of four state prediction strategies. (a) Dataset I. (b) Dataset II. (c) Dataset III. (d) Dataset IV.
Figure 4. The precisions of four state prediction strategies. (a) Dataset I. (b) Dataset II. (c) Dataset III. (d) Dataset IV.
Applsci 11 11294 g004
Figure 5. The recalls of four state prediction strategies. (a) Dataset I. (b) Dataset II. (c) Dataset III. (d) Dataset IV.
Figure 5. The recalls of four state prediction strategies. (a) Dataset I. (b) Dataset II. (c) Dataset III. (d) Dataset IV.
Applsci 11 11294 g005
Figure 6. The most stable metric combinations for precision. (a) State. (b) Tri-state. (c) IFS-tri-state. (d) RFS-tri-state.
Figure 6. The most stable metric combinations for precision. (a) State. (b) Tri-state. (c) IFS-tri-state. (d) RFS-tri-state.
Applsci 11 11294 g006
Figure 7. The most stable metric combinations for recall. (a) State. (b) Tri-state. (c) IFS-tri-state. (d) RFS-tri-state.
Figure 7. The most stable metric combinations for recall. (a) State. (b) Tri-state. (c) IFS-tri-state. (d) RFS-tri-state.
Applsci 11 11294 g007
Table 1. Notations.
Table 1. Notations.
 NotationsDescriptions
S = ( T , A , V = a A V a , f ) The original numerical MTS.
S = ( T , A , V = a A V a , f ) The PAA version of numerical S .
S = ( T , A , V = a A V a , f ) The SAX version of numerical S .
mThe number of all time stamps, | T | .
nThe number of all variables, | A | .
gThe number of partitions; a A , | V a | = g .
DThe set of breakpoints for S, | D | = g 1 .
δ δ D .
Γ The strong region.
Λ The medium region.
Ω The weak region.
Σ = ( Γ , Λ , Ω ) Tri-partition alphabet.
β 0 The threshold for the weak region.
α β The threshold for the strong region.
f i , A symbolic state occurring at time t i .
f i , A numerical state occurring at time t i .
p m + 1 , A prediction of state occurring at time t m + 1 .
wThe length of sliding window.
OA matrix instance; | O | = w × n .
Δ The similarity of two matrix instances.
kThe number of nearest matrix neighbors.
N The set of k-nearest matrix neighbors.
P m + 1 , The form of the tri-state with area 3 × n .
Table 2. The breakpoints for the N ( 0 , 1 ) distribution [52].
Table 2. The breakpoints for the N ( 0 , 1 ) distribution [52].
Dg
345678910
δ 1 −0.43−0.67−0.84−0.97−1.07−1.15−1.22−1.28
δ 2 0.430−0.25−0.43−0.57−0.67−0.76−0.84
δ 3 0.670.250−0.18−0.32−0.43−0.52
δ 4 0.840.430.180−0.14−0.25
δ 5 0.970.570.320.140
δ 6 1.070.670.430.25
δ 7 1.150.760.52
δ 8 1.220.84
δ 9 1.28
Table 3. An example of SAX-MTS and PAA-MTS.
Table 3. An example of SAX-MTS and PAA-MTS.
   TA
  SO 2   ( a 1 )   NO 2   ( a 2 )   PM2.5 ( a 3 )
   20 ( t 1 )   b (−0.989)  a (−1.422)  b (−0.857)
   21 ( t 2 )   b (−0.966)  a (−1.460)  b (−0.770)
   22 ( t 3 )   b (−0.615)  c (−0.318)  b (−0.752)
   23 ( t 4 )   d (−0.106)  d (0.095)  b (−0.681)
   24 ( t 5 )   g (1.173)  f (1.007)  f (0.922)
   25 ( t 6 )   g (1.496)  f (0.842)  g (1.490)
   26 ( t 7 )   e (0.272)  f (0.609)  f (0.959)
   27 ( t 8 )   c (−0.203)  d (0.016)  c (−0.465)
   28 ( t 9 )   c (−0.508)  c (−0.453)  b (−0.691)
   29 ( t 10 )   e (0.447)  g (1.083)  f (0.846)
Table 4. The availability of similarities and distances.
Table 4. The availability of similarities and distances.
  IDs  Name  Type  Availability
  PAA  SAX
  0  Euclidean  distance  True  False
  1  Manhattan  distance  True  False
  2  LCSubstring [53]  distance  False  True
  3  Levenshtein [54]  distance  False  True
  4  Cosine  similarity  True  False
  5  Pearson  similarity  True  False
  6  Tanimoto [55]  similarity  True  False
  7  Triangle [56]  similarity  True  False
  8  Jaccard  similarity  False  True
  9  Jaro  similarity  False  True
Table 5. The similarity matrix of O 10 .
Table 5. The similarity matrix of O 10 .
O 2 O 3 O 4 O 5 O 6 O 7 O 8 O 9
O 10 0.0510.0590.2430.3340.1050.10900.063
Table 6. The outlines of the datasets.
Table 6. The outlines of the datasets.
  DatasetName | T | | A | Fields
  IWanLiu35,06412Environment
  IIStocks430012Finance
  IIIIPES33,00111Healthy
  IVCACS88,84037Industry
Table 7. The top 10 metric combinations for precision.
Table 7. The top 10 metric combinations for precision.
 DatasetType of StateMetric Combinations ( r , c )
 IState(5, 1), (7, 1), (4, 1), (7, 8), (3, 7), (9, 1), (1, 8), (5, 8), (1, 7), (0, 8)
Tri-state(9, 1), (9, 8), (3, 8), (5, 1), (7, 8), (3, 0), (7, 1), (4, 1), (4, 7), (3, 1)
IFS-tri-state(3, 8), (9, 1), (9, 8), (5, 1), (3, 0), (7, 8), (8, 8), (7, 1), (4, 1), (4, 7)
RFS-tri-state(3, 8), (9, 1), (9, 8), (5, 1), (3, 0), (7, 8), (8, 8), (7, 1), (4, 1), (4, 7)
 IIState(1, 8), (1, 7), (5, 0), (0, 7), (0, 4), (0, 0), (7, 7), (0, 9), (1, 4), (1, 0)
Tri-state(7, 7), (9, 0), (8, 8), (1, 7), (8, 1), (1, 1), (1, 3), (1, 9), (1, 8), (9, 8)
IFS-tri-state(7, 7), (3, 0), (3, 1), (3, 7), (1, 4), (1, 0), (1, 8), (1, 9), (1, 3), (1, 1)
RFS-tri-state(0, 0), (3, 1), (0, 7), (5, 0), (4, 1), (5, 7), (1, 0), (1, 7), (9, 1), (1, 3)
 IIIState(7, 8), (0, 8), (7, 4), (1, 4), (1, 8), (4, 8), (7, 7), (1, 7), (7, 9), (8, 8)
Tri-state(9, 8), (3, 3), (1, 7), (8, 8), (1, 4), (7, 8), (0, 4), (0, 8), (4, 8), (8, 0)
IFS-tri-state(3, 3), (7, 8), (4, 8), (9, 8), (1, 8), (0, 8), (8, 8), (9, 7), (1, 4), (0, 4)
RFS-tri-state(1, 8), (9, 8), (8, 1), (0, 7), (0, 1), (1, 4), (0, 8), (1, 0), (0, 4), (1, 1)
 IVState(0, 1), (0, 7), (0, 0), (1, 0), (5, 7), (3, 7), (5, 0), (1, 1), (7, 1), (7, 0)
Tri-state(7, 7), (5, 7), (4, 0), (5, 0), (0, 0), (5, 1), (0, 7), (4, 7), (4, 1), (7, 0)
IFS-tri-state(7, 7), (5, 7), (4, 0), (5, 0), (0, 0), (5, 1), (0, 7), (4, 7), (4, 1), (7, 0)
RFS-tri-state(5, 7), (4, 4), (0, 7), (7, 1), (7, 8), (7, 0), (5, 1), (5, 0), (5, 4), (4, 0)
Table 8. The top 10 metric combinations for recall.
Table 8. The top 10 metric combinations for recall.
 DatasetType of StateMetric Combinations ( r , c )
 IState(5, 1), (7, 1), (4, 1), (7, 8), (3, 7), (9, 1), (1, 8), (5, 8), (1, 7), (0, 8)
Tri-state(3, 8), (9, 8), (8, 8), (4, 8), (5, 1), (5, 8), (9, 1), (3, 7), (3, 0), (5, 7)
IFS-tri-state(3, 8), (9, 8), (8, 8), (9, 1), (5, 1), (0, 8), (5, 8), (4, 8), (3, 7), (3, 0)
RFS-tri-state(3, 8), (9, 8), (8, 8), (9, 1), (5, 8), (4, 8), (0, 8), (5, 1), (3, 7), (5, 7)
 IIState(1, 8), (1, 7), (5, 0), (0, 7), (0, 4), (0, 0), (7, 7), (0, 9), (1, 4), (1, 0)
Tri-state(7, 7), (0, 1), (3, 1), (3, 7), (1, 4), (1, 0), (1, 8), (1, 3), (1, 1), (1, 7)
IFS-tri-state(0, 0), (3, 1), (0, 1), (1, 0), (1, 8), (4, 0), (7, 0), (5, 0), (0, 8), (7, 7)
RFS-tri-state(9, 1), (3, 1), (4, 1), (5, 1), (9, 7), (0, 8), (0, 9), (7, 7), (1, 7), (1, 0)
 IIIState(7, 8), (0, 8), (7, 4), (1, 4), (1, 8), (4, 8), (7, 7), (1, 7), (7, 9), (8, 8)
Tri-state(9, 8), (3, 3), (1, 8), (0, 8), (4, 3), (8, 8), (7, 3), (0, 3), (1, 4), (1, 3)
IFS-tri-state(3, 3), (9, 8), (0, 8), (1, 4), (1, 8), (1, 3), (8, 8), (0, 4), (4, 3), (7, 3)
RFS-tri-state(9, 8), (1, 8), (0, 8), (1, 4), (0, 7), (1, 7), (8, 8), (0, 4), (4, 1), (1, 1)
 IVState(0, 1), (0, 7), (0, 0), (1, 0), (5, 7), (3, 7), (5, 0), (1, 1), (7, 1), (7, 0)
Tri-state(7, 7), (4, 7), (0, 0), (0, 7), (7, 1), (7, 0), (0, 4), (5, 0), (5, 7), (5, 1)
IFS-tri-state(7, 7), (4, 7), (0, 0), (0, 7), (7, 1), (7, 0), (0, 4), (5, 0), (5, 7), (5, 1)
RFS-tri-state(5, 4), (7, 7), (4, 7), (0, 0), (5, 0), (0, 7), (7, 1), (7, 0), (4, 1), (7, 4)
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Wen, Z.-C.; Zhang, Z.-H.; Zhou, X.-B.; Gu, J.-G.; Shen, S.-P.; Chen, G.-S.; Deng, W. Tri-Partition Alphabet-Based State Prediction for Multivariate Time-Series. Appl. Sci. 2021, 11, 11294. https://doi.org/10.3390/app112311294

AMA Style

Wen Z-C, Zhang Z-H, Zhou X-B, Gu J-G, Shen S-P, Chen G-S, Deng W. Tri-Partition Alphabet-Based State Prediction for Multivariate Time-Series. Applied Sciences. 2021; 11(23):11294. https://doi.org/10.3390/app112311294

Chicago/Turabian Style

Wen, Zuo-Cheng, Zhi-Heng Zhang, Xiang-Bing Zhou, Jian-Gang Gu, Shao-Peng Shen, Gong-Suo Chen, and Wu Deng. 2021. "Tri-Partition Alphabet-Based State Prediction for Multivariate Time-Series" Applied Sciences 11, no. 23: 11294. https://doi.org/10.3390/app112311294

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop