1. Introduction
Lithium-ion (Li-ion) batteries are attractive power storage technologies for many applications, such as in power systems, electric cars, telecommunications, aerospace, and other industries, due to their small size, high energy density, low self-discharge rate, low cost, and long life [
1]. Li-ion batteries degrade over time and charging/discharging cycles, which results in reductions in their capacity and output power [
2]. Also, the incorrect utilization of the battery increases its speed of degradation. A battery management system (BMS) can provide risk warnings to users and help with timely battery maintenance and replacements to guarantee the satisfactory performance of a battery [
3]. An essential metric for monitoring the health status of batteries in BMS is their state-of-health (SOH), which is defined as the ratio of the current capacity of the battery to its initial capacity. The SOH directly indicates the degree of degradation of the battery over its lifetime [
4]. Accidents such as battery leakage, insulation failure, and partial short circuit problems can occur when battery health degrades to a certain degree and can create several safety issues [
5,
6]. A battery’s SOH cannot be measured directly by sensors; instead, it can only be calculated from measurable variables such as battery voltage, current, and temperature [
7]. Additionally, the degradation of Li-ion batteries is a very complex process with varying degrees of dependence on the working conditions [
8].
The internal characteristics of Li-ion batteries are highly nonlinear, and their lifespan is influenced by several factors. In this sense, accurate state of charge estimation, SOH monitoring, thermal management, and increasing the useful life of batteries are among the main challenges of BMS [
9]. During recent years, digital twin (DT) technology has attracted a great deal of attention for battery monitoring and control from both academia and industry. Using DT, living models can be developed for systems and components to closely track their SOH and completely understand their degradation behavior to prolong components’ life and minimize abnormal events [
10]. A good overview of DT-based predictive maintenance can be found in [
11,
12,
13].
Although the DT technique is still in its early stages of development, it has already been shown to be useful in the design, monitoring, and control of complex systems, such as Li-ion batteries. A battery DT creates a mapping between the physical entity and its virtual model, which interact closely with one another through bidirectional data exchange. Big data analytics, artificial intelligence, blockchain, and the (Industrial) Internet of Things are among the main DT-enabling technologies [
14].
Figure 1 represents a general overview of a battery DT and its services. Zhao et al. [
15] present a DT framework using machine learning (ML) for battery SOH estimation to improve the BMS and optimize the operation of battery storage units. The framework combines a hybrid model that integrates long short-term memory (LSTM) as a data-driven model to provide precise initial state-of-charge (SOC) estimations and impedance data for an extended Kalman filter as a physical-based model. The BMS measures, estimates, and regulates battery conditions to ensure the efficient and safe operation of the battery. However, numerous voltage, current, and temperature sensors are needed, which can pose new challenges, such as high costs, limited space, poor efficiency, and high failure rates. Muaaz et al. [
16] suggested a DT-based solution to these issues that allows the BMS to estimate and predict the SOH of the battery using only a voltage sensor. They also explained the importance and feasibility of implementing a BMS DT. In addition, to identify relevant variables for ML modeling, they analyzed a correlation matrix of all the variables of the BMS and considered variables with a correlation coefficient greater than 0.6. Simple and multi-linear regression models were then employed to predict the continuous output variables.
One of the most important services of a battery DT is providing an accurate estimation of the SOH of the battery during its lifetime. Such information is critical for the efficient operational management of the battery, especially when integrating it with other power resources in microgrids, and the timely maintenance of the battery. The role of the battery DT is to provide an accurate model of the battery considering its operating environment as well as an efficient platform for integrating and managing the enormous amounts of data that are collected over the battery’s life [
17]. The DT-driven SOH estimation strategy features key attributes of autonomy and adaptation. The ultimate goal is to have a SOH estimation technique that is automatically updated based on the most recent information about the battery and the surrounding environment using efficient learning algorithms. Although there are many studies dedicated to battery SOH estimation, a DT-based approach taking full advantage of DT technologies is still missing. To develop such a model, a step-by-step approach should be adopted. This paper provides a brief overview of the current SOH estimation strategies for Li-ion batteries. A SOH estimation method with a novel data pre-processing technique is then proposed and validated through different simulation scenarios to enhance the SOH estimation accuracy without increasing the model’s complexity. Afterward, future steps towards developing an adaptive and automatic SOH estimation technique are discussed, and a conceptual framework for battery digital twinning is presented.
The rest of this paper is organized as follows. In
Section 2, a brief overview of battery SOH estimation methods is given.
Section 3 is dedicated to the introduction of the NASA data set for battery degradation analysis, and feature extraction for predicting the SOH of Li-ion batteries is also discussed. A novel data pre-processing framework for ML-based models to estimate battery SOH is provided in
Section 4. In
Section 5, the performance of the proposed model is evaluated in four different scenarios, and simulation results are thoroughly discussed. In
Section 6, online learning is introduced to develop a DT of a battery. Finally, concluding remarks are presented in
Section 7.
2. Battery SOH Estimation
Many studies have been conducted to determine an accurate estimation of the SOH of batteries. Generally, these studies can be classified into two categories: model-based and data-driven methods. The three commonly used model-based methods used for SOH estimation are based on equivalent circuit, electrochemical, and empirical models [
18]. Modeling batteries with equivalent circuits is a simplistic approach with low accuracy [
19]. Electrochemical models are accurate laboratory models with complex partial differential equations [
20], while empirical models use a simple model of the battery to identify the key parameters with difficulty in describing the capacity degradation [
21]. Nowadays, continuous advancements in enhancing the electrochemical performance of Li-ion batteries are achieved by applying intricate chemical mechanisms that make the physical modeling of Li-ion batteries more and more complicated. Reference [
22] introduced a self-assembly strategy focusing on terminal groups to improve lithium adsorption capability, low activation energy for lithium diffusion, and excellent structural stability. On the other hand, data-driven techniques provide appropriate solutions to battery degradation estimation problems with fewer requirements for understanding the physical and chemical mechanisms. Due to their flexibility and effectiveness, data-driven methods have been increasingly attracting interest in the SOH estimation of batteries. In data-driven models, the SOH of the battery is directly calculated using historical data that are easy to collect from sensors. Data-driven models for battery SOH estimation can be further classified into two categories: statistical and ML models [
2].
A data-driven battery SOH estimation method generally involves the following four steps: data collection, feature extraction, model training, and SOH estimation [
23]. Many data-driven techniques have been presented for battery SOH estimation, which either directly use battery capacity data or extract features from sensor data to estimate the SOH [
24]. The remaining useful life (RUL) of a battery defines how many charge–discharge cycles are left before the battery state drops to a specific threshold. To compute the RUL of the battery, Lyu et al. [
25] used an optimized relevance vector machine approach. To predict the RUL and SOH of Li-ion batteries, Khumprom and Yodo [
26] provided a data-driven prognostic method using deep Neural networks (NNs); the effectiveness of data-driven methods for battery aging characterization and SOH estimation was found to depend on the quality and quantity of the available data. Sufficient and representative data from various battery operating conditions and aging scenarios are crucial to ensure accurate predictions. By analyzing, identifying, and classifying the patterns and trends of the measurement data, such as the voltage, current, and temperature, and deploying deep neural network (DNN) models, an aging behavior pattern indicator for battery SOH estimation was proposed by Xia and Abu Qahouq [
27]. To forecast battery aging, She et al. [
28] integrated an incremental capacity analysis with a radial basis function (RBF) NN. The input data, including the accumulated mileage of vehicles and initial charging SOC, average charging temperature, average charging current, and the average discharging temperature of battery systems, and a classification factor for the incremental capacity peak values were chosen. A RBF NN model was used to reduce the dependency on the quality of data sets and increase the flexibility for noisy data while being particularly useful for pattern recognition and regression tasks. It could also learn the relationship between the battery’s aging parameters (capacity fade) and the corresponding incremental capacity analysis values and measurement data.
Convolutional neural network (CNN), recurrent neural network (RNN), gated recurrent unit (GRU), and LSTM models for predicting battery SOH and RUL have been established by different researchers for predicting the SOH and RUL of Li-ion batteries. An RNN model was provided in [
29] to estimate SOH degradation in batteries using current and voltage measurements. To improve the accuracy of the battery RUL prediction, Park et al. [
30] combined multichannel charging profiles with a LSTM model. Zheng et al. [
31] proposed a CNN–GRU model using current and voltage measurement data to estimate battery SOH. Ren et al. [
32] used a CNN–LSTM model to predict the RUL of the battery by considering the discharge capacity, linearly interpolated discharge capacity, linearly interpolated temperature, the quantity of discharge, and discharge time as main parameters. Kara et al. [
33] combined a CNN–LSTM model with particle swarm optimization (PSO) to improve the accuracy of RUL prediction. In this model, spatiotemporal relations are extracted from multivariate time series data, and the PSO algorithm is utilized to optimize the NN hyperparameters.
Data-driven models that do not directly use sensor data, such as voltage, current, and temperature, cannot estimate battery capacity with high accuracy. A solution to this is to extract efficient health features from measured data to improve the performance of SOH prediction models. There are two types of feature extraction: manual and automatic. In the manual feature extraction techniques, features are selected from the charging or discharging data of the battery, such as voltage, current, etc., and the most relevant features are fed into ML algorithms to estimate battery capacity [
34]. For example, Cui et al. [
35] extracted the duration of the constant voltage (CV) phase, the ratio of the constant current (CC) time and charging time, the area under the CC characteristic, and the time required for equal voltage rise, current drop, and voltage drop intervals as features. Then, the extracted features were evaluated by the Pearson and Spearman’s rank correlation coefficients. To evaluate the short-term SOH and long-term RUL of Li-ion batteries, Wang et al. [
36] applied indirect health indicators, such as the time corresponding to the same charging voltage interval, discharge voltage interval, and discharge temperature change, and a Gaussian process regression model. Zhu et al. [
37] proposed a method to estimate the battery SOH using a support vector machine and extracted features from battery cycle data. They focused on the relaxation process that occurs after a full charge, as it is highly related to battery degradation. Each voltage relaxation characteristic was transformed into six statistical features: the variance, skewness, maxima, minima, mean, and excess kurtosis. These features served as input to the SVM for SOH estimation, allowing for the accurate assessment of battery health based on the extracted characteristics from the relaxation process.
To track battery degradation, Ma et al. [
38] integrated a RNN model with the differential thermal voltammetry (DTV) signal processing method. In the beginning, DTV curves were smoothed using the Savitzky–Golay method and Pearson correlation analysis, and three key features were derived from various dimensions. Then, using National Aeronautics and Space Administration (NASA) [
39] battery data sets, four RNNs were developed and analyzed. In addition, a Monte Carlo (MC) simulation was performed to evaluate the performance under uncertainty, and the Bayesian optimization method was applied to modify hyper-parameters. A SOH estimation method integrating transfer learning and a deep belief network (DBN)–LSTM model was presented by Ma et al. [
40]. Transfer learning was applied to learn the features, and a DBN was used to estimate the SOH of the battery, while LSTM was utilized to consider the impact of historical data. The measured voltage, current, and temperature were used directly to extract six health features. The performance of their proposed technique was evaluated on the MIT battery data set and a mean absolute percentage error (MAPE) of 0.99% was reported.
Xu et al. [
41] presented a feature selection method to improve the training efficiency of a NN by eliminating irrelevant features from the input data. Also, by increasing the layers of the NN, the accuracy rate tends to saturate and bring more errors to the prediction results. They addressed this problem by incorporating skip connections into the CNN–LSTM model to improve the accuracy of SOH prediction while reducing the computational burden and enhancing the robustness. A CNN is used to extract features from the original data, LSTM is applied to predict the SOH, and a skip-connection is added by concatenating the outputs of two different LSTMs and using concatenated outputs as an input layer of LSTM.
Manual feature extraction in battery SOH estimation has the merit of being rooted in the understanding of the battery’s physical behavior and can be scaled and applied effectively. However, the need for developing efficient automatic feature extraction methods is growing due to the increasing complexity and diversity of battery data sets. Automatic feature extraction offers the advantage of discovering hidden patterns and capturing complex relationships that may not be apparent while manually selecting features. It enables the exploration of a broader range of features and facilitates a more comprehensive characterization of battery behavior. Additionally, automatic feature extraction techniques can adapt to varying battery chemistries, configurations, and operating conditions, ensuring scalability and generalizability [
42].
For battery SOH estimation, there are not many studies that specifically investigate feature extraction automatically, and existing techniques frequently combine feature extraction and SOH estimation techniques into a supervised model that is trained using SOH values as labels and raw battery data as inputs. An effective deep learning (DL) model that is frequently used to extract features from measured data automatically is the CNN architecture. Typically, a one-dimensional (1D) CNN is utilized to extract the temporal features of sequence data, while a two-dimensional (2D) CNN is applied to extract spatial features, such as images [
43]. Gong et al. [
44] calculated battery SOH using a CNN and encoder–decoder model while the encoder combines a CNN with a super-attention method for automatic feature extraction. In their model, for estimating battery SOH, only the raw measurements of battery current, voltage, and temperature are required.
CNN models are commonly used to extract features automatically from sensor data. However, in most studies, the focus is on improving the feature extraction methods to enhance the accuracy of SOH estimation by increasing the model complexity [
45] without much attention to the pre-processing of data. Zhou et al. [
46] proposed an attention depthwise temporal convolutional neural network (TCNN) model for feature extraction to improve battery SOH prediction. The TCNN is designed to capture temporal dependencies and patterns in battery data. The TCNN architecture typically consists of multiple layers of 1D temporal convolutional blocks, which are followed by pooling and fully connected layers. In order to capture long-range dependencies in the input sequences, the TCNN utilizes dilated causal convolution. By employing dilated convolutions, the model can incorporate a broader temporal context without significantly increasing computational costs. Additionally, attention mechanisms are integrated into the TCNN to focus on relevant temporal features of battery conditions. This integration allows the model to prioritize important patterns of the battery data to enhance its ability to analyze and interpret the temporal dynamics effectively. However, according to Andrew Ng, founder of DeepLearning AI, good data are defined consistently and covers all edge cases [
47].
In this paper, a novel data pre-processing model is proposed that integrates the current research findings on the SOH estimation of Li-ion batteries with the most recent advancements in DL algorithms. With this model, the numbers and usefulness of training data that are fed into the ML models are increased to improve the accuracy of the battery SOH estimation. The proposed pre-processing method converts the 1D discharge voltage data for all cycles to a 2D data set (discharge voltage cycle) extracts the new training data set from the large 2D data set by considering a constant length and width for the sliding window and moving this window over the entire 2D data set. Then, by feeding the 2D training voltage data to an ML model, features are automatically extracted in the training process. Finally, the battery voltage in the next cycle is predicted, and the SOH of the battery is calculated. An advantage of this model is that the autonomous feature extraction technique eliminates the need for extra mathematical calculations and the manual evaluation of measured data for extracting features. Another advantage is that the root-mean-squared error (RMSE) values are considerably reduced only by applying the proposed data pre-processing technique without increasing the complexity of the ML model. The performance of the proposed technique is evaluated on the NASA battery degradation data set, which is introduced in the next section.
6. Battery DT with Online Learning
The conventional methods of battery SOH estimation involve building a model from sensor data and other sources and applying that model to estimate the behavior of the battery. However, this method is not completely capable of capturing the complicated and dynamic behavior of batteries in real-world applications and their aging mechanisms that are necessary for developing the battery DT. In this regard, adopting online learning methods is crucial. Online learning is a method that uses ML algorithms to continuously update and improve the accuracy of the prediction model. By embedding online learning into the SOH estimation model, the model is continuously updated with new data as it becomes available, enabling it to improve over time. In this way, the real-time data stream is continuously analyzed and the model parameters are updated in real-time employing ML techniques, which provides a more accurate representation of the behavior of the battery. The estimation of battery SOH with online learning is shown in
Figure 12. These models can recognize and respond to changes in the behavior of batteries over time, which is especially crucial for battery usage in mobile applications, for instance in electric vehicles, where the battery undergoes a wide range of operating conditions and usage patterns.
The ability to continually learn, adjust over time, and avoid catastrophic forgetting are among the important challenges of online learning. The learning model must balance
stability (the ability to keep previous information) with
plasticity (the flexibility to adjust to new information) over time. When learning a new activity, excessive plasticity might lead to the forgetting of previously learned information. Moreover, learning sequential tasks may become more challenging due to excessive stability [
55]. Li et al. [
56] presented a combined CNN and transfer learning model to extract features automatically and fine-tune a pre-trained model for online battery SOH estimation. Yan et al. [
57] suggested an XGBoost-based online platform for online battery health diagnosis. Data aggregation and feature extraction for real-time battery data during charging are performed using dynamic time warping for clustering measured data, and a SOH prediction model is created by using the XGBoost method.
Incorporating online learning mechanisms into the battery SOH estimation model facilitates developing a battery DT that mirrors the true condition of the battery over its lifetime, which is the scope of the future research of the authors. The proposed data pre-processing method in this paper, which can automatically extract features from raw data, is the first step toward developing battery DT.