A Computationally Efficient Approach for the State-of-Health Estimation of Lithium-Ion Batteries

Qin, Haochen; Fan, Xuexin; Fan, Yaxiang; Wang, Ruitian; Shang, Qianyi; Zhang, Dong

doi:10.3390/en16145414

Open AccessArticle

A Computationally Efficient Approach for the State-of-Health Estimation of Lithium-Ion Batteries

by

Haochen Qin

,

Xuexin Fan

,

Yaxiang Fan

^*,

Ruitian Wang

,

Qianyi Shang

and

Dong Zhang

National Key Laboratory of Electromagnetic Energy, Naval University of Engineering, Wuhan 430033, China

^*

Author to whom correspondence should be addressed.

Energies 2023, 16(14), 5414; https://doi.org/10.3390/en16145414

Submission received: 10 June 2023 / Revised: 6 July 2023 / Accepted: 13 July 2023 / Published: 16 July 2023

(This article belongs to the Special Issue Advanced Application Technology of Lithium-Ion Batteries)

Download

Browse Figures

Versions Notes

Abstract

:

High maintenance costs and safety risks due to lithium-ion battery degeneration have significantly and seriously restricted the application potential of batteries. Thus, this paper proposes an efficient calculation approach for state of health (SOH) estimation in lithium-ion batteries that can be implemented in battery management system (BMS) hardware. First, from the variables of the charge profile, only the complete voltage data is taken as the input to represent the complete aging characteristics of the batteries while limiting the computational complexity. Then, this paper combines the light gradient boosting machine (LightGBM) and weighted quantile regression (WQR) methods to learn a nonlinear mapping between the measurable characteristics and the SOH. A confidence interval is applied to quantify the uncertainty of the SOH estimate, and the model is called LightGBM-WQR. Finally, two public datasets are employed to verify the proposed approach. The proposed LightGBM-WQR model achieves high accuracy in its SOH estimation, and the average absolute error (MAE) of all cells is limited to 1.57%. In addition, the average computation time of the model is less than 0.8 ms for ten runs. This work shows that the model is effective and rapid in its SOH estimation. The SOH estimation model has also been tested on the edge computing module as a possible innovation to replace the BMS bearer computing function, which provides tentative solutions for online practical applications such as energy storage systems and electric vehicles.

Keywords:

lithium-ion battery; state of health; battery management system; light gradient boosting machine; weighted quantile regression; interval estimation; edge computing

1. Introduction

Due to the characteristics of high energy density, remarkable cycling performance, and low environmental pollution, lithium-ion batteries are widely applied in portable electronic devices, electric transportation, and microgrid energy storage [1]. However, as lithium-ion batteries are cycled, an irreversible electrochemical reaction occurs inside of them. The reaction leads to battery degradation, which is specifically manifested as battery capacity fade and a rise in the internal resistance [2,3]. In severe cases, a thermal runaway may occur. In particular, incorrect operations will also intensify the battery aging process, which invariably increases the potential maintenance costs and security risks. Therefore, developing an efficient and reliable battery management system (BMS) has become an urgent demand. A BMS monitors the performance of lithium-ion batteries in real time by collecting characteristics such as the voltage, current, and temperature of the battery and by estimating the state [4]. In addition, various strategies are integrated into BMSs for battery charge–discharge control and health management. As one of the typical states estimated in a BMS, the state of health (SOH) is applied to track the actual aging characteristics of batteries during their operation. The BMS not only displays the estimated SOH value to the user in real time for timely battery maintenance but also employs the SOH as the decision support for online battery health management [5]. Because the electrical characteristics of aging lithium-ion batteries are manifested by the decrease in their capacity [6,7], the SOH can be expressed as follows:

SOH = \frac{Q_{n o w}}{Q_{f u l l}}

(1)

where

Q_{n o w}

and

Q_{f u l l}

represent the batteries’ current available capacity and initial capacity, respectively.

However, it is not feasible to directly measure the capacity of the battery in some applications; for example, direct measurements are often not applicable to online estimates. Moreover, the evaluation standard for battery aging behavior has not been made strictly clear.

The above reasons make the accurate estimation of the SOH more challenging, this problem has attracted the attention of many studies. Most of the existing research on SOH estimation over-pursues the estimation accuracy and neglects the computational efficiency, which makes it difficult to apply these methods to BMSs. As SOH estimation is just one of many tasks of BMSs, it must occupy as little computing resources as possible, and a fast SOH estimation is beneficial for the BMS to perform other health management tasks. Therefore, it is necessary to develop a SOH estimation method that has high computational efficiency.

1.1. A Brief Review of Existing Approaches

A large number of approaches have been proposed for lithium-ion battery SOH estimation; these methods fall into three main categories: direct measurement methods, model-based methods, and data-driven methods [8].

The direct measurement method mainly adopts the raw measurement data that are related to battery aging and estimates the SOH by mathematically calculating the capacity or internal resistance. For such cases, the lithium-ion battery is charged under standard operating conditions and then discharged to the cutoff voltage at the rated current. The measured amount of power released can be utilized to calculate the SOH as shown in Equation (1). Another direct measurement method is electrochemical impedance spectroscopy (EIS), which monitors and studies the battery aging mechanism nondestructively by measuring the impedance [9]. The hybrid pulse power characteristic (HPPC) test, also a direct measurement method, requires special battery testing equipment to measure the direct current internal resistance of the battery and estimate the SOH [10]. The above methods can accurately determine the discharge capacity and internal resistance, but they are not suitable for complex dynamic conditions due to the limitation of test conditions. Therefore, these methods are more focused on laboratory applications and cannot meet the requirements of an efficient estimation in BMSs.

Model-based methods aim to develop a mechanistic model that describes the degradation behavior of the battery, and they mainly focus on developing an equivalent circuit model (ECM) and electrochemical model. The ECM simulates the charge–discharge characteristics of lithium-ion batteries by numerically expressing the electrical components, including the capacitors and resistors [11]. To some extent, this approach considers the aging mechanism and has the advantages of a simple structure and good dynamic response [6,12,13]; however, the accuracy of this approach is highly dependent on the fidelity of the ECM model. Building high-fidelity models to adequately describe the dynamic behavior of aging under different operating conditions will be computationally burdensome. By numerically expressing the internal electrochemical reaction process of the lithium-ion battery, the electrochemical model describes the charge–discharge behavior and precisely quantifies the aging state [14,15]. However, the electrochemical model often consists of many nonlinear and coupled partial differential equations, which are prone to the dilemma where it becomes difficult to identify model parameters [16]. Yan et al. [17] proposed the Kalman filtering (KF) approach to assist in identifying lithium-ion batteries’ model parameters; however, due to the complex working mechanism of the lithium-ion battery, most existing studies only approximate the working process of lithium-ion batteries from a single perspective. These studies fail to accurately describe its actual situation under the coupling of multiple physical fields such as circuit, electrochemical, and thermal circuits. Even if a multi-physics field model is developed that can accurately estimate the SOH, the computational complexity that is caused by correcting the model’s parameters will abruptly increase, making it difficult to efficiently implement the model in BMSs.

Due to having flexibility and being battery model-free, data-driven methods are emerging and have become one of the most significant methods for battery SOH estimation [18,19]. Compared with model-based methods, this type of method has the advantages of high accuracy, robustness, and a relatively small complexity of online estimation [20]. They can also establish a nonlinear mapping relationship between direct measurement data and the SOH using machine learning methods without prior knowledge of the battery degradation and then utilize real-time data to estimate the SOH. One key benefit of data-driven-based methods is these data can be easily measured using BMS sensors, which certainly provide an opportunity for the deployment of a data-driven approach. Many popular machine learning methods have been applied to address this problem [8,21]; for instance, in ref. [22], a support vector regression (SVR) model was built to connect the health indicator and SOH of the battery. The experiment implied that the trained SVR model reflected the degradation process of lithium-ion batteries well. In [23], a novel Gaussian process regression (GPR) model was proposed based on the features that were extracted from direct measurement curves, and the model achieved an accurate SOH estimation with a reliable uncertainty representation. Li et al. [24] designed a variant long short-term memory (LSTM) neural network to guarantee the SOH estimation performance. The accuracies of the SOH estimation methods mentioned above have been verified in some public datasets. Furthermore, a hybrid neural network called gate recurrent unit-convolutional neural network (GRU-CNN) was proposed in our previous study, where we attempted to migrate it to a microcontroller for online estimation [25]; however, the computational complexity limited the estimation performance of the GRU-CNN approach.

As a branch of machine learning, ensemble learning is generally considered to be the interpretation of collective intelligence by machine learning, and it is capable of fast calculations [26]. Specifically, ensemble learning combines several similar base learners into one strong learner by using strategies such as bagging, boosting, and stacking; in this way, the generalization error of a single learner will be compensated. A more critical advantage of the method is that ensemble learning exhibits high computational efficiency when the computation of the base learner (e.g., decision tree) has a low cost [27]. For instance, random forest (RF) is a representative algorithm of the bagging strategy, and it classifies data samples using a binary decision tree as the base learner. Yang et al. [28] adopted the RF algorithm to ensemble two CNN models and proposed a solution for SOH estimation based on partial discharge data to achieve complementary advantages. Different from the bagging strategy, the boosting strategy integrates multiple weak learners along with an iterative fitting of residuals into a strong learner to approximate the objective function. Representative ensemble learning algorithms that apply boosting strategies include gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), etc. Qin et al. [29] selected the geometric features of the local voltage profile as the aging features to estimate the SOH using the GBDT model. Song et al. [30] adopted the average voltage of equal time intervals, the voltage difference, and the temperature difference as the feature inputs and developed an XGBoost model to estimate SOH. The authors achieved high estimation accuracy on the NASA dataset. LightGBM is a novel GBDT algorithm that has been used in many different fields of regression tasks [31]. So far, few studies have established LightGBM framework models for SOH estimation. Having faster speed and lower memory consumption compared with other ensemble learning algorithms, LightGBM has solved many practical problems in the field of data mining and machine learning, and it will be very suitable for estimating the health of lithium-ion batteries in the future.

1.2. Research Motivation and Original Contribution

The development of computationally efficient methods for accurate SOH estimation can reduce maintenance costs, mitigate battery safety risks, and improve reliability. Although existing methods could obtain marvelous estimation results, due to problems such as limited computing resources and difficulties in actual measurement, it is difficult for the existing methods to be implemented in BMSs.

To address the problem, in this paper, we propose a new, highly efficient, and practical SOH estimation method that is based on LightGBM, which is one of the ensemble learning methods that has high computational efficiency. Using the charging voltage data as directly measured from the BMS sensor as the input, this method is not only suitable for online applications, but also further reduces the computational complexity by omitting the feature extraction process. In addition, as an important upgrade to estimation capacity, a weighted quantile regression (WQR) is used in combination with the loss function of LightGBM to help the model describe the estimation results more comprehensively and provide a swift SOH estimation. The model can not only estimate the SOH but can also present the distribution of SOH in a specific interval that achieves the interval estimation. Interval estimation reflects the confidence range when the SOH estimation results vary with a certain parameter, thus providing more useful information. We also innovatively propose the deployment of edge intelligent computing modules that are good at using artificial intelligence algorithms into BMS; this can share the computational pressure of the BMS. In summary, the innovative contributions of this paper can be summarized as follows:

This paper proposes an approach for estimating the SOH in lithium-ion batteries under the LightGBM framework with the intent of achieving efficient calculations in BMSs.
In the LightGBM framework, a LightGBM-WQR model is developed to minimize the quantile regression loss functions, provide 90% confidence intervals for SOH estimation, and establish an evaluation method. These processes make the SOH calculation more efficient and contain more information.
Due to the varying contributions of data samples in the training set towards SOH estimation, randomly generated sequences are utilized for sample weighting in order to achieve superior integration results. Through multiple rounds of random weighting, the contribution of each sample is determined and subsequently used to select the optimal SOH estimation result.
The proposed method is verified in computer and edge intelligent computing modules, respectively; the advantages of this method for improving computational efficiency are illustrated by comparing and analyzing the computational complexity.

The remaining sections are organized as follows: Section 2 describes the selection and preprocessing of datasets. The details of the SOH estimation method are presented in Section 3. Section 4 focuses on the analysis of the experimental results. Finally, Section 5 summarizes the conclusions.

2. Dataset Selection and Preprocessing

In the development of a data-driven method, the dataset is crucial to construct an effective framework. This section presents the details of the battery dataset and data preprocessing used in this paper by analyzing the degradation characteristics of two publicly available degradation datasets for lithium-ion batteries.

2.1. Battery Dataset

This study adopted the NASA Randomized Battery Usage dataset (hereafter referred to as the NASA dataset) and Oxford Battery Degradation dataset (hereafter referred to as the Oxford dataset) for the experiments.

2.1.1. NASA Randomized Battery Usage Dataset

The NASA dataset was generated in 2014 by NASA’s Ames Center for Predictive Science, and it is currently widely used to study the degeneration characteristics of lithium-ion batteries [32]. This dataset includes 28 LG Chem 18,650 LiFePO4 cells, and the 28 cells are named RW1 to RW28 due to the random walk (RW) conditions. These cells are equally divided into seven groups with different regimes. These differences are reflected in the temperature, way of random walking, and frequency of the performing reference charge and discharge. For example, the temperature of Group 4 and Group 6 is 25 °C and 40 °C, respectively, which is the only difference between these two groups’ loading procedures. This is the same with the temperature setting of Groups 5 and 7. For Groups 1 and 2, which are also at 25 °C, the difference in the way of random walking determines the difference in the experimental procedure between them. Cells of Group 1 are continuously operated by repeatedly discharging them to 3.2 V using a randomized sequence of discharging currents ranging between 0.5 A and 4 A. After being completely discharged, these cells are charged by randomly selecting a multiple of 0.5 h ranging from 0.5 h to 3 h as the charge duration. In contrast to Group 1, cells of Group 2 are continuously operated by repeatedly charging them to 4.2 V and then discharging them to 3.2 V using a randomized sequence of discharging currents ranging between 0.5 A and 4 A. The details of the other groups’ conditions are also shown in Table 1. The RW is the most important condition of a controlled trial for the NASA dataset among these groups.

The typical operating voltage range of these cells ranges from 3.2 V to 4.2 V, and the rated capacity is 2.1 Ah. The reference charge and discharge are conducted to determine the battery capacity in an ampere-hour integral. The charging protocols of the reference charge and discharge are the constant current-constant voltage (CC-CV) mode and constant current (CC) mode, respectively. First, the cells are charged at 2 A until they reach 4.2 V, at which time the charging is switched to a constant voltage mode and continues to charge the batteries until the charging current falls below 0.01 A. Second, the cells are then rested for a while with no current draw. Then, the cells are discharged at 2 A until the battery voltage reaches 3.2 V. Finally, they are rested for some time. The reference procedure is designed to evaluate the capacity as an indicator of the SOH [33]. A total of 947 charging curves apply to the research of SOH estimation of lithium-ion batteries.

2.1.2. Oxford Battery Degradation Dataset

The Oxford dataset adopts eight Kokam LiCoO2 batteries with a conventional operating voltage range of 2.7 V to 4.2 V and a rated capacity of 0.74 Ah [34]. Different from the NASA dataset, the Oxford dataset was obtained through repeated experiments on lithium-ion batteries using ARTEMIS urban driving conditions at a constant ambient temperature of 40 °C. These cells, which are named Cell1 to Cell8, are charged for the first time at a constant current of 2 C to 4.2 V and then discharged in a variable current mode. In the subsequent charge–discharge cycle, charge to 4.2 V at a 1 C constant current, then discharge at a 1 C constant current. The battery charging process data are recorded between the cell voltage limits once every 100 cycles. In the recorded cycle, the sampling interval is 1 s. However, all cells lack the data of the 1500th, 1700th, 3400th, 4700th, and 4900th cycle, and the current data are not recorded. Therefore, there are 517 charging voltage curves available for SOH estimation.

2.2. Dataset Preprocessing

The voltage, current, capacity, and temperature were recorded by the two public datasets, which can also be directly measured in the BMS. The practical benefit of effective control of a BMS is that the charging process is peaceful compared with the discharge process Practically, benefiting from the effective control of the BMS, the charging process is peaceably compared to the discharge process [25]. Moreover, in the NASA dataset, every time the reference profile is conducted, it will undergo the process of a full charge after a complete discharge. This effectively avoids the uncertain initial charging state caused by the previous incomplete discharging or charging process [35]. Therefore, the charge curves in the NASA dataset can more accurately reflect the battery aging than the discharge curves. As for the Oxford dataset, the cycle of full charge and discharge constitutes the load profiles. Specifically, the Oxford dataset records capacity instead of current, which to some extent indicates the steady reaction process. Moreover, in a single cycle, the change in SOH during the end of charging and discharging is approximately negligible. Thus, charging curves are also suitable for describing the battery aging in the Oxford dataset. Stable charging curves can also reduce the impact of a few outliers [2].

Feature extraction can obtain the part that has a strong correlation with battery aging from the charging curves. In many studies on the SOH estimation of lithium-ion batteries, researchers have used different feature extraction methods for different datasets. In other words, the correlation of the same feature extraction method on different batteries may be uneven, so the existing methods still have deficiencies. To solve this problem, Yang et al. [36] analyzed a total of 18 characteristics, including voltage, current, temperature, and capacity, and screened out the characteristics that had a high correlation with SOH. Although the analysis of feature selection was adequate and detailed, the above paper ignored the computational burden caused by feature redundancy. Moreover, although it is not obvious compared with features with higher correlations, unselected features are often also correlated with SOH. Eliminating these features that are not significantly correlated with SOH may result in the processed data missing some aging information about the battery. Although these characteristics are not highly correlated with SOH, they still characterize the battery aging degree to a certain extent. Therefore, the complete direct measurement data were chosen as the feature in order to preserve the full characteristics of the battery and avoid redundancy. Voltage is one of the most important physical quantities for characterizing the SOH of lithium-ion batteries [37]. For instance, the charging voltage curves of RW20 and Cell8 vary with time, which is shown in Figure 1. The charging voltage curves reflect the decay characteristics of lithium-ion batteries by using a color gradient, and each of them is a monotonous function of time. For the newly activated battery, the SOH is relatively large and the charging voltage slowly increases to the cut-off voltage. However, as the battery continues to perform charge and discharge cycles, the SOH gradually decreases, and the charging voltage rises to the cut-off voltage faster. As a result, charging voltage curves can be directly used as a measurement feature for SOH estimation.

The aforementioned facts demonstrate that estimating SOH by using the complete charging voltage curve as a feature can not only limit computational complexity but also preserve batteries’ degradation characteristics [38]. Consequently, the voltage during the complete charging curve was selected for SOH estimation in this study.

The complete charging voltage curves of each battery in the above two datasets are preprocessed as follows. First, each complete charging voltage curve is resampled to form 256 sequences of fixed length. From the information of the dataset, the charging time between the complete charging voltage curves is different. As the number of cycles increases, the battery takes less and less time to charge. The time interval of the data collection is fixed. Therefore, the number of voltage points collected for each complete charging process is different. For example, in the Oxford dataset, Cell8 has 43 more voltage points collected in the 1st cycle than in the 100th cycle. In order to avoid estimation results that are biased towards charging curves that may have more voltage points, resampling is used to unify the dimensions. For voltage curves of different lengths, the sampling points of each curve are the same after resampling. Therefore, the sampling frequency is different between each complete charging voltage curve, which is obtained by dividing the voltage number by 256 and rounding to the left. Because each complete charging voltage curve corresponds to the available capacity

Q_{n o w}

of one cell, each voltage sequence is also in one-to-one correspondence

Q_{n o w}

. The mapping relationship among them will be established in the SOH estimation model proposed in Section 3.

Then, by augmenting the dataset, the data-driven algorithm is made more robust to the noise that accompanies the given characteristic in real-world measurement devices. In the actual measurement, due to the systematic error of the BMS sensor, the measurement data may have a certain deviation. These two datasets were measured using laboratory measurement equipment, and this part of the offset can be approximately ignored. However, the scenario where the BMS electrical integration of the energy storage device is higher than the laboratory measurement environment will expose it to an environment where other electronic devices and radio frequency devices are injected into the system. Consideration of these issues motivates us to expand the dataset through data augmentation. Concretely, we assumed that the injected noise in the electrical environment is white Gaussian noise with varying amplitudes. Matlab software is utilized to process the data, and Gaussian white noise is introduced into the charging voltage curve. Initially, we apply the wgn function to generate white Gaussian noise with a mean of 0 and a deviation ranging from 0.3% to 3%. Additionally, a Gaussian white noise component is added to the voltage curves in order to generate the noisy voltage variants shown in Figure 2. As depicted in Figure 1 and Figure 2, the introduction of Gaussian white noise results in a rough voltage curve for this variant while maintaining its overall trend. In contrast, the raw data exhibit a much smoother profile. Finally, the augmented training set is composed of this curve and the original voltage curve without noise, resulting in a doubling of the data size in the training set. It is worth noting that this augmented dataset is not only utilized for training the SOH estimation model but also for enhancing the model’s generalization capability. In summary, the augmentation technique would make the estimation model more robust to error, have higher estimation accuracy, and prevent overfitting.

Finally, we normalize the voltage data

x

as follows:

x ’ = \frac{x - \min (x)}{\max (x) - \min (x)}

(2)

where

x ’

represents the value of the normalized voltage.

3. LightGBM-WQR Model for SOH Estimation

In this section, the LightGBM framework in ensemble learning is combined with the weighted quantile regression (WQR) method to build a computationally efficient SOH estimation model. First, we formulate the current SOH estimation problem. Second, the principle of the LightGBM framework is introduced. Then, we present the weighted quantile regression method in detail and derive how to use WQR to improve LightGBM from the loss function. Lastly, the LightGBM-WQR model is developed for estimating SOH. This is the key contribution of this paper.

3.1. Problem Formulation

Given a training set of input–output pairs

D = {(x_{i}, y_{i})}_{i = 1}^{N_{D}}

, the goal of the data-driven SOH estimation problem is to learn a nonlinear mapping

f (\cdot)

from characteristics

x

to SOH

y

.

x

is the complete charging voltage data and

N_{D}

is the number of training examples. Subsequently, SOH

y ’

would be estimated with new observed charging voltage data

x ’

and the nonlinear mapping

f (\cdot)

as follows:

y ’ = f (x ’)

(3)

In this paper, the approach called LightGBM-WQR is used to learn this nonlinear mapping

f (\cdot)

.

3.2. LightGBM Framework

LightGBM was developed based on XGBoost and has been widely used in regression tasks [31]. The XGBoost framework adopted a boosting learning method, which iterates several decision trees and integrates them into a strong learner. Compared with the gradient boosting tree, XGBoost introduces a regularization term and uses a second-order expansion to represent the loss function, which has a stronger fitting ability. LightGBM has the advantage of faster computation compared with XGBoost. This is due to three aspects of improvement that were accomplished by LightGBM on the basis of XGBoost. First, by using the gradient-based one-sided sampling (GOSS) algorithm, all samples with large gradients are retained, and samples with small gradients are randomly sampled, making the selection of the best segmentation point faster. Second, the exclusive feature bundling (EFB) algorithm is proposed to bundle features that are completely mutually exclusive or have a small degree of mutual exclusivity in high-dimensional sparse data, thus achieving a data dimensionality reduction. It is worth noting that the development of GOSS and EFB is based on the histogram algorithm. The LightGBM framework takes advantage of the histogram algorithm to reduce storage space and speed up intermediate computation. Therefore, the organic combination of these three algorithms plays a crucial role in improving the computational efficiency of the LightGBM framework. Third, LightGBM also adopts a leaf-wise growth strategy based on depth limitation, which can select a node with the highest splitting gain in each layer for splitting, and the situation of too many leaf nodes in the same layer is avoided. Compared with the decision tree method of the level-wise growth strategy in the XGBoost framework, this algorithm can improve the training speed of the model and inhibit overfitting by limiting the depth of the tree. In summary, LightGBM is an ensemble learning method with a faster running speed and less computational complexity based on realizing a high-precision SOH estimation. The LightGBM framework is described in detail below.

Given the battery training set

D = {(x_{i}, y_{i})}_{i = 1}^{N_{D}}

, LightGBM focuses on iterating an approximation

\hat{f} (x)

to a certain function

f^{*} (x)

that minimizes the expected value of a detailed loss function

L (y, f (x))

as follows:

\hat{f} = \arg \min_{\hat{f}} E_{y, D} L (y, f (x))

(4)

LightGBM integrates multiple

K

regression trees to approximate the final function, which is

{\tilde{y}}_{i} = f (x) = \sum_{k = 1}^{K} f_{k} (x_{i})

(5)

where

{\tilde{y}}_{i}

represents the estimated SOH value for the ith sample.

x_{i}

is the normalized voltage input vector in the sample and

f_{k} (x_{i})

represents the kth regression tree. Each regression tree could be expressed as

w_{q (x)}, q \in {1, 2, \dots, T}

, where

q

stands for the decision rules of the tree,

T

denotes the leaves’ number, and

w

is the sample weight of leaf nodes in the tree structure. Considering the possible overfitting, the regularization term

Ω

should be added when solving the objective function. Furthermore, it is necessary to control the weight of leaf nodes and the number of leaves by using the regularization coefficients

γ

and

λ

, thereby limiting the complexity of the regression trees and suppressing their overfitting. The smaller regularization term leads to lower complexity and stronger generalizability. The equations are as below.

L (\hat{f}) = \sum_{i = 1}^{n} l (y_{i}, {\tilde{y}}_{i}) + \sum_{k} Ω (f_{k})

(6)

Ω (f) = γ T + \frac{1}{2} λ {‖ w ‖}^{2}

(7)

Here,

l

is a differentiable convex loss function, which measures the discrepancy between estimation

{\tilde{y}}_{i}

and target

y_{i}

. LightGBM is trained in an additive manner. The following shows the objective function of solving the kth iteration:

L^{(k)} = \sum_{i = 1}^{n} l (y_{i}, {\tilde{y}}_{i}^{(k - 1)} + f_{k} (x_{i})) + Ω (f_{k})

(8)

Using Taylor expansion, we transform

l

and combine the constant terms. As the constant term does not influence the solution of the objective function, it is omitted from the following equations:

{\tilde{L}}^{(k)} ≅ \sum_{i = 1}^{n} [g_{i} f_{k} (x_{i}) + \frac{1}{2} h_{i} f_{k}^{2} (x_{i})] + Ω (f_{k})

(9)

g_{i} = \frac{\partial l (y_{i}, {\tilde{y}}^{(k - 1)})}{\partial {\tilde{y}}^{(k - 1)}}

(10)

h_{i} = \frac{\partial^{2} l (y_{i}, {\tilde{y}}^{(k - 1)})}{\partial {\tilde{y}}^{{(k - 1)}^{2}}}

(11)

g_{i}

and

h_{i}

are first- and second-order gradient statistics on the loss function. Define

I_{i} = {i | q (x_{i}) = j}

as the sample set of leaf

j

. Equation (9) can be rewritten by expanding

Ω

as follows:

{\tilde{L}}^{(k)} = \sum_{j = 1}^{T} [(\sum_{i \in I_{j}} g_{i}) w_{j} + \frac{1}{2} (\sum_{i \in I_{j}} h_{i} + λ) w_{j}^{2}] + γ T

(12)

Using the optimal weights

w_{j}^{*}

of the leaf,

j

can be computed for a fixed structure

q (x)

as follows:

w_{j}^{*} = - \frac{\sum_{i \in I_{j}} g_{i}}{\sum_{i \in I_{j}} h_{i} + λ}

(13)

The corresponding minimum objective function values can be obtained as follows:

{\tilde{L}}^{(k)} (q) = - \frac{1}{2} \sum_{j = 1}^{T} \frac{{(\sum_{i \in I_{j}} g_{i})}^{2}}{\sum_{i \in I_{j}} h_{i} + λ} + γ T

(14)

where they can be utilized as the evaluation function that measures the quality of one tree structure

q

. In practice, it is inefficient to enumerate all possible tree structures. Therefore, a greedy algorithm applied for splitting is added into the objective function as follows:

L_{s p l i t} = \frac{1}{2} [\frac{{(\sum_{i \in I_{L}} g_{i})}^{2}}{\sum_{i \in I_{L}} h_{i} + λ} + \frac{{(\sum_{i \in I_{R}} g_{i})}^{2}}{\sum_{i \in I_{R}} h_{i} + λ} - \frac{{(\sum_{i \in I} g_{i})}^{2}}{\sum_{i \in I} h_{i} + λ}]

(15)

where

I_{L}

and

I_{R}

are the sample sets of the left and right nodes after the split, respectively. It is important to reiterate that a leaf-wise strategy makes splitting more effective. Compared with the traditional regression tree method, LightGBM has a faster convergence rate. In addition, the hyperparameter setting and optimization of LightGBM affect its actual prediction result. See Table 2 for the details of these hyperparameters.

Therefore, in the LightGBM framework, the SOH estimation model will be established with less computational complexity, a faster training process, and stronger generalizability.

3.3. Weighted Quantile Regression for LightGBM

To obtain the uncertainty information in the process of SOH estimation, the quantile regression is proposed to improve LightGBM to realize the SOH interval estimation. SOH interval estimation can better reflect the distribution of model-predicted values in a certain interval and provide more potential health information about lithium-ion batteries [39,40]. As mentioned in Section 3, a Taylor expansion of the objective function under the LightGBM framework and the introduction of a second derivative not only improves the gradient convergence rate but also provides extensible opportunities for selecting multiple forms of the loss function. Therefore, the SOH interval estimation can be achieved by developing the quantile loss function.

The quantile loss function is commonly utilized in quantile regression to estimate parameters for each quantile and solve interval prediction problems. A QR model is often trained by solving an optimization problem that minimizes the quantile loss [41]:

P_{k, τ} = \arg \min_{P_{k, τ}} \sum_{i}^{n} L_{i, τ}^{(k)} (y_{i}, f_{k, τ} (x_{k, i}, P_{k, τ}))

(16)

where

τ

represents the quantile. The quantile regression model

f_{k, τ} (x_{k, i}, P_{k, τ})

can be built for each quantile

τ

and model formed at the kth iteration. Equation (16) aims to optimize the parameter

P_{k, τ}

, and the result of the quantile model is expressed as

{\tilde{y}}_{i, τ}

. When the quantile regression model is trained with

D = {(x_{i}, y_{i})}_{i = 1}^{N_{D}}

,

P_{k, τ}

is only determined by the quantile value

τ

. Then, the quantile loss function

L_{i, τ}^{(k)} (y_{i}, {\tilde{y}}_{i, τ})

can be written as below:

L_{i, τ}^{(k)} = {\begin{matrix} (y_{i} - {\tilde{y}}_{i, τ}) \times τ & {\tilde{y}}_{i, τ} \leq y_{i} \\ ({\tilde{y}}_{i, τ} - y_{i}) \times (1 - τ) & {\tilde{y}}_{i, τ} > y_{i} \end{matrix}

(17)

where

L_{i, τ}^{(k)}

is the quantile loss function after the kth iteration with

τ

as the quantile. The quantile loss function, also known as the pinball loss function, is a metric for evaluating the performance of quantile regression models. When the quantile loss is smaller, the quantile regression performs better. The loss function is the key to combining quantile regression and LightGBM. From Equations (10) and (11), the information about the loss function of LightGBM has been included in the first-order derivative

g_{i}

and second-order derivative

h_{i}

. Therefore, by substituting Equation (17) into Equations (10) and (11), the first-order derivative

g_{i}

and second-order derivative

h_{i}

can be obtained. After minimizing the objective function, a first-order function of the weight of a leaf node is obtained concerning

τ

. When

τ

takes different values, the original model will obtain different estimation results using Equation (15). So far, we have established the LightGBM-QR model. This paper mainly focuses on the 90% estimation interval, so it is necessary to calculate the estimation values when

τ

is 0.05 and 0.95 as the lower and upper bounds of the prediction interval, respectively.

Moreover, as

g_{i}

and

h_{i}

determine the gradient boosting rate of the LightGBM-QR model, the iterative process can be further improved through random sequence weighting. In order to make the random process as fair as possible, the values of each element of the sequence conform to a uniform distribution ranging between 0 and 1. At the same time, the random sequence is normalized. Randomly generate multiple sets of normalized weight sequences as the hyperparameters of the model and participate in the SOH interval estimation. Through hyperparameter optimization, the weight sequence that makes the estimation result optimal is integrated into the model to form the SOH estimation model that is called LightGBM-WQR. LightGBM-WQR is trained on the pseudocode shown in Algorithm 1 to compute the SOH.

Algorithm 1: Training algorithm for estimating the SOH
	Input	: Test data $x ’$
	Output	: Estimation $f (x ’)$
	Data	: Training dataset $D = {(x_{i}, y_{i})}_{i = 1}^{N_{D}}$
1	Initialize the LightGBM-WQR hyperparameters $θ_{1}$ , $θ_{2}$ …… $θ_{P}$
2	For $i$ = 1 to $P$ do
3		Loop until the terminal condition is met. One epoch:
4			Processing data with GOSS, EFB, and Histogram algorithms
5			Grow an independent decision tree with a leaf-wise strategy
6			Accumulate the decision tree by boosting and iterating Equations (4)–(17) to form a strong learner in Equation (3)
7			Terminal condition: The quadratic objective function of the strong learner obtains the optimal solution
8		end
9		Optimize hyperparameters $θ_{1}$ , $θ_{2}$ …… $θ_{P}$ by GridSearchCV toolkit
10	end

In this section, the SOH interval estimation model based on LightGBM-WQR is developed by improving the loss function in the LightGBM framework to a quantile loss function. The complete procedure is shown in Figure 3. To better conduct the SOH prediction of lithium-ion batteries, the learning rate of the model is set to 0.1, the number of weak learners is set to 200, the maximum depth of the tree is set to 5, and the number of leaves in each tree is set to 15. The min child samples and weight are 18 and 0.001, respectively. Other hyperparameters are their default values. Table 3 presents the hyperparameter settings.

4. Results and Discussion

This section divides the representative NASA and Oxford datasets into their respective training and testing sets and verifies that the proposed LightGBM-WQR method is effective for SOH estimation.

4.1. Metrics

Mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), and R-square value are used to evaluate the accuracy of the model for estimating the battery SOH. The equations are as below.

MAE = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\tilde{y}}_{i} |

(18)

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\tilde{y}}_{i})}^{2}}

(19)

MAPE = \frac{1}{n} \sum_{i = 1}^{n} | \frac{y_{i} - {\tilde{y}}_{i}}{y_{i}} | \times 100 %

(20)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - {\tilde{y}}_{i})}^{2}}

(21)

In addition, the mean pinball loss is adopted as an evaluation index for SOH interval estimation. This standard can evaluate the quantile regression error as well:

L_{s c o r e} = \frac{1}{n} \sum_{i = 1}^{n} L_{i, τ}

(22)

The evaluation changes with the influence of

τ

. When

τ

is equal to 0.5,

L_{s c o r e}

is the MAE of the LightGBM-WQR model. The interval estimation ability of the model needs to be evaluated from two aspects. On the one hand, the interval range should cover the true values well. On the other hand, the interval width is as small as possible. The more that true values are covered and the less the interval width is, the better the achieved estimation ability of the interval. Therefore, the true value coverage

r

and average interval width

W

could be defined to characterize the effect of the interval estimation. The equations are as follows:

r = \frac{n_{1}}{n} \times 100 %

(23)

W = \frac{\sum_{i = 1}^{n} y_{i, 0.95} - y_{i, 0.05}}{n}

(24)

where

n_{1}

is the number of true values covered by the interval;

y_{i, 0.95}

and

y_{i, 0.05}

are the estimated value of SOH when

τ

is equal to 0.95 and 0.05, respectively. The possible interval of the true value of SOH is estimated by assigning a value to the quantile parameter

τ

. When

τ

is set at 0.05, the resulting value represents the lower bound of the interval; when it is set at 0.95, the resulting value represents the upper bound of the interval. The 90% confidence interval of SOH is defined as the range between its upper and lower limits, which represents a possible range of estimation results rather than a precise value, thus indicating the level of uncertainty involved.

4.2. SOH Estimation for the NASA Randomized Battery Usage Dataset

The NASA dataset divides cells into seven groups under different random walk currents and temperatures. To validate the performance of the proposed method on different load profiles, RW16, RW20, RW24, and RW28 are chosen as the testing set and the other 24 cells are chosen as the training set. The acquisition process for the NASA dataset has been described in detail in Section 2.

4.2.1. Effectiveness Analysis of the LightGBM Framework

To verify the effectiveness of the LightGBM framework, experiments are also performed using CNN, SVR, and XGBoost approaches under the same input and output conditions. In previous studies, these machine learning methods have been effective in SOH estimation. The point estimate of the LightGBM-WQR model at

τ

= 0.5 is also used for this comparison. The SOH estimation results are shown in Figure 4. It should be noted that Figure 4a has a large interval between the 97th day and the 143rd day because RW16 has been working under the random walk (RW) condition and has not been completely charged in these days; this is the case with the NASA dataset itself. Therefore, an estimate of the SOH between 97 and 143 days cannot be made. The model evaluation is shown in Figure 5 and Figure 6.

As shown in Figure 4, the SOH of the cells significantly decreased over time. All of the methods in the experiments reflect this downward trend. Nevertheless, there are some discrepancies among the five models. From the SOH curves of Figure 4a, the model has a large error in estimating the starting point, especially for CNN and XGBoost. In addition, intuitively, the coincidence of the estimated model SOH curves with the measured values in RW24 is lower than that of in RW16 and RW20. A similar situation also occurs in RW28, which is related to its operation at 40 °C. The error between the estimated and measured values is well represented in Figure 5. Compared with other models, the MAE estimated by the LightGBM model for these four batteries is between 0.57% and 1.66%, and the MAE of the LightGBM-WQR model is between 0.82% and 1.57%, which is generally smaller than the other machine learning approaches. The RMSE also varies with the cells and models. Overall, the RMSE of the two models under the LightGBM framework is smaller than the other methods. The MAPE is similar to the trend of the previous two indicators. This performance has benefitted from the processing of features using GOSS, EFB, and histogram algorithms under the LightGBM framework. The evaluation of model capability from the R-square perspective is shown in Figure 6. Blocks of different colors represent the estimation ability of each estimation model for different batteries, and darker blocks represent a better estimation. All approaches have an R-square value above 95.7% on RW16 and RW20. However, for RW24, LightGBM and LightGBM-WQR outperform SVR and XGBoost and are on par with CNN. This partly explains the poor estimation ability of SVR and XGBoost under the combination of higher temperature and lower current. The detailed errors are shown in Table 4. Meanwhile, the results also show that LightGBM and LightGBM-WQR could adapt well to batteries that have been discharged at different rates. In other words, the results prove that the two models under the LightGBM frame have a high accuracy for SOH estimation.

4.2.2. Effectiveness Analysis of LightGBM-WQR

Compared with LightGBM, LightGBM-WQR provides more information on the SOH estimates, including the distribution of SOH estimation. The predicted results are shown in Figure 7.

The interval estimation of SOH is mainly attributed to quantile regression, and the selection of weights also plays an important role. The interval estimation effects are evaluated by using the defined true value coverage

r

and average interval width

W

. The closer that

r

and

W

are to 100% and 0, respectively, the more the SOH interval estimation is valid and accurate. The calculations of

r

and

W

have been given in Equations (23) and (24). First, we calculate

r

rand

W

for RW16, RW20, RW24, and RW28 and then find the averages of

r

and

W

, respectively. the average of

r

and the average of

W

together form the evaluation of the LightGBM-WQR model for the SOH interval estimation. The uncertainty evaluation of the model on each battery is as shown in Table 5. From Figure 7 and Table 5, the average value of these four cells

r

can be calculated as 83.13%. Furthermore, the average value of

W

is 7.68%. Although a few values are not covered, they are also very close to the interval boundaries. This shows that the interval describes the aging trend of the battery well. The average value of

W

successfully provides the uncertainty between each estimated SOH and its true value on the NASA dataset.

4.3. Estimation of Oxford Battery Degradation Dataset

Compared with the NASA dataset, the Oxford dataset exposed the same load profiles for each cell. Eight lithium-ion cells with different SOHs are successively numbered from Cell1 to Cell8, and the experiment is carried out under the same condition. Cell4 and Cell8 are selected as the testing set, and the other batteries are chosen as the training set. The eight lithium-ion batteries are continuously cycled by a constant current, constant voltage charging, and constant discharge, and the data of the charging voltage are used to estimate the SOH. The specific details of this process have been presented in Section 2.

4.3.1. Effectiveness Analysis of the LightGBM Framework

Similarly, the Oxford dataset is used to repeat the above experiment. The SOH estimation results are shown in Figure 8, and the model evaluation is shown in Figure 9 and Figure 10.

Figure 9 shows that the MAE value and RMSE value estimated by the LightGBM model for the SOH of the Oxford dataset are between 0.36 and 0.46% and between 0.64 and 0.75%. The MAE value of the LightGBM-WQR model is between 0.24 and 0.27%, and the RMSE value is between 0.41 and 0.46%, which is much smaller than other models. In terms of the R-square value, Figure 10 shows that the other four models except for the CNN model are all between 97.6% and 99.4%, indicating that SVR, XGBoost, and the two models under the framework of LightGBM have a high accuracy for SOH estimation in this dataset. Table 6 lists more specific error values. From the above estimation result, the estimation results on the Oxford dataset are generally more accurate than those on the NASA dataset. Sampling time had an impact on the estimates. Specifically, this is because the feature sampling interval of the Oxford dataset is 1 s compared with 30 s for the NASA dataset. This enables the Oxford dataset to contain more details about the aging process of lithium-ion batteries than the NASA dataset. More importantly, in the Oxford dataset, eight cells use a stable current and the same temperature, which also greatly reduces the impact of battery aging by current stress. It is worth noting that LightGBM-WQR performs slightly better than LightGBM.

4.3.2. Effectiveness Analysis of LightGBM-WQR

In the Oxford dataset, LightGBM-WQR shows better interval estimation ability than in the NASA dataset, as shown in Figure 11.

The 90% interval not only wraps well around the SOH curve of the testing set but is also highly consistent in terms of trend. The experimental results on the Oxford dataset were also evaluated using the same approach as for the evaluation of the SOH interval estimates on the NASA dataset. The details of the uncertainty evaluation on each battery is shown in Table 7. The average value of these two cells

r

can be calculated as 83.09%. Most importantly, its average interval width only achieves 2.06%, which proves its validity in interval estimation. Compared with the NASA dataset, the model has a better ability to describe the uncertainty of the Oxford dataset.

4.4. Computational Complexity Analysis

The computational complexity of the model is intuitively represented by the training time and test time of each dataset. The training experiment and first testing experiment were conducted on a computer with an AMD R7-4800H 2.90 GHz CPU, NVIDIA GeForce RTX 2060 (6 GB on-board memory) GPU, and 16 GB of RAM (3200 MHz). The second testing experiment was conducted on an NVIDIA Jetson Xavier NX edge computing module. Table 8 summarizes the mean computational times over the 10 runs.

Table 8 demonstrates that the training time and test time of the two models under the LightGBM framework are both significantly less than XGBoost for the same integrated learning method. Of course, it is also significantly smaller than the CNN method with a similar R-square value. The LightGBM-WQR model has a similar test time to SVR but is more available due to its interval estimation capability.

Being similar to the NASA dataset, the training and testing time of the two models under the LightGBM framework is at a short level. Because the SOH estimation process performed in the BMS only includes the test part of this experiment, the training process of the model can be performed offline on a personal computer with stronger computational resources. So, the testing time in Table 8 is of more concern. It is obvious that the test time of the LightGBM-WQR method is much lower than that of the other three methods except for the SVR method. LightGBM-WQR shows excellent performance advantages in terms of computational complexity.

5. Conclusions

Accurate SOH estimation is important to ensure the safety, reliability, and reduced maintenance costs of lithium-ion batteries. In this paper, we have proposed a computationally efficient approach for SOH estimation. First, a LightGBM framework is applied to balance the real-time online estimation accuracy and computing cost. To avoid complex calculations and incomplete feature coverage caused by feature extraction, this approach takes the charging voltage data of lithium-ion batteries with the complete features as the input of SOH estimation. Utilizing the LightGBM framework greatly reduces the computational complexity in the process of SOH estimation compared with the previous methods. Then, in the LightGBM framework, a LightGBM-WQR model is improved by exploiting quantile regression loss functions to estimate 90% confidence intervals for the SOH and set up an evaluation criterion. Specifically, the true value coverage

r

and average interval width

W

are also proposed to evaluate the LightGBM-WQR, and the effectiveness of the SOH interval estimation is evaluated using quantitative calculations. Finally, experiments with public datasets on computers and edge computing modules verify that the SOH interval estimation model is fast and effective. The mean absolute error, root mean square error, and average interval width are constrained within 1.57%, 2.82%, and 7.68%, respectively. The R-square value is also at a high level compared with the other models. In the future, we aim to conduct more in-depth research on battery aging and explore the application of data-driven methods on more indicators that can characterize battery aging, such as the prediction of remaining useful life.

Author Contributions

Conceptualization, H.Q. and Y.F.; methodology, H.Q. and X.F.; software, H.Q. and X.F.; validation, H.Q. and Q.S.; formal analysis, H.Q. and Y.F.; investigation, Y.F.; resources, X.F.; data curation, H.Q. and D.Z.; writing—original draft preparation, H.Q.; writing—review and editing, Y.F.; visualization, H.Q.; supervision, Y.F. and X.F.; project administration, R.W. and Y.F.; funding acquisition, Y.F. and X.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China under grant number 52007196.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Richardson, R.R.; Birkl, C.R.; Osborne, M.A.; Howey, D.A. Gaussian Process Regression for In-Situ Capacity Estimation of Lithium-Ion Batteries. IEEE Trans. Ind. Inform. 2017, 12, 127–138. [Google Scholar] [CrossRef] [Green Version]
Tian, J.; Xiong, R.; Shen, W.; Lu, J.; Yang, X.-G. Deep Neural Network Battery Charging Curve Prediction Using 30 Points Collected in 10 Min. Joule 2021, 5, 1521–1534. [Google Scholar] [CrossRef]
Han, X.; Lu, L.; Zheng, Y.; Feng, X.; Li, Z.; Li, J.; Ouyang, M. A Review on the Key Issues of the Lithium Ion Battery Degradation among the Whole Life Cycle. eTransportation 2019, 1, 100005. [Google Scholar] [CrossRef]
Khaleghi, S.; Hosen, M.S.; Karimi, D.; Behi, H.; Beheshti, S.H.; Van Mierlo, J.; Berecibar, M. Developing an Online Data-Driven Approach for Prognostics and Health Management of Lithium-Ion Batteries. Appl. Energy 2022, 308, 118348. [Google Scholar] [CrossRef]
Wei, H.; Zhong, Y.; Fan, L.; Ai, Q.; Zhao, W.; Jing, R.; Zhang, Y. Design and Validation of a Battery Management System for Solar-Assisted Electric Vehicles. J. Power Sources 2021, 513, 230531. [Google Scholar] [CrossRef]
Chen, Z.; Mi, C.C.; Fu, Y.; Xu, J.; Gong, X. Online Battery State of Health Estimation Based on Genetic Algorithm for Electric and Hybrid Vehicle Applications. J. Power Sources 2013, 240, 184–192. [Google Scholar] [CrossRef]
Zhu, M.; Hu, W.; Kar, N.C. The SOH Estimation of LiFePO4 Battery Based on Internal Resistance with Grey Markov Chain. In Proceedings of the 2016 IEEE Transportation Electrification Conference and Expo (ITEC), Dearborn, MI, USA, 27–29 June 2016; IEEE: Dearborn, MI, USA, 2016; pp. 1–6. [Google Scholar]
Rauf, H.; Khalid, M.; Arshad, N. Machine Learning in State of Health and Remaining Useful Life Estimation: Theoretical and Technological Development in Battery Degradation Modelling. Renew. Sustain. Energy Rev. 2022, 156, 111903. [Google Scholar] [CrossRef]
Goh, H.H.; Lan, Z.; Zhang, D.; Dai, W.; Kurniawan, T.A.; Goh, K.C. Estimation of the State of Health (SOH) of Batteries Using Discrete Curvature Feature Extraction. J. Energy Storage 2022, 50, 104646. [Google Scholar] [CrossRef]
Rahimi-Eichi, H.; Ojha, U.; Baronti, F.; Chow, M.-Y. Battery Management System: An Overview of Its Application in the Smart Grid and Electric Vehicles. IEEE Ind. Electron. Mag. 2013, 7, 4–16. [Google Scholar] [CrossRef]
Kamali, M.A.; Caliwag, A.C.; Lim, W. Novel SOH Estimation of Lithium-Ion Batteries for Real-Time Embedded Applications. IEEE Embed. Syst. Lett. 2021, 13, 206–209. [Google Scholar] [CrossRef]
Bi, J.; Zhang, T.; Yu, H.; Kang, Y. State-of-Health Estimation of Lithium-Ion Battery Packs in Electric Vehicles Based on Genetic Resampling Particle Filter. Appl. Energy 2016, 182, 558–568. [Google Scholar] [CrossRef] [Green Version]
Waag, W.; Fleischer, C.; Sauer, D.U. On-Line Estimation of Lithium-Ion Battery Impedance Parameters Using a Novel Varied-Parameters Approach. J. Power Sources 2013, 237, 260–269. [Google Scholar] [CrossRef]
Haji Akhoundzadeh, M.; Panchal, S.; Samadani, E.; Raahemifar, K.; Fowler, M.; Fraser, R. Investigation and Simulation of Electric Train Utilizing Hydrogen Fuel Cell and Lithium-Ion Battery. Sustain. Energy Technol. Assess. 2021, 46, 101234. [Google Scholar] [CrossRef]
Marcicki, J.; Canova, M.; Conlisk, A.T.; Rizzoni, G. Design and Parametrization Analysis of a Reduced-Order Electrochemical Model of Graphite/LiFePO4 Cells for SOC/SOH Estimation. J. Power Sources 2013, 237, 310–324. [Google Scholar] [CrossRef]
Modeling of Galvanostatic Charge and Discharge of the Lithium/Polymer/Insertion Cell—IOPscience. Available online: https://iopscience.iop.org/article/10.1149/1.2221597 (accessed on 29 November 2022).
Yan, W.; Zhang, B.; Zhao, G.; Tang, S.; Niu, G.; Wang, X. A Battery Management System With a Lebesgue-Sampling-Based Extended Kalman Filter. IEEE Trans. Ind. Electron. 2019, 66, 3227–3236. [Google Scholar] [CrossRef]
Tan, Y.; Zhao, G. Transfer Learning with Long Short-Term Memory Network for State-of-Health Prediction of Lithium-Ion Batteries. IEEE Trans. Ind. Electron. 2020, 67, 8723–8731. [Google Scholar] [CrossRef]
Oji, T.; Zhou, Y.; Ci, S.; Kang, F.; Chen, X.; Liu, X. Data-Driven Methods for Battery SOH Estimation: Survey and a Critical Analysis. IEEE Access 2021, 9, 126903–126916. [Google Scholar] [CrossRef]
Xia, Z.; Qahouq, J.A.A. Lithium-Ion Battery Ageing Behavior Pattern Characterization and State-of-Health Estimation Using Data-Driven Method. IEEE Access 2021, 9, 98287–98304. [Google Scholar] [CrossRef]
Driscoll, L.; de la Torre, S.; Gomez-Ruiz, J.A. Feature-Based Lithium-Ion Battery State of Health Estimation with Artificial Neural Networks. J. Energy Storage 2022, 50, 104584. [Google Scholar] [CrossRef]
Guo, Y.; Huang, K.; Hu, X. A State-of-Health Estimation Method of Lithium-Ion Batteries Based on Multi-Feature Extracted from Constant Current Charging Curve. J. Energy Storage 2021, 36, 102372. [Google Scholar] [CrossRef]
Cai, L.; Lin, J.; Liao, X. An Estimation Model for State of Health of Lithium-Ion Batteries Using Energy-Based Features. J. Energy Storage 2022, 46, 103846. [Google Scholar] [CrossRef]
Li, P.; Zhang, Z.; Xiong, Q.; Ding, B.; Hou, J.; Luo, D.; Rong, Y.; Li, S. State-of-Health Estimation and Remaining Useful Life Prediction for the Lithium-Ion Battery Based on a Variant Long Short Term Memory Neural Network. J. Power Sources 2020, 459, 228069. [Google Scholar] [CrossRef]
Fan, Y.; Xiao, F.; Li, C.; Yang, G.; Tang, X. A Novel Deep Learning Framework for State of Health Estimation of Lithium-Ion Battery. J. Energy Storage 2020, 32, 101741. [Google Scholar] [CrossRef]
Sagi, O.; Rokach, L. Ensemble Learning: A Survey. WIREs Data Min. Knowl. Discov. 2018, 8, e1249. [Google Scholar] [CrossRef]
Yu, J. State of Health Prediction of Lithium-Ion Batteries: Multiscale Logic Regression and Gaussian Process Regression Ensemble. Reliab. Eng. Syst. Saf. 2018, 174, 82–95. [Google Scholar] [CrossRef]
Yang, N.; Song, Z.; Hofmann, H.; Sun, J. Robust State of Health Estimation of Lithium-Ion Batteries Using Convolutional Neural Network and Random Forest. J. Energy Storage 2022, 48, 103857. [Google Scholar] [CrossRef]
Qin, P.; Zhao, L.; Liu, Z. State of Health Prediction for Lithium-Ion Battery Using a Gradient Boosting-Based Data-Driven Method. J. Energy Storage 2022, 47, 103644. [Google Scholar] [CrossRef]
Song, S.; Fei, C.; Xia, H. Lithium-Ion Battery SOH Estimation Based on XGBoost Algorithm with Accuracy Correction. Energies 2020, 13, 812. [Google Scholar] [CrossRef] [Green Version]
Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Advances in Neural Information Processing Systems 30 (NIPS 2017); HSE University: Moscow, Russia, 2017. [Google Scholar]
Bole, B.; Kulkarni, C.S.; Daigle, M. Adaptation of an Electrochemistry-Based Li-Ion Battery Model to Account for Deterioration Observed under Randomized Use. Proc. Annu. Conf. PHM Soc. 2014, 6, 4. [Google Scholar]
Richardson, R.R.; Osborne, M.A.; Howey, D.A. Battery Health Prediction under Generalized Conditions Using a Gaussian Process Transition Model. J. Energy Storage 2019, 23, 320–328. [Google Scholar] [CrossRef]
Birkl, C. Diagnosis and Prognosis of Degradation in Lithium-Ion Batteries; University of Oxford: Oxford, UK, 2017. [Google Scholar]
Wang, Z.; Zeng, S.; Guo, J.; Qin, T. State of Health Estimation of Lithium-Ion Batteries Based on the Constant Voltage Charging Curve. Energy 2019, 167, 661–669. [Google Scholar] [CrossRef]
Yang, F.; Wang, D.; Xu, F.; Huang, Z.; Tsui, K.-L. Lifespan Prediction of Lithium-Ion Batteries Based on Various Extracted Features and Gradient Boosting Regression Tree Model. J. Power Sources 2020, 476, 228654. [Google Scholar] [CrossRef]
Hu, X.; Xu, L.; Lin, X.; Pecht, M. Battery Lifetime Prognostics. Joule 2020, 4, 310–346. [Google Scholar] [CrossRef]
Zhou, D.; Wang, B. Battery Health Prognosis Using Improved Temporal Convolutional Network Modeling. J. Energy Storage 2022, 51, 104480. [Google Scholar] [CrossRef]
Koenker, R.; Bassett, G. Regression Quantiles. Econometrica 1978, 46, 33. [Google Scholar] [CrossRef]
Harrell, F.E. A New Distribution-Free Quantile Estimator. Biometrika 1982, 69, 635–640. [Google Scholar] [CrossRef]
Zhang, S.; Wang, Y.; Zhang, Y.; Wang, D.; Zhang, N. Load Probability Density Forecasting by Transforming and Combining Quantile Forecasts. Appl. Energy 2020, 277, 115600. [Google Scholar] [CrossRef]

Figure 1. The charge voltage curves over time. (a) RW20 in the NASA dataset. (b) Cell8 in the Oxford dataset.

Figure 2. The variants of these two datasets. (a) RW20 in the NASA dataset. (b) Cell8 in the Oxford dataset.

Figure 3. The procedure of the LightGBM-WQR model for estimating the SOH of lithium-ion batteries.

Figure 4. Estimation accuracy of the testing set for the NASA dataset. (a) RW 16, (b) RW 20, (c) RW 24, and (d) RW 28.

Figure 5. MAE, RMSE, and MAPE of the testing set for the NASA dataset. (a) RW 16, (b) RW 20, (c) RW 24, and (d) RW 28.

Figure 6. R-square of the testing set for the NASA dataset.

Figure 7. The 90% interval estimation of the testing set for the NASA dataset. (a) RW 16, (b) RW 20, (c) RW 24, and (d) RW 28.

Figure 8. Estimation accuracy of the testing set for the Oxford dataset. (a) Cell4; (b) Cell8.

Figure 9. MAE, RMSE, and MAPE of the testing set for the Oxford dataset. (a) Cell4; (b) Cell8.

Figure 10. R-square of the testing set for the Oxford dataset.

Figure 11. The 90% interval estimation of the testing set for the Oxford dataset. (a) Cell4; (b) Cell8.

Table 1. The working condition of 28 lithium-ion batteries.

	Cell	Temperature	Randomized Current	Frequency of Performing Reference Charge and Discharge
Group 1	RW1	25 °C	Between 0.5 A and 4 A	Every 50 RW cycles
	RW2
	RW7
	RW8
Group 2	RW3	25 °C	Between 0.5 A and 4 A	Every 50 RW cycles
	RW4
	RW5
	RW6
Group 3	RW9	25 °C	Between −4.5 A and 4.5 A	Every 1500 RW cycles
	RW10
	RW11
	RW12
Group 4	RW13	25 °C	Between 0.5 A and 5 A, probability bias to low current	Every 50 RW cycles
	RW14
	RW15
	RW16
Group 5	RW17	25 °C	Between 0.5 A and 5 A, probability bias to high current	Every 50 RW cycles
	RW18
	RW19
	RW20
Group 6	RW21	40 °C	Between 0.5 A and 5 A, probability bias to low current	Every 50 RW cycles
	RW22
	RW23
	RW24
Group 7	RW25	40 °C	Between 0.5 A and 5 A, probability bias to high current	Every 50 RW cycles
	RW26
	RW27
	RW28

Table 2. The main hyperparameters of LightGBM.

Hyperparameters	Interpretation
learning_rate	This can control the speed at which the model iteratively fit residuals.
n_estimators	This can control the number of weak learners.
max_depth	This can specify the maximum depth of the tree.
num_leaves	This can specify the number of leaves.
min_child_samples	This specifies the minimum number of leaf node samples.
min_child_weight	This specifies the minimum sample weight sum in the leaf node.
feature_fraction	This represents the proportion of random sampling of features when building a weak learner.
bagging_fraction	This represents randomly taking some samples without a replacement for training the weak learner.
lambda_l1	This is the regularization term $γ$ .
lambda_l2	This is the regularization term $λ$ .
min_split_gain	This represents the minimum gain to perform a node split.

Table 3. Hyperparameters for LightGBM-WQR.

Hyperparameters	Values
learning_rate	0.1
n_estimators	200
max_depth	5
num_leaves	15
min_child_samples	18
min_child_weight	0.001
feature_fraction	1
bagging_fraction	1
lambda_l1	0
lambda_l2	0
min_split_gain	0

Table 4. SOH estimated error of the batteries in the NASA dataset.

Battery	Algorithms	MAE	RMSE	R²	MAPE
RW16	CNN	0.0250	0.0304	0.9703	0.0400
	SVR	0.0180	0.0219	0.9845	0.0281
	XGBoost	0.0181	0.0215	0.9852	0.0285
	LightGBM	0.0097	0.0171	0.9812	0.0201
	LightGBM-WQR	0.0082	0.0136	0.9882	0.0180
RW20	CNN	0.0175	0.0211	0.9793	0.0284
	SVR	0.0108	0.0161	0.9880	0.0175
	XGBoost	0.0109	0.0160	0.9881	0.0177
	LightGBM	0.0119	0.0215	0.9572	0.0219
	LightGBM-WQR	0.0119	0.0213	0.9580	0.0221
RW24	CNN	0.0200	0.0284	0.8599	0.0257
	SVR	0.0276	0.0403	0.7170	0.0298
	XGBoost	0.0285	0.0418	0.6957	0.0307
	LightGBM	0.0166	0.0294	0.8492	0.0248
	LightGBM-WQR	0.0157	0.0286	0.8573	0.0128
RW28	CNN	0.0161	0.0191	0.8656	0.0181
	SVR	0.0118	0.0152	0.9147	0.0128
	XGBoost	0.0118	0.0152	0.9147	0.0127
	LightGBM	0.0057	0.0092	0.9381	0.0080
	LightGBM-WQR	0.0084	0.0138	0.8604	0.0092

Table 5. Uncertainty evaluation on the NASA dataset.

Battery	True Value Coverage r	Average Interval Width W
RW16	88.88%	9.48%
RW20	80.00%	8.09%
RW24	81.82%	6.83%
RW28	81.82%	6.32%
Average	83.13%	7.68%

Table 6. SOH estimated error of the batteries in the Oxford dataset.

Battery	Algorithms	MAE	RMSE	R²	MAPE
Cell4	CNN	0.0356	0.0377	0.6666	0.0431
	SVR	0.0074	0.0099	0.9768	0.0088
	XGBoost	0.0066	0.0093	0.9795	0.0077
	LightGBM	0.0036	0.0064	0.9810	0.0058
	LightGBM-WQR	0.0027	0.0046	0.9902	0.0041
Cell8	CNN	0.0085	0.0126	0.9697	0.0108
	SVR	0.0047	0.0065	0.9920	0.0059
	XGBoost	0.0045	0.0063	0.9923	0.0056
	LightGBM	0.0046	0.0075	0.9783	0.0061
	LightGBM-WQR	0.0024	0.0041	0.9936	0.0043

Table 7. Uncertainty evaluation on the Oxford dataset.

Battery	True Value Coverage r	Average Interval Width W
Cell4	82.98%	1.93%
Cell8	83.20%	2.19%
Average	83.09%	2.06%

Table 8. The mean computational times over the 10 runs on the NASA and Oxford datasets.

Dataset	Time	CNN	SVR	XGBoost	LightGBM	LightGBM-WQR
NASA	Training time (s)	354.862	5.0070	61.965	0.0614	12.138
	First testing time (s)	0.395	0.0006	0.0014	0.0005	0.0004
	Second testing time (s)	0.384	0.0096	0.0083	0.0032	0.0090
Oxford	Training time (s)	57.813	4.9990	157.948	0.0480	40.6970
	First testing time (s)	0.705	0.0076	0.2870	0.0048	0.0008
	Second testing time (s)	0.809	0.0145	0.1100	0.0040	0.0145

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qin, H.; Fan, X.; Fan, Y.; Wang, R.; Shang, Q.; Zhang, D. A Computationally Efficient Approach for the State-of-Health Estimation of Lithium-Ion Batteries. Energies 2023, 16, 5414. https://doi.org/10.3390/en16145414

AMA Style

Qin H, Fan X, Fan Y, Wang R, Shang Q, Zhang D. A Computationally Efficient Approach for the State-of-Health Estimation of Lithium-Ion Batteries. Energies. 2023; 16(14):5414. https://doi.org/10.3390/en16145414

Chicago/Turabian Style

Qin, Haochen, Xuexin Fan, Yaxiang Fan, Ruitian Wang, Qianyi Shang, and Dong Zhang. 2023. "A Computationally Efficient Approach for the State-of-Health Estimation of Lithium-Ion Batteries" Energies 16, no. 14: 5414. https://doi.org/10.3390/en16145414

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Computationally Efficient Approach for the State-of-Health Estimation of Lithium-Ion Batteries

Abstract

1. Introduction

1.1. A Brief Review of Existing Approaches

1.2. Research Motivation and Original Contribution

2. Dataset Selection and Preprocessing

2.1. Battery Dataset

2.1.1. NASA Randomized Battery Usage Dataset

2.1.2. Oxford Battery Degradation Dataset

2.2. Dataset Preprocessing

3. LightGBM-WQR Model for SOH Estimation

3.1. Problem Formulation

3.2. LightGBM Framework

3.3. Weighted Quantile Regression for LightGBM

4. Results and Discussion

4.1. Metrics

4.2. SOH Estimation for the NASA Randomized Battery Usage Dataset

4.2.1. Effectiveness Analysis of the LightGBM Framework

4.2.2. Effectiveness Analysis of LightGBM-WQR

4.3. Estimation of Oxford Battery Degradation Dataset

4.3.1. Effectiveness Analysis of the LightGBM Framework

4.3.2. Effectiveness Analysis of LightGBM-WQR

4.4. Computational Complexity Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI