1. Introduction
Lithium-ion batteries are used as energy storage devices in a variety of applications, including portable electronics, grid storage, electric vehicles, and marine energy systems [
1,
2]. The market for Li-ion batteries is expected to register a compound annual growth rate (CAGR) of approximately 22% during the forecast period (2019–2024) [
3]. The increasing demand is also enabled by the continuous decline in battery prices, with the volume weighted average battery pack cost falling by 85% from 2010 to 2018, reaching an average of
$176/kWh [
4]. Batteries are critical subsystems that provide primary and/or standby energy within products and systems, and their performance degrades during their lifetime due to various degradation mechanisms resulting in capacity and power fade [
5,
6].
Cycle testing of Li-ion batteries is conducted to qualify a battery population according to the capacity and power requirements for its targeted application. However, testing under normal operating conditions can be prohibitively time-consuming. For example, 500 cycles at discharge and charge C-rates of 0.5C can take approximately 95 days (more than three months) to conclude. If the testing is conducted to reach a predefined failure threshold or end of life (i.e., a drop to 80% capacity in many cases), then the testing time can exceed even four months because state-of-the-art commercial Li-ion batteries can maintain more than 80% of capacity for more than 500 cycles. A survey [
7] collating feedback from professionals from a broad spectrum of industrial sectors, including battery cell producers and battery pack and component developers, revealed time-to-market and reliability as their primary concerns. Due to the costs and associated delays in accessing the market, the battery industry is increasingly interested in utilizing accelerated testing procedures. Accelerated test methods are also desirable for qualification of second-life Li-ion batteries where the supply is expected to surpass 200 gigawatt-hours per year by 2030 and applications require less-frequent battery cycling [
8].
Accelerated testing is conducted by elevating certain stress factors to increase the degradation rate and precipitate failure earlier than under normal operation. The cycling operation of batteries under a qualification process is usually characterized by five primary stress factors: ambient temperature, discharge C-rate, charge C-rate, constant voltage charge cut-off C-rate, and depth of discharge (DOD) [
9,
10,
11]. Some or all these stress factors can be selected to accelerate the battery degradation process. The effects of these stress factors on battery degradation are not uniform, however, and some factors can have a disproportionate impact on degradation. Additionally, multifactorial and interdependent stress factors may work in combination (interaction effects) to accelerate certain failure modes. It is not usually feasible to design a test matrix with many factors due to limited testing resources and the time needed to conduct such tests. Therefore, to achieve an optimal time reduction with the fewest stress factors, a key challenge is to identify the most impactful stress factors and utilize them for accelerated testing and modeling.
Within the literature, there is a significant database of experimental studies that have investigated the effects of individual stress factors on Li-ion battery degradation during cycling operation. However, there is limited research on the battery cycle life degradation under the influence of multiple (three or more) stress factors simultaneously. It is important to understand the individual and coupled effects of these stressors on battery degradation. The following studies have provided some initial analysis on the influence of multiple and interacting stressors: Diao et al. [
12] conducted a full factorial design of experiment (DOE) to study the effects of temperature, discharge C-rate, and charge current cut-off on Li-ion batteries and found that only temperature influenced the capacity fade rate. Wang et al. [
13] studied the cycle life of Li-ion batteries under three stress factors, including DOD, temperature, and discharge rates. They found that the capacity loss was strongly affected by time and temperature, whereas the effect of DOD was less important at a 0.5C discharge rate. Schimpe et al. [
14] studied the cycle aging of commercial Li-ion batteries under temperature and charge and discharge C-rates stress factors and with and without constant voltage charging step. They showed that current rates did not show any effects on capacity loss at high temperature (55 °C). However, these studies lacked analysis to separate the individual and interaction effects of stress factors and did not consider all the five stress factors that characterize the cycling operation of batteries.
Cui et al. [
15] conducted orthogonal experiments to determine the key stress factor for capacity loss in commercial Li-ion batteries cycled at shallow (20% to 30%) DOD. The stress factors in their study included temperature, discharge C-rate, end-of-charge voltage, and DOD. They identified temperature—followed by discharge rate and DOD—as the most impactful stress factor for battery capacity loss. The orthogonal design did not include the end-of-charge voltage stress factor, the effects of which were investigated separately. They did not investigate the interaction effects of stress factors.
Prochazka et al. [
16] used a D-optimization method to reduce the full factorial design of four factors, including current, temperature, state of charge (SOC), and the change of SOC with five levels each (4
5 = 1024 tests) to 46 tests. They then implemented a linear regression model with individual, quadratic, and interaction effects of stress factors to conclude that three out of four factors were significant for the batteries they tested. These two studies [
15,
16] chose four stress factors, which alone cannot constitute the full cycle profile of batteries, leading to an incomplete ranking of individual and/or interaction effects of all the stress factors on battery capacity fade.
Su et al. [
17] used seven principal factors—including the charge/discharge currents during the constant current regime, the charge/discharge cut-off voltages and the corresponding durations during the constant voltage regime, and the ambient temperature in a reduced orthogonal DOE (18 tests)—with three different levels for each factor. They then conducted analysis of variance (ANOVA) to identify significant stress factors and their ranking. However, they chose stress factors that were not practical to define the conventional constant current/constant voltage (CCCV) charge (limited by charge cut-off current and not time) and constant current discharge cycling profile used for the battery qualification purpose. They also chose more than two levels of stress factors for screening purposes, which unnecessarily increased the number of tests for a constant number of desired effects and then did not consider the interaction effects of the stress factors. They recommended the determination of interaction effects of stress factors on battery degradation for future research.
This paper presents a study to address the limitations of current state of the art and provide guidance in the accelerated battery test planning. This study combines design of experiment for data generation and two machine learning techniques, least absolute shrinkage and selection operator (LASSO) and random forests (RF) for the stress factor effects analysis and ranking. The data generation work included the implementation of a multi-stress half-fractional DOE consisting of two levels of stress factors to reduce the number of tests to train the models. To the best of our knowledge, this is the first multifactorial experimental study that investigates the significance and ranking of individual (main) and two-way interaction effects of all five stress factors (temperature, charge current, discharge current, constant voltage charge cut-off current, and DOD) required to constitute a typical cycling operation used in the battery qualification testing.
The remainder of the paper is structured as follows:
Section 2 describes the half-fractional factorial DOE and experimental procedure.
Section 3 discusses the machine learning techniques used in this paper.
Section 4 presents the two case studies and discusses the analysis of the effects of stress factors.
Section 5 presents the conclusions.
2. Design of Experiment
A DOE was used to conduct the cycle testing of batteries with five different stress factors: ambient temperature, discharge C-rate, charge C-rate, constant voltage charge cut-off current, and DOD. Two levels for each selected stress factor were chosen to cover the entire range of the stress factor. The purpose of this study was to screen the stress factors and rank them in terms of their importance for battery capacity fade acceleration; therefore, two levels were sufficient for each stress factor. The DOE presented in this section was not intended for the accelerated testing of batteries, but rather served as a preliminary step to determine impactful stress factors that could be used for planning the accelerated testing of batteries. The DOE for the testing was as follows:
where
k is the number of stress factors,
I is the constant discharge current,
t is the discharge time,
QN is battery capacity after
N cycles, and
T is the ambient temperature.
A full factorial DOE is the most comprehensive as it considers all the individual and higher-order interaction effects. A full factorial design (
2k) for five stress factors will result in
test cases. Third- or higher-order interaction effects of stress factors are usually insignificant in most practical cases and thus can be confounded with individual and two-way interaction effects. A half-fractional factorial design can preserve individual and two-way interaction effects of stress factors while reducing the number of test cases, which can be later extracted using statistical analysis and machine learning. In half-fractional design, one of the stress factors is represented as a logical multiplication of the remaining four or a lower number of stress factors.
Table 1 shows the process of representing DOD (X5) as a function of the remaining four stress factors in the proposed design. The design in
Table 1 is resolution V design, where the individual effects are aliased with at least fourth-order effects and the second-order (two-way) interaction effects are confounded with third-order effects. To understand the repeatability of results, three samples per test case have been considered.
The ranges of stress factors in the DOE were selected with a focus on portable electronic application and electrode-electrolyte interfacial side-reaction degradation mechanisms (such as solid electrolyte interphase [SEI] layer). Battery capacity loss due to low-temperature-based transport limitations and lithium plating was not considered as part of this study. Factor ranges were chosen carefully to prevent/minimize abusive or unsafe conditions such as thermal decomposition of the SEI layer or electrolyte. Based on the literature, these mechanisms do not occur until 100 °C [
18]. The battery surface temperature in the testing never reached those levels.
The first factor in the DOE was ambient temperature with a range of 25 °C to 55 °C. In the range, 25 °C represents a typical room-temperature condition and 55 °C (which is lower than the battery datasheet specification of 60 °C) was considered as the maximum (accelerated level) limit. This was based on battery manufacturers’ suggestions to prevent any abusive degradation mechanisms, considering that battery temperature can further rise due to the charge-discharge operations. The second stress factor was the battery discharge C-rate. The range 0C to 0.5C represents a typical normal range of continuous discharge rate for portable electronic applications such as smartphones and laptops. Discharge rates up to 2C are also expected, but only for short durations; hence, a discharge C-rate range of 0.5C to 2C was considered. Many battery manufacturers usually prescribe 2C as the maximum continuous discharge current.
The third stress factor was charge cut-off current during the constant voltage charging step. Reducing the charge cut-off current increases the constant voltage charging time but ensures that the open circuit voltage of the battery reaches closer to the end-of-charge voltage. Although time-consuming, C/100 (0.01C) is the lowest practical charge cut-off current that is used sometimes in battery testing. C/20 (0.05C), C/10 (0.1C), and C/5 (0.2C) are other examples of charge cut-off current levels that are typically used to charge a battery almost fully. Thus, two levels of C/5 and C/100 were used for charge cut-off. The charge cut-off 0.01C represents a high level (in
Table 1) because it corresponds to “more” charging of the battery. Charge C-rate was the fourth stress factor in the design space. Charge C-rates close to 0.8C are used in portable electronic devices and have been prescribed in battery datasheets. Thus, 0C to 0.8C can be considered as the normal operating range. The charge C-rate stress factor at increased levels can accelerate unsafe mechanisms, such as lithium plating and dendrites. Thus, the maximum limit of charge C-rate was fixed to 1.2C based on suggestions from the battery manufacturers. DOD, the fifth factor, represents the amount of battery capacity being utilized in each cycle. DOD at 100% represents the full cycle condition and the maximum limit. Considering that the entire 0–100% range represents the normal operation and the factor effects are independent of the factor range (if there are no new mechanisms), 50% was selected as a lower limit of DOD in the testing.
All the test cases described in
Table 1 involved continuous cycling of a Li-ion battery under the specified stress factors to characterize the capacity fade trend. The cycling profile included charge/discharge operations. The charging operation was conducted using a CCCV standard charge algorithm. The battery was charged using the prescribed constant charge C-rate up to the end-of-charge voltage (4.4 V) followed by the “top-up” using constant voltage charging until the charging current dropped below the prescribed charge cut-off. The discharge operation was performed using the prescribed constant discharge C-rate until the prescribed DOD was achieved. DOD of 100% was defined as the discharge to 3 V at 0.5C discharge current rate. For high discharge current tests (2C/1.3C), therefore, the batteries were further discharged to 3 V at 0.5C after hitting the 3 V threshold at the prescribed high level (2C/1.3C) of discharge C-rate. Rest times after discharge and after charge were not considered as stress factors in this study and were kept fixed at 10 min for the testing.
The cell characterization testing was conducted at the beginning of cycling testing (
Table 1) to set up the baseline and intermediately between the cycling tests for comparison with baseline characteristics. The characterization testing included the measurements of “true” battery discharge capacity. This true value of discharge capacity was defined at a standard condition for comparison across different tests in
Table 1. Discharge capacity measurement tests were conducted by cycling the battery at 25 °C ambient temperature using a standard full charge/discharge cycle. The battery was charged at 0.8C constant current up to the end-of-charge voltage (4.4 V) followed by the constant voltage charging until the charging current dropped below C/20. Following the 10 min of rest after the charging, the battery was discharged at C/2 constant current up to the end-of-discharge voltage (3 V). After discharge, a 10-min rest period was provided before charging the battery for the next cycle. For the initial capacity measurements of fresh cells, five cycles were conducted and the average value of the discharge capacities was used as the initial capacity. For the capacity measurements at prescribed intervals and at the end of the cycling testing, two cycles were conducted and the discharge capacity from the second cycle was used as the real capacity measurement. The intermediate characterization of discharge capacity was conducted at fixed intervals of 100 cycles. These discharge capacities measured at every 100 cycles were used for adjusting the discharge time required to achieve 50% DOD at a prescribed discharge C-rate in
Table 1. The cycle testing based on the DOE in
Table 1 was continued until 300 cycles (approximately 2 months) or a 20% drop in discharge capacity measurements, whichever occurred earlier.
3. Algorithms for Stress Factor Ranking
Two machine learning algorithms, LASSO regression and RF, were considered for evaluating the effects of the stress factors on battery capacity fade and for ranking them in terms of their impact based on the experimental data in this study. It is important to note that these algorithms have been used in this study only for stress factor (effects) ranking before accelerated testing and not for building predictive capacity fade and life models [
19,
20]. The algorithms, therefore, were not applied to the other (unseen) test datasets for prediction purposes, instead the algorithms used all of the experimental data generated as part of this study.
LASSO is specifically designed to model linear relationships with the added benefit of pruning the model weights to better fit the data. The pruning function, also referred to as regularization, also acts as a predictor variable (feature) ranking property [
21]. While linear regression can lead to overfitting in small datasets, LASSO is more suitable due to its regularization property. RF, on the other hand, is a non-parametric, non-linear ensemble model consisting of an ensemble of decision trees [
22]. Given its decision tree ensemble construction, RF, is able to measure relative importance between predictor variables by looking at how much the treen node of a particular predictor variable reduces the loss function, typically measured as impurity [
22].
The task of identifying the highest contributing predictor variables for prediction is termed “predictor variable (feature) selection” and is generally applied in datasets where the number of predictor variables exceeds the order of hundreds. The primary goal of predictor variable selection is a reduction in computational time and memory. However, the present work made use of predictor variable selection for stress factor ranking in order of importance to capacity fade. Traditionally, supervised predictor variable selection is grouped into three broad categories: filter, wrapper, and embedded methods [
23,
24,
25]. The focus here was on embedded methods, in particular LASSO and RF due to (1) superior performance over filter methods, (2) better interpretability when compared to wrapper methods, and (3) use of the predictor performance as the objective function to evaluate the variable subset. The problem can be specified mathematically as follows:
where
Y is the response variable;
X is the matrix of predictor variables; and
ε is the random error term, which is independent of
X and has zero mean. The purpose of the algorithms is to learn the unknown function
f and understand how
Y is affected by
X. In this study, predictor variables (features) refer to individual stress factors and their two-way interactions.
The average capacity fade rate was used as the response variable
Y in Equation (1). It was calculated by dividing the total drop in capacity (Ah) with the cumulative discharge (Ah), which is defined as the total amount of charge (Ah) delivered by the battery during the discharge operations of the entire testing. Cumulative discharge was selected in the denominator of Equation (3) in place of number of cycles because the term “cycle” is not well defined for the tests in
Table 1. For example, a cycle with 50% DOD cannot be directly compared with a cycle with 100% DOD. Cumulative discharge also provides a good sense of testing (discharge) time for a given C-rate.
The estimation of
f requires training data generated by the cycle testing of batteries as per the DOE described in
Table 1. There are two types of machine learning approaches: parametric and non-parametric. In this paper, both of these approaches have been demonstrated. The LASSO regression falls into the parametric category, where an assumption about the functional form of
f is made in advance and then training data are used to estimate the coefficients. Non-parametric approaches like RF do not make an explicit assumption about the functional form of
f.
A linear model for five stress factors can be described by Equation (4), where
β0, β1,..., β5, β12,…, β45 are the regression coefficients (effect sizes) that need to be estimated. The most conventional algorithm to estimate these coefficients is least squares, which tries to minimize the residual sum of squares (RSS), defined as follows:
where
is an estimate of function
f;
are estimates of
β0, β1,..., β5,
β12,…, β45 respectively; and
n is the number of training observations. LASSO regression, a regularization technique developed for linear regression problems, has been widely used in the literature as a predictor variable selection algorithm due to its simplicity and ease of interpretation [
21,
26]. The key assumption is that the best possible prediction rule is sparse; that is, only a few of the coefficients are different from zero. Coefficients different from zero are directly proportional to the variable’s importance. The higher the coefficient at a nonzero weight, the more important the variable for the prediction. In LASSO regression, sparsity is achieved by adding a term that penalizes the loss based on convex relaxation via the l
1 norm and a regularization term
λ. The loss function formula thus becomes:
Equation (6) represents the summation of RSS (Equation (5)) and a l
1 norm penalty term.
λ represents the regularization term. The higher the regularization term, the more the weights of the model shrink to zero, thus revealing the important predictor variables. The value of
λ was chosen based on the 10-fold cross-validation approach. In this approach, the training data are randomly divided into 10 parts or folds, and the first fold is treated as the validation set while the remaining nine folds are used for training purposes. This process is repeated 10 times to choose each one of those 10 folds as the validation set, and the mean squared error is calculated for each validation set. The final mean squared error (MSE) is calculated as follows:
A value of
λ that minimizes the MSE can be chosen. However, to use a slightly larger value of
λ, which allows a stricter penalty on coefficients, this study used a value of
λ corresponding to the MSE, which was higher by one standard deviation from the minimum MSE. A non-parametric bootstrap method [
27] was further used to calculate the probability of each coefficient in the regression model being zero using a 1000-sample size. In the bootstrap method, a training sample of size
n—the same as that of the actual training data—was randomly sampled with replacement. For this sample, the 10-fold cross-validation was conducted to select
λ corresponding to MSE at one standard deviation and calculate corresponding estimates of coefficients. This process was repeated 1000 times, and then the probability of zero was calculated for each coefficient.
The non-parametric machine learning method, RF, has also been implemented for variable ranking as a comparison to the bootstrapped LASSO method. Randomized-based ensemble techniques such as RF have only recently been used as a predictor variable selection method [
28]. In ensemble learning such as RF, a collection of single regression models (decision trees) are trained, and the output of the ensemble regression is obtained by aggregating or averaging the outputs of the single models. In other words, the generated RF is nothing but a collection of constructed decision trees, sequentially conducting binary splits of the data during training in order to produce a homogeneous subset [
22].
RF can handle interactions between variables, distinguishing relevant from irrelevant variables even when the number of variables is much larger than the number of samples [
22,
29]. The RF algorithm requires substantial hyperparameter tuning; therefore, instead of the common grid search, a random search approach was adopted. Exponential distributions were considered for each of the following hyperparameters: maximum number of predictor variables considered for splitting the node, maximum tree depth, minimum number of samples placed in a node before splitting, and minimum number of samples in a leaf node. Empirically and theoretically randomly chosen trials, no matter the distribution from which they are chosen, are more efficient for hyperparameter optimization than trials on a grid [
30], hence the adoption of the technique.
Due to the randomization effect inherent to RF, the masking effect that plagues the decision tree algorithm—by which one predictor variable might never occur in any split because it leads to splits that are slightly worse—is greatly decreased, especially when the forest comprises decision trees on the order of hundreds or even thousands. The number of decision trees is therefore an important parameter that in some cases requires tuning. For instance, if computational power is not a concern, the bigger the number of decision trees the better the prediction. In the present work, the number of trees was restricted to 70 to prevent overfitting. Variable importance was calculated as the decrease in node impurity weighted by the probability of reaching that node, where the probability was simply the number of samples that reach the node, divided by the total number of samples. In mathematical terms, variable importance values over all trees in the forest φ
m can be calculated by:
where
p(
t)
∆i(
st, t) is the impurity increase, typically measured via mean decrease accuracy, mean increase error, or variance for regression;
p is the number of predictor variables;
t is the node number;
m is the number of decision trees in the forest; and
p(
t) is the proportion of samples reaching node
t over the total number of samples [
31,
32]. The present work adopts mean decrease accuracy for predictor variable selection.
5. Conclusions
This study determines and ranks the individual and two-way interaction effects of five stress factors on the battery capacity fade to identify most effective stress factors for accelerated battery test planning. This work also demonstrates an application of machine learning in ranking the battery stress factors and their interactions. Using a half-fractional factorial design with 16 tests, the capacity fade of 96 Li-ion batteries was studied under five stress factors (ambient temperature, discharge C-rate, charge C-rate, DOD, and constant voltage charge cut-off current) that constitute a typical cycling profile for batteries in the qualification testing. The half-fractional factorial design was sufficient to preserve the individual (main) and two-way interaction effects of stress factors.
Machine learning methods, LASSO and random forest, were implemented for ranking the effects of stress factors. Temperature, discharge C-rate, and constant voltage charge cut-off current stress factors in the form of either their individual effects or interactions were always among the top four ranked effects for the capacity fade of batteries from the two manufacturers. Individual effects of charge C-rate and DOD were found to be least significant for the battery capacity fade. The study shows that charge cut-off current during the constant voltage charging phase, a relevant parameter during battery charging that has been under represented in the battery literature, was more dominant for battery degradation compared to the charge current used during the constant current charging phase.
The stress factor rank list obtained from this study can be used for planning accelerated Li-ion battery qualification testing with only top-ranked factors. However, the ranking should be read in context with the ranges of stress factors chosen in the experimental design, as harsher levels of an insignificant factor of charge C-rate may have the potential to trigger additional degradation mechanisms such as lithium plating. The selection of only two levels of each stress factor in the design of experiments, although sufficient for ranking purposes, limits the determination of non-linear effects of stress factors. While the implementation of RF in this study has been limited to the factor ranking, it can also be used for data-driven modeling to predict battery capacity fade in future work.