# A New Neural Network Approach to Short Term Load Forecasting of Electrical Power Systems

^{*}

## Abstract

**:**

## 1. Introduction

## 2. The Proposed STLF Strategy

**Figure 1.**Structure of the proposed STLF strategy including the preprocessor and hybrid forecast engine.

_{L}),EX

_{1}(t),EX

_{1}(t−1),…,EX

_{1}(t−N

_{1}),…,EX

_{P}(t),EX

_{P}(t−1),…,EX

_{P}(t−N

_{P})}

_{L}) are the historical values of load, since electrical load is dependent on its past values. The output of the STLF strategy is the load forecast of the next time interval, denoted by L(t) in Figure 1. The time interval depends on the STLF forecast step; for instance, for an hourly load forecast, t is measured in terms of hours. Electrical load is also dependent on exogenous variables (such as temperature and humidity), in addition to its past values. These exogenous variables are shown by EX

_{1}to EX

_{P}in (1). Since, the inputs of (1) have different ranges (such as load and temperature); we linearly normalize all inputs and output to be within the range [0,1] to avoid the masking effect. Linear normalization is a simple and well-known mathematical transformation. Suppose that an input x (such as load, temperature, humidity, etc.) is in the range of [x

_{min},x

_{max}]. Linear normalization of x, to be within the range [0,1], is as follows:

_{n}indicates normalized value of x in the range of [0,1]. Thus, each input is separately normalized based on its own minimum and maximum values. Also, the normalized variable x

_{n}can easily be returned to the original range [x

_{min},x

_{max}] by means of the inverse transform:

_{i}(t) and past values EX

_{i}(t−1),…,EX

_{i}(t−N

_{i}) (such as temperature forecast and past temperatures) are considered as the input data in (1). Choosing these exogenous variables is dependent on the engineering judgment and availability of data. For instance, while residential customers usually have high sensitivity to weather conditions (such as temperature), industrial loads are not so sensitive to weather parameters [13]. A discussion about this matter can be found in [2]. In (1), N

_{L}and N

_{1}to N

_{P}indicate order of back shift for load L and P exogenous variables EX

_{1}to EX

_{P}, respectively. From a data mining viewpoint, these orders should be considered high enough so that no useful information is missed. In [4], considering short-run trend, daily and weekly periodicity characteristics of hourly load time series, at least N

_{L}= N

_{1}= … = N

_{P}= 200 has been proposed. However, this results in a too large set of inputs S(t) in (1), which cannot be directly applied to a forecast engine. Moreover, this large set may include ineffective inputs, which complicate the construction of input/output mapping function of the STLF (i.e., the mapping function of S(t)→L(t)) for the forecast engine and degrade its performance. Thus, the set of inputs S(t) should be refined by a feature selection technique such that a minimum subset of the most informative inputs is selected and the other unimportant features are filtered out. For this purpose, the two-stage feature selection technique proposed in our previous work [16] is used here. This feature selection technique is based on the information theoretic criterion of mutual information and can evaluate both relevancy of each input with the output and redundant information among inputs. The preprocessor by means of the feature selection technique selects a subset of the most relevant and non-redundant inputs among S(t). Details of this feature selection technique can be found in [16]. The selected inputs by the preprocessor are given to the proposed hybrid forecast engine (Figure 1).

_{1},…,x

_{ND}. To solve this optimization problem, harmony memory (HM) of HS is a matrix as follows:

_{1},…,x

_{ND}. HMS indicates number of harmony vectors of HS population. In (4), the superscript of each HV represents its number from 1 to HMS. In other words, the rows of HM matrix are individuals or harmony vectors of HS. To initialize HM, the decision variables of each of its HVs are randomly initialized within their allowable limits. Then, the value of the objective function, denoted by OF(.), is computed for each HV. Moreover, an improvisation counter IC is set to zero. For training of the MLP neural network (the forecast engine), the decision variables x

_{1},…,x

_{ND}are the weights of the MLP. Also, the objective function or OF(.) is the error function of the training phase of the MLP neural network that should be minimized. This error function will be introduced later.

**Case 1. Memory consideration without pitch adjustment:**If Rand1 < HMCR & Rand2 > PAR, then randomly select ${x}_{i}^{\mathit{\text{new}}}$ among the stored values of ith decision variable in the HM, i.e., {${x}_{i}^{1}$,${x}_{i}^{2}$,…,${x}_{i}^{\mathit{\text{HMS}}}$}.

^{new}, these two random numbers are separately generated. The ith column of HM contains the previously stored values of i

^{th}decision variable, which in this case (Rand1 < HMCR & Rand2 > PAR), ${x}_{i}^{\mathit{\text{new}}}$ is selected among them. As described in step 1, HMCR and PAR are two user defined parameters of HS in the interval [0,1]. Thus, the probability of this case is HMCR × (1 − PAR).

**Case 2. Memory consideration with pitch adjustment:**If Rand1 < HMCR & Rand2 < PAR, then randomly select ${x}_{i}^{\mathit{\text{new}}}$ among {${x}_{i}^{1}$,${x}_{i}^{2}$,…,${x}_{i}^{\mathit{\text{HMS}}}$}. Moreover, pitch adjustment is also executed in this case as follows:

**Case 3. Randomization:**If Rand1 > HMCR, then randomly generate ${x}_{i}^{\mathit{\text{new}}}$ within its allowable range. In this case, ${x}_{i}^{\mathit{\text{new}}}$ is selected from its entire feasible range not limited to those stored in the HM. The probability of this case is (1 − HMCR). After generating all decision variables ${x}_{i}^{\mathit{\text{new}}}$ (1 ≤ i ≤ ND) based on the above three cases, the new harmony vector is produced.

^{new}has a lower OF(.) value than the worst HV, HV

^{new}is included in the HM and instead the worst HV is excluded from the HM. Otherwise, the produced HV

^{new}is disregarded and the HM does not change.

^{new}. In step 3 of the proposed MHS, the mutation operation of (6) is executed in addition to the improvisation of HS. In other words, both HV

^{new}and ${\mathit{HV}}_{m}^{\mathit{\text{new}}}$. are produced by the improvisation and mutation operation, respectively, in the step 3 of the MHS. Then, step 4 of the MHS is sequentially performed for the two newly generated harmony vectors HV

^{new}and ${\mathit{HV}}_{m}^{\mathit{\text{new}}}$, respectively, to update the HM. In other words, HV

^{new}is first compared with the worst HV of the HM as described in the step 4 of the HS algorithm. If it is replaced by HV

^{new}, the new worst HV of the HM is found. Similarly, ${\mathit{HV}}_{m}^{\mathit{\text{new}}}$ is compared with the newly found worst HV and replaces it provided that ${\mathit{HV}}_{m}^{\mathit{\text{new}}}$ has a lower OF(.)value than it.

_{1},…,x

_{ND}of (4) are considered as the weights of the MLP neural network. For instance, if the MLP has 20 neurons in the input layer (corresponding to 20 selected inputs by the preprocessor), 10 neurons in the hidden layer and one neuron in the output layer, it will have 20 × 10 + 10 × 1 = 210 weights. These 210 weights are considered as x

_{1},…,x

_{ND}of the MHS (ND = 210). We should also determine the OF(.)of the MHS or the error function of the MLP neural network. To train a MLP neural network, the error function can be selected as the training error or validation error. Here, validation error is selected as the error function of the MLP, since it can better evaluate the generalization performance of the NN (generalization is a measure of how well the NN performs on the actual problem once training is complete) [22,23].

_{1},…,x

_{ND}of the final solution of the MHS (the best HV of the last iteration) are considered as the weights of the NN based forecast engine.

## 3. Numerical Results

_{act}(t) and L

_{for}(t) represent actual and forecasted values of load in hour t, respectively; NH indicates number of hours in the forecast horizon. Here, NH = 24 for day ahead STLF. Four test weeks corresponding to the four seasons of 2009 (including the third weeks of February, May, August, and November) are considered for this numerical experiment indicated in the first column of Table 1. This is to represent the whole year in the numerical experiments. The MAPE value for each test week, shown in Table 1, is the average of seven MAPE values of its corresponding forecast days. Also, the average result of the four test weeks is shown in the last row of Table 1. For the sake of a fair comparison, all forecast methods of Table 1 have the same training period including 50 days prior to each forecast day. Also, all of these methods have the same set of inputs S(t) and the same preprocessor (Figure 1). Thus, each forecast method is fed by the same selected inputs, since the purpose of this numerical experiment is comparison of the efficiency of different forecast engines.

**Table 1.**Comparison of the proposed hybrid forecast engine with five other prediction methods for day-ahead STLF of the PJM test case in the four test weeks of year 2009 (the reported results are in terms of MAPE criterion).

TestWeek | Multi-VariateARMA (%) | RBF (%) | MLP + BR (%) | MLP + BFGS (%) | MLP + LM (%) | Proposed (%) |
---|---|---|---|---|---|---|

Feb | 4.10 | 2.38 | 2.55 | 2.76 | 2.17 | 1.44 |

May | 4.45 | 3.06 | 2.63 | 2.85 | 2.44 | 1.91 |

Aug | 3.09 | 2.01 | 2.61 | 2.59 | 1.86 | 1.10 |

Nov | 3.01 | 2.24 | 2.28 | 2.63 | 1.75 | 1.13 |

Average | 3.66 | 2.42 | 2.52 | 2.71 | 2.05 | 1.39 |

**Table 2.**Comparison of the proposed MHS with five other stochastic search techniques for day-ahead STLF of the PJM test case in the four test weeks of year 2009 (the reported results are in terms of MAPE criterion).

Test Week | SA (%) | GA (%) | PSO (%) | DE (%) | HS (%) | MHS (%) |
---|---|---|---|---|---|---|

Feb | 3.98 | 2.68 | 2.39 | 2.11 | 1.87 | 1.44 |

May | 4.86 | 3.46 | 2.68 | 2.35 | 2.21 | 1.91 |

Aug | 2.69 | 2.38 | 2.37 | 2.21 | 1.51 | 1.10 |

Nov | 3.17 | 3.04 | 2.69 | 2.33 | 1.53 | 1.13 |

Average | 3.68 | 2.89 | 2.53 | 2.25 | 1.78 | 1.39 |

**Table 3.**Comparison of the STLF results of the proposed strategy with the STLF results of the PJM ISO for the four test weeks of year 2009 (the reported results are in terms of MAPE criterion).

Test Week | PJM ISO (%) | Proposed Strategy (%) |
---|---|---|

Feb | 4.37 | 1.44 |

May | 6.49 | 1.91 |

Aug | 3.12 | 1.10 |

Nov | 3.50 | 1.13 |

Average | 4.37 | 1.39 |

**Figure 2.**Curves of real values, forecast values and forecast errors for the first test week of Table 1 (the forecast results are related to the proposed strategy).

**Figure 3.**Curves of real values, forecast values and forecast errors for the second test week of Table 1 (the forecast results are related to the proposed strategy).

**Figure 4.**Curves of real values, forecast values and forecast errors for the third test week of Table 1 (the forecast results are related to the proposed strategy).

**Figure 5.**Curves of real values, forecast values and forecast errors for the fourth test week of Table 1 (the forecast results are related to the proposed strategy).

**Table 4.**Comparison of the STLF results of the proposed strategy with the STLF results of the PJM ISO for the 12 months of year 2009 (the reported results are in terms of MAPE criterion).

Test Month | PJM ISO (%) | Proposed Strategy (%) |
---|---|---|

Jan | 5.25 | 1.67 |

Feb | 3.73 | 1.18 |

March | 4.67 | 1.43 |

April | 4.27 | 1.32 |

May | 4.95 | 1.62 |

June | 5.44 | 1.73 |

July | 4.92 | 1.53 |

Aug | 4.29 | 1.35 |

Sept | 4.72 | 1.54 |

Oct | 4.36 | 1.33 |

Nov | 4.28 | 1.32 |

Dec | 4.71 | 1.53 |

Whole year | 4.64 | 1.46 |

_{act}(t), L

_{for}(t) and NH are as defined for (7). The MAE and MAPE results of New York ISO [27], support vector machine (SVM) [3], hybrid network (composed of Self-Organized Map (SOM) for data clustering and groups of 24 SVMs for the next day load forecast) [3], and wavelet transform combined with neuro-evolutionary algorithm [22] for this test case are also reported in Table 5 for comparison. The results of the benchmark methods of this table have been quoted from their respective references. The same test period (January 2004 and July 2004) and the same error criteria (MAE and MAPE) of these references are also adopted for the proposed STLF strategy. The reported MAE and MAPE values for each test month in Table 5 are average of its corresponding daily MAE and MAPE values, respectively. Observe that the proposed strategy has the lowest MAE and the lowest MAPE among all forecast methods of Table 5 in both the test months. This comparison illustrates the effectiveness of the proposed STLF strategy. Its obtained results for the two test months are also graphically shown in Figure 6 and Figure 7, respectively. As seen from these figures, the forecast curve is close to real curve, such that in most of the time these two curves cannot be discriminated, and the error curve has small values. This numerical experiment further validates the efficiency of the proposed hybrid forecast engine.

**Table 5.**Day-ahead STLF results of New York city for the two test months of January 2004 and July 2004.

Type of Forecast | January 2004 | July 2004 | ||
---|---|---|---|---|

MAE (MW) | MAPE (%) | MAE (MW) | MAPE (%) | |

New York ISO [27] | 163.05 | 2.86 | 245.47 | 3.55 |

SVM [3] | 143.21 | 2.38 | 207.74 | 3.03 |

Hybrid Network [3] | 106.97 | 1.82 | 162.2 | 2.29 |

Wavelet Transform + Neuro-Evolutionary Algorithm [22] | 87.01 | 1.68 | 123.62 | 2.02 |

Proposed Strategy | 72.73 | 1.25 | 115.71 | 1.57 |

**Figure 6.**Curves of real values, forecast values and forecast errors for the test month of January 2004 of Table 5 (the forecast results are related to the proposed strategy).

**Figure 7.**Curves of real values, forecast values and forecast errors for the test month of July 2004 of Table 5 (the forecast results are related to the proposed strategy).

## 4. Conclusions

## References

- Hobbs, B.F.; Jitprapaikulsarn, S.; Konda, S.; Chankong, V.; Loparo, K.A.; Maratukulam, D.J. Analysis of the value for unit commitment of improved load forecasting. IEEE Trans. Power Syst.
**1999**, 14, 1342–1348. [Google Scholar] [CrossRef] - Shahidehpour, M.; Yamin, H.; Li, Z. Market Operations in Electric Power Systems; Wiley: New York, NY, USA, 2002. [Google Scholar]
- Fan, S.; Chen, L. Short-Term Load Forecasting Based on an Adaptive Hybrid Method. IEEE Trans. Power Syst.
**2006**, 21, 392–401. [Google Scholar] [CrossRef] - Amjady, N.; Daraeepour, A. Mixed Price and Load Forecasting of Electricity Markets by a New Iterative Prediction Method. J. Electr. Power Syst. Res.
**2009**, 79, 1329–1336. [Google Scholar] [CrossRef] - Huang, S.J.; Shih, K.R. Short-term load forecasting via ARMA model identification including non-Gaussian process considerations. IEEE Trans. Power Syst.
**2003**, 18, 673–679. [Google Scholar] [CrossRef] - Amjady, N. Short-Term Hourly Load Forecasting Using Time Series Modeling With Peak Load Estimation Capability. IEEE Trans. on Power Syst.
**2001**, 16, 798–805. [Google Scholar] [CrossRef] - Charytoniuk, W.; Chen, M.S.; van Olinda, P. Nonparametric regression based short-term load forecasting. IEEE Trans. Power Syst.
**1998**, 13, 725–730. [Google Scholar] [CrossRef] - Al-Hamadi, H.M.; Soliman, S.A. Short-term electric load forecasting based on Kalman filtering algorithm with moving window weather and load model. Electr. Power Syst. Res.
**2004**, 68, 47–59. [Google Scholar] [CrossRef] - Khotanzad, A.; Afkhami-Rohani, R.; Maratukulam, D. ANNSTLF—Artificial neural network short-term load forecaster—Generation three. IEEE Trans. Power Syst.
**1998**, 13, 1413–1422. [Google Scholar] [CrossRef] - McMenamin, J.S.; Monforte, F.A. Short-term energy forecasting with neural networks. Energy J.
**1998**, 19, 43–61. [Google Scholar] - Song, K.B.; Ha, S.K.; Park, J.W.; Kweon, D.J.; Kim, K.H. Hybrid load forecasting method with analysis of temperature sensitivities. IEEE Trans. Power Syst.
**2006**, 21, 869–876. [Google Scholar] [CrossRef] - Senjyu, T.; Mandal, P.; Uezato, K.; Funabashi, T. Next day load curve forecasting using hybrid correction method. IEEE Trans. Power Syst.
**2005**, 20, 102–109. [Google Scholar] [CrossRef] - Amjady, N. Short-Term Bus Load Forecasting of Power Systems by a New Hybrid Method. IEEE Trans. Power Syst.
**2007**, 22, 333–341. [Google Scholar] [CrossRef] - Weron, R. Modeling and Forecasting Electricity Loads and Prices: A Statistical Approach; Wiley: Chichester, UK, 2006. [Google Scholar]
- Hippert, H.S.; Pedreira, C.E.; Souza, R.C. Neural networks for short-term load forecasting: A review and evaluation. IEEE Trans. Power Syst.
**2001**, 16, 44–55. [Google Scholar] [CrossRef] - Amjady, N.; Keynia, F. Electricity Market Price Spike Analysis by a Hybrid Data Model and Feature Selection Technique. Electr. Power Syst. Res.
**2010**, 80, 318–327. [Google Scholar] [CrossRef] - Geem, Z.W.; Kim, J.H.; Loganathan, G.V. A new heuristic optimization algorithm: harmony search. Simulation
**2001**, 76, 60–68. [Google Scholar] [CrossRef] - Verma, A.; Panigrahi, B.K.; Bijwe, P.R. Harmony search algorithm for transmission network expansion planning. IET Gener. Transm. Distrib.
**2010**, 4, 663–673. [Google Scholar] [CrossRef] - Lee, K.S.; Geem, Z.W. A new meta-heuristic algorithm for continuous engineering optimization: harmony search theory and practice. Comput. Methods Appl. Mech. Eng.
**2005**, 194, 3902–3933. [Google Scholar] [CrossRef] - Das, S.; Mukhopadhyay, A.; Roy, A.; Abraham, A.; Panigrahi, B.K. Exploratory power of the harmony search algorithm: analysis and improvements for global numerical optimization. IEEE Trans. Syst. Man Cybern. Part B: Cybern.
**2010**, 99, 1–18. [Google Scholar] - Engelbrecht, A.P. Computational Intelligence: An Introduction, 2nd ed.; John Wiley & Sons: Chichester, UK, 2007. [Google Scholar]
- Amjady, N.; Keynia, F. Short-Term Load Forecasting of Power Systems by Combination of Wavelet Transform and Neuro-Evolutionary Algorithm. J. Energy
**2009**, 34, 46–57. [Google Scholar] [CrossRef] - Hush, D.R.; Horne, B.G. Progress in supervised Neural Networks. IEEE Signal Process. Mag.
**1993**, 10, 8–39. [Google Scholar] [CrossRef] - Pennsylvania-New Jersey-Maryland system operator. Available online: http://www.pjm.com (accessed on 1 December 2010).
- Pennsylvania State Climatologist. Available online: http://climate.met.psu.edu (accessed on 5 December 2010).
- Amjady, N.; Keynia, F. Day-ahead Price Forecasting of Electricity Markets by Mutual Information Technique and Cascaded Neuro-Evolutionary Algorithm. IEEE Trans. Power Syst.
**2009**, 24, 306–318. [Google Scholar] [CrossRef] - New York Independent System Operator. Available online: http://www.nyiso.com (accessed on 10 December 2010).

© 2011 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

## Share and Cite

**MDPI and ACS Style**

Amjady, N.; Keynia, F.
A New Neural Network Approach to Short Term Load Forecasting of Electrical Power Systems. *Energies* **2011**, *4*, 488-503.
https://doi.org/10.3390/en4030488

**AMA Style**

Amjady N, Keynia F.
A New Neural Network Approach to Short Term Load Forecasting of Electrical Power Systems. *Energies*. 2011; 4(3):488-503.
https://doi.org/10.3390/en4030488

**Chicago/Turabian Style**

Amjady, Nima, and Farshid Keynia.
2011. "A New Neural Network Approach to Short Term Load Forecasting of Electrical Power Systems" *Energies* 4, no. 3: 488-503.
https://doi.org/10.3390/en4030488