# ANN Sizing Procedure for the Day-Ahead Output Power Forecast of a PV Plant

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Artificial Neural Networks

## 3. Methodology

#### 3.1. Early Assumptions

^{®}, Natick, MA, USA [20]), which has the following expression, where x is the generic input and y the output of the neuron:

#### 3.2. Data Set Definition

^{®}, Natick, MA, USA, and grouped in three clusters: “training”, “validation”, and “test”. Each group fulfills three specific tasks:

- (1)
- The training set includes the samples employed to train the network. It should contain enough different examples (days) to make the network able to generalize its learning.
- (2)
- The validation set contains additional samples (i.e., days not already included in the training set) used by the network to check and validate the training process.
- (3)
- The test set is the dataset corresponding to the days actually forecasted by the network.

#### 3.3. Artificial Neural Network Size

_{w}existing within a layered ANN is bound to the number of patterns N

_{p}in the training set and outputs N

_{o}by the following condition:

_{w}) are larger than the constraints associated with the desired response function (i.e., N

_{o}and N

_{p}), the training procedure will be unable to completely constrain the weights in the network.

_{n}for a three layer network is computed with the following formula:

_{n}should be divided by the number of hidden layers.

## 4. Evaluation Indexes

_{m,h}and the forecasted one P

_{p}

_{,h}in the same hour h. Therefore, we now consider the normalized mean absolute error NMAE

_{%}as a reference for evaluating the performance of the forecasts:

_{%}is based on the net capacity of the plant C. N is the number of time samples (hours) considered in the evaluated period (i.e., $N=24$ in a daily error basis calculation).

_{%}value is a random variable, its trend could be analyzed by means of the theory of parametric estimation. Therefore, when looking for the settings which on average minimize NMAE

_{%}values, the FFNN will be analyzed as a function of:

- (1)
- a single layer number of neurons;
- (2)
- a double layer number of neurons; and
- (3)
- trials in the ensemble forecast.

- (1)
- one or more ANN parameters are kept constant;
- (2)
- the free parameter to be inspected (i.e., number of neurons within a layer) is varied within a specific range; and
- (3)
- the NMAE
_{%}values of similar networks are calculated.

_{%}can be calculated, assuming that the unknown mean of all of the possible NMAE

_{%}values, with those ANN settings, is within that interval of confidence.

#### NMAE Statistical Distributions and Confidence Limits

_{%}behavior with respect to a specific ANN setting, this setting is assigned a starting constant value, and n

_{t}forecasts are performed. For each i-th forecast (trial), the NMAE

_{%}is calculated. From a statistical point of view, the group of these NMAE

_{%}values represents a sample of the endless population of all of the possible NMAE

_{%}values related to the forecast performed with those specific ANN settings. After the same parameter has been changed, further n

_{t}forecasts are performed, and the associated NMAE

_{%}values are calculated. This procedure is repeated until the maximum value of the ANN parameter is reached. As this value could be as great as possible, a reasonable threshold is set. Intuitively, a higher accuracy could be obtained by increasing the number of forecasts for each network, but the highly time-consuming process should be compensated for by a much more striking performance, otherwise it is not worthy. Therefore, a tradeoff between the computational burden and the expected accuracy is defined. Considering the group of NMAE

_{%}values belonging to the same test set obtained by the ANN with those given “p” settings, the relative sample mean ${\overline{NMAE}}_{p}$ is an estimator of all of the possible NMAE

_{%}values, and it is defined as:

_{i}

_{,p}is the NMAE

_{%}calculated for the i-th trial performed by the ANN with the p-th value of a given setting. In our case, the distribution of the NMAE

_{i}

_{,p}population is unknown. However, the sample mean can be calculated (3), as well as the sample variance S

_{p}

^{2}:

_{p}, is:

- (1)
- how confident we want to be with our assessment;
- (2)
- the sample standard deviation S; and
- (3)
- how large our sample size is.

_{α}

_{/2}is set by the relative t-Student distribution selected according to the degree of freedom equal to n

_{t}− 1. Now that the distribution has been estimated, it is possible to define appropriate confidence intervals in which the mean μ of the population can be included, according to the sample mean ${\overline{NMAE}}_{p}$.

## 5. Case Study

_{%}was evaluated. The NMAE

_{%}’s sample mean, its sample variance, and the width of its confidence interval for a given degree of confidence were also calculated as previously described. As the sample variance is different for each network configuration, also the related confidence intervals will have different amplitudes.

## 6. Neurons in a Single Layer ANN

_{%}s were calculated. Afterwards, we calculated the average and the variance of this parameter, and we chose intervals of 95% confidence for the sample mean. Both methods used the Levenberg–Marquardt (LM) algorithm, together with an early stopping procedure. The “LM fast” method adopts the default setting, while the second one has been changed in order to assume a slower convergence towards the solution. This ensures higher protection from overlearning, as suggested in the user’s guide [20], together with the default values already set in the Neural Network Toolbox™. More in detail, the parameters for the two “LM” methods are shown in Table 1, where ω is the initial convergence speed, δ is the speed increase and Φ the speed decrease when the convergence speed is inadequate.

_{%}trend as a function of the neurons in the single hidden layer is shown. The error function reaches a minimum within a broad interval of neurons, after which it seems to increase again. This slight growth is not evident in the slow convergence algorithm (“LM slow”, Figure 2); however, we may expect that this will not decrease further, as the error has reached a minimum region. In fact, while an exact point for the minimum of the sample mean can be determined, (the red points in Figure 1 and Figure 2), this value is included in an interval of confidence. If such a range matches other minima (which are not necessarily adjacent), it would not be possible to exactly determine which one of the two points actually represents the absolute minimum (the “optimum” value). Two intervals of confidence are called “compatible” if their intersection is not null. The mean points representing intervals compatible with the minimum interval are shown in Figure 1 and Figure 2 (red point), and have been highlighted in yellow. It can be noted that, for the fast convergence algorithm (“LM fast”, Figure 1) with an equal number of similar networks, the trend is more variable: the higher variance generates wider intervals. As a consequence, we have a higher number of compatible points, and higher uncertainty in the optimum configuration.

_{%}) vary according to the weather conditions, as follows: around 2% for a sunny day; 7% for highly variable cloudy day; and 8% for a typical overcast day. The proposed PHANN approach has been already validated versus other methods for a day-ahead PV power forecasts in [33]: results obtained in a real application scenario, after setting neural network parameters with the here proposed methodology, have proved to reach lower error rates.

## 7. Number of Trials in the Ensemble Forecast

_{%}, along with the amount of trials, is outlined as follows. The training Levenberg–Marquardt algorithms both with fast and slow convergence are adopted, and the ensemble forecast is performed by an ANN with 120 number of neurons. A growing number of trials were analyzed up to a maximum equal to one thousand.

_{%}is calculated for ten independent ensemble forecasts. The term “independent” means that different ensemble forecasts do not have common trials. The n-th ensemble forecast is performed by using a growing quantity of trials, starting from one trial and going up to one hundred trials, in order to infer the global trends of the ensemble NMAE

_{%}, while trying to avoid any possible random influence due to a few number of cases.

_{%}error curves as a function of the average outputs number used for the ensemble method (grey lines). The mean NMAE

_{%}of the ten ensemble forecasts is depicted in red; the mean NMAE

_{%}of one thousand forecasts is the upper constant dashed green line; and the NMAE

_{%}of the ensemble forecast made by one thousand trials is the lower constant dash-dotted blue line. The red line represents a “trend index”, and it rapidly tends to an asymptotic value that can be considered “stable”.

_{%}, which bears the brunt of a much higher computational burden.

## 8. Number of Neurons in a Dual Layer ANN

_{%}trends of the dual layer networks as a function of the neurons of the first layer, for a total of 50 forecasts for each layout. From 20 to 120 neurons were employed, with an increasing rate of 20 neurons. In Figure 6, the neurons of the second layer were fixed at 25%, 50%, and 75% of the first layer neurons, rounded up, and originating in this way three different curves. The best result is comparable to the one reached by the single layer networks. However, here it is possible, by analyzing the gradient of the orange curve, to have small margins of improvement while the number of neurons is increasing. Similarly to the analysis carried out for a single layer network, the NMAE

_{%}is calculated for the ensemble of the forecasts, and it is shown together with the mean error of the single outputs. Also in this case, the shape of the curves suggests looking for the optimum layout towards a higher number of neurons. Moreover, the smallest value of the error for the ensemble output is slightly less than the one obtained by the single layer networks. Therefore, the dual layer networks seem to provide better performance.

^{®}, Santa Clara, CA, USA, Core

^{TM}i7-2640M CPU, with an operating frequency of 2.8 GHz, and 8 GB ram.

## 9. Conclusions

_{%}of the 24 h ahead PV power forecast with a one-year historical dataset are: an ensemble size of ten trials, and a number of 120 neurons in a single layer ANN configuration. The above outlined method is meant to be adopted for setting the most suitable ANN parameters in view of the day-ahead forecast of any PV plant’s output power.

## Author Contributions

## Conflicts of Interest

## References

- Brenna, M.; Dolara, A.; Foiadelli, F.; Lazaroiu, G.C.; Leva, S. Transient analysis of large scale PV systems with floating DC section. Energies
**2012**, 5, 3736–3752. [Google Scholar] [CrossRef] - Paulescu, M.; Paulescu, E.; Gravila, P.; Badescu, V. Weather Modeling and Forecasting of PV Systems Operation; Green Energy Technology; Springer: London, UK, 2013. [Google Scholar]
- Omar, M.; Dolara, A.; Magistrati, G.; Mussetta, M.; Ogliari, E.; Viola, F. Day-ahead forecasting for photovoltaic power using artificial neural networks ensembles. In Proceedings of the 2016 IEEE International Conference on Renewable Energy Research and Applications (ICRERA), Birmingham, UK, 20–23 November 2016; pp. 1152–1157. [Google Scholar]
- Cali, Ü. Grid and Market Integration of Large-Scale Wind Farms Using Advanced Wind Power Forecasting: Technical and Energy Economic Aspects; Kassel University Press: Kassel, Germany, 2011. [Google Scholar]
- Gardner, M.; Dorling, S. Artificial neural networks (the multilayer perceptron)—A review of applications in the atmospheric sciences. Atmos. Environ.
**1998**, 32, 2627–2636. [Google Scholar] [CrossRef] - Lippmann, R. An introduction to computing with neural nets. IEEE ASSP Mag.
**1987**, 4, 4–22. [Google Scholar] [CrossRef] - Gandelli, A.; Grimaccia, F.; Leva, S.; Mussetta, M.; Ogliari, E. Hybrid model analysis and validation for PV energy production forecasting. In Proceedings of the International Joint Conference on Neural Networks, Beijing, China, 6–11 July 2014. [Google Scholar]
- Nelson, M.M.; Illingworth, W.T. A Practical Guide to Neural Nets; Addison-Wesley: Reading, MA, USA, 1991; Volume 1. [Google Scholar]
- Pelland, S.; Remund, J.; Kleissl, J.; Oozeki, T.; De Brabandere, K. Photovoltaic and Solar Forecasting: State of the Art. Int. Energy Agency Photovolt. Power Syst. Program. Rep.
**2013**, 14, 1–40. [Google Scholar] - Jayaweera, D. Smart Power Systems and Renewable Energy System Integration; Springer International Publishing: Cham, Switzerland, 2016; Volume 57. [Google Scholar]
- Kalogirou, S.A. Artificial Intelligence in Energy and Renewable Energy Systems; Nova Science Publishers: Commack, NY, USA, 2006. [Google Scholar]
- Raza, M.Q.; Nadarajah, M.; Ekanayake, C. On recent advances in PV output power forecast. Sol. Energy
**2016**, 136, 125–144. [Google Scholar] [CrossRef] - Li, G.; Shi, J.; Zhou, J. Bayesian adaptive combination of short-term wind speed forecasts from neural network models. Renew. Energy
**2011**, 36, 352–359. [Google Scholar] [CrossRef] - Quan, D.M.; Ogliari, E.; Grimaccia, F.; Leva, S.; Mussetta, M. Hybrid model for hourly forecast of photovoltaic and wind power. In Proceedings of the IEEE International Conference on Fuzzy Systems, Hyderabad, India, 7–10 July 2013. [Google Scholar]
- Dolara, A.; Grimaccia, F.; Leva, S.; Mussetta, M.; Ogliari, E. A physical hybrid artificial neural network for short term forecasting of PV plant power output. Energies
**2015**, 8, 1138–1153. [Google Scholar] [CrossRef] - Leva, S.; Dolara, A.; Grimaccia, F.; Mussetta, M.; Ogliari, E. Analysis and validation of 24 hours ahead neural network forecasting of photovoltaic output power. Math. Comput. Simul.
**2017**, 131, 88–100. [Google Scholar] [CrossRef] - Ripley, B.D. Statistical Ideas for Selecting Network Architectures. In Neural Networks: Artificial Intelligence and Industrial Applications, Proceedings of the Third Annual SNN Symposium on Neural Networks, Nijmegen, The Netherlands, 14–15 September 1995; Kappen, B., Gielen, S., Eds.; Springer: London, UK; pp. 183–190.
- Benardos, P.G.; Vosniakos, G.-C. Optimizing feedforward artificial neural network architecture. Eng. Appl. Artif. Intell.
**2007**, 20, 365–382. [Google Scholar] [CrossRef] - Basheer, I.; Hajmeer, M. Artificial neural networks: Fundamentals, computing, design, and application. J. Microbiol. Methods
**2000**, 43, 3–31. [Google Scholar] [CrossRef] - Beale, M.H.; Hagan, M.T.; Demuth, H.B. Neural Network Toolbox
^{TM}User’s Guide; MathWorks Inc.: Natick, MA, USA, 1992. [Google Scholar] - Widrow, B.; Lehr, M.A. 30 Years of Adaptive Neural Networks: Perceptron, Madaline, and Backpropagation. Proc. IEEE
**1990**, 78, 1415–1442. [Google Scholar] [CrossRef] - Cover, T.M. Geometrical and Statistical Properties of Linear Threshold Devices; Department of Electrical Engineering, Stanford University: Stanford, CA, USA, 1964. [Google Scholar]
- Cover, T.M. Capacity problems for linear machines. Pattern Recognit.
**1968**, 283–289. [Google Scholar] - Baum, E.B.; Haussler, D. What Size Net Gives Valid Generalization? Neural Comput.
**1989**, 1, 151–160. [Google Scholar] [CrossRef] - Hertz, J.; Krogh, A.; Palmer, R.G. Introduction to the Theory of Neural Computation; Westview Press: Boulder, CO, USA, 1991; Volume 1. [Google Scholar]
- Grossman, T.; Meir, R.; Domany, E. Learning by Choice of Internal Representations. Complex Syst.
**1989**, 2, 555–575. [Google Scholar] - Chow, S.K.H.; Lee, E.W.M.; Li, D.H.W. Short-term prediction of photovoltaic energy generation by intelligent approach. Energy Build.
**2012**, 55, 660–667. [Google Scholar] [CrossRef] - Frederick, M. Neuroshell Manual; Ward Systems Group Inc.: Frederick, MD, USA, 1996; Volume 2. [Google Scholar]
- Mezard, M.; Nadal, J.-P. Learning in feedforward layered networks: The tiling algorithm. J. Phys. A Math. Gen.
**1989**, 22, 2191–2203. [Google Scholar] [CrossRef] - Castellano, G.; Fanelli, A.M.; Pelillo, M. An iterative pruning algorithm for feedforward neural networks. IEEE Trans. Neural Netw.
**1997**, 8, 519–531. [Google Scholar] [CrossRef] [PubMed] - Huang, S.-C.; Huang, Y.-F. Bounds on the number of hidden neurons in multilayer perceptrons. IEEE Trans. Neural Netw.
**1991**, 2, 47–55. [Google Scholar] [CrossRef] [PubMed] - Shao, J. Fundamentals of Statistics. In Mathematical Statistics; Springer: New York, NY, USA, 2003; pp. 91–160. [Google Scholar]
- Ogliari, E.; Dolara, A.; Manzolini, G.; Leva, S. Physical and hybrid methods comparison for the day ahead PV output power forecast. Renew. Energy
**2017**, 113, 11–21. [Google Scholar] [CrossRef] - Dolara, A.; Leva, S.; Mussetta, M.; Ogliari, E. PV hourly day-ahead power forecasting in a micro grid context. In Proceedings of the EEEIC 2016—International Conference on Environment and Electrical Engineering, Florence, Italy, 6–8 June 2016. [Google Scholar]

**Figure 1.**Mean normalized mean absolute error (NMAE

_{%}) as a function of the number of neurons, with Levenberg–Marquardt training algorithm set for a faster convergence. Comparable mean NMAE

_{%}in yellow, with 95% interval of confidence, with the minimum in red.

**Figure 2.**Mean NMAE

_{%}as a function of the number of neurons, with Levenberg–Marquardt training algorithm set for a slower convergence. Comparable mean NMAE

_{%}in yellow, with 95% interval of confidence, with the minimum in red.

**Figure 3.**NMAE

_{%}of ten random ensemble (grey lines) as a function of the growing number of trials. The artificial neural network (ANN) has 120 neurons and an “LM fast” training algorithm. The mean of the ten ensemble forecasts ${\overline{NMAE}}_{p}$ is in red, the mean NMAE

_{%}of one thousand forecasts is the dashed green line, and the ensemble NMAE

_{%}of one thousand forecasts is the dash-dotted blue line.

**Figure 4.**NMAE

_{%}of ten random ensemble (grey lines) as a function of the growing number of trials. The artificial neural network (ANN) has 120 neurons and a “Levenberg–Marquardt (LM) slow” training algorithm. The mean of the ten ensemble forecasts ${\overline{NMAE}}_{p}$ is in red, the mean of one thousand forecasts is the dashed green line, and the ensemble NMAE

_{%}of one thousand forecasts is the dash-dotted blue line.

**Figure 5.**Mean NMAE

_{%}of 50 different forecasts as a function of the hidden layers’ sizes, with a constant ratio of neurons.

**Figure 6.**Comparison between the mean NMAE

_{%}and the ensemble NMAE

_{%}of 50 trials as a function of the hidden layers’ sizes, kept with a constant ratio of neurons.

Settings | LM Fast (Default) | LM Slow |
---|---|---|

ω | 10^{−3} | 1 |

δ | 10 | 2 |

Φ | 10^{−1} | −0.5 |

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Grimaccia, F.; Leva, S.; Mussetta, M.; Ogliari, E.
ANN Sizing Procedure for the Day-Ahead Output Power Forecast of a PV Plant. *Appl. Sci.* **2017**, *7*, 622.
https://doi.org/10.3390/app7060622

**AMA Style**

Grimaccia F, Leva S, Mussetta M, Ogliari E.
ANN Sizing Procedure for the Day-Ahead Output Power Forecast of a PV Plant. *Applied Sciences*. 2017; 7(6):622.
https://doi.org/10.3390/app7060622

**Chicago/Turabian Style**

Grimaccia, Francesco, Sonia Leva, Marco Mussetta, and Emanuele Ogliari.
2017. "ANN Sizing Procedure for the Day-Ahead Output Power Forecast of a PV Plant" *Applied Sciences* 7, no. 6: 622.
https://doi.org/10.3390/app7060622