Wind energy is featured by instability due to a number of factors, such as weather, season, time of the day, climatic area and so on. Furthermore, instability in the generation of wind energy brings new challenges to electric power grids, such as reliability, flexibility, and power quality. This transition requires a plethora of advanced techniques for accurate forecasting of wind energy. In this context, wind energy forecasting is closely tied to machine learning (ML) and deep learning (DL) as emerging technologies to create an intelligent energy management paradigm. This article attempts to address the short-term wind energy forecasting problem in Estonia using a historical wind energy generation data set. Moreover, we taxonomically delve into the state-of-the-art ML and DL algorithms for wind energy forecasting and implement different trending ML and DL algorithms for the day-ahead forecast. For the selection of model parameters, a detailed exploratory data analysis is conducted. All models are trained on a real-time Estonian wind energy generation dataset for the first time with a frequency of 1 h. The main objective of the study is to foster an efficient forecasting technique for Estonia. The comparative analysis of the results indicates that Support Vector Machine (SVM), Non-linear Autoregressive Neural Networks (NAR), and Recurrent Neural Network-Long-Term Short-Term Memory (RNN-LSTM) are respectively 10%, 25%, and 32% more efficient compared to TSO's forecasting algorithm. Therefore, RNN-LSTM is the best-suited and computationally effective DL method for wind energy forecasting in Estonia and will serve as a futuristic solution.

The worldwide energy demand is increasing with every passing year so is the environmental pollution due to the brown energy generation from fossil fuels. Therefore, the uses of Renewable Energy Resources (RES) like solar and wind have gained popularity due to lower carbon emissions. However, wind energy generation is variable and unstable due to variations in wind speed [

In the past, several research works have been developed using deep methods for wind speed forecasting and wind power generation forecasting. A bibliometric visualization of the keywords used in previous studies conducted in the past 5 years related to wind energy furcating has been made in VOS viewer software and depicted in

Paper | Algorithms | Data size | Duration | Description |
---|---|---|---|---|

[ |
ELM | 4 months | 1 h | ELM has better accuracy for multistep ahead forecasting. |

[ |
SVM | 4 years | 1 day | Feature extraction based SVM outperforms KNN. |

[ |
Randomizable filter classifier | 4 years | 1 day | Randomizable filter classifier gives better forecasting and a lower error compared to KNN. |

[ |
SVM, ANN | 3 years | 1 day | ANN is found to be more accurate than SVM. |

[ |
RNN-LSTM | 1 year | 1 h | LSTM forecasting with 10.43% RMSE. |

[ |
SVM, KNN | 4 years | 1 day-1 month | KNN algorithm outperforms SVM, RT, and ET. |

[ |
SVM, RBFNN | 2 years | 2 h | A hybrid model based on SVM and RBFNN gives only 6.84% MAPE. |

[ |
RNN-LSTM | 4 years | 1 day | LSTM gives better forecasting. |

[ |
DNN | 10 years | 1 day | DNN gives better forecasting compared to SVM. |

[ |
ANFIS | 2 years | 1 h | ANFIS gives 2.25%, 3.35% and 3.86% MAPE. |

[ |
Linear regression | 2 years | 6 h | ML algorithms give more accurate forecasting compared to statistical methods. |

[ |
SVM, ANN | 2 years | 1 day | A hybrid SVM-ANN model outperforms individual models. |

[ |
RNN-LSTM | 1 year | 3 days | LSTM gives 25% more accurate than statistical methods. |

[ |
ARIMA, ELM | 1 month | 1 day | Hybrid ARIMA-ELM gives MAPE of 2.21%, 2.94%, and 3.2% for three different sites. |

[ |
TDCNN | 4 years | 1–24 h | TDCNN gives lower RMSE for up to 24 h before forecasting. |

[ |
GBM | 3.75 years | 21–45 h | The improvement in GBM reaches on average 1% on MAE and 0.9% on RMSE. |

[ |
SVM/ANN | 2.5 year | 6–24 h | SVM gives better 24 h ahead forecasting results than ANN. |

[ |
RNN, KNN | 3 years | 1 day | LSTM is 18.3% more accurate than KNN and SVM. |

[ |
MLP | 3 years | 70 h | ANN based MLP gives accurate forecasting for 70 h. |

Our work | LR, TR, SVM, ANFIS, AR, ARIMA, NAR, LSTM | 2–8 years | 24 h | RNN-LSTM gives better forecasting compared to LR, TR, SVM, ANFIS, ARIMA, AR and NAR. |

From all the above studies, it is clear that ML and DL algorithms are very useful in wind energy forecasting. However, it is still a very difficult thing to make an accurate prediction and a universal model is not possible. Therefore, every scenario requires a local dataset of wind speed, weather information, and location. Each model needs to be customized, built, and then trained. This accurate forecasting will help in the better management of demand and supply, smooth operation, flexibility and reliability and as well as economic implication.

In this research, a comparison has been made between different machine learning and DL forecasting algorithms for a day-ahead wind energy generation in Estonia. The historical data set on one-year Estonian wind energy generation was taken from the Estonian Transmission System operator (TSO) called ELERING [

The key contributions of this paper are summarized as follows:

To address the problem of wind energy forecasting in Estonia, state-of-the-art ML and DL algorithms are implemented and rigorously compared based on performance indices, such as root mean square error, computational complexity, and training time.

A detailed exploratory data analysis is conducted for the selection of optimal models’ parameters, which proves to be an essential part of all implemented ML and DL algorithms.

A total of six ML NAR and two DL algorithms are implemented, such as linear regression, tree regression, SVM, ARIMA, AR, NAR, ANFIS, and RNN-LSTM. All implemented algorithms are thoroughly compared with currently implemented TSO forecasted wind energy and our proposed RNN-LSTM forecasting algorithm proves to be a more accurate and effective solution based on performance indices.

The structure of the paper is shown in

The most common ML tool for forecasting is regression-based algorithms [

This simplest and most commonly used algorithm computes a linear relationship between the output and input variables. The input variables can be more than one. The general equation for linear regression is along with its details can be found in [

This algorithm deploys a separate regression model for the different dependent variables, as these variables could belong to the same class. Then further trees are made at different time intervals for the independent variables. Finally, the sum of errors is compared and evaluated in each iteration, and this process continues until the lowest RMSE value is achieved. The general equation and the details of the algorithm are described in [

SVM is another commonly used ML algorithm due to its accuracy. In SVM, an error margin called ‘epsilon’ is defined and the objective is to reduce epsilon in each iteration. An error tolerance threshold is used in each iteration as SVM is an approximate method. Moreover, in SVM, two sets of variables are defined along with their constraints by converting the primal objective function into a Lagrange function. Further details of this algorithm are given in [

The RNN is usually categorized as a deep-learning algorithm. The RNN algorithm used in this paper is the Long Short-Term Memory (LSTM) [

This algorithm uses feedforward neural network architecture to forecast future values. This algorithm consists of three layers, and the forecasting is done iteratively. For a step ahead forecast, only the previous data is used. However, for the multistep ahead, previous data and forecasted results are also used, and this process is repeated until the forecast for the required prediction horizon is achieved. The mathematical relationship between input and output is as follows [

The Nonlinear Autoregressive Neural Network (NAR-NN) predicts the future values of the time series by exploring the nonlinear regression between the given time series data. The predicted output values are the feedback/regressed back as an input for the prediction of new future values. The NA-NN network is designed and trained as an open-loop system. After training, it is converted into a closed-looped system to capture the nonlinear features of the generated output [

This model is usually applied to such datasets that exhibit nonstationary patterns like wind energy datasets. There are mainly three parts of the ARIMA algorithm. The first part is AR where the output depends only on the input and its previous values.

The third part ‘I’ describes that the data have been updated by the amount of error calculated at each step to improve the efficiency of the algorithm. The final equation of ARIMA is as follows [

This algorithm is a hybrid of ANN and Fuzzy logic. In the first step, Takagi and Sugeno Kang's fuzzy inference modeling method is used to develop the fuzzy system interference [

This algorithm works on Error Backpropagation (EPB) model. The model employs Least Square Estimator (LSE) in the last layer which optimizes the parameters of the fuzzy membership function. The EBP reduces the error in each iteration and then defines new ratios for the parameters to obtain optimized results. However, the learning algorithm is implemented in the first layer. The parameters defined in this method are usually linear [

Estonia is a Baltic country located in the northeastern part of Europe. Most of its energy is generated from fossil fuels, whereas the RESs are also contributing significantly. The distribution of RES and non-RES energy gen is shown in

In Estonia, a total of 139 wind turbines are currently installed, mainly along the coast of the Baltic Sea [

The data set used in this article is the Estonian general data on wind energy generation from 1 January 2011 to 31 May 2019. The frequency of the data set is one hour. This data set for wind energy generation is highly variable due to the weather conditions in Estonia. The maximum value of wind energy production in the aforementioned period is nearly 273 MWh, the mean value is 76.008 MWh, the median is 57.233 MWh, and the standard deviation is 61.861 MWh. To demonstrate the variable nature of the time series dataset for Estonian wind power generation, the moving average and the moving standard deviation are the best tools to elaborate on this dynamic nature of the dataset.

The histogram and the probability density function (PDF) of the data are shown in

In time-series analysis, the autocorrelation measure is a very useful tool to observe the regression nature of the time-series data and provides a birds-eye view for the election of the number of lags if any regression-based forecasting model is employed. It is the correlation of the signal with its delay version to check the dependency on the previous values. In this graph, the lag of 20 h is shown, in which the lags up to the previous 16 h have a regression value above 0.5 percent and after which it drops significantly below 0.5. The confidence interval is identified by the calculated 2

The Estonian wind energy dataset has been used in this research. The dataset is then divided into training, testing and validation and the divisions of data are 80%, 10% and 10%, respectively. All these simulations are carried out in Matlab2021a in a Windows 10 platform running on an Intel Core i7-9700 CPU with 64 GB RAM. Initially, the training data was converted into standard zero mean and unit variance form to avoid convergence in the data. The same procedure was carried out for the test data as well. The prediction features and response output parameter has also been defined for a multistep ahead furcating. The Estonian TSO is responsible for the forecasting of wind energy generation on an hourly basis. Their prediction algorithm forecasts wind energy generation 24 h in advance. It also generates the total energy production and the anticipated energy consumption.

Most of the time, the actual energy generation is much higher than the forecast values. The gap can go up to 70 MWh, which is too much. The forecasting algorithms need to be more accurate than that. This variation can falsely tell the energy supplier to use alternative energy sources rather than wind. This may be fossil fuel or any other resource, which will cost more to the supplier and eventually the customer. This low accuracy allowed us to study, develop, and propose a comparatively suitable forecasting algorithm for the prediction of wind power generation in Estonia.

In this study, the emphasis is on the accurate prediction of wind energy generation in Estonia. Eight different algorithms based on machine learning and DL are simulated and tested using the 1-year wind energy generation data set for a day-ahead prediction horizon. The results of all employed algorithms are compared based on RMSE values.

The wind power generation data understudy has a highly nonlinear nature; therefore, a vast variety of linear and nonlinear forecasting algorithms need to be tested to find an appropriate option. A thorough comparative analysis is conducted to compare the accuracies of all forecasting algorithms employed in this paper. Machine learning algorithms, such as linear regression, AR, ARIMA, and tree-based regression, are not performed adequately, while SVM is given good forecast accuracy.

On the contrary, deep-learning algorithms, such as NAR and RNN, have a high degree of accuracy compared to all other algorithms employed as the architectures for both algorithms have the capability to capture nonlinear features of the data. However, the ANFIS also gives relatively low accuracy. The ML algorithms are not showing accuracy as the data is highly non-linear and therefore the ML algorithms do not perform better curve fitting and result in lower accuracy as compared to DL methods.

DL models, in contrast, due to the ANN fitted the curve better and therefore gave more accurate forecasting results. Thus, these results indicate that for this time series-based forecasting the efficiency of DL methods is higher as compared to ML methods. The comparative analysis of ML algorithms and DL algorithms based on the RMSE value is depicted in

Algorithm name | RMSE value | Algorithm name | RMSE value |
---|---|---|---|

Linear regression | 37 | AR | 64 |

Tree-based regression | 34 | ARIMA | 78 |

SVM | 18 | ANFIS | 44 |

RNN-LSTM | 13 | NAR | 16 |

Furthermore, it is pertinent to mention that this energy forecasting topic has been under investigation for decades. The main issue is still the accuracy of forecasting. The main focus is to forecast wind energy on the basis of past data and not wind speed. Some researchers have tried to develop some hybrid models as well. However, it is extremely difficult to compare the results of these studies with our study as there are many parameters involved like the size of the dataset, location, time span, and then the algorithm used.

In this study, the best results are shown by the RNN-LSTM algorithm. The algorithm consists of 100 hidden units in the LSTM layer. This number of hidden units is obtained by the hit and trial method, the numbers are varied from 20 to 250. The models showed the best results for 100 units and after that, the results remained almost the same. It is using historical data only. Therefore, the number of features is one and the response is also one. The training of the algorithm is carried out by an ‘ADAM’ solver and the number of Epochs was also varied from 50–250 epochs. When the whole data set passes through the back or forward propagation through the neural network then it is called an Epoch. Learning rate is used to train the algorithm and when a certain number of Epochs are passed then it is dropped to a certain value. The initial learning rate was defined as 0.005. The gradient threshold is also one. The simulation parameters are described in

Data size (Years) | No. of hidden states | Epoch | Learn rate drop period | Training time | RMSE |
---|---|---|---|---|---|

2 | 50 | 250 | 125 | 3:34 | 9.47 |

2 | 100 | 250 | 125 | 8:03 | 9.44 |

2 | 200 | 250 | 125 | 14:23 | 9.44 |

2 | 100 | 200 | 100 | 7:15 | 9.44 |

2 | 100 | 100 | 50 | 2:31 | 9.99 |

2 | 100 | 50 | 25 | 1:16 | 11.9 |

2 | 200 | 50 | 25 | 3:18 | 12.67 |

2 | 200 | 100 | 50 | 7:54 | 9.53 |

In order to make multistep predictions, the prediction function makes a forecast of a single time step; and then updates the status of the network after each prediction. Now, the output of the first step will act as the input for the next step. The size of the data is also varied and tested between 1 month and 96 months to observe its impact on the forecasting algorithm. The simulation results show that after the data size is more than 24 months, the performance of this algorithm does not affect. Almost, the same RMSE value is obtained for 36, 60, and 96 months. The comparison is shown in

Data size (Months) | No. of hidden states | Epoch | Learn rate drop period | Training time | RMSE |
---|---|---|---|---|---|

1 | 100 | 200 | 100 | 0:23 | 46.45 |

6 | 100 | 200 | 100 | 1:42 | 18.43 |

12 | 100 | 200 | 100 | 3:27 | 13.67 |

24 | 100 | 200 | 100 | 7:15 | 9.44 |

36 | 100 | 200 | 100 | 10:40 | 9.44 |

60 | 100 | 200 | 100 | 17:49 | 9.44 |

96 | 100 | 200 | 100 | 28:22 | 9.44 |

In the past decade, ML and DL have become promising tools for forecasting problems. The highly nonlinear behavior of weather parameters especially wind speed makes it a valid challenging problem to use ML and DL algorithms for wind energy forecasting for smart grids. Moreover, an accurate time-series forecasting algorithm can help provide flexibility in modern grids and have economical and technical implications in terms of demand and supply management and for the study of power flow analysis in power transmission networks. In this paper, six ML and two DL forecasting algorithms are implemented and compared for Estonian wind energy generation data.

Wind energy accounts for approximately 35% of total renewable energy generation in Estonia. This is the first attempt to provide an effective forecasting solution for the Estonian energy sector to maintain power quality on the existing electricity grid. We target the day-ahead prediction horizon, which is the normal practice for the TSO forecasting wind energy model. Real-time year-long wind energy generation data are used for the comparative analysis of the ML and DL algorithms employed. Moreover, the results of all employed models are also compared with the forecasting results of TSO's algorithm. The comparison of all ML and DL algorithms is based on performance indices, such as RMSE, computational complexity, and training time. For example, the results for May 31, 2019, illustrated that TSO's forecasting algorithm has an RMSE value of 20.48. However, SVM, NAR, and RNN-LSTM have lower RMSE values. The results conclude that SVM, NAR, and RNN-LSTM are respectively 10%, 25%, and 32% more efficient compared to TSO's forecasting algorithm. Therefore, it is concluded that the RNN-LSTM based DL forecasting algorithm is the best-suited forecasting solution among all compared techniques for this case.

We acknowledge the support received from the Estonian Research Council Grants PSG142, PUT1680, Estonian Centre of Excellence in Zero Energy and Resource Efficient Smart Buildings and Districts ZEBE, Grant 2014-2020.4.01.15-0016 funded by European Regional Development Fund.