Wind and solar energy are two popular forms of renewable energy used in microgrids and facilitating the transition towards net-zero carbon emissions by 2050. However, they are exceedingly unpredictable since they rely highly on weather and atmospheric conditions. In microgrids, smart energy management systems, such as integrated demand response programs, are permanently established on a step-ahead basis, which means that accurate forecasting of wind speed and solar irradiance intervals is becoming increasingly crucial to the optimal operation and planning of microgrids. With this in mind, a novel “bidirectional long short-term memory network” (Bi-LSTM)-based, deep stacked, sequence-to-sequence autoencoder (S2SAE) forecasting model for predicting short-term solar irradiation and wind speed was developed and evaluated in MATLAB. To create a deep stacked S2SAE prediction model, a deep Bi-LSTM-based encoder and decoder are stacked on top of one another to reduce the dimension of the input sequence, extract its features, and then reconstruct it to produce the forecasts. Hyperparameters of the proposed deep stacked S2SAE forecasting model were optimized using the Bayesian optimization algorithm. Moreover, the forecasting performance of the proposed Bi-LSTM-based deep stacked S2SAE model was compared to three other deep, and shallow stacked S2SAEs, i.e., the LSTM-based deep stacked S2SAE model, gated recurrent unit-based deep stacked S2SAE model, and Bi-LSTM-based shallow stacked S2SAE model. All these models were also optimized and modeled in MATLAB. The results simulated based on actual data confirmed that the proposed model outperformed the alternatives by achieving an accuracy of up to 99.7%, which evidenced the high reliability of the proposed forecasting.

With the rapid development of smart grids, microgrids have been garnering increasing interest as a unique method of power delivery. A microgrid is a small-scale self-sustained intelligent power system designed to deliver electricity to regional and local users, e.g., businesses. It may function in an islanded state during grid disruptions or can be grid-connected. Notably, such solutions have the potential to minimize energy delivery costs, increase load-point dependability, improve power quality, reduce emissions from power generation, manage investment costs in power transmission, and also improve the susceptibility of large-scale power systems [

Two of the most popular RES used in multi-energy microgrids are wind and solar energy. However, both are exceedingly unpredictable as they are highly reliant on weather and atmospheric conditions, which means they need suitable controllers to extract constant maximum power [

The many benefits offered by DL, including superior generalization capabilities, the ability to process large datasets, and support for both supervised and unsupervised learning algorithms, have proven invaluable in the development of forecasting solutions. The supervised learning method uses algorithms to learn the mapping functions between the input and output variables of an originally labelled dataset. The machine learning model can associate the signal dataset with an activity class thanks to supervised learning algorithms. In contrast, the unsupervised learning algorithms work on unlabeled data and can recover the learning features from raw data datasets and rebuild the patterns [

What distinguishes the supervised and unsupervised learning algorithms is the large-scale hierarchical data representation and several linear layers of processing. Therefore, increased computational complexity and a higher number of layers can contribute to a more intricate design of DL models. DL algorithms may be used to assess and utilize critical aspects of big data by facilitating the extraction of complicated patterns from enormous datasets, data tagging, semantic indexing, quick information retrieval, and the refinement of discriminating tasks [

Various cutting-edge DL-based methodologies have been proposed in the literature (as discussed in Section 2) to improve prediction accuracy and encourage prospective innovation in the sector. However, those methodologies are not without some significant drawbacks: (i) expertise is needed to choose the number of data points that will be fed into a model, making the model less trustworthy and less effective in extracting nonlinear features [

A novel Bi-LSTM-based deep stacked S2SAE for improving the accuracy of short-term solar irradiation and wind speed predictions was developed and evaluated in MATLAB.

The forecasting performance of the proposed Bi-LSTM-based deep stacked S2SAE model was compared to three other deep and shallow stacked S2SAEs, i.e., the LSTM-based deep stacked S2SAE model, GRU-based deep stacked S2SAE model, and Bi-LSTM-based shallow stacked S2SAE model.

Bayesian optimization was used to optimize the hyperparameters of all the forecasting models. The simulated results evidenced the superior performance of the proposed Bi-LSTM-based deep stacked S2SAE forecasting model relative to other models.

All the models were optimized using at least 30 objective function evaluations using Bayesian optimization capabilities.

The proposed model returned highly accurate results (up to 99.7%) when faced with unknown data and did not show evidence of vanishing gradient, over-fitting, or excessive network training problems.

The subsequent parts of the paper are organized as follows: Section 2 reports on the literature review conducted, Section 3 details the methodology followed, Section 4 discusses the results, and Section 5 draws a conclusion and offers recommendations for the future.

An accurate forecasting model is challenging to develop because of the apparent problems with fluctuations in wind speed and solar irradiance. As a result, various cutting-edge DL-based methodologies have been proposed in the literature to improve prediction accuracy and encourage prospective innovation in this sector. Hourly intervals of solar irradiance and wind speed were forecasted by authors in [

The performance and accuracy of wind speed and solar irradiation forecasting models may also be enhanced by combining various methodologies to create a hybrid prediction model. For instance, the authors of [

Approaches to solar irradiation and wind speed forecasting include both time series and regression models [

Moreover, time series can be associated with parallel series, and forecasting can also be applied to these parallel series (so-called ‘multivariate time series’). For example, a multivariate time series forecasting-based Bi-LSTM neural network was used in a study presented in [

In turn, regression forecasting models are often described as an interpolation technique. Time-series forecasting can also be done via regression. For example, a time-series auto-regression solar irradiation model has been proposed by [

A detailed literature review on DL and machine learning methods currently used to forecast solar irradiation can be found in [

Kernel-based popular forecasting models include the “support vector machine” (SVM) [

As follows from the above literature, the described DL-based forecasting algorithms needed expertise to choose the correct number of data points, making the model less trustworthy and less effective in extracting nonlinear features. Some models also suffered from the problems of gradient disappearance, over-fitting, and excessive network training, while their lack of generalization capacity prevented them from learning complicated patterns. Moreover, the hyperparameters are not sufficiently finetuned to account for the lack of data. Therefore, the present study aimed to address these drawbacks by evaluating deep stacked S2AEs’ effectiveness in solar irradiation and wind speed time series prediction. It was necessary to test the effectiveness of Bi-LSTM-based deep-stacked S2SAE for wind speed and solar irradiance forecasting. Therefore, a novel Bi-LSTM-based deep stacked S2SAE for short-term solar irradiation and wind speed prediction was developed and evaluated in MATLAB. The forecasting performance of the proposed Bi-LSTM-based deep stacked S2SAE model was compared to three other deep and shallow stacked S2SAEs, i.e., the LSTM-based deep stacked S2SAE model, GRU-based deep stacked S2SAE model, and Bi-LSTM-based shallow stacked S2SAE model. Bayesian optimization was used to optimize the hyperparameters of all the forecasting models. The simulated results confirmed the superior performance of the proposed Bi-LSTM-based deep-stacked S2SAE forecasting model relative to other models.

The following section will explain the basic concepts of deep SAE and Bi-LSTM networks before detailing the proposed novel Bi-LSTM-based deep stacked S2SAE forecasting model for solar irradiance and wind speed forecasting.

Autoencoders are among the most significant neural network-based deep learning designs falling under unsupervised machine learning. They operate within three distinct layers: input, hidden, and output. Encoding is handled by the hidden layer, whereas decoding is performed by the output layer. The network is taught to produce a replica of the input. This is made possible by the hidden layer that learns the depictions of the inputs. Because autoencoders are taught to reproduce their input

When autoencoding, the autoencoder can be either under or over-complete. The hidden layer dimensions in an under-complete autoencoder are less than those in the input layer, whereas, in an over-complete autoencoder, they are more. The essential features of the inputs can be captured by an under-complete autoencoder. To train these networks, the backpropagation technique is used. Autoencoders can be used to represent either a linear or a nonlinear transformation. Under-complete autoencoders can be stacked and are used mainly in dimensionality reduction and data denoising [

To address the issue of vanishing gradients in RNNs, LSTM networks have been developed [

The following equations describe how a single LSTM network cell functions [

LSTM cells can be stacked on top of each other to create a deep or multi-layered network. Therefore, each LSTM layer consists of several hidden cells. The LSTM layers used in this study are bidirectional, meaning that the input sequence may be run either forwards or backward [

This study aimed to evaluate the effectiveness of a novel Bi-LSTM-based deep-stacked S2SAE designed for short-term predictions of both solar irradiation and wind speed. To create a deep-stacked S2SAE prediction model, six Bi-LSTM network layers were stacked on top of one another. The proposed deep model was inspired by [

It is worth mentioning that the hidden states of every B-LSTM layer are fed as an input to a dropout layer, which is included to prevent the network from overfitting the data and consequently underperforming with novel values. The dropout layer is designed to operate at the probability of 0.05, which prevents overfitting by arbitrarily assigning 5% of inputs to zero. Similarly, the decoder is composed of three Bi-LSTM layers that receive this vector as an input and utilize it to generate a target sequence. The encoder uses Bi-LSTM cells to turn the input into a hidden state. Therefore, the hidden state of the most recent Bi-LSTM cell is the output vector generated by the encoder. Subsequently, the repeat vector’s reconstructed original sequence input is fed into the first Bi-LSTM-based hidden layer of the decoder. The layer uses this vector as its first hidden state, and the last time step’s output value is fed into the subsequent Bi-LSTM cell for the step-ahead forecast.

By gathering information from several Bi-LSTM layers, the model’s forecasting performance may enhance, and it can understand more complex representations of time-series data in the model’s hidden layers [

When developing an ML model, it is crucial to select optimum values of hyperparameters, i.e., parameters whose values are set before the model’s training begins. In RNN, hyperparameters generally include the maximum number of iterations, mini-batch size, number of hidden layers, initial learning rate, momentum, activation functions, and regularization factor. Model-specific considerations dictate specific hyperparameters to be used. There is no optimal set of hyperparameters common to all the models. In the presented study, these included the initial learning rate, the number of hidden neurons in every Bi-LSTM layer, and the L2 regularization (weight decay) factor. The initial learn-rate aids in finding generic patterns in the input sequence, and L2 regularization improves the model’s generalization and prevents overfitting, leading to more accurate forecasts. All the hyperparameters used were optimized using a Bayesian optimization method [

Bayesian optimization uses previous knowledge about the function and updates the knowledge gained via experimentation to minimize losses and increase the model’s accuracy. Other parameter-tuning methods, such as a grid and random search, were not used due to their inherent limitations. The shortcomings of a grid search become more apparent as the number of dimensions increases. In contrast, a random search is more akin to the greedy strategy because it stops at local optimum solutions rather than pursuing the global best [

The activation function of a rectified linear unit “ReLU” was used with every Bi-LSTM layer while training the proposed deep stacked S2SAE forecasting model to deal with vanishing gradients. It also speeds up and improves the learning process [

^{2}) performance metrics were calculated, as given in ^{2} value closer to 1 would indicate a better fit between the forecasting model and the data given.

The forecasting performance of the proposed novel Bi-LSTM-based deep stacked S2SAE model for solar irradiation and wind speed forecasting was compared to three other deep and shallow stacked S2SAEs, i.e., the LSTM-based deep stacked S2SAE model, GRU-based deep stacked S2SAE model, and Bi-LSTM based shallow stacked S2SAE model. Shallow stacked S2SAE has one hidden layer after the input layer on the encoder side and one hidden layer before the output layer on the decoder side, whereas a deep stacked S2SAE has two hidden layers on both sides. All the above stacked S2SAE forecasting models were developed and optimized using Bayesian optimization in MATLAB.

All the models were evaluated relative to annual (January 2021 to January 2022) global horizontal irradiance (GHI) and wind speed hourly data obtained from the “NREL Solar Radiation Research Laboratory (BMS)” publicly available dataset [^{2}) from January 1, 2021, to March 11, 2021, and validation data included hourly GHI readings (W/m^{2}) from March 12–18, 2021, and the testing data included hourly GHI readings (W/m^{2}) from March 19–25, 2021. In the case of wind speed stacked S2SAE forecasting models, the training data included hourly wind speed readings (m/s) from January 1, 2021, to March 10, 2021, and validation data included hourly wind speed (m/s) from March 11–18, 2021, and the testing data included hourly wind speed (m/s) data from March 19–25, 2021. The GHI values from January 1–6, 2021, are shown in

The hyperparameters for each deep-stacked S2SAE forecasting model were optimized using at least 30 objective function evaluations using Bayesian optimization. The objective function was to minimize the MAPE. As deep learning models are sensitive to data scaling, the training and validation data were normalized to have unit variance and zero mean. MAPE measures the extent of the network’s underprediction or overprediction, and how successfully the network adopts new, unknown data is measured by the validation RMSE value. Therefore, the iteration with the lowest MAPE and validation RMSE values was selected as the optimal outcome of the experiment. Optimized hyperparameter values used for all the wind speed and solar irradiation forecasting Bi-LSTM-based deep stacked S2SAE models are shown in

Input layer number of hidden units | Total network layers (Input+hidden layers) | Learn rate | L2 regularization weight decay | ||
---|---|---|---|---|---|

Wind speed forecasting | Shallow stacked Bi-LSTM S2SAE | 108 | 4 | 0.0015 | 0.0001 |

models | Deep-stacked GRU S2SAE | 250 | 6 | 0.0010 | 0.0001 |

Deep-stacked LSTM S2SAE | 61 | 6 | 0.0209 | 1.4729e-5 | |

Proposed deep-stacked Bi-LSTM S2SAE | 250 | 6 | 0.0037 | 0.0001 | |

Solar irradiation | Shallow stacked Bi-LSTM S2SAE | 247 | 4 | 0.0030 | 0.0001 |

forecasting models | Deep-stacked GRU S2SAE | 228 | 6 | 0.0018 | 1.4046e-5 |

Deep-stacked LSTM S2SAE | 248 | 6 | 0.0030 | 0.0001 | |

Proposed deep-stacked Bi-LSTM S2SAE | 172 | 6 | 0.0149 | 0.0001 |

Wind speed | Solar irradiation | |||||
---|---|---|---|---|---|---|

MAPE | R^2 | RMSE | MAPE | R^2 | RMSE | |

Shallow stacked Bi-LSTM S2SAE | 2.32 | 0.98 | 0.056 | 2.00 | 0.98 | 2.06 |

Deep-stacked GRU S2SAE | 6.87 | 0.44 | 0.099 | 4.20 | 0.91 | 2.03 |

Deep-stacked LSTM S2SAE | 5.96 | 0.1916 | 0.0988 | 9.10 | 0.78 | 4.70 |

Proposed deep stacked-Bi-LSTM S2SAE | 1.58 | 0.99 | 0.0358 | 0.2763 | 0.99 | 3.03 |

Moreover, when forecasting GHI, its RMSE value was 3.03, which was better than the LSTM-based deep stacked S2SAE and comparable to the GRU-based deep stacked S2SAE and Bi-LSTM-based shallow stacked S2SAE. Likewise, R-squared values reached 0.99 in both cases and were greater than the R-squared values for all the other models, confirming that the developed forecasting model is highly reliable. The lower values of MAPE indicate that the proposed Bi-LSTM deep stacked S2SAE-based forecasting model is 99.7% and 98.42% accurate for solar irradiation and wind speed forecasting, respectively. This would imply that the overall forecasts were only about 0.2763% and 1.58% off from the actual values. Similarly, lower RMSE values indicate that the observed data are very close to the predicted data.

Since the proposed Bi-LSTM-based deep stacked S2SAE can successfully learn crucial unobserved characteristics from time series and subsequently provide accurate predictions, we can confidently say that it is an effective method. The proposed Bi-LSTM-based deep encoder reducesthe input dimension and produces a single vector representation of input time sequence data. Next, the Bi-LSTM-based deep decoder uses this single vector to learn and generate the target sequence. The proposed deep-stacked S2SAE model was able to effectively reconstruct the input sequence and use the reconstruction error to forecast GHI and wind speed, demonstrating its high efficacy. Although LSTM and GRU are not affected by the vanishing gradient problem, they both overfitted and failed to capture the non-linearities in the time sequence data. At each time step, every single Bi-LSTM layer combines the results of the forward and backward layers to generate output. Moreover, unlike other prediction methods, every Bi-LSTM forecast is based on the entire data sequence. Another advantage of the proposed model is that the deep-stacked S2SAE is made without flip the source/target sequence since the Bi-LSTM layer can learn in both forward and reverse directions.

A novel Bi-LSTM-based deep stacked S2SAE for short-term solar irradiation and wind speed predictions was developed and evaluated in MATLAB 2022a. The forecasting performance of the proposed Bi-LSTM-based deep stacked S2SAE model was compared to three other deep and shallow stacked S2SAEs, i.e., the LSTM-based deep stacked S2SAE model, GRU-based deep stacked S2SAE model, and Bi-LSTM based shallow stacked S2SAE model. Bayesian optimization was used to optimize the hyperparameters of all the forecasting models. The simulated results demonstrated the superiority of the proposed Bi-LSTM-based deep stacked S2SAE forecasting model over the other models. Compared to the other benchmark models, the proposed model achieved the lowest MAPE of 0.2763% when forecasting GHI and of 1.58% when forecasting wind speed. It was also able to achieve the lowest RMSE value of 0.0358 when forecasting wind speed. Moreover, the R-squared value reached 0.99 in both cases and was higher than in all the other models, confirming that the developed forecasting model is highly reliable. The presented study also explored the optimization of the initial learning rate, number of hidden neurons, and regularization factor at constant pre-defined values of other hyperparameters such as momentum and mini-batch size. On this basis, future research can consider optimizing such hyperparameters using nature-inspired algorithms such as artificial bee colony [

Adaptive network-based fuzzy inference system

Artificial neural network

Autoregressive integrated moving average model

Autoregressive moving average model

Bi-directional long short-term memory networks

Backpropagation neural network

Classification and regression tree

Convolutional neural networks

Deep brief networks

Deep learning

Extreme learning machine

Gradient boosting decision tree

Global horizontal irradiance

Gaussian process regression

Generalized regression neural network

Gated recurrent unit

Least-squares support vector machine

Long short-term memory networks

Model five tree

Mean absolute percentage error

Multi-layer perception neural network

Radials basis function neural network

Root mean square error

Recurrent neural networks

Sequence-to-sequence autoencoder

Stacked autoencoder

Support vector machine

Backward hidden state

Forward hidden state

Input

Bias term

Hidden representation

_{[t]}

Cell’s output

_{[t]}

Cell state

Hyperbolic tangent activation function

Weight matrix

Sigmoid activation function

Input of forget gate

_{g}

Input of input gate

_{g}

Input of output gate

Activation function

Activation function

Update signal

This research was funded by Warsaw University of Technology, Faculty of Mechanical and Industrial Engineering from funds for the implementation of KITT4SME (platform-enabled KITs of artificial intelligence for an easy uptake by SMEs) project. The project was funded under the European Commission H2020 Program, under GA 952119 (

The authors declare that they have no conflicts of interest to report regarding the present study.